research article open access multilevel comparative ...the distribution of gene expression (figure...

13
RESEARCH ARTICLE Open Access Multilevel comparative analysis of the contributions of genome reduction and heat shock to the Escherichia coli transcriptome Bei-Wen Ying 1, Shigeto Seno 1, Fuyuro Kaneko 1 , Hideo Matsuda 1 and Tetsuya Yomo 1,2,3* Abstract Background: Both large deletions in genome and heat shock stress would lead to alterations in the gene expression profile; however, whether there is any potential linkage between these disturbances to the transcriptome have not been discovered. Here, the relationship between the genomic and environmental contributions to the transcriptome was analyzed by comparing the transcriptomes of the bacterium Escherichia coli (strain MG1655 and its extensive genomic deletion derivative, MDS42) grown in regular and transient heat shock conditions. Results: The transcriptome analysis showed the following: (i) there was a reorganization of the transcriptome in accordance with preferred chromosomal periodicity upon genomic or heat shock perturbation; (ii) there was a considerable overlap between the perturbed regulatory networks and the categories enriched for differentially expressed genes (DEGs) following genome reduction and heat shock; (iii) the genes sensitive to genome reduction tended to be located close to genomic scars, and some were also highly responsive to heat shock; and (iv) the genomic and environmental contributions to the transcriptome displayed not only a positive correlation but also a negatively compensated relationship (i.e., antagonistic epistasis). Conclusion: The contributions of genome reduction and heat shock to the Escherichia coli transcriptome were evaluated at multiple levels. The observations of overlapping perturbed networks, directional similarity in transcriptional changes, positive correlation and epistatic nature linked the two contributions and suggest somehow a crosstalk guiding transcriptional reorganization in response to both genetic and environmental disturbances in bacterium E. coli. Keywords: Transcriptome, Negative epistasis, Genome reduction, Chromosomal periodicity, Regulatory network, Transcriptional change, Genomic interruption, Environmental perturbation, Heat shock, Directionality Background The bacterial transcriptome, as one of the important genome-level information, reflects the global cellular ac- tivity (i.e., fitness). Since the fitness and/or activity of the cells is directly influenced by the environmental perturb- ation (e.g., heat or cold stress) and genetic disturbance (e.g., mutations or deletions in genome), the evaluation of the environmental and genetic contributions to tran- scriptome turns to be highly essential. In recent, the transcriptome analysis (i.e., gene expression profiling) of bacteria (e.g., Escherichia coli) has been performed ex- tensively using a range of experimental and analytical tools [1-7], to illustrate an overall picture of the bio- chemical events occurring inside the cells. In particular, to examine the environmental contributions to bacterial transcriptome, gene expression profiling has been tested across multiple environmental conditions in a high- throughput manner [8-10]. As a consequence, the inten- sive studies on global transcriptional activity contributed by environmental perturbations successfully provided conclusions regarding the molecular mechanisms involved * Correspondence: [email protected] Equal contributors 1 Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan 2 Graduate School of Frontier Biosciences, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan Full list of author information is available at the end of the article © 2013 Ying et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Ying et al. BMC Genomics 2013, 14:25 http://www.biomedcentral.com/1471-2164/14/25

Upload: others

Post on 27-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Ying et al. BMC Genomics 2013, 14:25http://www.biomedcentral.com/1471-2164/14/25

RESEARCH ARTICLE Open Access

Multilevel comparative analysis of thecontributions of genome reduction and heatshock to the Escherichia coli transcriptomeBei-Wen Ying1†, Shigeto Seno1†, Fuyuro Kaneko1, Hideo Matsuda1 and Tetsuya Yomo1,2,3*

Abstract

Background: Both large deletions in genome and heat shock stress would lead to alterations in the geneexpression profile; however, whether there is any potential linkage between these disturbances to thetranscriptome have not been discovered. Here, the relationship between the genomic and environmentalcontributions to the transcriptome was analyzed by comparing the transcriptomes of the bacterium Escherichiacoli (strain MG1655 and its extensive genomic deletion derivative, MDS42) grown in regular and transient heatshock conditions.

Results: The transcriptome analysis showed the following: (i) there was a reorganization of the transcriptome inaccordance with preferred chromosomal periodicity upon genomic or heat shock perturbation; (ii) there was aconsiderable overlap between the perturbed regulatory networks and the categories enriched for differentiallyexpressed genes (DEGs) following genome reduction and heat shock; (iii) the genes sensitive to genome reductiontended to be located close to genomic scars, and some were also highly responsive to heat shock; and (iv) thegenomic and environmental contributions to the transcriptome displayed not only a positive correlation but also anegatively compensated relationship (i.e., antagonistic epistasis).

Conclusion: The contributions of genome reduction and heat shock to the Escherichia coli transcriptome wereevaluated at multiple levels. The observations of overlapping perturbed networks, directional similarity intranscriptional changes, positive correlation and epistatic nature linked the two contributions and suggestsomehow a crosstalk guiding transcriptional reorganization in response to both genetic and environmentaldisturbances in bacterium E. coli.

Keywords: Transcriptome, Negative epistasis, Genome reduction, Chromosomal periodicity, Regulatory network,Transcriptional change, Genomic interruption, Environmental perturbation, Heat shock, Directionality

BackgroundThe bacterial transcriptome, as one of the importantgenome-level information, reflects the global cellular ac-tivity (i.e., fitness). Since the fitness and/or activity of thecells is directly influenced by the environmental perturb-ation (e.g., heat or cold stress) and genetic disturbance(e.g., mutations or deletions in genome), the evaluation

* Correspondence: [email protected]†Equal contributors1Graduate School of Information Science and Technology, Osaka University,1-5 Yamadaoka, Suita, Osaka 565-0871, Japan2Graduate School of Frontier Biosciences, Osaka University, 1-5 Yamadaoka,Suita, Osaka 565-0871, JapanFull list of author information is available at the end of the article

© 2013 Ying et al.; licensee BioMed Central LtCommons Attribution License (http://creativecreproduction in any medium, provided the or

of the environmental and genetic contributions to tran-scriptome turns to be highly essential. In recent, thetranscriptome analysis (i.e., gene expression profiling) ofbacteria (e.g., Escherichia coli) has been performed ex-tensively using a range of experimental and analyticaltools [1-7], to illustrate an overall picture of the bio-chemical events occurring inside the cells. In particular,to examine the environmental contributions to bacterialtranscriptome, gene expression profiling has been testedacross multiple environmental conditions in a high-throughput manner [8-10]. As a consequence, the inten-sive studies on global transcriptional activity contributedby environmental perturbations successfully providedconclusions regarding the molecular mechanisms involved

d. This is an Open Access article distributed under the terms of the Creativeommons.org/licenses/by/2.0), which permits unrestricted use, distribution, andiginal work is properly cited.

Page 2: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Ying et al. BMC Genomics 2013, 14:25 Page 2 of 13http://www.biomedcentral.com/1471-2164/14/25

in specific pathways and/or regulatory networks [9,11-14],as well as the theoretical and/or overall outlook in regardto global dynamic changes in transcriptome [15].However, the genetic contribution to bacterial tran-

scriptome has been rarely reported. Although the previ-ous studies revealed the potential links betweentranscriptional regulation and chromosome organization[16] and the relation between changes in DNA methyla-tion and chromosomal structure and changes in globalgene expression [3,5], whether and how the genomicscars (e.g., deletions) influence the global transcriptionalactivity is unclear.Both genomic and environmental perturbations could

initiate a global change in the transcriptome, becauseboth are stressful for living cells. Whether there is anypotential linkage between the two contributions is an in-triguing question. To address the question, a compre-hensive analysis comparing the effects of genomic andenvironmental interruptions on the transcriptome ishighly required. Such analysis would not only enable acomparison of the internal and external influences onthe plasticity of genome-wide transcriptional activity butwould also provide a valuable example of a global andconceptual assessment of high-throughput biologicaldata. Therefore, a multi-level survey of genomic expres-sion is highly in demand, to show how genomic distur-bance causes the global transcriptional reorganizationand whether there is any relationship between the con-tributions of genomic and environmental interruptionsto transcriptome.As a pilot study, we evaluated these two distinct en-

tities, the genome and the environment, at the level ofthe transcriptome. We addressed how genomic and en-vironmental perturbations contribute to the bacterialtranscriptome and whether there is any interaction be-tween the two influences. A multiple deletion E. colistrain, MDS42 [17], was compared with its wild-type par-ent strain MG1655 [18,19] in this study. Because of itslack of insertion sequences (ISs) [17,20], MDS42 has beenwidely used in a number of applications [21-25]; however,its detailed, genome-wide analysis has been rarely reported[21-25]. The genes used for the comparative transcrip-tome analysis were selected on the basis of the genomesequences of MDS42 and MG1655. The comparison ofMG1655 to MDS42 provided insights into genomicdisturbance-induced transcriptional reorganization. Theevaluation of the heat shock response (a transientresponse to an elevated temperature) provided insightsinto environmental perturbation-induced transcriptionalchanges. We examined whether the losses of insertionelements in genome interrupt the genome-wide transcrip-tional activity (i.e., the transcriptome) and whether thesemultiple deletions in genome and the heat shock stressexert common or different effects on transcriptional

changes. Furthermore, we performed multilevel compara-tive analyses of chromosomal periodicity, regulatory net-works, the localization of differential gene expression,transcriptional sensitivity in response to genomic andenvironmental perturbations and the epistatic effect, thatis, a negatively compensated relationship, on the tran-scriptome of the genome and the environment.

ResultsStrains and genes subjected to transcriptome analysisGene expression was analyzed using a custom, high-density DNA microarray (a single nucleotide tiling array)with a precise and wide quantitative range of mRNAconcentrations [26,27], as previously described for gen-ome resequencing [28] and transcriptome analysis [29].First, the wild-type strain MG1655 and the multiple de-letion strain MDS42 [17] were used to examine whetherand how removing the nonessential genomic regions inE. coli contributed to its transcriptome. Repeated experi-ments were performed with cells grown precisely tomid-exponential phase in minimal medium. The genesshared between the two strains were determined bycomparing their genome sequences (MDS42: DDBJ No.AP012306; MG1655: GenBank No. U00096). In total,locus tags (b numbers) were assigned to 4485 and 3778open reading frames (ORFs) in MG1655 and MDS42, re-spectively (Figure 1A), which was consistent to the ori-ginal report [17]. Genes with repeated locus tags wereremoved from the analysis, for a final total of 4428 geneswith locus tags in MG1655. Excluding the 696 deletedgenes [17] and 22 mutated genes in MG1655 (deter-mined by comparing the genome sequences, see theMaterials and Methods), a total of 3710 genes weredetermined to be common between the two strains.Subsequently, as the heat shock stress is known as oneof many external perturbations and showed distinguish-able expression profiles [9,11], heat shock experimentswere performed to evaluate how the external disturb-ance contributed to the transcriptome for comparativeanalyses. The average gene expression levels fromrepeated experiments were plotted in Figure 1B andused for further analysis. We showed that neithergenomic reduction nor heat shock altered the shape ofthe distribution of gene expression (Figure 1B).

Priority in chromosomal periodicityThe periodicity of genome-wide transcriptional activitywas studied to provide a global view of gene expressionprofiling [4,6,31,32]. The average transcription levels(i.e., mRNA concentrations) over the entire genome wereplotted for both genomes using a 100-bp sliding window(Figure 2). The periodic patterns showed a major peak ofapproximately 663 kb (Figure 2A) in MG1655, consistentwith previous reports [4,32]. However, these periodic

Page 3: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

-4

-2

2

0

MG1655 MG1655_heatshock

MDS42 MDS42_heatshock

ori

ter

A B

MG1655 (outer)MDS42 (inner)

Exp

ress

ion

leve

l[lo

g 10

(mR

NA

con

cent

ratio

n)(p

M)]

Figure 1 Overview of genomes and transcriptomes. A. Circle diagram of the MG1655 and MDS42 genomes. The genes present in MG1655and MDS42 are shown in the outer and inner rings, respectively, and visualized using Circos [30]. The gold boxes indicate the deleted segmentsthat were originally reported [17]. The point mutations in MG1655 are indicated with red tick marks. The origin and terminus of replication areindicated outside of the circle. B. Box plot of gene expression in MG1655 and MDS42. The average expression levels of 4428 and 3710 genes inMG1655 and MDS42, in exponential phase growth or under heat shock conditions, are shown in pink (MG1655), green (MG1655_heatshock), cyan(MDS42) and purple (MDS42_heatshock), respectively. The expression levels are represented by the log-scale mRNA concentrations. The dotsrepresent the genes.

Ying et al. BMC Genomics 2013, 14:25 Page 3 of 13http://www.biomedcentral.com/1471-2164/14/25

patterns were disturbed by heat shock. The most signifi-cant wavelength, as indicated by the red, vertical, brokenline in Figure 2 (left panels), shifted from 663 to 773 kb(to the right of the red line, a close-up view can be find inAdditional file 1: Figure S4). This slight but statisticallysignificant change in the major peak caused a reduction inthe number of periods from seven to six (Figure 2A–B,right panels). This finding indicated that the heat shocktreatment initiated a reorganization of the transcriptomethat gave a high priority to a periodicity of six periods.Interestingly, the reduced genome also showed a fixed

periodicity of six periods. MDS42 maintained the paren-tal periodogram, with a major peak at 663 kb that wasaccompanied by a decrease in the chromosomal periodnumber (Figure 2C). This reduced period was unchangedin MDS42 in response to heat shock perturbation(Figure 2D). In addition, when the expression of the 3710common genes in MG1655 was plotted against theMDS42 genome, an identical chromosomal periodicity ofsix periods was observed independently of external condi-tions (Figure 2E and F). Statistical analyses confirmed thatthe altered periodicity was highly specific for the MDS42genome and was not due to random deletions in thegenome (P < 0.001). The stable six periods may be havebeen caused by the presence of six macrodomains in thechromosomal structure, as determined by recombinationactivity [33]. The results suggested that the absent geneticregions in MDS42 (Figure 1A, gold boxes) may contain aconsiderable number of genes that are not or rarelytranscribed under heat shock conditions, potentially con-tributing to an additional period. Thus, the multiple

deletions not only reduced the genome size but alsoprovided, probably by chance, a steady periodicity to theglobal transcriptional activity of the E. coli chromosome.

Overlaps in perturbed regulatory networksRegulatory network maps comprising the transcriptionfactors (TFs, or regulators) that control more than 15genes (42 of 175 regulatory networks) were constructedas shown in Figure 3. Significant changes in gene expres-sion are represented as gradations according to theirFDR values, which were determined using the rankproduct method [34,35]. Overall, both genome reductionand heat shock exerted broad effects on transcriptionalreorganization: the expression levels of a number ofgenes were changed (Figure 3). A small number of regu-lators showed remarkable, independent transcriptionalchanges; for example, gadX was induced only by genomereduction (A) and rpoD only by heat shock (B). It indi-cated that genome structural changes might activate theprimary activator of the gad system but fairly influencedthe global transcriptional factor (sigma factor). In addition,a comparable number of regulators were disturbed byboth perturbations in a correlated manner: those inducedby genome reduction were also induced by heat shock.The changes in these overlapping perturbed regulatorsoccurred in either the same or the reverse direction;e.g., the expression level of phoB was upregulated by bothgenome reduction and heat shock (orange in both A andB), whereas, the expression level of gadE was upregulatedby genome reduction but repressed by heat shock (orangein A, purple in B).

Page 4: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

10 0 10 1 10 2 10 3

0 1000 2000 3000 4000

Period (kb)

Spe

ctra

lpow

er

Genomic position (kb)

Average

expre ssio n

Deleted segments

A

C

E

B

D

F

0 1000 2000 3000 4000

0

50

100

150

200

0

50

100

150

200

0

50

100

150

200

0

50

100

150

200

0

50

100

150

200

0

50

100

150

200

[log10

(mR

NA

concentration)(pM)]

-1.0

-0.5

0.0

0.5

-1.5

-1.0

-0.5

0.0

0.5

-1.5

-1.0

-0.5

0.0

0.5

-1.5

-1.0

-0.5

0.0

0.5

-1.5

-1.0

-0.5

0.0

0.5

-1.5

-1.0

-0.5

0.0

-1.5

10 0 10 1 10 2 10 3

Figure 2 Periodicity of transcriptional activity. The periodograms of the average transcriptional levels of MG1655 at 37°C (A), MG1655following heat shock (B), MDS42 at 37°C (C), and MDS42 following heat shock (D), with the highest (major) peaks at 663, 773, 663, and 663 kb,respectively. The vertical, broken line in red indicates the major peak at 663 kb. The significant periodicities at the 95% confidence level areindicated by blue dashed lines. The expression levels of 4428 and 3710 genes were plotted for MG1655 and MDS42, respectively. In addition, theperiodicities of the expression of the 3710 common genes in MG1655 at 37°C (E) and following heat shock (F) are plotted against the MDS42genome. The average transcriptional levels of the genes for every 100-bp sliding window are shown as black lines, and the correspondingperiodicities are shown as red curves. The location of the origin of replication (ori) is indicated by the black broken line, and the deletedsegments from MG1655 are indicated by black bars. The average expression represents the log-scale mRNA concentration. The spectral powercalculated with Fourier transform indicates the strength of the periodicity of the average expression along the genome.

Ying et al. BMC Genomics 2013, 14:25 Page 4 of 13http://www.biomedcentral.com/1471-2164/14/25

To identify these regulators (regulatory networks) up-stream of genes with marked changes in transcriptioninduced by either genome reduction or heat shock, agene set enrichment analysis (GSEA) was applied [36].Among the 175 regulatory networks controlled by 5sigma factors and 170 TFs, 24 and 12 networks showedclear transcriptional changes (FDR < 0.25) associatedwith genome reduction and heat shock, respectively(Table 1; details in Additional file 2: Table S1). Heatshock interrupted approximately equal numbers of theinduced and repressed regulatory networks (7 vs. 5),

whereas genome reduction triggered many more upre-gulated than downregulated networks (22 vs. 2). More-over, most of the regulatory networks that responded toheat shock were highly sensitive to genome reduction aswell (9 out of 12). These overlapping regulatory net-works (e.g., rpoH and purR) showed a marked tendencyto undergo transcriptional changes in the same direction(i.e., increased or decreased) in response to genomic andheat shock perturbations. The exception (with changesin opposite directions) was found in the network con-trolled by gadX, where the downstream genes were

Page 5: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

A B

UpHigh HighLow

Down

log10(FDR-value)-3 -30

Significance

Figure 3 Transcriptional changes in regulatory networks. The information regarding the regulatory networks is from RegulonDB. The genesunder the control of the same transcription factor (TF) or sigma factor were assigned to the same group. The nodes and arcs represent genesand regulation, respectively. The large nodes denote sigma factors or regulators that control more than 15 genes. The genes with conservedexpression are shown in gray, and those with notable transcriptional changes are shown in color. The increased and decreased expression levelsresulting from genome reduction (A) or heat shock (B) are quantitatively marked as gradations, as indicated in the scale bar. Vivid orange indicatehighly significant increases in expression, and dark purple represent highly significant decreases in expression.

Ying et al. BMC Genomics 2013, 14:25 Page 5 of 13http://www.biomedcentral.com/1471-2164/14/25

induced by genome reduction but suppressed by heatshock (FDR < 0.02).Taken together, the transcriptional reorganization

observed in the regulatory networks was not only second-ary to the expression changes in the regulators (e.g., rpoH)but also occurred in networks controlled by the regulatorsof steady expression (e.g., purR). The latter case suggestedeither weakened transcriptional control by the regulatorsor a high level of sensitivity in the downstream genes(i.e., slight transcriptional changes in the regulatorsresulted in profound transcriptional changes in the down-stream genes). The discovery of regulatory networks (and/or regulators) that were universally involved in both per-turbations indicated that transcriptional reorganizationresponds similarly to both endogenous and exogenousdisruptions.

Genome location dependency and gene categoryenrichment of highly responsive genesGenes with significant transcriptional changes (differen-tially expressed genes, DEGs) among the 3710 common

genes were also determined with the rank productmethod. In all, 159 and 95 genes were identified thatshowed either up- or downregulated expression in re-sponse to genome reduction (DEGs_gr) or heat shock(DEGs_hs), respectively (FDR < 0.001). The gene names,gene categories, locus tags, and directions of the transcrip-tional changes are summarized in Additional file 3: TableS2, and the genome locations are marked in Additionalfile 1: Figure S1. An analysis of the distances between theDEGs and the genome scars caused by genome reduc-tion showed that the DEGs_gr, but not the DEGs_hs,were located much closer to the nearest genome scarthan were the genes with conserved expression (nonDEGs)(P < 0.001) (Figure 4A and B). This tendency indicatedthat the genomic disturbance caused by genome reductionbrought about considerable transcriptional changes inneighboring genes. In particular, the transcriptionalchanges were highly significant in the direction of upregu-lation (Figure 4A, orange), which suggests that theremoval of the dispensable genetic segment might in-crease the expression efficiency of the neighboring genes.

Page 6: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Table 1 Regulatory networks with significantperturbations

Perturbation Direction Locustag

Nameof TF

Size FDRq-val

Genome reduction Increased b1951 rcsA 26 0.0000

b2217 rcsB 29 0.0000

b3938 metJ 15 0.0000

b3516 gadX 25 0.0000

b3512 gadE 49 0.0000

b0889 lrp 83 0.0038

b3461 rpoH 150 0.0065

b0761 modE 45 0.0061

b1237 hns 136 0.0145

b3261 fis 216 0.0513

b2193 narP 49 0.0870

b1221 narL 113 0.1043

b3912 cpxR 51 0.0972

b2531 iscR 26 0.1580

b0399 phoB 36 0.1572

b1712 ihfA 191 0.1477

b2741 rpoS 215 0.1448

b0912 ihfB 191 0.1398

b0076 leuO 20 0.1703

b1130 phoP 48 0.1627

b1334 fnr 272 0.1687

b1531 marA 36 0.2127

Decreased b1658 purR 31 0.0071

b0683 fur 81 0.0201

Heat shock Increased b3461 rpoH 150 0.0000

b3912 cpxR 51 0.0004

b3938 metJ 15 0.0026

b2193 narP 49 0.0837

b0399 phoB 36 0.1384

b0761 modE 45 0.2134

b0076 leuO 20 0.2345

Decreased b1658 purR 31 0.0000

b1275 cysB 19 0.0000

b0683 fur 81 0.0000

b3516 gadX 25 0.0123

b3868 glnG 44 0.0118

The information pertaining to transcription factors (TFs), sigma factors and thedownstream genes under their control is from RegulonDB. The transcriptionalchanges in the downstream genes were evaluated with gene set enrichmentanalysis (GSEA) [36], and the disturbed regulatory networks were determinedaccordingly (FDR < 0.25). Perturbation and Direction refer to the type ofdisturbance to the transcriptome and the directionality of the changes in geneexpression, respectively. Locus tag, Name of TF, Size and FDR q-val representthe gene ID, the corresponding regulator gene, the number of regulateddownstream genes and the statistical significance, respectively.

Ying et al. BMC Genomics 2013, 14:25 Page 6 of 13http://www.biomedcentral.com/1471-2164/14/25

We assumed that the deleted genes most likely perturbedthe local superhelical density and thereby affected thetranscriptional activity of the neighboring genes, althoughthe functions of the deleted genes near the deletion junc-tions or mutations remain to be further examined.The gene categories that were enriched in the DEGs

were analyzed to address the question of whether thegenomic and heat shock perturbations had affected anyspecific gene categories (details in Additional file 3:Table S2 and Additional file 4: Table S3). The heat map(Figure 4C) illustrates that the heat shock affected only alimited number of gene categories (i.e., Unknown func-tion and Factor) with high significance (P < 0.001). How-ever, the genome reduction influenced a relatively largenumber of gene categories with diverse directions in tran-scriptional changes; for example, the transporter genes (t)and the RNA-coding genes (n) were highly enriched inthe DEGs_gr in the directions of down- (P < 0.001) andupregulation (P < 0.002), respectively. Only the “Unknownfunction” (o) category was significantly enriched in boththe DEGs_gr and the DEGs_hs (All, P < 0.05). This resultsuggested that the genes with unknown functions werehighly sensitive to both genomic and environmental per-turbations, as might result from a loose control of geneexpression. The genes in the Factor category (f), of whichmost are heat shock proteins (chaperones), were not onlyenriched in the upregulated DEGs_hs (Up, P < 0.001), asextensively reported, but were also detected in the upregu-lated DEGs_gr (Up, P < 0.05). As a result, Factor (f) wasthe most highly enriched category among the 31 overlap-ping genes (P < 0.001). Intriguingly, some well-knownmolecular chaperones, such as clpB and ibpA, were highlyresponsive to both perturbations (Additional file 3: TableS2), indicating a universal response mechanism to internaland external stressors in these genes.

Positive correlations in changes of gene expressionThe directions of the transcriptional changes demon-strated by the DEGs were further analyzed, in particular,on the gene expression patterns by means of correla-tions. In a global view, the DEGs_gr (Figure 5A, upper)also showed remarkable transcriptional changes in re-sponse to heat shock (Figure 5B, upper), as evidenced bythe significantly lower correlation of the DEGs_gr(0.672) compared with that (0.876) of the 3710 commongenes (P < 0.001, Additional file 1: Figure S2). Similarly,the DEGs_hs showed large transcriptional changes in re-sponse to genomic perturbation (Figure 5B, bottom);their correlation (0.893) was significantly lower than that(0.968) of the 3710 common genes (P < 0.001, Additionalfile 1: Figure S2). Furthermore, the upregulated DEGs_hs(Figure 5A, yellow) were also induced by genome reduc-tion (Figure 5B, yellow), and the upregulated DEGs_gr(Figure 5A, orange) were also induced by heat shock

Page 7: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

DEGs_hs DEGs_gr Over-laps(31)

All(95)

Up(63)

Down(31)

All(159)

Up (83)

Down(76)

osu

sps

fpfh

phr

prdm

pme

pet

ptc

pcnl

cplp

log10 (P

va lue)

High

Low

A

B

C

50 150 250 350Distance from the nearest scar

(kb)

Down-regulated

Up -regulated

DEGs _hs

nonEGs

50 150 250 350Distance from the nearest scar

(kb)

Down-regulated

Up-regulated

DEGs _gr

nonDEGs

0.023

2.2×10-16

3.6×10-4

0.23

0.50

0.34

Figure 4 Differentially expressed genes (DEGs). A-B. Distance between genes and genomic scars. The distance from each gene to the nearestgenomic scar resulting from genome reduction is plotted. The genes with conserved (nonDEGs), increased, or reduced expression caused bygenome reduction are shown in gray, orange, and purple, respectively (A). The genes with conserved (nonDEGs), upregulated, or downregulatedexpression due to heat shock are shown in gray, yellow and cyan, respectively (B). All the genes of the DEGs_gr and DEGs_hs are shown in redand brown, respectively. Significance was determined with the Wilcoxon-Mann–Whitney test, and the P-values are indicated. C. Enriched genecategories. The DEGs were divided into 23 gene categories, according to a previous report [37]. One-tailed binominal tests were performed todetermine the significance of the enrichment of each category. The log-scale P-values are displayed as a heat map with gradations of blue; darkerblue indicates higher significance.

Ying et al. BMC Genomics 2013, 14:25 Page 7 of 13http://www.biomedcentral.com/1471-2164/14/25

(Figure 5B, orange). These correlated patterns of changesin gene expression deduced from the overall evaluationof the DEGs (Figure 5A and B) implied that the overlap-ping genes (Figure 4C) might play a crucial role inproducing such transcriptional fluctuation because notall the DEGs changed significantly when moving to theother condition. Among the 31 overlapping genes(Figure 4C, Additional file 4: Table S3), 20 and 10 geneswith up- or downregulated expression in response togenome reduction were up- or downregulated by heatshock in the same direction, respectively. Only a singlegene (yafF, a conserved protein) reversed the directionof its transcription; i.e., it was induced by heat shock butrepressed by genome reduction. The overall directionalsimilarities implied that the genes that were sensitive togenomic disturbance tended to be highly responsive toexternal perturbation and vice versa.Furthermore, the transcriptional changes of the 3710

genes that were induced by genome reduction were plot-ted against those that were induced by heat shock(Figure 5C). A clear, positive correlation between the tran-scriptional changes produced by the genetic and environ-mental contributions was observed (P < 0.001, Additionalfile 1: Figure S2). In other words, larger degrees of

transcriptional fluctuation triggered by genetic interrup-tion were correlated with greater changes in gene expres-sion induced by environmental perturbation. The positivecorrelation between these two contributions to the tran-scriptome implies somehow a linkage in transcriptionalreorganization in response to both genetic and environ-mental stresses.

Antagonistic epistasis in the transcriptomeTheoretically, the positive correlation in the transcrip-tional changes triggered by the genetic and environmen-tal interruptions (Figure 5C) potentially had a tendencyto amplify the magnitude of the changes in gene expres-sion that occurred in response to both genetic andenvironmental perturbations. For instance, the gene ex-pression levels of dnaJ were approximately 2.6 and 30folds up-regulated due to genome reduction and heatshock, respectively. Owing to the positive correlation,the change in the expression of dnaJ was assumed to bemultiplied, up to 78 folds, by the two simultaneously oc-curred perturbations. Actually, it was a 21-fold increase,a highly suppressed value, in the transcriptional change.To understand such repressed change in gene expres-sion, the genetic concepts of epistasis and additivity were

Page 8: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

A B

Genetic contribution[MDS42 – MG1655]

Env

ironm

ent a

l con

t rib

utio

n[M

G16

55_h

eats

hock

–M

G1 6

55]

-4 -2 0 2-4

-2

0

2

4

4

0.548

C

0.968

MD

S42

-4

-2

0

2

MG1655

0.876

-4 -2 0 2

MG

1655

_hea

tsh

ock

-4

-2

0

2

0.672

MG

1655

_hea

tsh

ock

MG1655

0.893

MD

S42

-4

-2

0

2

-4

-2

0

2

-4 -2 0 2

Figure 5 Positive correlations between the genomic and environmental contributions. Firstly, scatter plot of the expression levels of 3710common genes in MDS42 and MG1655 is shown (A, top panel). A total of 159 genes with differential expression in MDS42 caused by genomereduction (DEGs_gr) are highlighted, including 83 with increased expression (in orange) and 76 with reduced expression (in purple), respectively.The expression levels of these 159 genes (DEGs_gr) in MG1655 under the regular and heat shock conditions (MG1655 vs. MG1655_heatshock) areadditionally compared (A, bottom panel). Secondly, scatter plot of the expression levels of 3710 common genes in MG1655 in the presence orabsence of heat shock treatment is shown (B, top panel). A total of 95 genes with differential expression due to heat shock (DEGs_hs) in MG1655are highlighted, including 64 upregulated (in yellow) and 31 downregulated (in cyan) genes, respectively. The expression levels of these 95 genes(DEGs_hs) in both MG1655 and MDS42 under regular conditions (MG1655 vs. MDS42) are also compared (B, bottom panel). Finally, positivecorrelations between the genetic and environmental contributions to the transcriptional changes of total 3710 common genes are shown (C).The genetic contribution refers to the changes in gene expression caused by genome reduction, i.e., the subtraction of the expression level inMG1655 from the expression level in MDS42. The environmental contribution refers to the changes in gene expression due to heat shock inMG1655, i.e., the subtraction of the expression level in MG1655 from the expression level in MG1655_heatshock. The expression level isdetermined as log10 (mRNA concentration), with the unit of pM. The Pearson’s correlation coefficients are indicated.

Ying et al. BMC Genomics 2013, 14:25 Page 8 of 13http://www.biomedcentral.com/1471-2164/14/25

employed, as represented schematically in Figure 6A.These concepts are widely used to determine whetherthe final fitness or output is simply the sum of theeffects of two genes (additivity) or whether one genepartially inhibits (negative epistasis) or enhances (posi-tive epistasis) the effects of the other gene. The extendeddefinition of epistasis could be applied to the interac-tions among the chemicals and molecules in highlycomplex biochemical reactions [38]. Here, the two ana-lyzed effects were the transcriptional changes contribu-ted by genome reduction (ΔTG) and heat shock (ΔTE)individually, and the final output was defined as thetranscriptional changes that were produced when thegenomic and heat shock perturbations occurred simul-taneously (ΔTF), i.e., the reduced genome undergoingheat shock (MDS42_heatshock). The slopes representthe transcriptional changes (ΔTF) perturbed by the gen-ome and environment in an additive or epistatic manner(Figure 6A, indicated by the solid black line and thebroken gray line). The difference in the slopes, definedas α, represented the relative magnitude of the difference

(repressed or increased) between the additive and simul-taneous contributions.Accordingly, the relationship between the contribu-

tions of genome reduction and heat shock to transcrip-tome was subsequently analyzed in a genomic view.Considering the difference in the number of genes inMG1655 and MDS42, the common genes (3710 genes)shared with both strains were applied in the followinganalyses. The transcriptional changes due to these twoindependent contributors (genome and environment, asshown in Figure 5C) were added, and the sum was con-sidered to be the additive contribution (ΔTG +ΔTE). Thechanges in gene expression between MDS42_heatshockand MG1655 represented simultaneous genetic and en-vironmental interruptions (simultaneous contribution,ΔTF). The additive contribution was then plotted againstthe simultaneous contribution, as shown in Figure 6B.The simultaneous contribution showed a smaller magni-tude of transcriptional changes than did the additivecontribution (α < 0) (Figure 6B, upper panel), approxi-mately 30% suppressive effect (α = −0.29, p < 0.001). It

Page 9: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

BA

Additive contribution (ΔTG+ΔT

E)

Sim

ult

aneo

us

con

trib

uti

on

(ΔT

F)

ΔTF = ΔTE + ΔTG (α = 0, additivity)ΔTF > ΔTE + ΔTG (α > 0, positive epistasis)ΔTF < ΔTE + ΔTG (α < 0, negative epistasis)

C0, MG1655Ce, MG1655_heatshockCg, MDS42Cf, MDS42_heatshock

-4 -2 0 2 4

-4

-2

0

2

4

α = -0.29

-4

-2

0

2

4

α = -0.39

Tra

nscr

ipti

onal

EnvironmentalC0 Ce

CgCf

ΔTG

ΔTE

α

ΔTF

Figure 6 Epistasis between genomic and environmental perturbations. A. Schematic illustration of epistasis and additivity. Theenvironmentally and genetically induced transcriptional changes are defined as ΔTE and ΔTG, respectively. When genetic and environmentalperturbations occurred together, the simultaneous change in expression is defined as ΔTF. The relationships among these three parameters weredescribed with the indicated equations, where α represents the magnitude of the epistatic effect. B. Antagonistic epistasis in the transcriptome.The sum of the transcriptional changes caused separately by genetic and environmental perturbations (the additive contribution, ΔTE +ΔTG) isplotted against the transcriptional changes caused simultaneously by both perturbations (the simultaneous contribution, ΔTF), i.e., the changes ingene expression between MDS42_heatshock and MG1655. The 3710 common genes and the genes showing significant fluctuations induced byeither genetic or environmental perturbation (both DEGs_gr and DEGs_hs) are plotted in the upper and bottom panels, respectively. The valuesof α were calculated as follows: α = 1 – k, where k is the slope of the dot plot fit by linear regression, as indicated by the solid black line.

Ying et al. BMC Genomics 2013, 14:25 Page 9 of 13http://www.biomedcentral.com/1471-2164/14/25

strongly indicated a negatively compensated relationshipbetween the genomic and environmental contributionsto the changes in the transcriptome, as so-called “antag-onistic epistasis”. Moreover, the magnitude of the com-pensatory effect became larger (α = −0.39) in the genegroups of the DEGs_gr and the DEGs_hs (Figure 6B,bottom panel), suggesting that the negative epistasis be-came more significant, approximately 40%. Both thegenetic and environmental perturbations alone triggeredfluctuations in the transcriptome, whereas either per-turbation repressed or limited the fluctuation caused bythe other when both occurred simultaneously. Theseresults indicated that negative epistasis may be univer-sally involved in biological processes, although theunderlying mechanisms and theoretical principles re-main to be determined.

DiscussionThis study demonstrated a considerable interaction be-tween the genetic and environmental contributions tothe changes in the transcriptome, as evidenced by theidentical preferred chromosomal periodicity, the direc-tional similarities in the transcriptional changes, theoverlaps between the perturbed regulatory networks, thepositive correlation in transcriptional changes inducedby the genetic and environmental contributions and the

negative epistasis. Thus, our multilevel comparative ana-lysis provided a global comparison of the effects ofgenomic and heat shock perturbations to the E. colitranscriptome. A more comprehensive evaluation basedon the additional experiments under cross multiple con-ditions and a cross validation using the fruitful reporteddata sets are essential to figure out the common featuresof the genomic and environmental contributions to thereorganization in transcriptome.The transcriptome analysis caught the genome reduc-

tion effect. The deleted regions were assumed to havecrucial functions, on the basis of our novel observationsof the reduced genome, along with the previous reportsof its advantages in various applications [17,21,23] butits limited evolvability [20]. As these regions were largelyoccupied by genes that were derived from the IS/phage,that encoded structural components, or that had un-known or predicted functions, it highlighted the poten-tial importance of these genes. We observed that thedeletions altered the transcriptional efficiency, the stabi-lity of transcriptional patterns and the specificity of tran-scriptional changes. The high average levels of geneexpression (Figure 1B) and the strong significance of theDEGs_gr upregulation (Figure 4A) indicated that the elim-ination of the redundant genome sequence may save mate-rials and energy, leading to increases in the transcriptional

Page 10: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Ying et al. BMC Genomics 2013, 14:25 Page 10 of 13http://www.biomedcentral.com/1471-2164/14/25

efficiency of the remaining genes. The observation that thefixed chromosomal periodicity (Figure 2C and D) wasidentical to the number of chromosomal macrodomains[33] indicated that the global transcriptional pattern maybe stabilized by the complete removal of the flexible IS/phage regions. The slight but significant increase in thecell growth rate most likely resulted from this steadyexpression pattern, which potentially accelerated celldivision [39]. The significant enrichment of genes of un-known function in the DEGs_gr and DEGs_hs (Figure 4C,Additional file 4: Table S3) suggested that these genes areeasily disturbed. Additional analyses revealed an apparentbias in the genomic positions of the unknown functiongenes; these genes were preferentially located in areas nearthe deleted regions (Additional file 1: Figure S3), whichexplained the finding of genomic scar dependency in theDEGs_gr (Figure 4A). The genomic distribution of thefunctionally undefined genes reflected shared geneorganization strategies [40] that were most likely shapedby evolution. Taken together, the data suggest the pos-sibility of a relationship between the identities of thedeleted genes (in particular, the genes of unknownfunction), the chromosomal distribution of the deletedgenes, the global expression patterns (periodicity), and thecell growth (fitness).The genomic and environmental contributions to tran-

scriptome reorganization indicated a negative epistaticrelationship (Figure 6B), that is, a negatively compensa-tive relation of the transcriptional changes caused by thetwo contributors. Such relationship would inhibitthe transcriptional fluctuation that was amplified by thepositive correlation between the changes in gene expres-sion triggered by the genetic and environmental inter-ruptions (Figure 5C). Epistasis has largely been studiedwithin the framework of genetics, i.e., how a mutation inone gene masks the phenotypic effect of a mutation inanother gene [41,42]. The present study was the first tocompare the contributions of the genome and the envir-onment to the transcriptome and to include an analysisof the antagonistic (negative) epistasis involved in thereorganization of transcriptome. Because MDS42 wassynthetically constructed and is therefore not a productof evolution in the wild nature, the antagonistic epistaticrelationship between the genomic and environmentalcontributions suggested a universal innate principle ofcell biology. This principle could be described as theneed to restrict the consequences of two positively cor-related effects to physiologically tolerable limits. Themagnitude of the compensatory effect detected in theDEGs was larger than that in the 3710 common genes(Figure 6B) strongly supported this theory. The adagethat the sum is less than its parts (e.g., ΔTG + ΔTE <ΔTF)appears to be applicable to the ability of living organismsto survive in severe environments for millions of years

and to an earlier report of the robustness and/or plasti-city of living systems [43], although more data arerequired to support this hypothesis.

ConclusionThis study investigated the relationship between thegenetic and environmental contributions to the bacterialtranscriptome. Intriguingly, in addition to similaritiesand overlaps in periodicity, regulatory networks, DEGsand directionality, both positive correlations and nega-tive epistasis between the genomic and environmentalcontributions to the global transcriptional activity wereobserved. These results linked the genome and the envir-onment at the level of the transcriptome and revealed theepistatic nature of the genome-wide transcriptionalreorganization that occurs universally in living cells in re-sponse to both endogenous and exogenous disturbances.

Materials and methodsStrains and culture conditionsThe wild-type E. coli strain MG1655 was provided bythe National BioResource Project, National Institute ofGenetics, Shizuoka, Japan. The reduced-genome E. colistrain MDS42 was purchased from Scarab Genomics.The genome sequences of MG1655 and MDS42 wereretrieved from the information deposited in the Gen-Bank and DDBJ databanks and have accession IDs ofU00096 and AP012306, respectively. When comparingthe two deposited genome sequences, we found that 696genes presented in MG1655 were absent in MDS42(i.e., 696 deleted genes), consistent with the original report[17]. In addition, 22 genes showed mismatched sequencesbetween two genomes (i.e., 22 mutated genes). Thus, atotal of 3710 genes (if including oriC, then 3711 genes) ofperfect matched sequences were determined to be com-mon between the two strains. The bacterial cells were cul-tured in 5 mL of M63 minimal medium (62 mM K2HPO4,39 mM KH2PO4, 15 mM (NH4)2SO4, 2 μM FeSO4•7H2O,15 μM thiamine hydrochloride, 203 μM MgSO4•7H2Oand 22 mM glucose) at 37°C with shaking at 200 rpm in aBR-21FP air incubator (Taitec).

Cell culture conditions for the precise measurement ofmRNA levels and growth ratesThe cell cultures were established from a glycerol stockwith two preculture steps, and serial transfers were per-formed during the exponential phase. The dilution rateneeded to keep the final cell concentration at approxi-mately 1-2 × 108 cells/mL was calculated according to thegrowth rate observed during the previous day. The initialcell concentration (C0) was determined on the basis of theconcentration of the preculture used for transfer and thedilution rate. The final cell concentration (Ct) was mea-sured using a flow cytometric analysis (Canto II; Becton,

Page 11: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Ying et al. BMC Genomics 2013, 14:25 Page 11 of 13http://www.biomedcentral.com/1471-2164/14/25

Dickinson and Company) of the remainder of the cell cul-tures that had been sampled for microarray analysis. Thegrowth rate was calculated according to the following for-mula: ln(Ct/C0)/t, where Ct, C0, and t represent the finaland initial cell concentrations (cells/mL) and the culturetime (h), respectively. This formula has been previouslyapplied in the quantitative evaluation on bacterial growthin exponential phase [28,29,44].

Heat shock experimentAs previous studies showed that heat shock response inE. coli was evidently induced within a few minutes(i.e., 5–10 minutes) after the temperature upshift from30–37°C to 42–50°C [9,45-47], the condition for heatshock experiment was determined as a 5-min incubationfollowing the temperature upshift from 37°C to 45°C [28].Exponentially growing cells at approximately 108 cells/mLwere rapidly transferred to an adjacent water bath in-cubator (Personal-11EX; Taitec) set at 45°C. Followinga 5-min incubation at 45°C, the cell culture was im-mediately poured into a cold phenol-ethanol solutionto prepare the samples for microarray analysis. Eachheat shock experiment was performed separately toenable the precise control of the timing of the heatshock to ensure that the mRNA levels accuratelyreflected the stress response.

Microarrays and data extractionA high-density DNA microarray covering the entire gen-ome of the E. coli W3110 strain was utilized [26,27] asdescribed elsewhere [28,29]. The sample preparation, themicroarray analysis with the Affymetrix GeneChip sys-tem, and the data extraction based on the finitehybridization (FH) model [26] were performed as previ-ously described [29]. To avoid any significant errorsresulting from the diverse signal intensities of the Gene-Chips (i.e., to minimize experimental errors), only arrayresults with highly similar distributions of probe fluores-cence intensities were included in the subsequent ex-pression analysis. The results of 20 arrays for the twostrains under the two conditions were used (7 and 3replicas of exponential growth under 37°C and 45°C heatshock conditions, respectively). The raw data from these20 microarrays were deposited in the NCBI GeneExpression Omnibus database under the GEO Series ac-cession number GSE33212 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE33212). The gene namesare based on the genome information for W3110 andrepresent the genes in common between strains W3110and MG1655 [37]. The entire dataset of gene names andcategories is from GenoBase, Japan (http://ecoli.aist-nara.ac.jp/gb6/Download.html).

Computational analyses and graphicsThe transcription levels were determined as the log-scale mRNA concentrations (pM). The transcriptionalchanges (fluctuations) were calculated as the differencebetween the two transcription levels. Although severalmethods for the determination of chromosomal period-icity have been reported [4,32], a simple approach wasused here. Each expression value that had been deter-mined with the method described in the previoussection was projected onto the genome site that corre-sponded to the gene position, using 100-base bins. Next,the series of expression levels along the genome wassmoothed with a moving average of 500 bins (50 kb).The periodicity was calculated using a standard Fouriertransform, and the significance of the periodicity wasassessed with the chi-squared test. The approximate lineof the periodicity was calculated using the highest peak(statistic significance) of the periodogram and was fittedby minimizing the square error between the approximateline and the series of expression values. Gene set enrich-ment analysis (GSEA) was performed according to theoriginal report [36] using the available online tools(http://www.broadinstitute.org/gsea/index.jsp). The TFand sigma factor gene regulation datasets were fromRegulonDB. To compare the responses of MG1655 andMDS42, we limited the gene sets to those genes thatwere included among the 3710 common genes. The pre-ranked gene lists used as the input for GSEA comprisedgenes filtered by the absolute value of their expressiondifference between MG1655_hs and MG1655 (orMDS42 and MG1655). The regulatory networks werevisualized using Gephi software (http://gephi.org). Sub-sequently, the Bioconductor software package RankProd[34], which is based on the rank product method, wasemployed to identify the differential gene expressioncaused by genome reduction and the heat shock re-sponse. The rank product method is a nonparametricstatistical method derived from biological reasoning thatdetects items that are consistently ranked high or lowin a number of lists. An advantage of this method isits ability to identify biologically relevant expressionchanges among a relatively small number of samples[35]. Finally, the statistical analyses, with the exceptionof GSEA, were performed using R software [48] (http://www.r-project.org), and the array plot (heat map) of thegene categories was constructed using Mathematica 8(Wolfram Research).

Additional files

Additional file 1: Figure S1. Chromosomal locations of the DEGs.Figure S2. Statistical analysis. Figure S3. Genes of unknown function.Figure S4. A close-up view of the transition in major peaks.

Page 12: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Ying et al. BMC Genomics 2013, 14:25 Page 12 of 13http://www.biomedcentral.com/1471-2164/14/25

Additional file 2: Table S1. Detailed results of the gene set enrichmentanalysis.

Additional file 3: Table S2. List of the DEGs.

Additional file 4: Table S3. Analysis of the enriched gene categories.

AbbreviationsE. coli: Escherichia coli; TF: Transcription factor; bp: Base pair; kb: Kilobase;gr: Genome reduction; hs: Heat shock; DEGs: Differentially expressed genes;FDR: False discovery rate; GSEA: Gene set enrichment analysis.

Competing interestsThe authors declare that there is no competing interest.

Authors’ contributionsBWY and TY conceived the research; BWY designed and performed theexperiments; BWY, SS and TY analyzed the data; FK, HM and TY provided theanalytic tools; and BWY, SS and TY wrote the manuscript. All the authorsread and approved the final manuscript for publication.

AcknowledgementsWe thank Junko Asada, Natsuko Yamawaki, Shingo Suzuki, Naoaki Ono andChikara Furusawa for technical assistance and analytical tools. We thank theanonymous reviewers for helping us to improve the clarity of themanuscript. BWY was supported by the project of Novel and Innovative R&Dmaking use of brain structures, from the Ministry of Internal Affairs andCommunications, Japan.

Author details1Graduate School of Information Science and Technology, Osaka University,1-5 Yamadaoka, Suita, Osaka 565-0871, Japan. 2Graduate School of FrontierBiosciences, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan.3Exploratory Research for Advanced Technology (ERATO), Japan Science andTechnology Agency (JST), Suita, Osaka 565-0871, Japan.

Received: 15 July 2012 Accepted: 29 December 2012Published: 16 January 2013

References1. Richmond CS, Glasner JD, Mau R, Jin H, Blattner FR: Genome-wide

expression profiling in Escherichia coli K-12. Nucleic Acids Res 1999,27:3821–3835.

2. Tao H, Bausch C, Richmond C, Blattner FR, Conway T: Functional genomics:expression analysis of Escherichia coli growing on minimal and richmedia. J Bacteriol 1999, 181:6425–6440.

3. Lobner-Olesen A, Marinus MG, Hansen FG: Role of SeqA and Dam inEscherichia coli gene expression: a global/microarray analysis. Proc NatlAcad Sci U S A 2003, 100:4672–4677.

4. Allen TE, Herrgard MJ, Liu M, Qiu Y, Glasner JD, Blattner FR, Palsson BO:Genome-scale analysis of the uses of the Escherichia coli genome:model-driven analysis of heterogeneous data sets. J Bacteriol 2003,185:6392–6399.

5. Peter BJ, Arsuaga J, Breier AM, Khodursky AB, Brown PO, Cozzarelli NR:Genomic transcriptional response to loss of chromosomal supercoilingin Escherichia coli. Genome Biol 2004, 5:R87.

6. Jeong KS, Ahn J, Khodursky AB: Spatial patterns of transcriptional activityin the chromosome of Escherichia coli. Genome Biol 2004, 5:R86.

7. Passalacqua KD, Varadarajan A, Ondov BD, Okou DT, Zwick ME, BergmanNH: Structure and complexity of a bacterial transcriptome. J Bacteriol2009, 191:3203–3211.

8. Faith JJ, Driscoll ME, Fusaro VA, Cosgrove EJ, Hayete B, Juhn FS, SchneiderSJ, Gardner TS: Many Microbe Microarrays Database: uniformlynormalized Affymetrix compendia with structured experimentalmetadata. Nucleic Acids Res 2008, 36:D866–D870.

9. Jozefczuk S, Klie S, Catchpole G, Szymanski J, Cuadros-Inostroza A,Steinhauser D, Selbig J, Willmitzer L: Metabolomic and transcriptomicstress response of Escherichia coli. Mol Syst Biol 2010, 6:364.

10. Nicolas P, Mader U, Dervyn E, Rochat T, Leduc A, Pigeonneau N, Bidnenko E,Marchadier E, Hoebeke M, Aymerich S, et al: Condition-dependenttranscriptome reveals high-level regulatory architecture in Bacillussubtilis. Science 2012, 335:1103–1106.

11. Gunasekera TS, Csonka LN, Paliy O: Genome-wide transcriptionalresponses of Escherichia coli K-12 to continuous osmotic and heatstresses. J Bacteriol 2008, 190:3712–3720.

12. Traxler MF, Summers SM, Nguyen HT, Zacharia VM, Hightower GA, Smith JT,Conway T: The global, ppGpp-mediated stringent response to aminoacid starvation in Escherichia coli. Mol Microbiol 2008, 68:1128–1148.

13. Durfee T, Hansen AM, Zhi H, Blattner FR, Jin DJ: Transcription profiling ofthe stringent response in Escherichia coli. J Bacteriol 2008, 190:1084–1096.

14. Rutherford BJ, Dahl RH, Price RE, Szmidt HL, Benke PI, Mukhopadhyay A,Keasling JD: Functional genomic study of exogenous n-butanol stress inEscherichia coli. Appl Environ Microbiol 2010, 76:1935–1945.

15. Guell M, Yus E, Lluch-Senar M, Serrano L: Bacterial transcriptomics: what isbeyond the RNA horiz-ome? Nat Rev Microbiol 2011, 9:658–669.

16. Hershberg R, Yeger-Lotem E, Margalit H: Chromosomal organization is shapedby the transcription regulatory network. Trends Genet 2005, 21:138–142.

17. Posfai G, Plunkett G 3rd, Feher T, Frisch D, Keil GM, Umenhoffer K,Kolisnychenko V, Stahl B, Sharma SS, de Arruda M, et al: Emergent propertiesof reduced-genome Escherichia coli. Science 2006, 312:1044–1046.

18. Blattner FR, Plunkett G 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al: The complete genomesequence of Escherichia coli K-12. Science 1997, 277:1453–1462.

19. Gerdes SY, Scholle MD, Campbell JW, Balazsi G, Ravasz E, Daugherty MD,Somera AL, Kyrpides NC, Anderson I, Gelfand MS, et al: Experimentaldetermination and system level analysis of essential genes in Escherichiacoli MG1655. J Bacteriol 2003, 185:5673–5684.

20. Umenhoffer K, Feher T, Baliko G, Ayaydin F, Posfai J, Blattner FR, Posfai G:Reduced evolvability of Escherichia coli MDS42, an IS-less cellular chassisfor molecular and synthetic biology applications. Microb Cell Fact 2010, 9:38.

21. Lee JH, Sung BH, Kim MS, Blattner FR, Yoon BH, Kim JH, Kim SC: Metabolicengineering of a reduced-genome strain of Escherichia coli for L-threonine production. Microb Cell Fact 2009, 8:2.

22. Chakiath CS, Esposito D: Improved recombinational stability of lentiviralexpression vectors using reduced-genome Escherichia coli. Biotechniques2007, 43:466, 468, 470.

23. Sharma SS, Blattner FR, Harcum SW: Recombinant protein production inan Escherichia coli reduced genome strain. Metab Eng 2007, 9:133–141.

24. Sharma SS, Campbell JW, Frisch D, Blattner FR, Harcum SW: Expression of tworecombinant chloramphenicol acetyltransferase variants in highly reducedgenome Escherichia coli strains. Biotechnol Bioeng 2007, 98:1056–1070.

25. Chae HS, Kim KH, Kim SC, Lee PC: Strain-dependent carotenoidproductions in metabolically engineered Escherichia coli. Appl BiochemBiotechnol 2010, 162:2333–2344.

26. Ono N, Suzuki S, Furusawa C, Agata T, Kashiwagi A, Shimizu H, Yomo T: Animproved physico-chemical model of hybridization on high-densityoligonucleotide microarrays. Bioinformatics 2008, 24:1278–1285.

27. Suzuki S, Ono N, Furusawa C, Kashiwagi A, Yomo T: Experimentaloptimization of probe length to increase the sequence specificity ofhigh-density oligonucleotide microarrays. BMC Genomics 2007, 8:373.

28. Kishimoto T, Iijima L, Tatsumi M, Ono N, Oyake A, Hashimoto T, Matsuo M,Okubo M, Suzuki S, Mori K, et al: Transition from positive to neutral inmutation fixation along with continuing rising fitness in thermaladaptive evolution. PLoS Genet 2010, 6:e1001164.

29. Tsuru S, Yasuda N, Murakami Y, Ushioda J, Kashiwagi A, Suzuki S, Mori K,Ying BW, Yomo T: Adaptation by stochastic switching of a monostablegenetic circuit in Escherichia coli. Mol Syst Biol 2011, 7:493.

30. Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ,Marra MA: Circos: an information aesthetic for comparative genomics.Genome Res 2009, 19:1639–1645.

31. Wright MA, Kharchenko P, Church GM, Segre D: Chromosomal periodicityof evolutionarily conserved gene pairs. Proc Natl Acad Sci U S A 2007,104:10559–10564.

32. Mathelier A, Carbone A: Chromosomal periodicity and positionalnetworks of genes in Escherichia coli. Mol Syst Biol 2010, 6:366.

33. Valens M, Penaud S, Rossignol M, Cornet F, Boccard F: Macrodomainorganization of the Escherichia coli chromosome. EMBO J 2004, 23:4330–4341.

34. Hong F, Breitling R, McEntee CW, Wittner BS, Nemhauser JL, Chory J:RankProd: a bioconductor package for detecting differentially expressedgenes in meta-analysis. Bioinformatics 2006, 22:2825–2827.

35. Breitling R, Armengaud P, Amtmann A, Herzyk P: Rank products: a simple,yet powerful, new method to detect differentially regulated genes inreplicated microarray experiments. FEBS Lett 2004, 573:83–92.

Page 13: RESEARCH ARTICLE Open Access Multilevel comparative ...the distribution of gene expression (Figure 1B). Priority in chromosomal periodicity The periodicity of genome-wide transcriptional

Ying et al. BMC Genomics 2013, 14:25 Page 13 of 13http://www.biomedcentral.com/1471-2164/14/25

36. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA,Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene setenrichment analysis: a knowledge-based approach for interpretinggenome-wide expression profiles. Proc Natl Acad Sci U S A 2005,102:15545–15550.

37. Riley M, Abe T, Arnaud MB, Berlyn MK, Blattner FR, Chaudhuri RR, GlasnerJD, Horiuchi T, Keseler IM, Kosuge T, et al: Escherichia coli K-12: acooperatively developed annotation snapshot–2005. Nucleic Acids Res2006, 34:1–9.

38. Matsuura T, Kazuta Y, Aita T, Adachi J, Yomo T: Quantifying epistaticinteractions among the components constituing the protein transclationsystem. Mol Syst Biol 2009, 5:e297.

39. Viollier PH, Shapiro L: Spatial complexity of mechanisms controlling abacterial cell cycle. Curr Opin Microbiol 2004, 7:572–578.

40. Lawrence JG: Shared strategies in gene organization among prokaryotesand eukaryotes. Cell 2002, 110:407–413.

41. Van Driessche N, Demsar J, Booth EO, Hill P, Juvan P, Zupan B, Kuspa A,Shaulsky G: Epistasis analysis with global transcriptional phenotypes. NatGenet 2005, 37:471–477.

42. Sanjuan R, Elena SF: Epistasis correlates to genomic complexity. Proc NatlAcad Sci U S A 2006, 103:14402–14405.

43. Ishii N, Nakahigashi K, Baba T, Robert M, Soga T, Kanai A, Hirasawa T, NabaM, Hirai K, Hoque A, et al: Multiple high-throughput analyses monitor theresponse of E. coli to perturbations. Science 2007, 316:593–597.

44. Shimizu Y, Tsuru S, Ito Y, Ying BW, Yomo T: Stochastic switching inducedadaptation in a starved Escherichia coli population. PLoS One 2011,6:e23953.

45. Tilly K, Erickson J, Sharma S, Georgopoulos C: Heat shock regulatory generpoH mRNA level increases after heat shock in Escherichia coli. J Bacteriol1986, 168:1155–1158.

46. Nagai H, Yuzawa H, Kanemori M, Yura T: A distinct segment of the sigma32 polypeptide is involved in DnaK-mediated negative control of theheat shock response in Escherichia coli. Proc Natl Acad Sci U S A 1994,91:10280–10284.

47. Yura T, Nakahigashi K: Regulation of the heat-shock response. Curr OpinMicrobiol 1999, 2:153–158.

48. Ihaka R, Gentleman R: R: A language for data analysis and graphics.J Comput Graph Stat 1996, 5:299–314.

doi:10.1186/1471-2164-14-25Cite this article as: Ying et al.: Multilevel comparative analysis of thecontributions of genome reduction and heat shock to the Escherichiacoli transcriptome. BMC Genomics 2013 14:25.

Submit your next manuscript to BioMed Centraland take full advantage of:

• Convenient online submission

• Thorough peer review

• No space constraints or color figure charges

• Immediate publication on acceptance

• Inclusion in PubMed, CAS, Scopus and Google Scholar

• Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit