functional annotation and network analysis of predicted mirna...
TRANSCRIPT
1
International Journal of ISSN No. 2394-7772
Biomathematics and Systems Biology Official Journal of Biomathematical Society of India
Volume 3, No. 1, Year 2016
Functional annotation and network analysis of predicted
miRNA targets in leguminous plant Glycine max
Nela Pragathi Sneha Anjana Rajendiran Archana Pan*
Received: 14 Jun, 2017Revised: 10 Aug 2017
Accepted: 17 Aug 2017
Abstract. MicroRNAs (miRNAs) are small non-coding RNA molecules, usually of 18-24 nucleotides in
length. They bind to the 3′ untranslated region of mRNAs causing translational repression or mRNA
degradation and thereby regulate gene expression. Identification of miRNA targets (mRNAs) and inferring
their functions are important in order to understand their role in different biological processes. In the
present study, using in silico approach, we predicted and functionally characterized miRNA targets in
leguminous plant Glycine max. A total of 1333 potential targets were predicted using target prediction tools.
Functional enrichment analysis using bioinformatics resources revealed that 732, 710 and 643 targets were
found to be involved in molecular functions, biological processes and cellular components, respectively.
Pathway enrichment analysis using Kyoto Encyclopedia of Genes and Genomes (KEGG) indicated that the
identified targets were found to be associated with various vital metabolic pathways, including biosynthesis
of antibiotics, glycolysis, amino acid metabolism, and starch and sucrose metabolism. Furthermore, network
analysis stipulates the target’s functional importance in biological networks of the plant. The outcome from
the analysis would broaden the scope of understanding the role of G.max miRNAs in flower development,
root nodulation and response to biotic and abiotic stresses and therefore, can act as a lead towards
experimental understanding of the target’s role for the plant system.
Keywords: Glycine max, miRNA targets, Functional annotation, Network analysis
∗Corresponding author, email: [email protected], Telephone:+91-413-2654-584
MicroRNAs (miRNAs) are small non-coding, endogenous RNA molecules of about 18-24nt long. Lin-4 and let-7 are the first two miRNAs to be discovered in Caenorhabditis elegans way back in 1993 by Rosalind Lee and Rhonda Feinbaum in Victor Ambros’s laboratory (Ambros, 2008). The discovery triggered a revolution in the field of non-coding RNA (ncRNA) research, which led to the discovery of ncRNAs in other species, such as flies, mammals and plants. It was found that the identified miRNAs play a humongous role in animals and plants as well by targeting mRNAs thereby acting as an important post-transcriptional regulator. They act by targeting specific mRNAs, mostly at their 3' untranslated region (UTR), whereas a very few bind in the 5' UTR of mRNA and some even bind within the coding regions of mRNA.
1. Introduction
Centre for Bioinformatics, Pondicherry University,Pondicherry - 605014, India
2
A large proportion of protein-coding genes appears to be regulated by miRNAs suggesting that miRNAs have a critical role in affecting
variety of biological functions. Many miRNAs have been identified in different plant species, including Arabidopsis thaliana (Rajagopalan
et al., 2006), rice (Zhu et al., 2008) , apple (Xia et al., 2012), maize (Zhang et al., 2009) and soybean (Song et al., 2011). Traditional
methods for detecting miRNAs include cloning, northern blotting, RT-PCR, and microarray techniques. In plants, miRNAs play a crucial
role in growth, response to biotic and abiotic stresses, basal defense and flowering etc. They also regulate various plant development
processes, such as leaf morphogenesis and polarity, floral differentiation, root initiation and development, vascular development and
transition of plant growth from vegetative growth to reproductive growth (Khraiwesh et al., 2012). Various abiotic stresses, such as cold
stress, drought stress, salt stress, have high impact on cultivation and yield in leguminous plants (Araújo et al., 2015). It is also found that
the nitrogen fixation in legume plants depends on several abiotic stresses, including soil salinity, low/high temperature, acidity, water
logging etc. (Serraj, 2003).
Glycine max, commonly known as soybean, is a leguminous plant of the family fabaceae (Michelfelder, 2009). Legumes are important crop
plants utilized as food and for soil improvement. They play a critical role in natural environment by accessing atmospheric N2 through
symbiosis with a group of soil bacteria, Rhizobia (Graham and Vance, 2003). They require minimal amount of N fertilizers and also
improve soil quality. Soybeans are rich source of protein and vegetable oil for human consumption globally. Its ability to adapt to a wide
range of soil, climatic conditions and ability to fix nitrogen makes it an agronomically important crop. A typical soybean contains 40%
protein and 20% oil (by weight). This composition ranks the crop highest in terms of protein content among all food crops and second in
terms of oil content after peanut (48%) among all food legumes. However, the growth, development and yield of soybean are greatly
affected by several abiotic stresses, such as flood, drought, salinity and heavy metal cadmium (Hossain et al., 2013). It has been reported
that miRNAs also play crucial roles in the development of this crop. For example, miR160 (Nizampatnam et al., 2015) controls auxin and
cytokinin sensitivities and is responsible for soybean nodule development.
It is found that miR393 stimulates the defense against Phytophthorasojae infection (Caudle et al., 2016). Moreover, soybean miRNAs are
reported to be involved in biotic and abiotic stresses (Kulcheski et al., 2011) and play a crucial role in the regulatory network of soybean
flower (Kulcheski et al., 2016). In order to understand the biological role of miRNAs in G. max, identification and analysis of the targets is
of utmost importance. Thus, in the present study an attempt was made to identify and annotate the G.max targets using in silico approach, as
the experimental methods are highly demanding in terms of resources and time. The study identified 1333 miRNA targets and they were
found to be associated with various vital metabolic pathways, including biosynthesis of antibiotics, glycolysis, amino acid metabolism, and
starch and sucrose metabloism. Furthermore, network analysis of target genes was performed and it was observed that they are involved in
the vital processes of plant, such as root growth, seed germination, flower development etc. These predicted targets are majorly involved in
regulating various stresses like water stress, cold stress, osmotic stress etc.
The outcome from the analysis would broaden the scope of understanding the role of G. max miRNAs in stress regulation and can be
experimentally verified. Further, the result can be considered for engineering G.max with stress resistance traits.
2. Materials and methods
2.1 Sequence database
A total of 638 mature miRNA sequences of G.max was downloaded from miRBase (release 21) (http://www.mirbase.org) (Kozomara and
Griffiths-Jones, 2014), a searchable database of published miRNA sequences and annotation. The downloaded mature miRNAs were used
3
as a query sequence set for the prediction of G.max targets. The transcript library UniGene, DFCI Gene Index (GMGI) (version 16), was
downloaded for target identification.
2.2 Tools and Software
Two tools namely, psRNAtarget and TargetFinder were used for the prediction of miRNA targets. The mature miRNA sequences of G.max
were submitted as query to the target prediction server psRNATarget (Dai and Zhao, 2011), a small RNA target analysis server for plants.
The server uses dynamic programming with modified Smith & Waterman algorithm for aligning the sequences and it applies RNAup
algorithm (Lorenz et al., 2011) from Vienna package for analyzing target accessibility. TargetFinder (Fahlgren et al., 2007), a small RNA
target prediction tool, was also used for the prediction of targets, which predicts the small RNA binding sites on the target transcripts
computationally using position weighted scoring matrix. As miRNA usually target protein-coding sequences, the identified target mRNAs
were checked for their ability to code for a protein using Basic Local Alignment Search Tool (BLASTX), the homologous sequence finding
tool from NCBI. Annotation of the identified targets were done using BLAST2GO (Conesa et al., 2005) suite, a software for high-
throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. Pathway studio plant
(web) (Nikitin et al., 2003) tool was used for the construction of network, which allows us to understand the role of proteins in regulatory
networks.
2.3 Prediction of Glycine max targets
The downloaded 638 G.max mature miRNA sequences were used as query to identify their respective target transcript sequences.
2.3.1 psRNATarget
The downloaded miRNA sequences were submitted as query to the psRNATarget server with G.max UniGene library as target database
using the option ‘user submitted small RNAs/ preloaded transcripts’. The program was run with default values for the following parameters:
maximum expectation as 3.0, length for complementarity scoring (HSPsize) as 20, number of top target’s genes for each small RNA as 200,
target accessibility- allowed maximum energy to unpair the target site (UPE) as 25.0, flanking length around the target site for target
accessibility analysis as 17bp in upstream/13bp in downstream, and range of central mismatch leading to translational inhibition as 9-11nt.
2.3.2 TargetFinder
TargetFinder is the software that works on LINUX environment. Herein, the mature G.max miRNAs were used as query and the
downloaded G.max UniGene transcript sequences were used as database to be searched against. Targets were identified by aligning the
input small RNA sequences against all transcripts, followed by site scoring using a position-weighted scoring matrix. It implements a
‘FASTA’ program along with a penalty scoring scheme for mismatches, bulges, or gaps for aligning the sequences.
2.3.3 Elucidation of common protein-coding targets
The identified mRNA targets were manually curated and the targets identified by psRNATarget and TargetFinder were considered as
‘common targets’, as the combination of results will deliver high true positive coverage (Srivastava et al., 2014). These common targets
were then checked to be protein-coding sequence. This was accomplished by querying the predicted target mRNA sequences against the
non-redundant protein database using BLASTX with parameters (i) expect threshold: 10, (ii) word size: 6, matrix: BLOSUM62. Those
sequences which are capable of protein-coding alone were retained for further analysis and were termed as ‘common potential targets’.
2.4 Annotation and Network analysis
The ‘common potential targets’ were subjected to Gene Ontology (GO) annotation using the Blast2GO (Conesa et al., 2005) suite.
Blast2GO suite uses three principles to perform the annotation, which include Blast search, mapping and annotation. The tool provides GO
annotation information about the query sequences by mapping against databases, such as PIR, PSD, Uniprot, Swiss-prot, TrEMBL, RefSeq,
4
Gen Pept, PDB, GO and NCBI. Furthermore, the KEGG pathway analysis was done to interpret the pathways in which the targets are
involved, followed by InterPro domain analysis.
For the annotated G. max targets, network analysis was done using pathway studio plant (Nikitin et al., 2003), a tool that allows to
understand the role of proteins in regulatory networks. It builds pathways and networks from relationships between biological molecules
and processes extracted from databases. It identifies the relationship among proteins, diseases, small molecules and cell processes. The
annotated common potential targets belonging to G.max were considered for network construction. These G.max targets were submitted to
the pathway studio plant and network with cellular processes, small molecules and various plant diseases were constructed. The proteins
with high connectivity were considered to be important targets of G.max in the network. These were taken to analyze their potentiality using
Cytoscape 3.4.0 (Christmas et al., 2005), a software for biological network visualization and analysis. The targets selected were individually
analyzed by observing the network parameters upon node (target) deletion. The potentiality of the targets was determined based on the
change in the critical network parameter values like clustering coefficient, network heterogeneity, network centralization and characteristic
path length.
3. Results and Discussion
3.1 Glycine max target prediction
A total of 4225 mRNA targets were predicted for 610 mature miRNA sequences of G. max against G.max UniGene library (DFCI Gene
Index (GMGI)) by psRNATarget server. The parameters (i) maximum allowed unpaired energy (UPE), (ii) flanking length around the target
site for analysis of target accessibility, (iii) complementarity scoring length (HSPsize) and (iv) central mismatch range leading to translation
inhibition were set to default values. Results from the psRNATarget revealed that the identified UniGene targets were either cleaved or
inhibited by the identified miRNAs. About 4284 targets were predicted by TargetFinder tool for the 610 mature miRNA sequences against
the G.max UniGene library (DFCI Gene Index (GMGI)).
The previous comparison study (Srivastava et al., 2014) reported that the combination of targets obtained from both the tools will give high
true positive coverage. Thereby, the miRNA target genes obtained from psRNATarget server and TargetFinder tool were manually curated
to identify common targets. Since both the tools used UniGene transcript as database, target accession ID was same for the predicted
targets. This enabled the easy extraction of the transcripts predicted as targets by both the tools and were termed as ‘common targets’. A
total of 1333 transcript sequences were predicted as common targets by both the tools. It was observed that a single G.max miRNA targets
multiple transcripts and also a single transcript (target) is associated with multiple G.max miRNAs. These results were in accordance with
the earlier reports that miRNAs exhibit multiplicity nature. i.e., a single miRNA can target multiple transcripts and also a single transcript
(target) is acted upon by multiple miRNAs. It is found that on an average 12 targets are associated to a single miRNA (Table 1). 22
miRNAs target only single transcript, whereas other miRNAs target multiple transcript. miR9752 was found to have the maximum (83)
number of targets out of all the miRNAs. Table 2 depicts that each target is associated with one or multiple miRNAs.
3.2Functional annotation of miRNA target genes
To understand the functional importance of the transcripts identified as targets by the miRNAs, functional annotation of the target genes
was carried out. Out of the 1333 ‘common targets’, 1052 targets were found to be protein-coding genes through BLASTX analysis which
were considered to be the ‘common potential targets’. As miRNA targets protein-coding genes only, these transcripts were taken forward
for functional annotation. Functional annotation of the targets was done using BLAST2GO suite, which performs GO annotation and
pathway analysis. As a part of annotation, Blast2GO performs BLAST analysis initially which indicated that top blast hits belong to G. max
species. The GO annotation revealed that the identified targets are involved in various molecular functions, cellular components and
biological processes. In the present study, it was observed that 732 targets are involved in several molecular functions, such as catalytic
activity, ion binding, transferase activity, 710 targets participate in the biological processes, including metabolic processes, signaling,
regulation of biological processes and 643 targets take part in the cellular components, such as cell periphery, membrane, nucleus ( Figures:
1, 2 &3). InterPro domain analysis revealed that 697 targets are associated with different InterPro IDs.
5
Pathway enrichment analysis revealed that a total of 114 targets are involved in the metabolic networks. The pathways include antibiotic
biosynthesis, amino acid biosynthesis, nitrogen, starch and sucrose metabolism (Table 3). Most of the targets (24) were found to be
involved in biosynthesis of antibiotics. From the literature it was found that the genes responsible for the synthesis of antibiotics will be
useful in disease management of crop production. Antibiotics can contribute to microbial competitiveness and the suppression of plant root
pathogens. So, it is assumed that the identified targets involved in antibiotic biosynthesis might play a role in defense mechanism of the
crop by evading root pathogens colonization, as the crop depends mainly on the roots for a very crucial process called symbiosis to fix
atmospheric nitrogen.
3.3 Network analysis of Glycine max targets
In order to understand the interactions of target genes with other small molecules or diseases we considered all the targets of G.max and
constructed network using pathway studio plant (Nikitin et al., 2003). From the constructed network (Figure 4), it was observed that the
identified targets are involved in the important plant processes like seed germination, plant growth, root growth, flower development, cell
death, defense response and photosynthesis. Targets like peroxidase (UNIREF100 ID- O23961), Heat Shock Protein 70 (HSP70:
UNIREF100 ID- P26413), mitogen-activated protein kinase (MAPK: UNIREF100 ID- Q5K6Q4), Ribosomal protein S6 (RPS6:
UNIREF100 ID- Q6SPR3) were found to be highly connected in the network. For example, Figure 5 shows that peroxidase is responsible
for many regulations (grey-dotted lines) in the network and also in treatments of osmotic stress, hypoxia and drought. Similarly, Figure 6
indicates that HSP70 has many incoming and outgoing connections in the network denoting that it is responsible for expression of proteins
(blue arrowed lines), regulations (grey-dotted lines) and binding (purple lines) and is responsible for abiotic stress treatment, water stress
treatment, infection treatment etc.
These targets were individually analyzed using STRING (Szklarczyk et al., 2011) database. The process of node deletion in each network
was performed and considerable changes in the network parameters were observed (Table 4). These targets also had high centroid value
than the average centroid value of their respective networks which implies that they will be involved in regulating specific cell activity
(Scardoni et al., 2014).
4. Conclusions
In the present study, we predicted 1333 targets for the mature miRNA sequences, out of which 1052 targets are protein-coding genes. A
total of 710 target genes were found to be involved in various biological processes, 643 target genes as cellular components and 732 target
genes are involved in molecular functions like ATP binding, DNA binding, hydrolase activity, transferase activity etc. InterPro domain
analysis showed that 698 InterPro IDs are associated with target genes from databases, including- CATH-Gene3D, HAPMAP, PANTHER,
Pfam, PRINTS, PIRSF, ProDom, Prosite, SMART, Superfamily. KEGG analysis showed that targets are involved in 97 pathways which
include antibiotic biosynthesis, starch and sucrose metabolism. Network analysis revealed that four targets viz., peroxidase, HSP70, MAPK
and RPS6 are highly connected in the network and are involved in plant disease treatments and vital plant processes. It is noted from the
literature that MAPK is a mediator of several abiotic stresses, such as water, salt, osmotic and cold stress (Cristina et al., 2010), while RPS6
being a contributor of nodule development (Um et al., 2013). In addition, HSP70 (Yu et al., 2015) was found to be a major contributor of
stress related processes of cells and peroxidase helps the plant in coping up with the biotic stress conditions and pathogen attack (Passardi et
al., 2005). Thus, the suggested targets can be taken further for experimental verification of their roles as stress regulators in the plant which
will benefit the scientific community.
List of abbreviations
miRNA – microRNA; G.max- Glycine max; UPE- UnPaired Energy; KEGG - Kyoto Encyclopedia of Genes and Genomes; UTR -
Untranslated region; NCBI- National Center for Biotechnology Information; GO - Gene Ontology; HSP70 - Heat Shock Protein 70; MAPK
- Mitogen-activated protein kinase; RPS6 -Ribosomal protein S6.
6
References
Ambros, V. (2008) The evolution of our thinking about microRNAs. Nat. Med. 14, 1036–1040. doi: 10.1038/nm1008-1036.
Araújo, S. S., Beebe, S., Crespi, M., Delbreil, B., González, E. M., Gruber, V., Lejeune-henaut, I., Link, W., Monteros, M. J., Rao, I.,
Vadez, V. and Patto, M. C. V. (2015) Abiotic stress responses in legumes: strategies used to cope with environmental challenges. CRC Crit.
Rev. Plant Sci. 2689, 37–41. doi: 10.1080/07352689.2014.898450.
Caudle, A. S., Yang, W. T., Mittendorf, E. A. and Kuerer, H. M. (2016) HHS Public Access. 150, 137–143. doi:
10.1001/jamasurg.2014.1086.
Christmas, R., Avila-Campillo, I., Bolouri, H., Schwikowski, B., Anderson, M., Kelley, R., Landys, N., Workman, C., Ideker, T., Cerami,
E., Sheridan, R., Bader, G.D. and Sander, C. (2005) Cytoscape: A software environment for integrated models of bimolecular interaction
networks. Am. Assoc. Cancer Res. Educ. Book 12-16. doi: 10.1101/gr.1239303.
Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M. and Robles, M. (2005) Blast2GO: A universal annotation and visualization
tool in functional genomics research. Application note. Bioinformatics 21, 3674–3676. doi: 10.1093/bioinformatics/bti610.
Cristina, M., Petersen, M. and Mundy, J. (2010) Mitogen-activated protein kinase signaling in plants. Annu. Rev. Plant Biol. 61, 621–649.
doi: 10.1146/annurev-arplant-042809-112252.
Dai, X. and Zhao, P. X. (2011) psRNATarget: A plant small RNA target analysis server. Nucleic Acids Res. 39 (SUPPL. 2), 1–5. doi:
10.1093/nar/gkr319.
Fahlgren, N., Howell, M. D., Kasschau, K. D., Chapman, E. J., Sullivan, C. M., Cumbie, J. S., Givan, S. A., Law, T. F., Grant, S. R., Dangl,
J. L. and Carrington, J. C. (2007) High-throughput sequencing of Arabidopsis microRNAs: Evidence for frequent birth and death of
MIRNA genes. PLoS One 2. doi: 10.1371/journal.pone.0000219.
Graham, P. H. and Vance, C. P. (2003) Legumes: Importance and Constraints to Greater Use. Plant Physiol. 131, 872–877. doi:
10.1104/pp.017004.872.
Hossain, Z., Khatoon, A. and Komatsu, S. (2013) Soybean proteomics for unraveling abiotic stress response mechanism. J. Proteome Res.
12, 4670–4684. doi: 10.1021/pr400604b.
Khraiwesh, B., Zhu, J.-K. and Zhu, J. (2012) Role of miRNAs and siRNAs in biotic and abiotic stress responses of plants. Biochim.
Biophys. Acta. 1819, 137–148. doi: 10.1016/j.bbagrm.2011.05.001.
Kozomara, A. and Griffiths-Jones, S. (2014) miRBase: Annotating high confidence microRNAs using deep sequencing data. Nucleic Acids
Res. 42 (D1), 68–73. doi: 10.1093/nar/gkt1181.
Kulcheski, F. R., de Oliveira, L. F., Molina, L. G., Almerão, M. P., Rodrigues, F. a, Marcolino, J., Barbosa, J. F., Stolf-Moreira, R.,
Nepomuceno, A. L., Marcelino-Guimarães, F. C., Abdelnoor, R. V, Nascimento, L. C., Carazzolle, M. F., Pereira, G. A. and Margis, R.
(2011) Identification of novel soybean microRNAs involved in abiotic and biotic stresses. BMC Genomics 12, 307. doi: 10.1186/1471-
2164-12-307.
Kulcheski, F. R., Molina, L. G., da Fonseca, G. C., de Morais, G. L., de Oliveira, L. F. V and Margis, R. (2016) Novel and conserved
microRNAs in soybean floral whorls. Gene 575, 213–223. doi: 10.1016/j.gene.2015.08.061.
7
Lorenz, R., Bernhart, S. H., Höner zu Siederdissen, C., Tafer, H., Flamm, C., Stadler, P. F. and Hofacker, I. L. (2011) ViennaRNA Package
2.0. Algorithms Mol. Biol. 6, 26. doi: 10.1186/1748-7188-6-26.
Michelfelder, A. J. (2009) Soy: A complete source of protein Am. Fam. Physician 79, 43–47.
Nikitin, A., Egorov, S., Daraselia, N. and Mazo, I. (2003) Pathway studio - The analysis and navigation of molecular networks.
Bioinformatics 19,2155–2157. doi: 10.1093/bioinformatics/btg290.
Nizampatnam, N. R., Schreier, S. J., Damodaran, S., Adhikari, S. and Subramanian, S. (2015) MicroRNA160 dictates stage-specific auxin
and cytokinin sensitivities and directs soybean nodule development. Plant J. 84, 140–153. doi: 10.1111/tpj.12965.
Passardi, F., Cosio, C., Penel, C. and Dunand, C. (2005) Peroxidases have more functions than a Swiss army knife. Plant Cell Rep. 24, 255–
265. doi: 10.1007/s00299-005-0972-6.
Rajagopalan, R., Vaucheret, H., Trejo, J. and Bartel, D. P. (2006) A diverse and evolutionarily fluid set of microRNAs in Arabidopsis
thaliana. Genes Dev. 20, +3407–3425. doi: 10.1101/gad.1476406.
Scardoni, G., Laudanna, C. and Tosadori, G. (2014) CentiScaPe : Network centralities for Cytoscape. 1–11.
Serraj, R. (2003) Effects of drought stress on legume symbiotic nitrogen fixation: Physiological mechanisms. Indian J. Exp. Bio. 41, 1136–
1141.
Song, Q.-X., Liu, Y.-F., Hu, X.-Y., Zhang, W.-K., Ma, B., Chen, S.-Y. and Zhang, J.S. (2011) Identification of miRNAs and their target
genes in developing soybean seeds by deep sequencing. BMC Plant Bio. 11, 5. doi: 10.1186/1471-2229-11-5.
Srivastava, P.K., Moturu, T.R., Pandey, P., Baldwin, I.T. and Pandey, S.P. (2014) A comparison of performance of plant miRNA target
prediction tools and the characterization of features for genome-wide target prediction. BMC Genomics 15, 348. doi: 10.1186/1471-2164-
15-348.
Szklarczyk, D., Franceschini, A., Kuhn, M., Simonovic, M., Roth, A., Minguez, P., Doerks, T., Stark, M., Muller, J., Bork, P., Jensen, L. J.
and Von Mering, C. (2011) The STRING database in 2011: Functional interaction networks of proteins, globally integrated and scored.
Nucleic Acids Res. 39 (SUPPL. 1), 561–568. doi: 10.1093/nar/gkq973.
Um, J. H., Kim, S., Kim, Y. K., Song, S. B., Lee, S. H., Verma, D. P. S. and Cheon, C. I. (2013) RNA interference-mediated repression of
S6 kinase 1 impairs root nodule development in soybean. Mol. Cells 35, 243–248. doi: 10.1007/s10059-013-2315-8.
Xia, R., Zhu, H., An, Y., Beers, E. P. and Liu, Z. (2012) Apple miRNAs and tasiRNAs with novel regulatory networks. Genome Bio. 13,
R47. doi: 10.1186/gb-2012-13-6-r47.
Yu, A., Li, P., Tang, T., Wang, J., Chen, Y. and Liu, L. (2015) Roles of Hsp70s in Stress Responses of Microorganisms, Plants, and
Animals. Biomed Res. Int. (Figure 1). doi: 10.1155/2015/510319.
Zhang, L., Chia, J. M., Kumari, S., Stein, J. C., Liu, Z., Narechania, A., Maher, C. A., Guill, K., McMullen, M. D. and Ware, D. (2009) A
genome-wide characterization of microRNA genes in maize. PLoS Genet. 5. doi: 10.1371/journal.pgen.1000716.
Zhu, Q. H., Spriggs, A., Matthew, L., Fan, L., Kennedy, G., Gubler, F. and Helliwell, C. (2008) A diverse set of microRNAs and
microRNA-like small RNAs in developing rice grains. Genome Res. 18, 1456–1465. doi: 10.1101/gr.075572.107.
8
Table Legends
Table1: Glycine max miRNAs targeting multiple transcripts
Table 2: Targets associated to single and multiple miRNAs
Table 3: Pathway related information of the targets
Table 4 : Network parameters of target genes -pre and post node deletion
Figure Legends
Figure 1: Classification of predicted targets according to their biological process
Figure 2: Classification of predicted targets according to their molecular function
Figure 3: Classification of predicted targets according to cellular components
Figure 4: Network of Glycine max miRNA targets
Figure 5: Peroxidase interactions in the network
Figure 6 : HSP70 interactions in the network
9
Table 1: Glycine max miRNAs targeting multiple transcripts
miRNA id No. of targets Mode of Inhibition
gma-mir9732 10 cleavage
gma-mir5038a 9 cleavage
gma-mir5672 18 cleavage
gma-mir393c-5p 14 cleavage
gma-mir1520f-3p 10 cleavage
gma-mir9739 1 cleavage
gma-mir319p 6 cleavage
gma-mir156v 32 cleavage
gma-mir1520l 12 cleavage
gma-mir169a 12 cleavage
gma-mir2606a 16 cleavage
gma-mir403b 6 cleavage/translation
gma-mir5371-5p 3 cleavage
gma-mir482b-5p 20 cleavage
gma-mir9750 2 cleavage
gma-mir171i-3p 10 cleavage
gma-mir1532 3 cleavage
Table 2: Targets associated to single and multiple miRNAs
Target miRNAs
TC423678 gma-miR1507c-5p
TC437576 gma-miR1508c
TC434615 gma-miR1513a-5p,gma-miR1513b
TC427051 gma-miR1514a-5p
TC420931 gma-miR1514b-5p
TC426727 gma-miR1516a-3p
10
TC472015 gma-miR1516b
TC457335 gma-miR1534
TC462825 gma-miR1536
FK514297 gma-miR156b,gma-miR156p,gma-miR156t,gma-miR156f
TC425591 gma-miR156b,gma-miR156f
TC439563 gma-miR156f
DB962750 gma-miR156v,gma-miR156b,gma-miR156s,gma-miR156a,gma-miR156u,gma-miR156q,gma-
miR156h,gma-miR156y,gma-miR156g,gma-miR156x,gma-miR156w,gma-miR156f
TC430891 gma-miR156v,gma-miR156b,gma-miR156s,gma-miR156a,gma-miR156u,gma-miR156q,gma-
miR156h,gma-miR156y,gma-miR156g,gma-miR156x,gma-miR156w,gma-miR156f
TC466550 gma-miR156v,gma-miR156b,gma-miR156s,gma-miR156a,gma-miR156u,gma-miR156q,gma-
miR156h,gma-miR156y,gma-miR156g,gma-miR156x,gma-miR156w,gma-miR156f
TC437112 gma-miR171d
GE083705 gma-miR171k-5p,gma-miR171l
TC427597 gma-miR172i-3p,gma-miR172d,gma-miR172l,gma-miR172e,gma-miR172c
TC434089 gma-miR172i-3p,gma-miR172a,gma-miR172f,gma-miR172d,gma-miR172l,gma-miR172e,gma-
miR172h-3p,gma-miR172b-3p,gma-miR172c,gma-miR172k
TC457282 gma-miR172i-3p,gma-miR172a,gma-miR172f,gma-miR172d,gma-miR172l,gma-miR172e,gma-
miR172h-3p,gma-miR172b-3p,gma-miR172c,gma-miR172k
AW351184 gma-miR2108a,gma-miR2108b
EH222473 gma-miR2108a,gma-miR2108b
DB971500 gma-miR395l,gma-miR395i,gma-miR395k,gma-miR395h,gma-miR395j,gma-miR395m
TC489898 gma-miR396k-3p
TC444778 gma-miR397b-5p,gma-miR397a
TC479131 gma-miR4382
11
Table 3: Pathways related information of the targets
Pathway Seqs in
Pathway
Enzyme Enzyme ID Sequence ID Pathway ID
Zeatin biosynthesis 3 dimethylallyltransferase ec:2.5.1.27 TC445284 map00908
Zeatin biosynthesis 3 dimethylallyltransferase ec:2.5.1.75 TC470457, TC462920, TC445284 map00908
Valine, leucine and isoleucine
degradation
4 C-acetyltransferase ec:2.3.1.9 TC430049 map00280
Valine, leucine and isoleucine
degradation
4 dehydrogenase ec:1.1.1.31 TC462912 map00280
Valine, leucine and isoleucine
degradation
4 dehydrogenase (NAD+) ec:1.2.1.3 TC457335, TC482610 map00280
Valine, leucine and isoleucine
biosynthesis
2 synthase ec:2.2.1.6 TC441223, BW669215 map00290
Tyrosine metabolism 11 oxidase ec:1.10.3.1 NP8367007, TC461966 map00350
Tyrosine metabolism 11 dehydrogenase ec:1.1.1.1 TC432095, TC444770, TC424618 map00350
12
Table 4: Network parameters of target genes- pre and post node deletion
Target Name
Clustering
co-efficient
Network centralization characteristic path length Network heterogeneity
Before
node
deletion
After
node
deletion
Before
node
deletion
After
node
deletion
Before
node
deletion
After
node
deletion
Before
node
deletion
After
node
deletion
HSP70 0.225 0 0.521 0.637 1.471 2.103 0.331 1.338
MAPK 0.593 0.590 0.695 0.421 1.629 1.363 0.730 0.819
RPS6 0.990 0.989 0.011 0.012 1.010 1.011 0.020 0.021
GMIPER
(Peroxidase)
0.866 0.851 0.579 0.058 1.524 1.111 0.276 0.153
13
Figure 1: Classification of predicted targets according to their Biological Process
Figure 2: Classification of predicted targets according to their molecular function
14
Figure 3: Classification of predicted targets according to cellular components
15
Figure 4: Network of Glycine max miRNA targets
16
Figure 5: Peroxidase interactions in the network
17
Figure 6: HSP70 interactions in the network