inferring time-delayed gene regulatory networkstechlav.ncat.edu/seminars/2017/2017-02-17 mina moradi...

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Inferring Time-Delayed Gene Regulatory Networks

Presented by: Mina Moradi

Advisor: Dr. Abdollah Homaifar

North Carolina A&T State University

Dept. of Electrical & Computer Engineering

[email protected]://acitcenter.ncat.edu

February, 17, 2017

1 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Outline

1 Introduction and Motivation

2 Literature Review

3 Objective of the Work

4 Proposed Method

5 Simulations and Results

6 Conclusion

2 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Introduction and Motivation

Your genes are part of what makes you the person you are. You aredifferent from everyone alive now and everyone who has ever lived.Genes are not independent. Regulate each other and act collectively.Gene regulatory network (GRN) is an abstract mapping of generegulations in living cells.GRNs identify the specific functional roles of individual genes incellular systems and can open up a window on the disease progressionand drug development.

Therapy: This time it’s personal, Lauren Gravitz, Nature (2014)

3 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.


DNA microarrays and RNA sequencing technologies measure theexpression levels of thousands of genes inside cell in respond tospecific environmental conditions [1].

GRN is usually represented by a directed graph, with nodesrepresenting the genes and links representing the regulatoryrelationships.

4 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.


Reverse engineering of GRNs is a challenging problem due to:

The stochastic characteristics of biological phenomena, the inherentnoise of measured gene expression data, and high dimensionality [2].There is strong non-linearity on temporal patterns of regulatory genes[3].Genetic interactions among different genes can have different timedelays due to the time required for regulatory genes to express theirprotein products and etc. [4, 5]

5 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Literature Review

Boolean networks: Based upon binary outcomes (on and off) for geneexpression and therefore lack adequate dynamic resolution [6].

Bayesian networks: Represent probabilistic relationships among genes,the inherent noise and stochasticity of gene expression [7].

Ordinary differential equations: Deterministic models, whereinteractions among genes represent causal interactions rather thanstatistical dependencies [8].

TD-ARACNE [9]: The time-delayed dependencies between the genesin terms of mutual information by assuming a stationary MarkovRandom Field as its underlying probabilistic model.

HCC-CLINDE [10]: Infer a time-delayed GRN in the presence ofhidden common causes based on either a correlation test or mutualinformation test.

6 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Objective of the Work

Inferring a time-delayed GRN which takes into account thenon-linearity of gene interaction and the noise of measurements.

RNNs are computational tools for temporal data processing,approximating nonlinear patterns and tolerating noise inmeasurements.

RNNs are usually considered “black box” models. The internalstructure and learned parameters are not interpretable [11].

7 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Hierarchical Recurrent Neural Network

Proposed a hierarchical RNN (HRNN) that surmounts theinterpretation difficulties of the RNNs for modeling of GRN.

Time-delayed regulations can be captured through hierarchical pathsbetween leaf nodes (regulatory genes) and a target node (regulatedgene) in the HRNN.

x1, . . . , xC are context nodes.xC+1, . . . , xC+P are genes.

8 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.


A population of candidate HRNNs are randomly generated.

The network with c context nodes has c + 1 neurons.

In a network with c ≤ C context nodes, the first c context nodes andgenes (excluded the target gene) are potential inputs of the neurons.

The target gene is the output of the first neuron. The context nodeci is the input of neuroni and output of neuroni+1.

If input of the neuron is a context node, the weight is positive.

xi (t + 1) = f (∑j

wk,j .xj(t)) (1)

9 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.


The corresponding hierarchical model which shows the direct regulation ofx8 by gene x4, and time-delayed regulations of x7 by genes x4, x5, x7.

10 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Representation of the Candidate HRNNs in the GA

Candidate networks in the GA are represented by their number of neurons(Nn), number of inputs to each neuron (Nin), indices of the input nodes(In), weights of the input connections (W ) and the decay rate of thetarget gene’s expression level (µ) if it exists.

11 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Fitness of candidate networks

The performance of the candidate networks (fitness) is evaluated bymeasuring the trade-off between the goodness of fit and complexity of themodel by using the Akaike information criterion (AIC) and the Akaikeinformation criterion with correction (AICc).

AIC = n.ln(1

n

∑l

(∑t

(x li (t)− x̂ li (t))2))) + 2k (2)

AICc = AIC + 2k(k + 1)/(n − k − 1) (3)

k is the number of leaf nodes in the HRNN and n is the total number oftemporal samples for gene expression. If n is small or k is large, the AICcis preferred rather than AIC. As n gets larger, AICc converges to AIC.

12 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

The Crossover Operator

Figure: Parent 1Figure: Parent 2

Figure: Child 1Figure: Child 2

13 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

The Mutation operator

For a mutation site msite in the network, the mutation works as below:

If msite is on the number of inputs of a neuron (Nin), it is mutated toNin = Nin ± 1. Therefore, a new input and its corresponding weightare added or deleted.

If msite is on an input connection of a neuron (In), the selectedconnection is rewired to another node in the network.

If msite is on a connection weight of a neuron and input is a contextnode, the Gaussian mutation evolves the weight in the range of[0,wmax ]; else, the weight is mutated in the range of [wmin,wmax ]

14 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Simulations and Results

The HRNN is evaluated on the GRN of Saccharomyces cerevisiae andnonlinear synthetic generated data for different sizes of networks andvariances of noise. The results are compared with TD-ARACNE andHCC-CLINDE in terms of:

Links: if and only if both the gene pair and the direction are correct

Delays: if and only if both the link and the time delay are correct

Effects: if and only if both the link and the sign of an effect arecorrect

For each term, Recall = TPTP+FN , Precision = TP

TP+FP and F -score

= 2×Precision×RecallPrecision+Recall metrics are computed.

15 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

The Effect of Network Size

Figure: 5 genes Figure: 10 genes

Figure: 20 genes Figure: 30 genes

16 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

The Effect of Noise Level

Figure: σ2 = 0.5 Figure: σ2 = 1.0

Figure: σ2 = 1.5 Figure: σ2 = 2.0

17 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Saccharomyces cerevisiae

IRMA is a recent significant contribution to systems biology reported in[12] where the authors built a synthetic network of the yeast organismSaccharomyces cerevisiae.

Figure: True regulations

Figure: Proposed method

Figure: TD-ARACNE Figure: HCC-CLINDE

18 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Saccharomyces cerevisiae

Table: Comparison of Results for GRN Reconstructions of IRMA.

Methods TP FP FN Precision Recall F -score

Proposed 6 3 2 0.667 0.75 0.706

TD-ARACNE 2 1 6 0.667 0.25 0.366

HCC-CLINDE 1 3 7 0.25 0.125 0.166

19 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Conclusion

In this study, we proposed a hierarchical recurrent neural networkapproach to identify time-delayed regulatory interactions of genes.

The designed HRNN facilitates capturing the paths with differentlengths from the leaf nodes in the network to the target node.

Hierarchy in the network and possibility of recurrent connections inHRNN provide a capability for modeling the temporal patterns ofgene expression.

The proposed method outperformed TD-ARACNE and HCC-CLINDEin terms of non-linearity and high level of noise in measurements.

20 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

References[1] C. Sima, J. Hua, and S. Jung, “Inference of gene regulatory networks using time-series data: a survey,” Current Genomics,

vol. 10, pp. 416–29, Sept. 2009.

[2] Z. Bar-Joseph, A. Gitter, and I. Simon, “Studying and modelling dynamic biological processes using time-series geneexpression data,” Nature Reviews Genetics, vol. 13, no. 8, pp. 552–564, 2012.

[3] J. Hasty, D. McMillen, F. Isaacs, and J. J. Collins, “Computational studies of gene regulatory networks: in numeromolecular biology,” Nat Rev Genet, vol. 2, pp. 268–279, apr 2001.

[4] N. Morshed, M. Chetty, and N. Xuan Vinh, “Simultaneous learning of instantaneous and time-delayed genetic interactionsusing novel information theoretic scoring technique,” BMC Systems Biology, vol. 6, no. 1, p. 62, 2012.

[5] D. Bratsun, D. Volfson, L. S. Tsimring, and J. Hasty, “Delay-induced stochastic oscillations in gene regulation,”Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 41, pp. 14593–14598,2005.

[6] S. Bornholdt, “Boolean network models of cellular regulation: prospects and limitations,” Journal of the Royal SocietyInterface, vol. 5 Suppl 1, pp. S85–S94, 2008.

[7] N. Friedman, M. Linial, I. Nachman, and D. Pe’er, “Using Bayesian networks to analyze expression data,” Journal ofComputational Biology, vol. 7, no. 3-4, pp. 601–620, 2000.

[8] E. Sakamoto and H. Iba, “Inferring a system of differential equations for a gene regulatory network by using geneticprogramming,” in Proceedings of the 2001 Congress on Evolutionary Computation, vol. 1, pp. 720–726 vol. 1, 2001.

[9] P. Zoppoli, S. Morganella, and M. Ceccarelli, “TimeDelay-ARACNE: reverse engineering of gene networks fromtime-course data by an information theoretic approach,” BMC Bioinformatics, vol. 11, no. 1, p. 154, 2010.

[10] L.-Y. Lo, M.-L. Wong, K.-H. Lee, and K.-S. Leung, “Time delayed causal gene regulatory network inference with hiddencommon causes,” PLoS ONE, vol. 10, pp. 1–47, 09 2015.

[11] D. Castelvecchi, “Can we open the black box of AI?,” Nature, vol. 538, no. 7623, p. 20, 2016.

[12] I. Cantone, L. Marucci, F. Iorio, M. A. Ricci, V. Belcastro, M. Bansal, S. Santini, M. di Bernardo, D. di Bernardo, andM. P. Cosma, “A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches,” Cell,vol. 137, no. 1, pp. 172 – 181, 2009.

21 / 22

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Acknowledgment

This work is partially supported by the National Science Foundation (NSF)under Cooperative Agreement No. CCF-1029731.

Thank you for your attention

22 / 22

inferring time-delayed gene regulatory networkstechlav.ncat.edu/seminars/2017/2017-02-17 mina moradi...

Documents