click to edit present’s name identification of transition models of biological systems in the...

37
Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa, S. Agarwal

Upload: peregrine-norman

Post on 31-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

Click to edit Present’s Name

Identification of Transition Models of Biological Systems in the Presence of Transition Noise

A. Srinivasan, M. Bain, D. Vatsa, S. Agarwal

Page 2: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Part 1: Transition Models in Biology

Page 3: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Networks in Biology

• Biological processes are often represented as networks– Gene-regulatory networks, signal-transduction networks, metabolic

networks, protein-protein interaction networks, phylogenetic trees, food-webs, ecosystems

– Modelling, visualisation and analysis of these networks is a fundamental part of modern Biology

• Here, we will be looking at one kind of model for networks in Biology (transition models)– Most well known: Petri Net (and variants)– Generalisation to Logical Guarded Transition Systems (LGTSs)

Page 4: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Some Examples of Networks

Page 5: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Discrete System Observations

Page 6: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Petri Net Models

Page 7: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

From Extended PNs to LGTSs

Page 8: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Part 2: Model Identification

Page 9: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Identification of Petri Nets

• Durzinsky et al have proposed an algorithm that enumerates all Petri Nets consistent with a set of discrete state-pairs– These are called conformal networks

• This work has since been extended to a procedure that enumerates conformal extended PNs (i.e. Petri nets with read/write arcs)

• Limitations– Does not allow any explicit inclusion of background knowledge, though some

constraints are ``hard-wired’’

– Some technical limitations when data are Boolean valued

– Unclear whether the technique scales to arbitrary combinations of read/write arcs; and does not extend to other forms of PNs

Page 10: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

PN Identification

Page 11: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

LGTS and FSMs

• With a bound on the number of tokens allowed in each place, the LGTS models for a sequence of observations S. the LGTS model can computed by a DFA (Takahashi, 1992)

• The DFA is a transducer that reads zero or one input symbols (observations) and writes out the Tj = (tj , rj, , mj-1 , mj)

• This view of an LGTS will be useful when looking at noisy data

Page 12: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

LGTS Identification• System states are as in Petri nets (i.e., place-value vectors).

• System behaviours are sequences of system states Si = (si,0,si,1,…, si,n) or

equivalently, a set of state-pairs {(si,0,si,1), (si,1, si,2),…,(si,n-1, si,n)}. Let StatePairs

be the union of the sets of state-pairs for a set of sequences S = {S1,S2,…,Sj}.

• An LGTS trace for a state-pair (si,sf) is a set Trace(si,sf) = {T1 , T2 , …,Tk}, where

T1 = (t1 , r1, , m0 , m1), T2 = (t2 , r2, , m1, m2), …, Tk = (tk , rk, , mk-1 , mk).

– (a) Each tj is a guarded transition; (b) rj = mj –mj-1; (c) si = m0; and (d) sf = mk

– m1, m2, …, mk-1 are intermediate states.

• An LGTS model for a state-pair (si,sf) is T(si,sf) = {(t,r): (t,r,ma,mb) Trace(si,sf)}.

• Given a set of sequences S = {S1,S2,…,Sj}, let TracePairs be ( ) StatePairs(S)

Trace(si,sj).

• Then LGTS(S) = {(t,r): (t,r,ma,mb) Trace(si,sf)}.

Page 13: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

System Identification Setting

DataPerfect

Perfect

Imperfect

Imperfect

Background Knowledge

Page 14: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Identification of LGTSs

We can formulate this as logical consequence-finding:– Given: (a) A set of sequences S of states, representing observations of the

system behaviour; and (b) Background knowledge B containing generic and domain-specific constraints and definitions of guarded transitions; and (c) the definition of a relation G = lgts(S,T) that is TRUE for all pairs S and T s.t. T is an LGTS model of S, i.e., T = LGTS(S).

– Find: All T’s s.t. B G T lgts(S,T)

If B and G can be encoded as a logic programs, then the T’s can be computed using the usual theorem prover used by logic programming systems.

Page 15: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

LGTS Identification: Completeness and Correctness

If B is complete and correct, and G is correct, then all T’s that satisfy the equation will be found by the system (refutation-completeness of resolution)

Every T found by the system will correctly explain S, in the sense that lgts(S,T) will be TRUE (soundness of resolution)

Given a data sequence S, for every (extended or normal) PN found by Durzinsky et al, there is some background knowledge B and an LGTS model T s.t. lgts(S,T) is a logical consequence of B and G

Page 16: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Background Knowledge

• The constraints provided as background knowledge can greatly reduce the search-space of possible answers to the system-identification task

• For example, we can restrict chemical reactions to those that break no more than 3 bonds (on grounds that any more would require too much energy in a cell)

• This along with the mass-balance restrictions can provide very effective constraints on the search

Page 17: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Part 3: Model Identification with Transition Noise

Page 18: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

System Identification Setting

Data

Perfect

Perfect

Imperfect

Incomplete Incorrect

Background Knowledge

LP

Page 19: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Viterbi Estimator

System identification with noisy data

19

LGTS model

selection

LGTS Trace

Sequence of Discrete System

States

LGTS Identifi

er

Automaton

Builder

LGTS Model

PFA

Model Filtering

Ranked Transition Sequences

Generic and Problem-specific constraints; Guarded transitions

Background

Knowledge

Discretiser

Page 20: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Two kinds of incompleteness

• Data are missing intermediate states

• States are missing place values

• Of these, the first can be handled adequately by the capability of obtaining LGTS models with intermediate states. In DFA terms, this means allowing -transitions that do not consume input observations, and still produce T-tuples as outputs

• The second kind of incompleteness handled by abduction

Page 21: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

System Identification Setting

Data

Perfect

Perfect

Imperfect

Incomplete Incorrect

Background Knowledge

LPLP

ALP

Page 22: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

“Noise”

• Chemical equations are symbolic representations of what may happen, not what must happen

• Filling a balloon with hydrogen and oxygen will not necessarily result in a balloon full of water vapour (the temperature has to be right)

• Reactions are subject to extrinsic and intrinsic sources of “noise”– External conditions may not be suitable– Molecular collisions may not happen properly for a reaction to take place

• In addition, data are subject to errors of observation, recording etc.

Page 23: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Noise and System Identification

• 3 kinds of incorrectness in the data1. Signal noise: time-series data has noise2. State noise: values of places has errors3. Transition noise: output of transitions do not follow usual patterns

• In principle, if we assume all states are the output of some transition, then it is possible to model both (2) and (3) using a discrete probability model

– we will use the term transition noise for both kinds of errors

Page 24: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Transition NoiseTransitions have some probability of going to unexpected states.Transition-noise:

unexpected states are related to the post-state of the transitionState-noise:

unexpected states are unrelated to the post-state of the transition

If transition T = (t,r,spre,spost) then transition non-determinacy gives transition set T’ = (t,r,spre,spost’) where Hamming(spost, spost’) >= 0.

A probability distribution on set of T’ gives a probabilistic transition.Implemented in PRISM [4] as a probabilistic automaton (PFA).

Page 25: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

LGTS models with noisy data

• With noisy data, there may not be any known transition between a pair of noisy states s0 and s1

• That is, with S = (s0,s1), there is no T s.t. B G T lgts(S,T)

• But, allowing the abduction of new transitions, will allow finding a T– Tnew = (tnew,r,s0,s1) where r = s1 – s0 and guards of tnew are

always TRUE– A new transition is abduced for each “unexpected” state-pair

• With logic programs this is similar to what is done when extending SLD-resolution to SOLD-resolution [7]

Page 26: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

System Identification Setting

Data

Perfect

Perfect

Imperfect

Incomplete Incorrect

Background Knowledge

LPLP

ALPPLP

Page 27: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

PFA Identification from LGTS with Noisy TransitionsWith abduction, it will always be possible to obtain a T s.t. B G T lgts(S,T). The corresponding NFA will contain the abduced transitions as output.

But some transitions may be more likely than others

From the noisy data sequences we determine the parameters for transitions in a PFA using PRISM (Viterbi probability for an HMM where state pairs are observed data and transitions are internal states).

We show on the following slides a worked example

Page 28: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

From “Noisy” to Probabilistic Transitions

Place

s0 s3 s4 s0 s3 s4 s0 s3 s5

h2 0 1 0 0 1 0 0 1 1

o2 0 1 0 0 1 0 0 1 0

h2o 0 0 1 0 0 1 0 0 1

Page 29: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

From “Noisy” to Probabilistic Transitions

Place

s0 s3 s4 s0 s3 s4 s0 s3 s5

h2 0 1 0 0 1 0 0 1 1

o2 0 1 0 0 1 0 0 1 0

h2o 0 0 1 0 0 1 0 0 1

t1 t2 t1 t2 t1 t4

+1 -1 +1 -1 +1 0

+1 -1 +1 -1 +1 -1

0 +1 0 +1 0 +1

Page 30: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

From “Noisy” to Probabilistic Transitions

Page 31: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

From “Noisy” to Probabilistic Transitions

Page 32: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Experiments

Identification evaluation hard on unknown systems, so we use reconstruction3 standard biological models

Water, MAPK and GlycolysisWe vary

Noise level (low, medium and high)Sample size (small and large)with multiple replicates

Implentation LGTS in YAP with data generation and Viterbi estimation in

PRISM

Page 33: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Error (FNR) and Viterbi probability of transitions

Page 34: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Transitions in LGTS and probabilistic model

Page 35: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Related Work

• Durzinsky et al. (2011)– Petri net identification as optimisation

• Inoue (2011) and Inoue et al. (2014) – Learning from interpretation transition

• Bioinformatics and systems biology– Probabilistic network identification

Page 36: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

Conclusion

• Dynamic qualitative model identification– Identification as logical consequence finding using logic programming

(DFA)

• Transition model incompleteness– Abductive LP (NFA)

• Transition model incorrectness– Probabilistic LP (PFA)

• Future work– Generalisation of probabilistic transitions

Page 37: Click to edit Present’s Name Identification of Transition Models of Biological Systems in the Presence of Transition Noise A. Srinivasan, M. Bain, D. Vatsa,

School of Computer Science and Engineering

References

[1] M. Durzinsky, A. Wagler, and W. Marwan. Reconstruction of extended Petri nets from time series data and its application to signal transduction and to gene regulatory networks. BMC Systems Biology, 5:113, 2011.[2] K. Inoue, T. Ribeiro, and C. Sakama. Learning from interpretation transition. Machine Learning, 94(1):51-79, 2014.[3] R. King, K. Whelan, F. Jones, P. Reiser, C. Bryant, S. Muggleton, D. Kell, and S. Oliver. Functional genomic hypothesis generation and experimentation by a robot scientist. Nature, 427:247-252, 2004.[4] T. Sato and Y. Kameya. PRISM: A symbolic-statistical modeling language. In Proc. 15th Intl. Joint Conf. on Artificial Intelligence (IJCAI97), pp. 1330-1335, 1997.

[5] A. Srinivasan and M. Bain. Knowledge-Guided Identification of Petri Net Models of Large Biological Systems. In S. Muggleton, A. Tamaddoni-Nezhad, and F. Lisi, (Eds.), Proc. 21st Intl. Conf. on Inductive Logic Programming (ILP 2011) LNCS 7207 pp. 317-331, Springer, 2012.

[6] A. Srinivasan and M. Bain. Identification of Transition-Based Models of Biological Systems using Logic Programming. Technical Report UNSW-CSE-TR-201425, University of New South Wales, Sydney, Australia, 2014.

[7] A. Yamamoto. Representing Inductive Inference with SOLD-Resolution. In Proceedings of the IJCAI'97 Workshop on Abduction and Induction in AI, 1997.