the chemical master equation: from reactions to complex ...the chemical master equation: from...
Post on 13-May-2020
12 Views
Preview:
TRANSCRIPT
The Chemical Master Equation: From Reactions to
Complex Networks
Massimo Stella
April 21, 2015
Abstract
This project investigates the chemical master equation and its links to complex networks. The
report is composed of two parts: an introduction, deriving the chemical master equation from
some basic results of statistical mechanics and probability theory, and a second part, relating the
formalism of master equations to growing network models and random walks on graphs. At the
end of the first part, further analytical and numerical results about Markov processes are reported
and discussed.
1 The Physics behind the Chemical Master Equation
The mathematical modelling of chemically reacting gaseous systems, via the framework of
Markovian stochastic processes, relies on some delicate hypotheses from statistical mechanics
[3]. In this section, we review these basic results, with the aim of outlining a physically coherent
approach to the mathematics of the chemical master equation for chemical kinetics.
1.1 Some Physical Premises
Historically, the modelling of chemical reactions as stochastic processes was introduced in [2]
and became increasingly popular in the 1950s and 1960s. However, it was only in the nineties,
with the work of Gillespie [1], that a rigorous microphysical derivation of such approach was
provided, in order to demonstrate its a priori modelling validity. Before that date, in fact, it
was possible to perform such fidelity check only a posteriori, through comparisons with real or
molecular dynamics experiments [2, 13].
1
Following the physical approach of [1], we use a frequentist probability interpretation, i.e.
probability is the fraction of trials in which an event E occurs. Such approach is viable in the
context of chemical kinetics, were there are very high numbers of molecules engaging in the
very same reactions [3, 5]. In addition, it allows to derive results that should otherwise be
postulated (by using Kolmogorov and De Finetti axioms [4]), such as the following:
1. Addition Law: If events A and B are mutually exclusive (i.e. they never occur at the same
time), then the total probability of “either A or B” is given by P (A[B) = P (A)+P (B);
2. Multiplication Law: The joint probability of two events A and B happening at the same
time is P (A\B) = P (A,B) = P (A)·P (B|A), where P (B|A) is the conditional probability
of B happening, given the occurrence of A.
In our case, events are going to be chemical reactions at molecular level [1]. Therefore, let
us consider a gas comtaining molecules of N 2 N di↵erent species, S1, S2, ..., SN , interacting
through M chemical reaction channels R1, ..., RM and all contained in a recipient of constant
volume V . Let Xi(t) be a variable related to the number of molecules of type Si, in the system,
at time t � 0, with i 2 I := (1, 2, ..., N). We focus principally on the bimolecular elementary
reaction channels of the form Si + Sj ! Sk + ..., with i, j, k 2 I.
We restrict our analysis to close-to-ideal gases in thermodynamic equilibrium. In other
words, we consider the molecules as distinguishable, non-puntiform1 hard spheres, of given
mass and radius, interacting mainly by collisions, with other types of long range interactions
being neglectable in both frequency and intensity terms. Furthermore, the thermodynamic
equilibrium implies the existence of well defined temperature parameter T for the whole system.
Also, it means that Boltzmann’s molecular chaos hypothesis (i.e. Stosszahlansatz ) is valid: the
particle velocities are both uncorrelated and independent of position, mainly because of thermal
fluctuations [5]. These physical premises lead to two mathematical propositions [1, 3]:
• Spatial homogeneity : the probability of finding any randomly selected molecule inside any
subregion �V of the volume V equals �V/V ; in mathematical terms the molecule positions
are independent2 random variables, uniformly distributed over the domain V .
• Maxwell-Boltzmann velocity distribution: denoted as kB Boltzmann’s constant [5], then
1Ideal gases require for particles to be treated as puntiform mass points. Furthermore, the distinguishabilityof particles refers to the possibility of identifying each particle in time, according to its Newtonian trajectory,given an initial “labelling”. This concept looses any validity in quantum mechanics, where there is no quantumcounterpart of the idea of trajectory [5].
2Two random variables X and Y are independent (or pairwise independent) i↵ their joint probability distri-bution factorises, in formulas P (X \ Y ) = P (X,Y ) = P (X)P (Y ) [4].
2
the probability of finding a molecule of mass m with velocity between v and v + dv is3:
pMB(v)dv =
✓m
2⇡kBT
◆3/2
exp
�m |v|2
2kBT
!. (1)
In mathematical terms, the above equation means that each Cartesian velocity component
of a randomly selected molecule is a normally distributed random variable, with zero mean
and variance kBT/m. Additionally, all such components are independent variables.
These two points are often referred to as the system being “well-stirred”, so that molecules
are well mixed though the whole spatial domain and in thermal equilibrium. It has to be
underlined that the above findings emerge from a deterministic chaotic (mixing) behavior of
molecules at microscopic level, in a scenario close to ideality and in thermal equilibrium. It is
ultimately this physical concept of “molecular chaos” that provides the “unreasonable e�cacy”
of a mathematical stochastic tractation of such systems [5, 3, 14].
1.2 Towards the Chemical Master Equation
We want to determine the evolution law for the species population vector4 X(t) = (X1(t), ..., XN(t)),
compatibly with the two above definitions of molecule positions and velocities and focusing on
bimolecular reactions. In order to perform such task, we have to determine the probability
⇡µ(t, dt) that two molecules, randomly selected at time t, react in the next dt time interval,
accordingly to the bimolecular channel µ. However, according to the above physical discussion,
in order for a bimolecular reaction to occur, two (spherical) molecules i and j have to collide
with each other first. Additionally, their collision must be e�cient [1].
Denoted with uµ(t, dt) the probability of a collision (defined analogously to ⇡µ(t+ dt), but
for a collision event) and with Pµ the probability of a chemical reaction to be triggered, then:
⇡µ(t, dt) = uµ(t, dt) · Pµ. (2)
In other words, the probability ⇡µ(t, dt) that an e�cient collision (i.e. a reaction) happens in
the time interval [t, t+ dt) is equal to the product of the collision probability uµ(t, dt) with the
conditional probability Pµ = P (trigger a reaction|collision).3In statistical mechanics, given a Cartesian vector v = (v
x
, vy
, vz
), the di↵erential element dv, sometimesdenoted also as d3v, is equal to dv
x
dvy
dvz
.4Because of the intrinsic stochasticity of our chemical system, we have to consider X(t) as an N -dimensional
random variable, having outcomes o defined on a subset of NN . Rather than considering the time evolution ofX(t), we are more interested in determining the probability P (X(t) = o), evolving over time.
3
In order to compute uµ we can resort to the following:
Theorem 1. [1] Let {Ci}i2N be a set of mutually exclusive and collectively exhaustive events,
partitioning the sample space. Let the event A be mutually exclusive to {Ci}i2N. Then:
P (A) =X
i
P (Ci) · P (A|Ci) (3)
Proof. The Cis represent a partition of the whole sample space, so that actually A can be
decomposed onto the set {Ci} in terms of mutually exclusive subsets, i.e. A = [i(A\Ci). This
means that P (A) = P ([i(A\Ci)) =P
i P (A\Ci), from the addition law. Also,P
i P (A\Ci) =P
i P (A,Ci) =P
i P (Ci)·P (A|Ci), with the last passage being due to the multiplication law.
The above theorem is valid also in the continuous case (i.e. when i is a real index, defined
on the set K), with the sum substituted by an integral, with a proper measure.
We consider Cv0 , v0 2 R3, being the event that two randomly selected molecules (in the
channel Rµ) at time t have a relative velocity v
0=vj�vi. Given the simmetries of the Maxwell-
Boltzman velocity distribution (explicitly depending only on the modulus of velocity), a simple
change of reference frame and the random variable transformation theorem for statistically
independent random variables [3, 4, 5] lead to
P (Cv0) =
✓m⇤
2⇡kBT
◆3/2
exp
�m⇤��v
0��2
2kBT
!, (4)
where m⇤ = mimj/(mi+mj) is the reduced mass of the two reactant molecules (in the channel
Rµ). In the reference frame of the j-th molecule, the i-th molecule moves on the straight
path connecting i and j at speed v
0, covering a length
��v
0�� dt in the time interval [t, t + dt).
Additionally, the two molecules collide when their relative distance is less than or equal to
ri + rj. These two quantities allow to approximate the volume Vint, inside which the two
molecules collide, as a cylinder of radius ri + rj and height��v
0�� dt [1]. Since the collision
probability is ultimately related to the molecule positions in the volume V , because of the
spatial homogeneity premise, then
P (Ac|Cv0 ) =Vint
V=
⇡(ri + rj)2��v
0�� dtV
, (5)
where Ac is a collision event, with probability equal to uµ, in other words P (Ac) = uµ(t, dt).
4
Similarly to Theorem 1, we now use all the collision relevant probabilities to obtain:
uµ(t, dt) =
ˆP (Cv0)P (Ac|Cv0 )dv
0=
1
V
✓8kBT⇡
m⇤
◆1/2
(ri + rj)2dt. (6)
Interestingly, the resulting uµ(t, dt) can be factorised as uµ(t, dt) = aµdt where aµ is in-
dependent on time. It has to be underlined that in computing5 the above integrals, we are
implicitly assuming that the molecule velocities do not change over the infinitesimal amount
of time given by dt, which is actually a reasonable assumption for a gas close to ideality (with
collisions as the only non-negligible intermolecular interactions).
Nevertheless, in order to compute the reaction probability ⇡µ(t, dt) from (2), it is necessary
to compute also the conditional probability Pµ of triggering a reaction in the channel Rµ, given
a collision between two molecules of that channel. Without recurring to quantum mechanics,
our classic framework allows for the description of two “triggering” mechanisms:
1. Directionality : In order for the collision to be e↵ective and trigger the reaction, it has to
bring close enough specific molecular regions. Given the spherical assumption, if those
regions are relative to a solid angle !i and !j for molecules i and j, respectively, then the
collision-conditioned reaction probability can be approximated as Pµ = !i!j/(4⇡)2;
2. Impact energy : Every collision is characterised by the total kinetic energy ✏ of the colliding
molecules. If ✏ is less than a certain threshold ✏µ (relative to the channel Rµ) then new
chemical bounds cannot form and the reaction does not happen. In this case, with some
modifications to the probability apparatus, it is possible to show [3, 1] that the trigger
probability follows the so called Arrhenius law Pµ = exp(�✏µ/kBT ).
In both cases, also Pµ is actually independent on time, therefore ⇡µ(t, dt) = uµ(t, dt)Pµ =
aµPµdt = cµdt where the probability rate6 cµ is indepedent on time, i.e. stationary [14, 6].
1.3 Derivation of the Chemical Master Equation
Interestingly, in a well-stirred, close to ideality, and at thermal equilibrium gas, the bimolecular
channel has a reaction probability quantifiable in a rather simple closed form, i.e. cµdt, with
stationary probability rate cµ [1]. Let us introduce the vectors n = (n1, n2, ..., nN) 2 NN and
5Even if the velocities components should be bounded by the speed of light, actually extending the Gaussianintegrals appearing in u
µ
on the whole R field leads to exponentially low errors, that can be neglected. Fur-thermore, this approximation trivially allows for an analytical solution of the integrals, by using the derivationunder integral rules [5].
6A similar approximation ⇡µ
⇠ ↵µ
dt, with ↵µ
independent on time, can be performed also for monomolecularand trimolecular reactions, but only in specific instances [3].
5
nµ = (nµ1, nµ2, ..., nµN) 2 ZN to address the population number of each species and the change
in the each of the population after an Rµ reaction, respectively. Then, n and n + nµ provide
the molecular populations of each species S1, S2, ..., SN before and after the occurrence of one
Rµ chemical reaction. In addition, each Rµ channel involves a di↵erent number hµ of reactant
molecules, according to the stechiometric coe�cients in the relative chemical equations. For
instance, the channel R↵ : S1 + S2 ! S3 encompasses h↵ = n1n2 di↵erent combinations of
reactant molecules from species S1 and S2. The reactant combination function hµ is evidently
a scalar function of n. Together with the jump vector nµ and with the probability rate constant
cµ, hµ(n) specifies the dynamics of the channel Rµ, with µ 2 (1, ...,M). In fact, we can now
determine the evolution of the species pupulation vector X(t) = (X1(t), ..., XN(t)) over time.
Theorem 2. If X(t) = n, then the probability p1 that only one Rµ reaction occurs in the time
interval [t, t+ dt) is given by hµ(n)cµdt+O(dt2).
Proof. Since the system molecules are distinguishable (according to the Maxwell-Boltzmann
distribution) then it is possible to uniquely label each one of them at time t. This allows
to actually “select two random molecules at time t”. However, each of the hµ(n) distinct
combinations of Rµ reactant molecules in the system has a nonzero probability, equal to cµdt,
of reacting according to Rµ, in the time interval [t, t + dt). The complementary event of the
Rµ reaction not happening has probability 1 � cµdt, in the same time interval. This sets a
Bernoulli process-like instance, in which the multiplication law implies that the probability
that a particular one of the hµ(n) reactant combinations participates in a Rµ reaction while
the other hµ � 1 combinations do not, is cµdt(1� cµdt)hµ(n)�1 = cµdt+O(dt2).
In order for such process to be a Bernoulli one, the above probability has to include also a
normalisation quantity, which relates to the request that any one of the combinations reacts
alone, in the very same infinitesimal time interval. Then, the addition law gives:
p1 = hµ(n)[cµdt+O(dt2)] = hµ(n)cµdt+O(dt2). (7)
From the multiplication law, a corollary of the above theorem is that the probability for
k � 2 reactions to occur in [t, t + dt) is actually of order O(dt2). The case of no reactions
happening is quantified by the following:
Theorem 3. If X(t) = n, then the probability p0 that no reaction occurs in the time interval
[t, t+ dt) is given by 1�P
µ hµ(n)cµdt+O(dt2).
6
Proof. [1] Let us underline that we have to consider only terms of order dt. As stated in the
previous proof, each of the hµ(n) combinations of Rµ reactant molecules has a probability
1� cµdt of not occurring in [t, t+ dt). Because of the moltiplication law, then, the probability
of no reaction occurring in channel Rµ is simply (1� cµdt)hµ(n) = 1�hµ(n)cµdt+O(dt2), where
we used a Taylor expansion. The joint probability that no reaction occurs in any of the M
available channels is, once again, provided by the multiplication law, as
p0 =MY
µ=1
⇥1� hµ(n)cµdt+O(dt2)
⇤= 1�
MX
µ=1
hµ(n)cµdt+O(dt2). (8)
The above two theorems constitute a first order (in time) machinery entirely built on physical
premises, which provides a deterministic analytical description about the probabilities regulat-
ing X(t), rather than X(t) directly. Let us fix the initial population vector n0 = n(t0) at the
initial time t0 and let us introduce the transition probability P (n, t|n0, t0) as the probability
that X(t) = n, given that X(t0) = n0, for t � t0. In order to obtain a continuous evolu-
tion dynamics, we have to relate the transition probabilities before and after an infinitesimal
amount of time, encompassing also the initial conditions [3]. In formulas, we have to relate
P (n, t + dt|n0, t0) to what might happen in [t, t + dt), namely to the occurrance of one of the
following mutually exclusive events: “no reaction”, “one reaction”, “more than one reaction”.
Since P (n, t+dt|n0, t0) implies thatX(t0) = n0 andX(t+dt) = n, in case no reaction occurs
in [t, t + dt), then the species populations are unaltered, in formulas X(t) = n. However, we
defined the probability of transitioning from X(t0) = n0 to X(t) = n as P (n, t|n0, t0), therefore
the probability to further transition to the state X(t+dt) = n is given by the following product:
P (n, t|n0, t0) · 1�
MX
µ=1
hµ(n)cµdt+O(dt2)
!, (9)
where the second factor is probability that no reaction occurs in [t, t+ dt), from Theorem 3.
Since the system has to transition to a state with X(t+ dt) = n, in case only one reaction
from channel Rµ occurrs in [t, t + dt), then the species population must start from a state
X(t) = n�nµ. Denoted with P (n�nµ, t|n0, t0) the probability of transitioning fromX(t0) = n0
to X(t) = n� nµ, then the probability to further transition to X(t+ dt) = n is:
P (n� nµ, t|n0, t0)�hµ(n)cµdt+O(dt2)
�, (10)
7
where the second factor is the the probability that one Rµ reaction occurs in [t, t + dt), from
Theorem 2. Straightforwardly from the same theorem, any probability contribution coming
from the “more than one reaction occurs” case is of order O(dt2) [1, 3].
Because of the mutual exclusivity of the above three events, we can finally quantify all the
contributions to P (n, t+ dt|n0, t0):
P (n, t+dt|n0, t0) = P (n, t|n0, t0)· 1�
MX
µ=1
hµ(n)cµdt
!+P (n�nµ, t|n0, t0) (hµ(n)cµdt)+O(dt2).
(11)
Subtracting P (n, t|n0, t0) on both sides, dividing by dt and taking the limit dt ! 0 retrieves
the so called chemical master equation [1, 3, 13, 2]:
@
@tP (n, t|n0, t0) = (hµ(n)cµ)P (n� nµ, t|n0, t0)�
MX
µ=1
hµ(n)cµ
!P (n, t|n0, t0), (12)
with initial condition P (n, t = t0|n0, t0) = 1 if n = n0 and P (n, t = t0|n0, t0) = 0 other-
wise. Interestingly, this di↵erential equation can be interpreted as a balance equation for the
probability of each discrete state X(t) = n [6]. In fact, the probability evolution over time
has to keep into account the “gain” due to transitions from other states with X(t) = n � nµ,
while the second term represents the “loss” due to transitions into other states, both of the
terms physically originating from chemical reactions. Interestingly, the physical eventuality
of no chemical reaction occurring over a given time period shapes the typical path of X(t),
consisting of piecewise traits (which are constant for the discrete state case) interspersed with
discountinous “jumps” (which can be present also in the continuous state case). Because of
this, the more general class of Markovian processes described by a gain-loss master equation
are also referred to as “jump processes” [6].
As an example, we simulated a rather simple chemical reaction network, with two species
A and B and two channels; in details, a bimolecular degradation reaction, A + Bk1! B,
coupled with a synthesis reaction, ; k2! A. Assuming the chemical system was well-stirred,
at thermal equilibrium and close to ideality, we quantified the probability (i.e. propensity)
of the degradation and of the synthesis to occur as ↵deg = hdegcdeg = A(t)B(t)kdeg/V and
↵syn = hsyncsyn = ksynV , respectively, where X(t) = (A(t), B(t)) is the species population
vector at time t and V is the system volume. In predicting the stochastic dynamic of this
chemical network, we did not explicitly use its associated chemical master equation but its
equivalent algorithmic formulation, instead, i.e. Gillespie’s Stochastic Simulation Algorithm
8
(SSA) [1, 12]. We implemented the SSA in Mathematica 9, with parameter values V = 1m3,
kdeg/V = 0.05 s�1, ksynV = 2 s�1, t0 = 0, A(0) = 5 and B(0) = B(t) = 1 for t � t0. For our
chemical network, analytic results [13] predict a Poisson stationary distribution ⇧(n) for the
probability of having A(t) = n at time t � t0; in formulas
⇧(n) =1
n!Mn
A exp(�MA) MA =ksynV
2
kdegB(0)(13)
with MA = 40 being the average number of molecules of A at time t � t0 in our case. Even
if our simulations are not numerically intensive, they corroborate the analytical convergence of
Gillespie SSA to the exact results, derived from the chemical master equation [12, 3, 13].
Figure 1: SSA simulation for our chemical network toy model. Top: an ensemble of 5 discrete “jump”
trajectories of the number of A species molecules A(t) over time t. The trajectories evidentiate the
convergence of the fluctuations around the value MA = 40, starting from the intial condition A(0) =
5. Bottom: Normalised stationary distribution for (# of trajectories,# of transitions,MA) equal to
(5, 300, 39.78) (left), (20, 1000, 40.28) (center) and (30, 1500, 40.14) (right). Even if our numerical
results are not numerically intensive, they suggest the (analitically proven) convergence of Gillespie
SSA to the chemical master equation stationary distribution [13].
1.4 Beyond the Chemical Master Equation
From a purely mathematical point of view, master equations are a particular case of the
Chapman-Kolmogorov equation for Markov processes [14, 6, 4]. In case of continous Markov
chains, the Chapman-Kolmogorov equation relates the transition probability from state y at
9
time t0 to state x at time t by integrating over all possible intermediate transitions y ! z ! x
at any time t0 < t1 < t, analogously to what we did in deriving the chemical master equation.
In formulas, the Chapman-Kolmogorov equation can be stated as [14]:
P (x, t|y, t0) =ˆ
dzP (x, t� t1|z, t0)P (z, t1|y, t0) (14)
The chemical master equation represents a first order approximation in time to the evolution
of transition probabilities for the chemical species populations. Such finding [3] is analogous to
the property of a class of continuous Markov processes, in which a time interval dt corresponds
to an O(dt) displacement x� y, with the following features
ai(x, dt) =
ˆdy(yi�xi)P (y, dt|x, t0) = O(dt) bij(x, dt) =
ˆdy(yj�xj)(yi�xi)P (y, dt|x, t0) = O(dt),
(15)
and also with negligible higher order terms. For such Markov processes, a Kramers-Moyal
expansion [3, 6] of the Chapman-Kolmogorov equation (namely a Taylor expansion in x � y
with t1 = t0 + dt) leads to the celebrated Fokker-Planck equation, largely recurring in the
Brownian motion and in many other di↵usion related processes [6, 14, 9]:
@PT
@t= �
NX
i=1
@
@xi[fiPT ] +
1
2
NX
i,j=1
@2
@xj@xi[QijPT ] , (16)
where PT = P (x, t|y, t0), while fi = limdt!0 ai/dt and Qij = limdt!0 bij/dt. In case the Qij
are independent on x, it can be shown that the Fokker-Planck equation, with t0 = 0, rules the
evolution of the probability density ⇢(x, t) associated with the stochastic process [14, 6]
xi(t+ dt) = xi(t) + fi(x(t))dt+pdt⌘i(t), (17)
where ⌘i(t)s are zero mean Gaussian random variables with h⌘i(t+ ndt)⌘j(t+mdt)i = Qji�nm.
Such relationship is fundamental for many simulation techniques of stochastic processes [6, 3].
In the limit dt ! 0, the above stochastic equation leads to the so called Langevin equation [6],
which is a stochastic di↵erential equation
dxi
dt= fi(n) + ⌘i(t), (18)
with many applications in synchronisation theory [14] and which comprises multi-variate Gaus-
sian white noise (h⌘i(t)i = 0,⌦⌘i(t)⌘j(t
0)↵= Qji�(t� t
0) and Q = {Qij} positive definite).
10
2 From Theoretical Chemistry to Complex Networks
In the last few decades, the challenge of tackling complexity, in real world systems, required
the development of a multidisciplinary field, an “umbrella” encompassing and combining tech-
niques from di↵erent disciplines, spanning from mathematics to physics, from social sciences to
economics [9]. It is in this broader context of complexity science that network theory developed,
mainly drawing tools from graph theory, statistical mechanics and probability. Accordingly, it
is not a surprise that master equation approaches are widely used on networks [11, 10, 7, 9].
2.1 Growing Exponential Networks
Rigorously, a network is the physical representation of reality having the topological properties
of a finite graph G = (V,E), which is formally a finite set V of N 2 N vertices (or nodes)
connected by a set E of edges. For instance, the Internet can be represented as a network of
routers connected by wires, according to a given topology [9, 8]. In the following, however, we
use network as a synonym of graph. The network connectivity is contained in the adjacency
matrix A = {Aij}i,j=1,...,N . For an undirected, simple, loopless network, Aji = Aij = 1 if nodes
i and j are connected, while Aji = Aij = 0 otherwise. Let us define the degree ki of a node i
as the number of its connected first neighbors, in formulas ki =P
j Aij.
Let us discuss a simple model of a network having the above properties and growing in size
over time [9, 11]. The model starts with an initial configuration having one node only, at time
step t0 = 1. At each subsequent discrete time step, a new vertex is added to the network and
it is connected purely at random to one older vertex. Therefore, at time step t the network
consists of t nodes and t� 1 links. The degree distribution of such a network can be retrieved
by a master equation approach [11]. Let b be the birth (i.e. insertion) time of a node inside
the network. Within the network dynamics, each node i transitions from a degree equal to 1 at
b = bi to a degree k = ki at time t > b. Let P (k, t|1, b) be the conditional probability of such a
transition. Then the following discrete-time master equation holds
P (k, t+ 1|1, b) = 1
tP (k � 1, t|1, b) + (1� 1
t)P (k, t|1, b), (19)
with the initial condition P (k, 1|1, b) = �k,1. On the right-hand side, the above equation keeps
into account only “gain” terms to the (left-hand sided) fraction of nodes having degree k at
time t + 1. The first term represents the probability of a vertex, originally with degree k � 1,
to receive a connection, with uniform probability 1/t, from the new node added at time t. On
11
the other hand, the second term is the probability for a node, already of degree k, to keep its
degree fixed by not receiving any new connection (with probability 1 � 1/t). Once again, a
frequentist probability interpretation quantifies the probability of finding a node with degree k
at time t, i.e. the degree distribution p(k, t), as the fraction of nodes having degree k at time t
[9]. In formulas, from the addition law we obtain:
p(k, t) =1
t+ 1
tX
b=1
P (k, 1|1, b) (20)
Performing a sum over the di↵erent values of s in both the sides of (19) and using the degree
distribution definition, we obtain the following:
(t+ 1)p(k, t+ 1)� tp(k, t) = p(k � 1, t)� p(k, t) + �k,1 (21)
Compatibly with the linear growth of the number of both nodes and links, it is possible to show
[11] that p(k, t) grows linearly in time and it factorises in p(k, t) ⇠ tp(k) in the t � 1 regime.
This implies that, in the same regime, the above master equation approximates a recurrence
equation for the stationary degree distribution p(k) [11]:
2p(k)� p(k � 1) = �k,1 ) p(k) = 2�k (22)
Therefore, this simple model of growing network displays an exponentially decreasing probabil-
ity of finding a node of degree k, which is rather unrealistic for many real-world networks [9]. A
similar master equation approach can be adoperated also in less trivial scenarios. For instance,
in a model equivalent to the above one, except for the presence of a preferential attachment
procedure [9] (where each node at time step t receives the connection from the newly inserted
vertex with a probability proportional to its degree, i.e. ki/2t), it is possible to write down a
master equation similar to (19):
P (k, t+ 1|1, b) = k � 1
2tP (k � 1, t|1, b) + (1� k
2t)P (k, t|1, b), (23)
with the same initial conditions and with a power-law degree distribution p(k) / k�3. This
model with preferential attachment leads to a scale-free degree distribution (i.e. p(k) satisfies
the functional equation f(ak) = bf(k) with a, b 2 R) and it is also known as the Barabasi-
Albert model [9, 10, 11]. According to several empirical findings, many real-world networks
seem to display the scale-free property, even though such finding is currently debated [8, 9].
12
2.2 Random Walks on Networks
On a given network, it is possible to perform a time discrete random walk, with a walker
transitioning from node i, at time step t, to one of i’s neighbors j, at time step t+1, uniformly
at random. Similarly to the chemical reactions scenario, we want to derive a master equation
regulating this stochastic process. Given the uniform node hopping, the transition probability
P (j, t+1|i, t) = Pi!j = Pij is evidently stationary and equal to �iAij/ki, with �i normalisation
constant. Notice that, as for molecular collisions, also this random walk is a Markovian (i.e.
memoryless) process [10]. For finite heterogeneous networks7 with arbitrary degree distribution
p(k), it is possible to derive a master equation for the more interesting transition probability
P (i, t|i0, 0) of a random walker starting at node i0 at time t0 = 0 and visiting node i at time
step t as [7]:
@
@tP (i, t|i0, 0) =
NX
j=1
PjiP (j, t|i0, 0)�
NX
j=1
Pij
!P (i, t|i0, 0). (24)
In the above master equation, the first term on the right side quantifies the “gain” probability
of moving to node i from every other network node in one hop (the presence of Aji in Pji
weighting the sum only on i’s neighbors), including also the initial condition, while the negative
term constitutes the total probability of moving out from i, to any of its first neighbors (always
including the initial condition). It can be rigolously proven that [10, 7], for such a “regular”
random walk on finite networks, without sinks or sources, the P (i, t|i0, 0)s identify an ergodic,
irreducible and aperiodic Markov chain [4], which admits a unique stationary probability vector
Psta = (P (1)sta , ..., P
(N)sta ), whose generic component P (i)
sta quantifies the probability for a walker to
be in node i, as
P(i)sta =
kihki
1
N, (25)
where hki is the average node degree in the given network. This analytic finding implies that
a random walker visits “more often” nodes with higher average degree. Furthermore, the
hopping probabilities Pijs can be used also to compute the mean first passage time hTii of
node i, i.e. the average number of time steps for a walker to leave from i and come back to it
7The definition of “heterogeneous” network is rather delicate, since it expresses the presence of di↵erent sta-tistical properties between nodes. In [7] the authors referred to “heterogeneous” networks as to graphs havingnodes with di↵erent node degree, with nodes of the same degree having also the same statistical properties.However, this assumption neglects other higher-order correlations (i.e. assortativity, etc.) arising at meso-scopic levels and found in real world networks [9]. Even if additional techniques for a better quantification ofheterogeneity have been proposed [8], we still use “heterogeneous” in the degree-based sense of [7].
13
[10]. For heterogeneous finite networks, it is possible to show [7] that hTii is actually equal to
1/P (i)sta, which is intuitively compatible with the uniformity of the random walk. Additionally,
for su�ciently homogeneous scale-free networks, with degree distribution p(k) / k��, the
probability P (t1) to perform a first passage in t1 time steps follows a power-law, i.e. P (t1) /
t�(2��)1 , with � being tipically between 2 and 3 for most technological and social real-world
networks [9].
3 Conclusions
In the first section of this project we derived the chemical master equation for chemical reactions
in a “well-stirred” gas, close to ideality and at thermal equilibrium. We discussed the physical
meaning of many mathematical findings of our approach, underlining also the importance of the
molecular chaos hypothesis for the system to be e�ciently described as a Markov process. In
the same section, we performed and discussed numerical experiments on a simple bimolecular
reaction network, according to Gillespie’s stochastic simulation algorithm. We also briefly linked
the master equation formalism to the Chapman-Kolmogorov equation and to the Langevin one.
Instead, in the second section we reviewed the master equation formalism in two di↵erent
areas of network theory, namely growing network models and random walks on networks, dis-
cussing closed form analytical results for quantities such as the degree distribution or the mean
first passage time.
All in all, our review denotes the master equation, together with its simulation techniques,
as a powerful mathematical tool, with solid physics foundations, that can be successfully applied
to a variety of systems and models, inside the fascinating panorama of complexity science.
References
[1] D. T. Gillespie, A rigorous derivation of the chemical master equation, Physica A,
188 (1992).
[2] M. Delbruck, Statistical fluctuations in autocatalytic reactions, The Journal of
Chemical Physics, 8 (1940).
[3] D. T. Gillespie, Markov Processes: an Introduction for Physical Scientists, Aca-
demic Press 1992.
14
[4] G. Grimmett, D. Stirzaker, Probability and Random Processes (3rd Edition), Ox-
ford University Press (2001).
[5] L. D. Landau, E. M. Lifshitz, Statistical Physics Vol. 5 (3rd Edition), Butterworth-
Heinemann (1980).
[6] C. W. Gardiner, Handbook of Stochastic Methods (3rd Edition), Springer (2004).
[7] J. D. Noh and H. Rieger, Random Walks on Complex Networks, Physical Review
Letters, 92 (2004).
[8] E. Estrada, Quantifying network heterogeneity, Physical Review E, 82 (2010).
[9] M. J. Newman, Networks: An Introduction, Oxford University Press (2010).
[10] A. Barrat, M. Barthelemy and A. Vespignani, Dynamical Processes on Complex
Networks, Cambridge University Press (2008).
[11] S. N. Dorogovtsev and J. F. F. Mendes, Evolution of Networks, Advances in
Physics, 51 (2002).
[12] R. Erban, S. J. Chapman and P. K. Maini, A Practical Guide to Stochastic Sim-
ulations of Reaction-Di↵usion Processes, CoRR (2007).
[13] R. Erban and S. J. Chapman, Stochastic modelling of reaction-di↵usion processes:
algorithms for bimolecular reactions, Physical Biology, 6 (2009).
[14] M. Cencini, F. Cecconi and A. Vulpiani, Chaos: From Simple Models to Complex
Systems, World Scientific (2010).
15
top related