markov process - shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/8444/11/11_chapter 3.pdf ·...
TRANSCRIPT
50
Chapter 3MARKOV PROCESS
3.0 INTRODUCTION
Markov process is a stochastic or random process, that is used
in decision problems in which the probability of transition to any
future state depends on the current state and not on the manner in
which the specific state was reached.
Mathematically, nXP n122-n1n P{X}X,X,,X,X }X 1n
Markov analysis involves the studying of the present behavior of a
system to predict the future behavior of the same system. This is
introduced by Russian mathematician, Andrey A Markov. General
theory concerning Markov process was developed by A N Kolmogorov,
W.Feller and others. Markov processes are a certain special class of
mathematical models that are used in decision problems associated
with dynamic systems. Markov processes are widely used, perhaps, as
marketing aid for examining and predicting the customer behavior
concerning their loyalty to one brand of a given product and their
switching patterns to other brands of the same product. Other
applications that have been found for Markov processes include:
Manpower planning
Monsoon rainfall prediction
Forecast of wind power generation
Assessing the behaviour of stock prices
Scheduling hospital admissions
51
A simple Markov chain is a discrete random process with
Markovian property. In general the term Markov Chain is used to refer
a Markov Process that is discrete with finite state space. Usually a
Markov Chain would be defined for a discrete set of times (i.e. a
discrete-time Markov chain) although some authors use the same
terminology where time can take continuous values.
In a Markov process the possible states i.e. state space that a
system under focus could take at any point of time will be clearly
defined. As the system changes its state randomly it will be difficult to
predict the next future state with certainty. In this context the
statistical properties of the system for future state will be forecasted.
The change of system from one state to other state is called
Transition and the probability associated with this state transition is
called Transition Probability. The state space and the associated
transition probabilities characterize the Markov Chain.
For instance, the system may be a customer interested in a
certain commodity and have an option of selecting the commodity
from the available three brands namely – A, B and C. If the customer
selects the brand A, we might say that the system is in state 1. If the
customer selects the brand B, we might say that the system is in state
2 and if he/she selects the brand C, we might say that the system is
in state 3. Similarly the system might be inventory level of a given item
and a zero inventory could be denoted by state 1, one to ten units of
inventory could be denoted by state 2 and so on. Now the relevant
questions one may pose may be:
52
1. If the system is in state ‘i’ (i = 1,2,3…) say on day one, then what
is the probability that it is instate ‘j’ (j= 1,2,3…) in ‘n’ steps i.e. on
day 2 or day 3 or day 4 or … on day n?
2. After a larger number of steps (n ), what is the probability
that the system will be in state ‘s’?
3. If a company currently has a certain share of the market, what
share of the market will it have n steps from now?
Markov processes are capable of answering these and many other
questions relative to dynamic systems.
A simple Markov process is illustrated in the following example.
Let us consider monsoon rainfall over a particular region as a
system with two states – one as rainfall occurs and second as rainfall
does not occur. During monsoon, if rainfall occurs today, the
probability that there will be rainfall a day later is 0.7, and the
probability that there will be no rainfall a day later is 0.3. If there is no
rain fall today, the probability that there will be rainfall a day later is
0.6 and the probability that there will be no rainfall a day later is 0.4.
Let state-1 represents the rainfall and state-2 represents no rainfall
then the transition probabilities can be represented as given in the
Table 3.1.
Rain fall(State-1)
No rain fall(State-1)
Rain fall(State-1) 0.7 0.3
No rain fall(State-1) 0.6 0.4
Table 3.1: Transition probabilities for Monsoon rainfall example
53
The process is represented in Fig.3.1 by two flow diagrams whose
upward branches represent moving to state-1 and downward
branches represent moving to state-2.
Suppose the monsoon starts with state-1 (rainfall), there is a 0.7
probability that there will be rainfall on the 2nd day. Now consider
the state on 3rd day. The probability for state-1 on 3rd day is 0.67
(i.e. 0.49 + 0.18).
The corresponding probability that there will be state-2 in 3rd day,
given that it started in state-1 on 1st day, is 0.33 (i.e. 0.21 + 0.12).
Transition probabilities for next days can be computed in similar
manner.
Fig 3.1: Flow diagram representing the Markov Process
1
2
12
1
2
1
0.70.3
0.7
0.3
0.4
0.6
0.21
0.49
0.18
0.12
StartingwithState-1
1st day 2nd day 3rd day Transition probabilityon 3rd day
2
2
12
1
2
1
0.60.3
0.7
0.4
0.4
0.6
0.18
0.42
0.24
0.16
StartingwithState-2
1st day 2nd day 3rd day Transition probabilityon 3rd day
54
The Table 3.2 shows that the probability for the state-1 on any
future day tends towards 2/3, irrespective of the initial state on day-1.
This probability is called as Steady state probability of being in state-
1. On the similar lines, the steady state probability for the state-2 on
any future day is 1/3 (i.e. 1- 2/3).
DayNumber
Probability that there will be state-1 on a future day,given that it started in
STATE-1 on day-1 STATE-2 on day-1
1 1.0 0.02 0.7 0.63 0.67 … (2/3) 0.66 … (2/3)4 0.667 … (2/3) 0.666 … (2/3)5 0.6667 … (2/3) 0.6666 … (2/3)6 0.66667 … (2/3) 0.6666 … (2/3)
Table 3.2: Steady state probabilities for monsoon rainfall example
These steady state probabilities do find much significance in
several decision processes. For example, if we are deciding to hire a
machine with two states – working (state-1) and break down (state-2),
the steady state probability of state-2 indicate the fraction of time the
machine would be in break down condition in the long run, and this
fraction of break down time would be the key factor in deciding
whether to hire the said machine or not.
3.1 CHARACTERISTICS OF MARKOV PROCESS:
Important Characteristics of a first order Markov process or the simple
Markov Process are:
1. The probabilities of going to each of the sates of the system depend
only on the current state and not on the manner in which the
55
current state was reached. This means that the next state of the
system is dependent on the current state and is completely
independent of the previous states of the system. This property is
popularly known as the property of “No Memory” which simply
means that there is no need to remember how the process reached
a particular state at a particular period.
2. There are initial conditions that take on less and less importance
as the process operates eventually ‘washing out’ when the process
reaches the steady state. Accordingly, the term steady sate
probability is defined as the long run probability of being in
particular state, after the process has been operating long enough
to wash out the initial conditions.
3. In a Markov process we assume that the process is discrete in state
space and also in time.
4. We also assume that in a simple Markov process the switching
behaviour is represented by transition matrix (matrix containing
transition probability). The conditional probabilities of moving from
one state to another or remaining in the same sate in a single time
period are termed as transition probabilities.
3.2 MARKOV PROCESS:
A stochastic system is said to follow a Markov process if the
occurrence of a future state depends on the immediately preceding
state only.
Therefore if n10 ttt represents the instants on time scale
then the set of random variables ntX whose state space
56
n1n10 x,x,,x,xS is said to follow a Markov process provided it holds
the Markovian property:
nn xtXP nn001n1n xtP{X}xtX,,xtX }xtX 1n1n
for all n10 tX,,tX,tX
If the random process at time nt is in the state nx , the future
state of the random process 1nX at time n 1t depends only on the
present state nx and not on the past states 02n1n x,,x,x .
Examples of Markov process are
A first order differential equation is Markovian
The probability of raining today depends on the previous weather
conditions existed for the last two days and not on past weather
conditions.
3.2.1 First Order Markov Chain (FOMC)
First Order Markov process is based on the following three
assumptions:
(i) The set of possible outcomes is finite
(ii) The probability of next outcome (state) depends only on the
immediately preceding outcome.
(iii) The transition probabilities are constant over time.
A simple Markov Process is discrete and constant over time. A
system is said to be discrete in time if it is examined at regular
intervals, e.g. daily, monthly or yearly.
Mathematically,
57
the probability nnx(n)1),-x(n xtP{XP }xtX 1n1n is called FOMC
transition probability that represents the probability of moving from
one state to another future state.
The second order Markov process assumes that the probability
of the next outcome (state) may depend on the two previous outcomes.
Likewise, ‘l’ order Markov process assumes that the probability of next
state can be calculated by obtaining and taking account of the past “l’
states.
Second order Markov process is discussed in detail in Sec 3.10.
3.3 CLASSIFICATION OF MARKOV PROCESS
A Markov process ca be classified into four types, based on the nature
of the values taken by ‘t’ and iX .
(i) A continuous random process satisfying Markov Property is called
as Continuous parameter Markov process as ‘t’ and iX both
are continuous.
(ii) A continuous random sequence satisfying Markov Property is
called as Discrete parameter Markov process as ‘t’ is discrete
and iX is continuous.
(iii)A discrete random process satisfying Markov Property is called as
Continuous parameter Markov chain as ‘t’ is continuous and
{ iX is discrete.
(iv)A discrete random sequence satisfying Markov Property is called as
Discrete parameter Markov chain as ‘t’ and iX both are
discrete.
58
Classification of states of a Markov process:
Different states of Markov process are listed below:
(i) State ‘j’ is said to be accessible form state ‘i’ if (n)ijP 0 for some
n0.
(ii) If two states ‘i’ and ‘j’ are accessible from each other, they are said
to be communicate. In general
(a) Any state ‘i’ communicates with itself for all i0. (from the
definition of states communicating)
(b) State ‘i’ communicates with state ‘k’ and state ‘k’ communicates
with state ‘j, then state ‘i’ communicates with state ‘j’. (from the
Chapman-Kolmogorov equations)
(iii)Two states that communicate are in the same class. Two classes of
states are either identical or disjoint.
(iv)The Markov chain is irreducible if there is only one class, i.e. if all
states communicate with each other. (n)ijP 0 for some n and for all
‘i’ and ‘j’.
(v) A state is said to be an absorbing state if no other state is
accessible form it; that is for an absorbing state ‘i’, iiP = 1. A
Markov chain is absorbing if,
(a) it has at least one absorbing state
(b) it is possible to go from every non-absorbing state to at least
one absorbing state.
(vi)State ‘i’ is said to be recurrent, if starting in state ‘i’, the process
will ever reenter state ‘i’ with probability iP = 1. If state ‘i’ is
59
recurrent, then starting from state ‘i’, the process will reenter state
‘i’ again and again.
So a state ‘i’ is recurrent if
P1n
(n)ii
If state ‘i’ communicates with state ‘j’ and if state ‘i’ is recurrent
then state ‘j’ is also recurrent.
(n)iii nPμ is the expected or mean recurrence time of state ‘i’.
The state ‘i’ is positive recurrent if, starting in ‘i’, the expected time
until the process returns to state ‘i’ is finite. For a process, a
recurrent state need not be positive recurrent. Such recurrent
states are called null recurrent. But in a finite state Markov chain,
all recurrent states are positive recurrent.
(vii) State ‘i’ is said to be transient if starting from state ‘i’, the
process will ever reenter state ‘i’ with probability <1. If a state ‘i’ is
transient, then whenever the process enters state ‘i’, there is also
probability 1- Pi , that it will never enter again into the same state
‘i’ again. So if the state ‘i’ is transient then,
P1n
(n)ii .
(viii) A state ‘i’ is called an essential state if it communicates with
every state it leads to. Assume 0P (n)jk then if 0P (m)
kj , ‘j’ is an
essential state.
(ix) A state ‘i’ said to have period ‘d’ if 0P (n)ii whenever ‘n’ is not
divisible by ‘d’ i.e. for all values of ‘n’ other than n,2n,3n,…and ‘d’
is the largest integer with this property. If a state ‘i’ is periodic
60
with period ‘d’, if the GCD of the set { n 1n0,P (n)ii } is d>1. A
state is said to be aperiodic if its period is 1.
(x) Positive recurrent, aperiodic states are called ergodic. For an
ergodic Markov chain, it is possible to pass from one state to
another in a finite number of steps, regardless of present state. A
special case of ergodic Markov chain is regular Markov chain. A
regular Markov chain is defined as a chain having a transition
matrix P such that for some power of P, it has only non-zero
positive probability values. All regular chains must be ergodic
chains. An easiest way to check whether an ergodic chain is
regular is to square the transition matrix continuously until all
zeros are removed.
3.4 TRANSITION PROBABILITY AND TRANSITION PROBABILITY
MATRIX
The probability of moving form one state to another or
remaining in the same sate during a single period is called the
transition probability.
nnx(n)1),-x(n xtP{XP }xtX 1n1n
is called FOMC transition probability. This represents the conditional
probability of the system, which is now in state nx at time nt , provided
that it was previously in state 1nx at time 1nt .
The transition probabilities can be arranged in a matrix of size
m x m and such a matrix can be called as one step Transition
Probability Matrix(TPM), represented as below:
61
mmm2m1
2m2221
1m1211
PPP
PPPPPP
P
, where ‘m’ represents the number of states.
The matrix P is a square matrix who’s each element is non-
negative and sum of the elements in each row is unity i.e. 1P1
ij
m
j;
i= 1 to m and 0 Pij 1.
The initial estimates of Pij can be computed as,i
ijij N
NP , (i, j = 1 to m)
where ijN is the raw data sample that refer the number of items or
units or observations transitioned from the state i to state j. iN is the
raw data sample in state i.
In general , any matrix P, whose elements are non-negative and
sum of the elements in either each row or column is unity, is called a
Transition Matrix or Stochastic Matrix. Since the number of rows is
equal to number of columns in the matrix, it gives the complete
description of the Markov process.
3.5 n- STEP TRANSITION PROBABILITIES
Consider a system which is in state i at time t=0, the probability
that the system moves to state j at time t=n (some times these time
periods are considered as number of steps). The n-step transition
probabilities (n)ijP , can be represented in matrix from as:
62
(n)mm
(n)m3
(n)m2
(n)m1
(n)3m
(n)23
(n)22
(n)21
(n)1m
(n)13
(n)12
(n)11
(n)
PPPP
PPPPPPPP
P
In this matrix (n)21P , represents the probability that the system with
current state 2 will move to state 1 after ‘n’ steps.
3.5.1 State Probabilities
Let (n)Pi represents the probability that the system occupies
state ‘i’. Let this will move to state ‘j’ in one transition. One should
distinguish that the transition probability ijP is independent of time
where as the absolute ( or STATE) probability (n)Pi depends on time.
Let the number of possible states are ‘m’, then
1Pand1(n)Pm
1jij
m
1ii
for all ‘i’
If all the state probabilities are given at time t = n, then the state
probabilities at time t = n+1 can be calculated by the equation:
m
1iijij P(n)P1)(nP ; n = 0,1,2, …
i.e. the probability of being in state ‘j’ at time t = n+1 is equal
to the probability moving from state ‘i’ to state ‘j’ for all values of ‘i’.
63
To make the procedure more understandable, the equations for
each state probability at time t = n+1 can be written as:
m1m2121111 P(n)PP(n)PP(n)P1)(nP
m2m2221212 P(n)PP(n)PP(n)P1)(nP
mmm2m21m1m P(n)PP(n)PP(n)P1)(nP
This set of equations can be arranged in matrix form as
mmm2m1
2m2221
1m1211
m21m21
PPP
PPPPPP
(n)P(n)P(n)P1)(nP1)(nP1)(nP
In compact form it can be written as,
PXX n1n ---(3.1)
where 1nX is the row vector of state probabilities at time t = n+1, nX
is the row vector of state probabilities at time t = n, and P is the
matrix of transition probabilities.
If the state probabilities at time t = 0 are known, say 0X , then
STATE probabilities can be determined at any time by solving the
matrix equation (3.1) i.e.
n01-nn
30
2123
2012
01
PXPXX
PXPXPXX
PXPXX
PXX
64
3.6 CHAPMAN-KOLMOGOROV EQUATION
The Chapman-Kolmogorov Equation provides a method to
compute the n-step transition probabilities. This contains the fact that
the n-step probabilities, for any n can be calculated from 1-step
probabilities.
The equation can be represented as,
M(n m) n mij ik kj
k 1P P P
for all n, m 0 and
for all i, j =1,2,…,M ( state space).
PP (m)kj
(n)ik represents the probability that a process beginning from state
‘i’ will go to state ‘j’ in (n+m) transitions or steps through a path taking
it into state ‘k’ at the nth transition. So, if we sum over all intermediate
states ‘k’, it gives the probability that the process will be in state ‘j’
after (n+m) transitions.
Proof:
We know that iXjXPP 0mnm)(n
ij
State ‘j’ can be reached from state ‘i’ through an intermediate state ‘k’.
So
M
(n m)ij n m n 0
1P P X j , X k, X i
k
= M
n m n o n 0k 1
P X j X k, X i .P X k X i
= M
n m n n 01P X j X k P X k X i
k
65
= M
m 0 n 01P X j X k P X k X i
k
=M
(m) (n)kj ik
1P P
k
As it is mentioned at the starting of this section, the Chapman-
Kolmogorov equations contain the fact that the n-step probabilities,
for any n can be calculated from 1-step probabilities. This can be
easily understandable by the repeated application of the Chapman-
Kolmogorov equations as below:
PP (1) ( Since (1)P is just P, the 1-step Transition Probability Matrix )
2(1)(1)1)(1(2) PPPPPPP
32(1)(2)1)(2(3) PPPPPPP
Now by induction,
n1-n(1)1)-(n1)1-(n(n) PPPPPPP
Hence, the n-step transition probability may be determined by
multiplying the matrix P by itself ‘n’ times.
3.7 STOCHASTIC MATRIX:
A special case of non-negative matrices is stochastic matrix
(Transition Matrix) for which A = ija is constrained by the following
two conditions 0 ija 1 and 1aj
ij
The stochastic matrix is regular, if 0 ija 1 subject to 1aj
ij
66
A matrix is called doubly stochastic, if for 0 ija 1; i,j ;
1aaj
iji
ij
Here ija is the transition probability of going from the state i to state j.
3.8 EIGEN VALUES AND EIGEN VECTORS OF MATRIX
Let 1n x1n xnn x (X)λ(X)A holds for some scalar λ .
Then λ is called the characteristic root (or eigen value or latent
root) of A. X is called the characteristic vector (or eigen vector) of A or
the right characteristic vector of A.
If A is non-symmetric (non-Hermetian) then AA1 and if it
satisfies for some scalar λ then λXXA1 .
Then X will be called the left characteristic vector corresponding to λ .
If A is symmetric matrix then left and right characteristic
vectors are the same.
Since AX = λ X
0λI)X(A is a system of n linear and homogeneous equation.
λIA is called a characteristic polynomial of A.
If A = nn xija , then
λ-aaa
aλ-aaaaλ-a
λI-A
nnn2n1
2n2221
1n1211
= n22n
11n
0n σσλσλσλ
where A n0 ,1
67
i = sum of the principal minors of order i, in A. i= 1,2,…,n-1
The characteristic polynomial can be written as
0σ1σλσλσλ nn
22n
11n
0n
0σλ rrn
rn
0r
Theorem 3.1
If n21 λ,λ,λ are the characteristic roots of A, then A-KI will have the
characteristic roots Kλ,K,λK,-λ n21 , where K is a scalar.
Proof: We have
0= λIA = 0σλ rrn
rn
0r
0σ1σλσλσλ nn
22n
11n
0n
On factorization
0)λ(λ,),λ)(λλ(λ n21
where n21 λ,λ,λ are the roots of 0λIA i.e. the characteristic roots
of A.
Characteristic root of A-KI is given by
0λ)I(A K
i.e. 0σK)(λ rrn
rn
0r
0A1σK)λ(σK)λ(σK)(λ n2
2n1
1n0
n
( λI-AinλKbyλReplacing 0σλ rrn
rn
0r
)
68
On factorization,
0)λK(λ,),λK)(λλ(λ n21 K
0K))-(λ-(λ,K)),-(λ-K))(λ-(λ-(λ n21
This shows that (A-KI) has the characteristic roots
)(λ,)(λK),-(λ n21 KK , if A has the characteristic roots n21 λ,λ,λ .
Theorem 3.2
If λ is a characteristic root of A, then nλ is a characteristic root of nA .
Proof:
λXAX
Pre-multiplying by A, both sides, we get
AX)(XA 2
Xλλ(λX) 2
XλX)(λAX)λ(λXA 3223
XλXA 44
XλXA nn , for any positive integral index n.
Hence nλ is characteristic root of nA .
3.9 SPECTRAL DECOMPOSITION OF A MATRIX:
It is a method of Matrix Decomposition based on eigen values
and hence it is also called as Eigen Decomposition. It is applicable to
square matrix. In this, matrix will be decomposed into a product of
three(3) matrices. Out of these three, only one matrix will be diagonal.
69
Thus decomposition of a matrix into three matrices composing its
eigen values and Eigen vectors is known as Spectral decomposition.
An n x n matrix P will always have ‘n’ eigen values, which can
be arranged in more than one way to form a diagonal matrix D of size
n x n and a corresponding matrix V of non zero columns which
satisfies the eigen value equation PV = VD.
Let P has eigen values k321 λ,,λ,λ,λ and corresponding eigen vectors
k321 X,,X,X,X which can be represented as
kk
k2
k1
2k
22
21
1k
12
11
x
xx
,,
x
xx
,
x
xx
Let the matrix of eigenvectors,
V = k21 XXX
kk2k1k
k22212
k12111
xxx
xxxxxx
and matrix of eigen values
where D is a diagonal matrix.
k
2
1
λ00
0λ000λ
D
70
Then
PV = k21 XXXP
= k21 PXPXPX (as we know PX = λX)
= kk2211 XλXλXλ
=
kkk2k21k1
k2k222121
k1k212111
xλxλxλ
xλxλxλxλxλxλ
=
kk2k1k
k22212
k12111
xxx
xxxxxx
k
2
1
λ00
0λ000λ
= VD
This results in the decomposition of P as,
1VDVP
For a square matrix P, this type of decomposition is always possible as
long as V is a square matrix.
Further by squaring on both sides of above equation,
112 VDVVDVP
11 DVVVVD
12VVD
Mathematically, Spectral Decomposition can be represented as
1ii VVDP where i= 1 to n
71
n-step Transition Probability Matrix (TPM) of First order Markov
Chain using Spectral Decomposition Method:
If P represents the four state TPM then the higher order
transition probabilities are obtained by the following procedure.
i. Determine the eigen values of the Transition Probability Matrix ‘P’ by
solving 0 =׀P-λI׀
ii. If all eigen values say λ1, λ2, λ3, . . . λk are distinct then obtain k-
column vectors say X1, X2, X3, . . . Xk corresponding to the Eigen
values by solving PV = VD or PX = λX where X ≠ 0
iii. Denote these column vectors (eigen vectors by matrix V) where V=
(X1, X2, X3 , . . . Xk ) and obtain V-1
iv. Compute D, a diagonal matrix formed from the eigen values of P.
k
2
1
λ00
0λ000λ
D
v. Higher order Transition Probability Matrix (TPM) of four state
Markov chain can be computed using the equation, 1ii VVDP
where i= 1 to n
3.10 HIGHER ORDER MARKOV CHAIN
Second Order Markov Chain (SOMC) assumes that probability
of next state depends on the probabilities of immediately two
preceding states. Second order time dependence means that the
transition probabilities depend on the states at lags of both one and
72
two time periods. Then, the transition probabilities for a second order
Markov chain require three subscripts. Then we will have Second
Order Markov Chain (SOMC) whose transition probabilities are
nnx(n)1),-x(n2),-x(n xtXPP }xtX,xtX 2-n2-n1n1n
Similarly for a third order Markov chain, the notation requires
four subscripts on the transition counts and transition probabilities.
So as the model under study consists 4 states, the SOMC
Transition Probability Matrix(TPM) (Andre Berchtold et al 2002) can be
formulated as
444443442441
344343342341
244243242241
144143142141
434433432431
334333332331
234233232231
134133132131
424423422421
324323322321
224223222221
124123122121
414413412411
314313312311
214213212211
114113112111
PPPPIVIVPPPPIVIIIPPPPIVIIPPPPIVIPPPPIIIIVPPPPIIIIIIPPPPIIIIIPPPPIIIIPPPPIIIVPPPPIIIIIPPPPIIIIPPPPIIIPPPPIIVPPPPIIIIPPPPIIIPPPPIIIVIIIIII
PTPM
The size of the TPM will be ml x m and the number of transition
probabilities to be calculated in each TPM will be ml+1 . The Table 3.3
gives the number of transition probabilities for various combinations
of order (l) and states (m).
73
Numberof
states(m)
Order (l) ofMarkovChain
Size oftheTPM
No. ofTransition
probabilities
2 1 2x2 42 4x2 83 8x2 164 16x2 32
3 1 3x3 92 9x3 273 27x3 814 81x3 243
4 1 4x4 162 16x4 643 64x4 2564 256x4 1024
Table 3.3: Number of transition probabilities for various combinationsof order (l) and states (m)
3.11 Parsimonious modeling of Higher order Markov Chain using
Weighted Moving Transition Probabilities
From the Table 3.3 it is evident that for higher order Markov
chain with more state space, the size of the TPM will be too large.
Estimation of several hundreds of parameters is a very time
consuming one. And also it will be difficult to analyse and make
conclusions or decisions.
Moreover as maintenance cost is in proportion of the items
falling in each state at certain time, the state probabilities are fairly
important rather through which intermediate states the current state
has been attained. Therefore focus on developing a model that yields
better forecast of the proportion of items in each state at a certain
time period is justifiable.
To address this, a parsimonious model, the Weighted Moving
Transition Probabilities (WMTP) method is introduced that
74
approximates higher order Markov chains. This makes the number of
parameters in each TPM to be computed far less. The size of the TPM
is m x m only. Each element of the TPM is the probability for the
occurrence of a particular event at time t given the probabilities of
immediate previous l (= order of the Markov Chain) time periods i.e.
at times from (t-l) to (t-1). The effect of each lag is considered by
assigning the weights.
For an l- order Markov Chain, in general, the probabilities can
be estimated as
nn xtXP }xtX,,xtX -n-n1n1n ll = g1g
ijg )(Pδ
l
subject to 11
l
gg and 0g .
is the weight parameter corresponding with the lag g. Pij are the
transition probabilities of the corresponding m x m TPM. It is based on
the premise that the most recent value is the most relevant to
estimate the future value; consequently the weights decrease as we
consider the older lags.
The weighted Moving Transition Probabilities-WMTP (for l = 2, Second
Order Markov process) can be written as:
nijP nn xtXP }xtX,xtX 2-n2-n1n1n
= g
2
1gijg )(Pδ
= 2nij2n1nij1n )(Pδ)(Pδ
where 1δδ 2n1n and ≥ 0.
75
As shown in the following Fig. 3.2 real Second Order Markov
Chain (SOMC) carries the combined influence of lags where as WMTP
model analogue carries the independent influences of each lag on the
present.
Fig. 3.2: Pictorial representation of Real SOMC and WMTP modeling of SOMC
Summary: Markov Process, a stochastic mathematical model, is used
in decision-making in the face of great amount of uncertainty. Markov
process is based on the premise that the future state of the system
depends on the current state but not on how it reaches the present
state. The general characteristics of the Markov Process, different
states of Markov process, order of Markov process, transition
probability etc. are discussed in detail. Stochastic matrix, eigen values
and eigen vectors of matrix, and spectral decomposition of a matrix
are also discussed.
By defining the higher order Multi state Markov processes and
the difficulties in estimating the corresponding transition
probabilities, reasons for introducing the Weighted Moving Transition
Probabilities (WMTP) technique are discussed in detail. Weighted
Moving Transition Probabilities (WMTP) technique is a parsimonious
model that approximates higher order Markov chains.
Xt-2 Xt-1 Xt
Xt-2 Xt-1 XtReal SOMC
WMTP modeling