lecture notes-march06 (1)

Upload: rukma

Post on 05-Feb-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/21/2019 Lecture Notes-March06 (1)

    1/43

    MSCI 431 (Stochastic Models and Methods)Summary of Lectures, Winter 2014

    by Hossein Abouee Mehrizi, University of Waterloo

    Remark: These are point-form summary of the lectures for MSCI 431. There is noguarantee for completeness and accuracy, and therefore, they should not be regarded as asubstitutable for attending course lectures. Lectures are based on the book titled Introduc-tion to Probability Models by Sheldon M. Ross.

    1. Introduction to Probability Theory:

    Lecture 1:Probability

    Experiment: any process whose outcome is not known in advance.

    Flip a coin

    Roll a die

    Sample space: set of all possible outcomes.

    Example. Flipping a coin: S= {H, T}

    Example. Rolling a die: S={1, 2,..., 6}

    Example. Flipping two coins: S= {(H, H), (H, T), (T, H), (T, T)}

    Example: Rolling two dice: S={(m, n) : 1 m, n 6} Event: subset of the sample space.

    Example. Flipping a coin: E= {H}, the event that a head appears.

    Example. Rolling a die: E={2, 4, 6}, the event that an even number appears.

    Union of events EandF (E

    F): all outcomes that are either in E, or F, or both.

    Example. Flipping a coin: E= {H}, F ={T}. Then, E

    F ={H, T}.

    Intersection of events EandF (EF): all outcomes that are in both EandF. Example. Rolling a die: E={1, 3, 5},F ={1, 2, 3}. Then, E

    F ={1, 3}.

    Consider the events E1, E2,.... Then,

    Union of these events,

    i=1 Ei, is a new events that includes all outcomes thatare in En for one value ofn = 1, 2,....

    Intersection of these events,

    i=1 Ei, is a new events that includes all outcomesthat are in all En forn= 1, 2, ...

    1

  • 7/21/2019 Lecture Notes-March06 (1)

    2/43

    Complement ofE(Ec): outcomes that are in the sample space Sand are not in E.

    Probability: Consider an experiment with the sample spaceS. For an event E S,we assume that P(E) is defined and satisfies

    (i) 0 P(E) 1,

    (ii) P(S) = 1,

    (iii) For a sequence of the events E1, E2,... that are mutually exclusive (En

    Em= , n=m),P(

    i=1 Ei) =

    i=1 P(Ei).

    Example. Flipping a fair coin: P({H}) =P({T}) = 1/2.

    Example. Rolling a die: P({1, 3, 5}) =P({1}) +P({3}) +P({5}) = 1/2.

    Two basic properties of probability:

    P(Ec) = 1 P(E).

    P(E F) =P(E) +P(F) P(E F). Example. Flipping two coins:

    E= {(H, H), (H, T)}: head appears on the first coin

    E= {(H, H), (T, H)}: head appears on the second coin

    P(E F) = 1/2 + 1/2 1/4 = 3/4.

    Conditional Probability

    Example. Rolling two dice: Suppose we observe that 4 has appeared on the first die.What is the probability that the sum is 6?

    E: event that 4 appears on the first die F: event that sum of two dice is 6

    P(E|F): the probability that the sum of two dice is 6 given that 4 has appearedon the first die.

    The sample space reduces to {(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)} given that 4has appeared on the first die. Therefore, P(E|F) = 1/6.

    P(E|F) =P(E F)

    P(F) .

    Example. Suppose cards are numbered 1 to 10 and they are placed in a hat and one

    of them is drawn. We are told that the number on the drawn card is at least 5. Whatis the probability that the number on the drawn card is 10?

    E: event that number on the drawn card is 10

    F: event that number on the drawn card is at least 5

    P(E|F) =P(E F)

    P(F) =

    1/10

    6/10= 1/6.

    2

  • 7/21/2019 Lecture Notes-March06 (1)

    3/43

    Lecture 2:Conditional Probability (continue ...)

    Example. A family has two children. What is the probability that both are boys giventhat at least one of them is a boy?

    E: event that both are boys.

    F: event that at least one of them is a boy.

    P(E|F) =P(E F)

    P(F) =

    1/4

    3/4= 1/3.

    Example. Suppose that an urn contains 7 black balls and 5 white balls. We draw twoballs from the urn without replacement. What is the probability that both balls areblack?

    E: event that the second one is black.

    F: event that the first one is black. P(E F) =P(F)P(E|F) = (7/12)(6/12) = 42/132.

    Example. Bev can take a course in computers or chemistry. If Bev takes the computercourse, then she will receive an A grade with probability 1/2. If she takes the chemistrycourse, then she will receive an A grade with probability 1/3. Bev decides to base herdecision on the flip of a fair coin. What is the probability that Bev will get an A inchemistry?

    E: event that she receives an A.

    F: event that she takes chemistry.

    P(E F) =P(F)P(E|F) = (1/2)(1/3) = 1/6.

    Example. Suppose that each of three men at a party throws his hat into the center ofthe room. Each man randomly selects a hat. What is the probability that none of thethree men selects his hat?

    Remark. P(E1

    E2

    E3) =P(E1)+P(E2)+P(E3)P(E1

    E2)P(E1

    E3)P(E2

    E3) +P(E1

    E2

    E3).

    Ei: event that ith man selects his own hat.

    P(E1E2E3): probability that at least one of them selects his own hat. 1 P(E1

    E2

    E3): probability that none of them selects his own hat.

    P(Ei) = 1/3, i= 1, 2, 3

    Since each man is equally likely to select any of them.

    P(E1

    E2) =P(E1)P(E2|E1) = (1/3)(1/2) = 1/6.

    Since given that the first man has selected his own hat, there remain two hatsthat the second man may select.

    3

  • 7/21/2019 Lecture Notes-March06 (1)

    4/43

    P(Ei

    Ej) = (1/3)(1/2) = 1/6, i=j.

    P(E1

    E2

    E3) =P(E1

    E2)P(E3|E1

    E2) = (1/6)(1) = 1/6.

    P(E3|E1

    E2) = 1: since given that the first two men get their own hats, itfollows that the third man must also get his own hat.

    P(E1

    E2

    E3) =P(E1)+P(E2)+P(E3)P(E1

    E2)P(E1

    E3)P(E2

    E3)+P(E1

    E2

    E3) = 1/3 + 1/3 + 1/3 1/6 1/6 1/6 + 1/6 = 2/3.

    1 P(E1

    E2

    E3) = 1 2/3 = 1/3.

    Independent Events

    Two events Eand F are independent ifP(E

    F) =P(E)P(F).

    This impliesP(E|F) =P(E) and P(F|E) =P(F).

    Example. Suppose we toss two fair dice.

    Ei: event that sum of the two dice is 6.

    F: event that first die is 4.

    P(E1

    F) =P({4, 2}) = 1/36.

    P(E1)P(F) = (5/36)(1/6) = 5/216.

    P(E1

    F)=P(E1)P(F)!

    4

  • 7/21/2019 Lecture Notes-March06 (1)

    5/43

    Lecture 3:Independent Events

    Two events Eand F are independent ifP(E

    F) =P(E)P(F).

    This impliesP(E|F) =P(E) and P(F|E) =P(F).

    Example. Suppose we toss two fair dice.

    Ei: event that sum of the two dice is 6.

    F: event that first die is 4.

    P(E1

    F) =P({4, 2}) = 1/36.

    P(E1)P(F) = (5/36)(1/6) = 5/216.

    P(E1

    F)=P(E1)P(F)!

    Since our chance of getting a total of six depends on the outcome of the first

    die. E2: event that sum of the dice is 7.

    P(E1

    F) =P({4, 3}) = 1/36.

    P(E1)P(F) = (1/6)(1/6) = 1/36.

    P(E1

    F) =P(E1)P(F)!

    Bayes Formula

    LetEand Fbe events. Then,

    E= (E

    F)

    (E

    Fc

    ).

    Since a point is isE if it is either in both EandF, or in Eand not F.

    P(E) =P(E

    F) +P(E

    Fc) =P(F)P(E|F) +P(Fc)P(E|Fc).

    Example. Consider 2 urns. The first contains 2 white and 7 black balls, and the secondcontains 5 white and 6 black balls. We flip a fair coin and then draw a ball from thefirst urn or second urn depending on whether the outcome was heads or tails. Whatis the probability that the outcome of the toss was heads given that a white ball wasselected?

    W: Event that a white ball is drawn. H: Event that the coin comes up heads.

    P(H|W): probability that the outcome of the coin is heads given that a whiteball was selected.

    P(H|W) =P(H

    W)

    W .

    P(H

    W) =P(H)P(W|H) = (1/2)(2/9).

    5

  • 7/21/2019 Lecture Notes-March06 (1)

    6/43

    P(W) =P(H)P(W|H) +P(Hc)P(W|Hc) = (1/2)(2/9) + (1/2)(5/11).

    P(H|W) =P(H

    W)

    W =

    (1/2)(2/9)

    (1/2)(2/9) + (1/2)(5/11)= 22/67.

    Example. In answering a question with five-choice test, a student knows the correct

    answer with probability 1/2 and guess with probability 1/2. Assume that a studentwho guesses at the answer will be correct with probability 1/5. What is the probabilitythat a student knew the correct answer given that she answered it correctly?

    C: event that the student answers it correctly

    K: event that the student knows the answer

    P(K|C) =P(K

    C)

    C =

    P(K)P(C|K)

    P(K)P(C|K) +P(Kc)P(C|Kc)=

    (1/2)(1)

    (1/2)(1) + (1/2)(1/5)=

    5/6.

    2. Random Variables:

    In performing an experiment we are often interested in some functions of the outcomeas opposed to the the outcome itself.

    Example. Rolling two dice: we interested in sum of the two dice.

    Random variables: real-valued functions defined on the sample space.

    Example. Flipping two fair coins: Let Ydenote the number of heads appearing.

    Y: a random variable taking on one of the values 0, 1, 2.

    P(Y= 0) =P({T, T}) = 1/4.

    P(Y= 1) =P({T, H}, {H, T}) = 2/4. P(Y= 2) =P({H, H}) = 1/4.

    P(Y= 0) + P(Y= 1) + P(Y= 2) = 1.

    Example. Suppose that we flip a coin having a probability p of coming up headsuntil the first head appears. Let N denote the number of flips required, andassume that the outcome of successive flips are independent.

    N: a random variable taking on one of the values 1, 2, 3,...

    P(N= 1) = P({H}) =p.

    P(N= 2) = P({T, H}) = (1 p)p.

    P(N= 3) = P({T , T , H }) = (1 p)2p.

    P(N=n) =P({T , T , , T , H }) = (1 p)n1p, n 1.

    Cumulative Distribution Function (CDF) or distribution function F() of the randomvariable Xis defined for any real number b , b , byF(b) =P(Xb).

    6

  • 7/21/2019 Lecture Notes-March06 (1)

    7/43

    F(b) denotes the probability that the random variable X takes on a value lessthan or equal to b.

    Example. Flipping two fair coins: Let Ydenote the number of heads

    Y: a random variable taking on one of the values 0, 1, 2.

    F(0) =P(Y 0) =P(Y= 0) = 1/4. F(1) =P(Y 1) =P(Y= 0) + P(Y = 1) = 3/4.

    F(2) =P(Y 2) =P(Y= 0) + P(Y= 1) + P(Y= 2) = 1.

    Discrete Random Variables:

    A random variable that can take on at most a countable number of possible values issaid to be discrete.

    P(a) =P(X=a): probability mass function ofX.

    Example. Flipping two fair coins: Let Ydenote the number of heads

    probability mass function ofY: P(0) = 1/4, P(1) = 2/4, P(2) = 1/4.

    CDF ofY: F(0) = 1/4, F(1) = 3/4, F(2) = 1.

    7

  • 7/21/2019 Lecture Notes-March06 (1)

    8/43

    Lecture 4:Discrete Random Variables:

    Discrete random variables are often classified according to their probability mass func-tions.

    The Bernoulli Random Variable

    Consider a trial, or an experiment whose outcome can be classified as either asuccess or failure.

    LetXequal 1 if the outcome is a success and 0 if the outcome is a failure.

    Let 0 p 1 denote the probability that the trial is a success.

    The probability mass function ofX is

    P(0) =P(X= 0) = 1 p,

    P(1) =P(X= 1) =p.

    Xis a Bernoulli random variable with parameter p.

    Example. Flipping a fair coin: consider heads as a success and tails as a failure.

    P(0) = 1/2, P(1) = 1/2.

    The Binomial Random Variable

    Suppose that n independent trails, each of which results in a success withprobabilityp and in a failure with probability 1 p, are to be performed.

    LetXrepresent the number of successes that occur in the ntrials.

    Xis a binomial random variable with parameters (n, p). Probability mass function of a binomial random variable having parameters (n, p):

    P(i) =ni

    pi(1 p)ni =

    n!

    (n i)!i!pi(1 p)ni, i= 0, 1, , n.

    Example. Suppose that each patient in a hospital is discharged on day t withprobabilityp. What is the distribution of the number of discharged patients ondayt given that there are n patients in the hospital on that day?

    Consider discharge of a patient as a success.

    X: number of successes (discharged patients).

    There are n trials and each of them is a success with probability p.

    Therefore, the distribution of the number of discharged patients is binomial

    with parameters (n, p):ni

    pi(1p)ni =

    n!

    (n i)!i!pi(1p)ni, i = 0, 1, , n.

    Example. It is known that any item produced by a certain machine will bedefective with probability 0.1, independently of any other item. What is theprobability that in a sample of three items, at most one item will be defective?

    X: number of defective items in the sample.

    8

  • 7/21/2019 Lecture Notes-March06 (1)

    9/43

    Xis a binomial random variable with parameters (3, 0.1).

    Probability at most one defective in the sample: P(X= 0) + P(X= 1).

    P(X= 0) + P(X= 1) =30

    (0.1)0(1 0.1)3 +

    31

    (0.1)1(1 0.1)2 = 0.972.

    The Geometric Random Variable

    Suppose that independent trials, each having probability p of being a success, areperformed until a success occurs.

    LetXdenote the number of trails required until the first success.

    X is geometric random variable with parameter p.

    Probability mass function of a geometric random variable having parameters p:P(n) = (1 p)n1p, n= 1, 2, , n.

    Example. Suppose that we flip a coin having a probability p of coming up headsuntil the first head appears. Let N denote the number of flips required, andassume that the outcome of successive flips are independent.

    N: a geometric random variable.

    Therefore,P(N=n) = (1 p)n1p, n 1.

    Poisson Random Variable

    A random variableXtaking on one of the values 0, 1, 2, , is said to be a Poissonrandom variable with parameter , if for some >0,

    P(i) =P(X=i) =ei

    i!, i= 0, 1, .

    Example. Pedestrian death (true story): In January 2010 there were 7 pedestrian

    deaths in Toronto (14 in GTA). On average there are 2 .66 pedestrian deaths permonth in Toronto.

    Suppose that the distribution of the number of pedestrian deaths in Toronto isPoisson with parameter 2.66. What is the probability of having 7 pedestriandeaths in a month in Toronto?

    P(= 7) =e2.66(2.66)7

    7! = 0.013077 or 1.3%.

    Probability of having 7 or more pedestrian deaths in a month in Toronto:

    P(X7) =

    j=7 e2.66 (2.66)

    j

    j! = 0.019.

    Poisson random variable may be used to approximate a binomial random variable binomial parameter nshould be large,

    binomial parameter pshould be small.

    The Poisson random variable that approximates a binomial random variablewith parameters (n, p) has parameter = np.

    Example. Number of fires in Toronto per day.

    This is caused by a very large nnumber of buildings.

    9

  • 7/21/2019 Lecture Notes-March06 (1)

    10/43

    Each building has a very small probability p of having a fire.

    Then, the number of fires per day can be approximated by a Poissonrandom variable with parameter = np.

    Suppose number of building is n = 100000 and each has a probability2.5

    100000 of having a fire. Let Ndenote number of fires.

    P(N=k) =e2.52.5k

    k! .

    Continuous Random Variables:

    Xis a continuous random variable if there exists a nonnegative function f(x), definefor all real x (, ) having the property that for any set B of real numbersP(XB) =

    Bf(x)x.

    f(x): probability density function.

    P(X(, )) =

    f(x)dx= 1.

    LetB = [a, b]. Then,P(a Xb) =ba

    f(x)dx.

    P(X=a) =aa

    f(x) = 0.

    F(a) =P(X(, a]) =a

    f(x)dx

    10

  • 7/21/2019 Lecture Notes-March06 (1)

    11/43

    Lecture 5:Several Important Continuous Random Variables:

    The Uniform Random Variable

    A random variable is said to be uniformly distributed over the interval (0 , 1) if itsprobability density function (pdf) is

    f(x) =

    1, 0< x

    Example. IfXis uniformly distributed over (0, 10), calculate the probability thatX 7, 1< X

  • 7/21/2019 Lecture Notes-March06 (1)

    12/43

    A continuous random variable whose probability density function is given, forsome > 0, by

    f(x) =

    ex, x 0

    0, x 0

    xP(x).

    The expected value ofXis a weighted average of the possible values that Xcan take on.

    Example. Find E[X] where X is the outcome when we roll a fair die.

    E[X] = 1(1

    6) + 2(

    1

    6) + 3(

    1

    6) + 4(

    1

    6) + 5(

    1

    6) + 6(

    1

    6) =

    7

    2.

    Example. CalculateE[X] whenXis a Bernoulli random variable with parameterp.

    E[X] = 0(1 p) + 1(p) =p.

    Example. CalculateE[X] whenXis a binomial random variable with parameters

    (n, p).

    E[X] =n

    i=0 iP(i) =n

    i=0 ini

    pi(1 p)ni =

    ni=0 i

    n!

    (n i)!i!pi(1 p)ni

    =np.

    Example. CalculateE[X] whenXis a geometric random variable with parameterp.

    E[X] =

    i=1 iP(i) =

    i=1 i(1 p)i1p= p

    i=1 i(1 p)

    i1 =1

    p.

    12

  • 7/21/2019 Lecture Notes-March06 (1)

    13/43

    Example. CalculateE[X] when Xis a Poisson random variable with parameter.

    E[X] =

    i=0 iP(i) =

    i=0 ie!

    i! =

    i=1 e

    i

    (i 1)!

    =

    i=1 e

    i1

    (i 1)!=.

    13

  • 7/21/2019 Lecture Notes-March06 (1)

    14/43

    Lecture 6:

    Discrete Case (continue ...)

    Example. Suppose that teams A and B are playing a series of games. Team Awins each game independently with probability 2/3 and Team B wins each gameindependently with probability 1/3. The winner of the series is the first team towin 2 games. Find the expected number of games that are played.

    X: number of games

    P(X= 2) =P(X= 2, A wins 2 of the first 2)+P(X= 2, B wins 1 of the first 2)

    =

    2

    3

    2+

    1

    3

    2=

    5

    9

    P(X= 3) =P(X= 3, A wins 1 of the first 2)+P(X= 3, B wins 1 of the first 2)

    =

    21

    2

    3

    11

    3

    12

    3

    +

    21

    1

    3

    12

    3

    11

    3

    =

    12

    27.

    E[X] = 2P(X= 2) + 3P(X= 3) = 6627.

    Continuous Case

    Consider a continuous random variable Xwith probability density function f(x).Then, the expected value ofXis defined by

    E[X] =

    xf(x)dx.

    Example. Calculate E[X] when X is a random variable uniformly distributedover (, ).

    E[X] = x(

    1

    )dx=

    2 2

    2( )=

    ( )(+ )

    2( ) =

    (+ )

    2 .

    Example. Calculate E[X] when X is an exponential random variable with pa-rameter.

    E[X] =0

    x(ex)dx.

    Integrating by parts (dv= ex, u= x) yields to

    E[X] =

    0

    x(ex)dx= xex|0 +

    0

    (ex)dx= 0 1

    (ex)|0 =

    1

    .

    Expectation of a Function of a Random Variable:

    Suppose we are interested in a function ofX, sayg(X).

    IfXis a discrete random variable with probability mass function P(x), then for anyreal-valued function g(x),

    E[g(X)] =

    x:P(x)>0

    g(x)P(x).

    14

  • 7/21/2019 Lecture Notes-March06 (1)

    15/43

    Example. Suppose Xhas the following probability mass function P(0) = 0.2, P(1) =0.5, P(2) = 0.3. Calculate E[X2].

    E[X2] = (0)2(0.2) + (1)2(0.5) + (2)2(0.3) = 1.7.

    IfX is a continuous random variable with probability density function f(x), then for

    any real-valued function g(x),

    E[g(X)] =

    g(x)f(x)dx.

    Example. The dollar amount of damage involved in a car accident is an exponen-tial random variable with expected value of 1000. The insurance company paysthe whole damage if it is more than 400 and 0 otherwise. What is the expectedvalue that the company pays per accident?

    Let define g(X) as

    g(X) =

    0, 0< X

  • 7/21/2019 Lecture Notes-March06 (1)

    16/43

    Lecture 7:Expectation of a Function of a Random Variable...

    Remark. Ifa and b are constants, then

    E[aX+b] =aE[X] +b.

    Remark. V ar(X) =E[(X E(X))2]: the variance ofXmeasures the expected squareof deviation ofX from its expected value.

    V ar(X) =E[X2] (E[X])2.

    Example. Calculate V ar(X) whenX is the outcome of rolling a fair die.

    E[X2] = 12(1

    6) + 22(

    1

    6) + 32(

    1

    6+ 42(

    1

    6) + 52(

    1

    6) + 62(

    1

    6) = (

    91

    6)

    E[X] =7

    2(It is obtained in Lecture 5.)

    V ar(X) =E[X2] (E[X])2

    =91

    6 (7

    2)2 =

    35

    12.

    Joint Distribution Functions:

    If X and Y are discrete random variables, the joint probability mass function ofXandY is defined by

    P(x, y) =P(X=x, Y =y).

    The probability mass function ofXcan be obtained from P(x, y) by

    PX(x) = y:P(x,y)>0P(x, y).

    The probability mass function ofYcan be obtained from P(x, y) by

    PY(y) =

    x:P(x,y)>0

    P(x, y).

    Example. SupposeX and Yare discrete random variables with the probabilitymass functionP(x, y),

    P(1, 1) = 1/4, P(1, 2) = 1/8, P(1, 3) = 1/16, P(1, 4) = 1/16,

    P(2, 1) = 1/16, P(2, 2) = 1/16, P(2, 3) = 1/4, P(2, 4) = 1/8.

    What is the probability that Y = 3?

    PY(3) =P(Y = 3, X= 1) + P(Y = 3, X= 2) = 1

    16+

    1

    4=

    5

    16.

    Remark. For discrete random variables Xand Y, and real-valued function g(X, Y)

    E[g(X, Y)] =y

    x

    g(x, y)P(x, y).

    16

  • 7/21/2019 Lecture Notes-March06 (1)

    17/43

    Remark. IfX1, X2, , Xn are n independent random variables, then for any n con-stants a1, a2, , an,

    E[a1X1+ a2X2+ + anXn] =a1E[X1] +a2E[X2] + +anE[Xn].

    Example. Calculate the expected sum obtained when three fair dice are rolled.

    LetXdenote the sum obtained.

    LetXi denote the value of the ith die.

    X=X1+ X2+X3. Thus,

    E[X] =E[X=X1+X2+X3] =E[X1]+E[X2]+E[X3] = (7

    2)+(

    7

    2)+(

    7

    2) =

    21

    2.

    Example. Suppose that there are 20 patients of type A and 15 patients of typeB in a hospital. Each patient of type A is discharged today with probability 2/3independent of other patients. Also, each patient of type B is discharged todaywith probability 1/3 independent of other patients. What is the expected number

    of patients who are discharged today?

    LetXdenote the total number of discharged patients.

    Xi =

    1, if theith patient of type A is discharged today

    0, otherwise

    Yi =

    1, if theith patient of type B is discharged today

    0, otherwise

    E[Xi] = (0)(1

    3

    ) + (1)(2

    3

    ) =2

    3

    .

    E[Yi] = (0)(2

    3) + (1)(

    1

    3) =

    1

    3.

    X=X1+ X2+ + X20+ Y1+ Y2+ + Y15.

    Thus,E[X] =E[X1+ X2+ + X20+Y1+ Y2+ +Y15]

    =E[X1] +E[X2] + +E[X20] +E[Y1] +E[Y2] + + E[Y15]

    = (20)(2

    3) + + (15)(

    1

    3) =

    55

    3.

    Independent Random Variables:

    The random variable X and Yare independent if for all a and b,

    P(Xa, Y b) =P(Xa)P(Y b).

    IfXand Yare independent, then for any functions g(X) andh(X)

    E[g(X)h(X)] =E[g(X)]E[h(X)].

    17

  • 7/21/2019 Lecture Notes-March06 (1)

    18/43

    Example. Two fair dice are rolled. What is the expected value of the product oftheir outcomes?

    LetZdenote the product of their outcomes.

    LetXdenote the outcome of the first die.

    LetYdenote the outcome of the second die.

    E[Z] =E[XY] =E[X]E[Y] = (7

    2)(

    7

    2) =

    49

    4.

    18

  • 7/21/2019 Lecture Notes-March06 (1)

    19/43

    Lecture 8:Conditional Probability: Discrete Case

    Recall that for any two events E and F, the conditional probability ofE given F isdefined, as long as P(F)> 0, by

    P(E|F) =P(E

    F)

    P(F) .

    IfXandYare discrete random variables, the conditional probability mass function ofXgiven that Y =y is defined by,

    P(X=x|Y =y) =P(X=x, Y =y)

    P(Y =y) =

    P(x, y)

    P(y) .

    Example. Suppose that P(x, y), the joint probability mass function ofX and Y,is given by

    P(1, 1) = 0.5, P(1, 2) = 0.1, P(2, 1) = 0.1, P(2, 2) = 0.3

    Calculate the conditional probability mass function ofXgiven that Y = 1.

    P(Y= 1) =P(1, 1) +P(2, 1) = 0.6

    P(X= 1|Y= 1) =P(X= 1, Y = 1)

    P(Y = 1) =

    P(1, 1)

    P(Y = 1)=

    5

    6.

    P(X= 2|Y= 1) =P(X= 2, Y = 1)

    P(Y = 1) =

    P(2, 1)

    P(Y = 1)=

    1

    6.

    The conditional cumulative distribution function ofX given Y = y is defined, for ally such that P(Y =y)>0, by

    F(x|y) =P(Xx|Y =y) =ax

    P(a|y).

    The conditional expectation ofXgiven that Y =y is defined by

    E[X|Y =y] =x

    xP(X=x|Y =y) =x

    xP(x|y).

    Example. The joint probability mass function ofX and Y,P(x, y), is given by

    P(1, 1) =1

    9, P(2, 1) =

    1

    3, P(3, 1) =

    1

    9,

    P(1, 2) =1

    9, P(2, 2) = 0, P(3, 2) =

    1

    18,

    P(1, 3) = 0, P(2, 3) =1

    6, P(3, 3) =

    1

    9.

    19

  • 7/21/2019 Lecture Notes-March06 (1)

    20/43

    CalculateE[X|Y= 2].

    E[X|Y= 2] = (1)P(X= 1|Y= 2) + (2)P(X= 2|Y= 2) + (3)P(X= 3|Y = 2)

    = (1)P(X= 1, Y = 2)P(Y = 2)

    + (2)P(X= 2, Y = 2)P(Y = 2)

    + (3)P(X= 3, Y = 2)P(Y = 2)

    = (1)

    1

    91

    6

    + (2)01

    6

    + (3)

    1

    181

    6

    =5

    3.

    Remark. IfXis independent ofY, thenP(X=x|Y =y) =P(X=x).

    Example. IfX1 and X2 are independent binomial random variables with respec-tive parameters (5, 0.4) and (10, 0.4), calculate the conditional probability mass

    function ofX1 given that X1+ X2 = 8. Remark. IfX1and X2 are binomial random variables with parameters (n1, p)

    and (n2, p), respectively, then X1+ X2 is a binomial random variable withparameters (n1+ n2, p).

    P(X1= k|X1+X2= 8) =P(X1 = k, X1+ X2= 8)

    P(X1+X2 = 8) =

    P(X1= k)P(X2= 8 k)

    P(X1+ X2 = 8)

    = 5k

    (0.4)k(1 0.4)5k

    108k

    (0.4)8k(1 0.4)108+k

    158 (0.4)8(1 0.4)158 = 5k

    108k

    158 , 0 k 5. Example. There aren components. On a rainy day, component i will function with

    probability pi. On a nonrainy day, component i will function with probability qi,for i = 1, n. It will rain tomorrow with probability . Calculate the conditionalexpected number of components that function tomorrow given it rains.

    Let define Xi andY as

    Xi=

    1, if componenti functions tomorrow

    0, otherwise

    Y =

    1, if it rains tomorrow

    0, otherwise

    Then,

    E

    ni=1

    Xi|Y = 1

    =

    ni=1

    E[Xi|Y= 1] =ni=1

    pi.

    The last equality is because ofE[Xi|Y= 1] =pi.

    20

  • 7/21/2019 Lecture Notes-March06 (1)

    21/43

    Conditional Probability: Continuous Case

    X andYare jointly continuous if there exists a function f(x, y), defined for all real xandy , having the property that for all setsA and B of real numbers

    P(XA, Y B) =B

    A f(x, y)dxdy.

    The functionf(x, y) is called the joint probability density function ofXand Y.

    The probability density function ofYcan be obtained from f(x, y) by

    P(Y B) =P(X(, ), Y B) =

    B

    f(x, y)dxdy=

    B

    fY(y)dy,

    wherefY(y) =

    f(x, y)dx.

    If X and Y have a joint density function f(x, y), then the conditional probability

    density function ofX, given that Y =y, is defined for all values ofy such that fY(y),by

    f(x|y) =f(x, y)

    fY(y)

    21

  • 7/21/2019 Lecture Notes-March06 (1)

    22/43

    Lecture 10:Conditional Probability: Continuous Case ...

    X andYare jointly continuous if there exists a function f(x, y), defined for all real xandy , having the property that for all setsA and B of real numbers

    P(XA, Y B) =

    B

    A

    f(x, y)dxdy.

    The functionf(x, y) is called the joint probability density function ofXand Y.

    The probability density function ofYcan be obtained from f(x, y) by

    P(Y B) =P(X(, ), Y B) =

    B

    f(x, y)dxdy=

    B

    fY(y)dy,

    wherefY(y) =

    f(x, y)dx.

    Example. Suppose the joint probability density ofX andY is given by

    f(x, y) =

    6xy(2 x y), 0< x

  • 7/21/2019 Lecture Notes-March06 (1)

    23/43

    If X and Y have a joint density function f(x, y), then the conditional probabilitydensity function ofX, given that Y =y, is defined for all values ofy such that fY(y),by

    f(x|y) =f(x, y)

    fY(y).

    Example. Suppose the joint probability density ofX andY is given by

    f(x, y) =

    6xy(2 x y), 0< x

  • 7/21/2019 Lecture Notes-March06 (1)

    24/43

    Example. Sam will read either one chapter of his probability book or one chapterof his history book. Suppose the number of misprints in a chapter of his proba-bility book is Poisson distributed with mean 2 and the number of misprints in hishistory chapter is Poisson distributed with mean 5. Assume that Sam is equallylikely to choose either book. What is the expected number of misprints that Sam

    will come across? X: the number of misprints.

    Y =

    1, if Sam chooses his history book

    2, if Sam chooses his probability book

    Then,

    E[X] =E[X|Y = 1]P(Y= 1) + E[X|Y = 2]P(Y= 2) = 5(1

    2) + 2(

    1

    2) =

    7

    2.

    Example. A miner is trapped in a mine containing three doors. The first door

    leads to a tunnel that takes him to safety after two hours of travel. The seconddoor leads to a tunnel that returns him to the mine after three hours of travel.The third door leads to a tunnel that returns him to his mine after five hours.Assuming that the miner is at all times equally likely to choose any one of thedoors, what is the expected length of time until the miner reaches safety?

    X: the time until the miner reaches safety.

    Y: the door he initially chooses.

    Then,

    E[X] =E[X|Y = 1]P(Y= 1)+E[X|Y = 2]P(Y= 2)+E[X|Y = 3]P(Y = 3).

    E[X|Y= 1] = 2. E[X|Y = 2] = 3 +E[X].

    E[X|Y = 3] = 5 +E[X].

    Therefore,

    E[X] =E[X|Y = 1]P(Y= 1)+E[X|Y = 2]P(Y= 2)+E[X|Y = 3]P(Y = 3)

    E[X] = (2)1

    3+ (3 +E[X])

    1

    3+ (5 +E[X])

    1

    3E[X] = 10.

    Computing Probabilities by Conditioning

    LetEdenote an arbitrary event and define Xas

    X=

    1, if E occurs

    0, otherwise

    Then,E[X] = (1)P(E) + (0)(1 P(E)) =P(E).

    Therefore,P(E) =E[X].

    24

  • 7/21/2019 Lecture Notes-March06 (1)

    25/43

    Lecture 11:Computing Probabilities by Conditioning ...

    Let Edenote an arbitrary event and Ydenote a discrete random variable. Then, theprobability of event Ecan be obtained by

    P(E) =y

    P(E|Y =y)P(Y =y).

    IfY is a continuous random variable, then the probability of event Ecan be obtainedby

    P(E) =

    P(E|Y =y)fY(y)dy.

    Example. Data indicate that the number of traffic accidents in Berkeley on arainy day is a Poisson random variable with mean 9, whereas on a dry day itis a Poisson random variable with mean 3. LetX denote the number of traffic

    accidents tomorrow. Suppose it will rain tomorrow with probability 0.6. CalculateP(X= 0).

    Let

    Y =

    1, if it rains tomorrow

    0, otherwise

    Then,

    P(X= 0) =P(X= 0|Y = 1)P(Y= 1) + P(X= 0|Y = 0)P(Y = 0)

    = (0.6)(e9(9)0

    0!

    ) + (1 0.6)(e3(3)0

    0!

    ) = (0.6)(e9) + (0.4)(e3).

    Markov Chains

    Stochastic Processes: A discrete-time stochastic process{Xn, n= 0, 1, }is a collec-tion of random variables.

    For each n = 0, 1, ,Xn is a random variable.

    The index n is often interpreted as time and, as a result, we refer to Xn as thestateof the process at time n.

    For example,

    Xn

    might be the total number of customers that have entered a supermarketby time n.

    Xn might be the number of customers in the supermarket at time n.

    A stochastic process is a family of random variables that describes the evolutionthrough time of some process.

    Example (Frog Example). Suppose 1000 lily pads are arranged in a circle. A frogstarts at pad number 1000. Each minute, she jumps either straight up, or onepad clockwise, or one pad counter-clockwise, each with probability 1/3.

    25

  • 7/21/2019 Lecture Notes-March06 (1)

    26/43

    P(at pad # 1 after 1 step) =1

    3.

    P(at pad # 1000 after 1 step) =1

    3.

    P(at pad # 999 after 1 step) =1

    3

    .

    P(at pad # 428 after 987 steps)?

    Markov Chain: a discrete-time Markov chain is a discrete-time stochastic process spec-ified by

    A state space S: any non-empty finite or countable set.

    In frog example, 1000 lily pads.

    Transition probabilities {Pij}i,jS: probability of jumping to j if you start at i.

    probability that the process will next make a transition into state j when itis in state i.

    Pij 0, and

    jPij = 1 for alli. In frog example,

    Pij =

    1/3, i j= 0

    1/3, i j= 1

    1/3, j i= 1

    1/3, i j= 999

    1/3, j i= 999

    0, otherwise

    LetXn be the Markov chains state at time n. Then,

    P(Xn+1= j|X0 = i0, X1= i1, , Xn1 = in1, Xn= i) =P(Xn+1= j |Xn= i) =Pij.

    This is called Markov property.

    Example (Gamblers ruin). Consider a gambling game in which on any turn youwin $1 with probability 0.4 or lose $1 with probability 0.6. Suppose further thatyou adopt the rule that you quit playing if your fortune reaches $5.

    LetXn be the amount of money you have after n plays.

    State space S= {0, 1, 2, 3, 4, 5}.

    Xn has the Markov property since given the current state Xn any other in-

    formation about the past is irrelevant for predicting the next state, Xn+1. Transition probabilities,

    P(Xn+1= i+1|X0= i0, , Xn= i) =P(Xn+1= i+1|Xn= i) = 0.4, 0< i

  • 7/21/2019 Lecture Notes-March06 (1)

    27/43

    Transition matrix

    P =

    1 0 0 0 0 0

    0.6 0 0.4 0 0 0

    0 0.6 0 0.4 0 0

    0 0 0.6 0 0.4 0

    0 0 0 0.6 0 0.4

    0 0 0 0 0 1

    Example (Inventory Chain). Consider an (s, S) inventory control policy. That is,

    when the stock on hand at the end of the day falls to s or below, we order enoughto bring it back up to S. For simplicity, we assume it happens at the beginningof the next day.

    27

  • 7/21/2019 Lecture Notes-March06 (1)

    28/43

    Lecture 12:Markov Chains ...

    Example (Inventory Chain). Consider an (s, S) inventory control policy. That is, whenthe stock on hand at the end of the day falls tos or below, we order enough to bring it

    back up to S. For simplicity, we assume it happens at the beginning of the next day.Suppose that s= 1 and S= 5. Also, assume that the distribution of the demand ondayn + 1 is

    P(Dn+1= 0) = 0.3, P(Dn+1= 1) = 0.4, P(Dn+1= 2) = 0.2, P(Dn+1= 3) = 0.1.

    Xn: the amount of stock on hand at the end of day n.

    State space S={0, 1, 2, 3, 4, 5}.

    Transition probabilities,

    P(Xn+1= 0|Xn= 0): when stock on hand is zero at the end of day n, 5 unitswill be ordered and therefore there will be 5 units available at the beginning

    of day n+ 1. Since the maximum demand on day n+ 1 is 3, there will beat least 1 unit available at the end of the day n+ 1. This means that givenXn = 0, Xn+1 is greater than zero, or

    P(Xn+1= 0|Xn= 0) = P(Dn+1 5) = 0.

    P(Xn+1= 1|Xn= 0) =P(Dn+1= 4) = 0

    (similar to the above discussion).

    P(Xn+1= 2|Xn= 0): when stock on hand is zero at the end of day n, 5 unitswill be ordered and therefore there will be 5 units available at the beginningof day n+ 1. If the demand on day n+ 1 is exactly 3, there will be 2 unitsavailable at the end of the day n + 1.

    P(Xn+1= 2|Xn= 0) =P(Dn+1= 3) = 0.1

    Similarly,P(Xn+1= 3|Xn= 0) =P(Dn+1= 2) = 0.2.

    P(Xn+1= 4|Xn= 0) =P(Dn+1= 1) = 0.4.

    P(Xn+1= 5|Xn= 0) =P(Dn+1= 0) = 0.3.

    Similarly, for Xn= 1:

    P(Xn+1= 0|Xn= 1) = P(Dn+1 5) = 0.

    P(Xn+1= 1|Xn= 1) =P(Dn+1= 4) = 0.

    P(Xn+1= 2|Xn= 1) =P(Dn+1= 3) = 0.1

    P(Xn+1= 3|Xn= 1) =P(Dn+1= 2) = 0.2.

    P(Xn+1= 4|Xn= 1) =P(Dn+1= 1) = 0.4.

    P(Xn+1= 5|Xn= 1) =P(Dn+1= 0) = 0.3.

    28

  • 7/21/2019 Lecture Notes-March06 (1)

    29/43

    P(Xn+1 = 0|Xn = 2): when stock on hand is 2 at the end of day n, 0 unitswill be ordered and there will be 2 units available at the beginning of dayn + 1. Therefore,

    P(Xn+1= 0|Xn= 2) =P(Dn+1 2) =P(Dn+1= 2) + P(Dn+1= 3) = 0.3.

    P(Xn+1= 1|Xn= 2) =P(Dn+1= 1) = 0.4.

    P(Xn+1= 2|Xn= 2) =P(Dn+1= 0) = 0.3.

    P(Xn+1= 3|Xn= 2) = 0.

    P(Xn+1= 4|Xn= 2) = 0.

    P(Xn+1= 5|Xn= 2) = 0.

    Similarly, for Xn= 3:

    P(Xn+1= 0|Xn= 3) =P(Dn+13) =P(Dn+1= 3) = 0.1.

    P(Xn+1= 1|Xn= 3) =P(Dn+1= 2) = 0.2.

    P(Xn+1= 2|Xn= 3) =P(Dn+1= 1) = 0.4.

    P(Xn+1= 3|Xn= 3) =P(Dn+1= 0) = 0.3.

    P(Xn+1= 4|Xn= 3) = 0.

    P(Xn+1= 5|Xn= 3) = 0.

    Similarly, for Xn= 4:

    P(Xn+1= 0|Xn= 4) = P(Dn+1 4) = 0.P(Xn+1= 1|Xn= 4) =P(Dn+1= 3) = 0.1.

    P(Xn+1= 2|Xn= 4) =P(Dn+1= 2) = 0.2.

    P(Xn+1= 3|Xn= 4) =P(Dn+1= 1) = 0.4.

    P(Xn+1= 4|Xn= 4) =P(Dn+1= 0) = 0.3.

    P(Xn+1= 5|Xn= 4) = 0.

    Similarly, for Xn= 5:

    P(Xn+1= 0|Xn= 5) = P(Dn+1 5) = 0.P(Xn+1= 1|Xn= 5) = P(Dn+1 4) = 0.

    P(Xn+1= 2|Xn= 5) =P(Dn+1= 3) = 0.1.

    P(Xn+1= 3|Xn= 5) =P(Dn+1= 2) = 0.2.

    P(Xn+1= 4|Xn= 5) =P(Dn+1= 1) = 0.4.

    P(Xn+1= 5|Xn= 5) =P(Dn+1= 0) = 0.3.

    29

  • 7/21/2019 Lecture Notes-March06 (1)

    30/43

    Transition matrix

    P =

    0 0 0.1 0.2 0.4 0.3

    0 0 0.1 0.2 0.4 0.3

    0.3 0.4 0.3 0 0 0

    0.1 0.2 0.4 0.3 0 0

    0 0.1 0.2 0.4 0.3 0

    0 0 0.1 0.2 0.4 0.3

    Example (Repair Chain). A machine has three critical parts that are subject to failure,

    but can function as long as two of these parts are working. When two are broken, theyare replaced and the machine is back to working order the next day. Assume thatparts 1, 2, and 3 fail with probabilities 0.01, 0.02, and 0.04, but no two parts fail onthe same day. Formulate the system as a Markov chain.

    Xn: the parts that are broken. State space S={0, 1, 2, 3, 12, 13, 23}.

    Transition probabilities:

    P(Xn+1= 0|Xn= 0) = 1 0.01 0.02 0.04 = 0.93.

    P(Xn+1= 1|Xn= 0) = 0.01.

    P(Xn+1= 2|Xn= 0) = 0.02.

    P(Xn+1= 3|Xn= 0) = 0.04.

    If we continue, we get the transition Matrix as,

    P =

    0 1 2 3 12 13 23

    0 0.93 0.01 0.02 0.04 0 0 01 0 0.94 0 0 0.02 0.04 02 0 0 0.95 0 0.01 0 0.043 0 0 0 0.97 0 0.01 0.0212 1 0 0 0 0 0 013 1 0 0 0 0 0 023 1 0 0 0 0 0 0

    Multistep Transition Probabilities:

    P(Xn+1= j|Xn = i) =Pij gives the probability of going from i to j in one step.

    What is the probability of going from i to j inm >1 steps?

    P(Xn+m = j |Xn= i) =Pmij =?

    30

  • 7/21/2019 Lecture Notes-March06 (1)

    31/43

    Chapman-Kolmogorov Equations:

    Pn+mij =k=0

    PnikPmkj , n, m 0, alli,j.

    To go fromi to j inn + msteps, we have to go from i to some state k inn stepsand then from k toj inm steps.

    Theorem. Them step transition probabilityP(Xn+m =j |Xn = i) is the mth powerof the transition matrix P,

    P(Xn+m= j |Xn= i) =Pmij = (P

    m)ij .

    31

  • 7/21/2019 Lecture Notes-March06 (1)

    32/43

    Lecture 13:Multistep Transition Probabilities ...

    Theorem. Them step transition probabilityP(Xn+m =j |Xn = i) is the mth powerof the transition matrix P,

    P(Xn+m= j |Xn= i) =Pmij = (P

    m)ij .

    Example. Suppose that if it rains today, then it will rain tomorrow with probabil-ity 0.7; and if it does not rain today, then it will rain tomorrow with probability0.4. Calculate the probability that it will rain two days from today given that itis raining today. Also, calculate the probability that it will rain four days fromtoday given that it is raining today.

    We model the problem as a Markov chain.

    State space S = {0, 1} where 0 denotes that it rains and 1 denotes that itdoes not rain.

    Transition matrix

    P =

    0.7 0.3

    0.4 0.6

    Then,

    P2 =

    0.61 0.39

    0.52 0.48

    The desired probability is P200 = 0.61.

    To calculate the probability that it will rain four days from today given thatit is raining today, we consider

    P4 =

    0.5749 0.4251

    0.5668 0.4332

    The desired probability is P400 = 0.5749.

    To obtain the mth power of a matrix, you can use WWW.WOLFRAMALPHA.CFor example, copy {{0.7, 0.3}, {0.4, 0.6}}4 in this website to get P4.

    What about P800?

    P8 = 0.5714 0.4286

    0.5714 0.4286 The desired probability is P800 = 0.5714.

    What about P1000 ?

    P10 =

    0.5714 0.4286

    0.5714 0.4286

    Rows are identical! It says that the probability that it will rain in 10days, or 2o days, ... is 0.5714.

    32

  • 7/21/2019 Lecture Notes-March06 (1)

    33/43

    The desired probability is P1000 = 0.5714.

    Example. Consider an (s, S) inventory control policy with s = 1 and S = 5.Assume that the distribution of the demand on day n + 1 is

    P(Dn+1= 0) = 0.3, P(Dn+1= 1) = 0.4, P(Dn+1= 2) = 0.2, P(Dn+1= 3) = 0.1.

    Suppose that today is day 0. What is the probability of having 3 units on handat the end of day 20 given that there are 2 units available at the end of today?

    From last session we have,

    P =

    0 0 0.1 0.2 0.4 0.3

    0 0 0.1 0.2 0.4 0.3

    0.3 0.4 0.3 0 0 0

    0.1 0.2 0.4 0.3 0 0

    0 0.1 0.2 0.4 0.3 0

    0 0 0.1 0.2 0.4 0.3

    We are looking for P2023 . Therefore,

    P20 =

    0.0909 0.1556 0.231 0.2156 0.2012 0.1057

    0.0909 0.1556 0.231 0.2156 0.2012 0.1057

    0.0909 0.1556 0.231 0.2156 0.2012 0.1057

    0.0909 0.1556 0.231 0.2156 0.2012 0.1057

    0.0909 0.1556 0.231 0.2156 0.2012 0.1057

    0.0909 0.1556 0.231 0.2156 0.2012 0.1057

    Rows are identical! It says that the probability that there will be 3 units ofinventory on hand in 20 days, or 25 days, ... is 0 .2156.

    The desired probability is P2023 = 0.2156.

    Classification of States:

    State j is said to be accessible from state iifPnij >0 for some n 0.

    Example. Consider a Markov chain with the following transition matrix.

    P = 1 2

    1 0.2 0.82 0 1.0

    2 is accessible from 1, but 1 is not accessible from 2.

    Example. Consider a Markov chain with the following transition matrix.

    P =

    1 2 3

    1 0.1 0.8 0.12 0 0.9 0.13 0.4 0 0.6

    33

  • 7/21/2019 Lecture Notes-March06 (1)

    34/43

    2 and 3 are accessible from 1. 1 is accessible from 2 since with probability 0 .1 wecan go from 2 to 3, and with probability 0.4 we can go from 3 to 1. Similarly, 2is accessible from 3.

    Two states iand j that are accessible to each other are said to communicate, and we

    writei j. Example. Consider a Markov chain with the following transition matrix.

    P =

    1 21 0.2 0.82 0 1.0

    1 and 2 does not communicate, 1 2.

    Example. Consider a Markov chain with the following transition matrix.

    P =

    1 2 3

    1 0.1 0.8 0.12 0 0.9 0.13 0.4 0 0.6

    1, 2, and 3 communicate with each other, 1 2, 1 3, 2 3.

    Two states that communicate are said to be in the same class.

    The Markov chain is said to be irreducibleif there is only one class, that is, if all statescommunicate with each other.

    Example. Consider a Markov chain with the following transition matrix.

    P =

    1 21 0.2 0.82 0 1.0

    The Markov chain has two classes, {1} and {2}. Therefore, it is not irreducibleor it is reducible.

    Example. Consider a Markov chain with the following transition matrix.

    P =

    1 2 3

    1 0.1 0.8 0.12 0 0.9 0.13 0.4 0 0.6

    The Markov chain has one class, {1, 2, 3}. Therefore, the Markov chain is irre-

    ducible.

    State i is said to be recurrentif starting in state i the process will ever reenter state iwith probability 1. Otherwise, state i is called transient.

    34

  • 7/21/2019 Lecture Notes-March06 (1)

    35/43

    Statei is said to have period difPnii = 0 whenever n is not divisible byd, andd is thelargest integer with this property.

    For instance, starting in i, it may be possible for the process to enter state i onlyat times 2, 4, 6, 8, in which case state i has period 2.

    A state with period 1 is said to be aperiodic.

    35

  • 7/21/2019 Lecture Notes-March06 (1)

    36/43

    Lecture 14:Multistep Transition Probabilities ...

    State i is said to be recurrentif starting in state i the process will ever reenter state iwith probability 1. Otherwise, state i is called transient.

    Suppose statei is recurrent. Then it ispositive recurrentif, starting ini, the expectedtime until the process returns to state i is finite.

    Remark. Every irreducible Markov chain with a finite state space is positive recurrent.

    Statei is said to have period difPnii = 0 whenever n is not divisible byd, andd is thelargest integer with this property.

    For instance, starting in i, it may be possible for the process to enter state i onlyat times 2, 4, 6, 8, in which case state i has period 2.

    A state with period 1 is said to be aperiodic. Remark. An irreducible Markov chain is aperiodic if there is a state i for which

    Pii > 0.

    Example. Consider a MC with the following transition matrix.

    P =

    0 1 2 3 4 5

    0 0 0.25 0 0.25 0.25 0.251 0 0 0 0.5 0 0.52 0.3 0 0.4 0 0.3 03 0 0.3 0 0.4 0 0.34 0 0 0.5 0 0.5 05 0.5 0 0 0 0 0.5

    Is this MC irreducible?

    All states communicate with each other. Therefore, the MC is irreducible.

    Long-run Behavior (Limiting Behavior):

    Theorem. If a Markov chain is irreducible, positive recurrent, and aperiodic, thenthe long-run proportion of time that the process will be in state j ,j is

    j = limn

    Pnij, j0.

    Remark. Positive recurrent, aperiodic states are called ergodic.

    Remark. j is called stationary probabilities.

    36

  • 7/21/2019 Lecture Notes-March06 (1)

    37/43

    Example. Suppose that if it rains today, then it will rain tomorrow with probabil-ity 0.7; and if it does not rain today, then it will rain tomorrow with probability0.4. Calculate the probability that it will rain two days from today given that itis raining today. In long-run what fraction of time it rains.

    We model the problem as a Markov chain.

    State space S = {0, 1} where 0 denotes that it rains and 1 denotes that itdoes not rain.

    Transition matrix

    P =

    0.7 0.3

    0.4 0.6

    Then,

    P20 =

    0.5714 0.4286

    0.5714 0.4286

    Therefore, in long-run the probability that it rains in a given day is 0.5714. Example. Three of every four trucks on the road are followed by a car, while only

    one of every five cars is followed by a truck. What fraction of vehicles on the roadare trucks?

    LetXn denote the type of thenth vehicle. Then,S= {C, T}whereCandTdenote car and truck, respectively.

    P =

    0.25 0.75

    0.2 0.8

    Then,P20 =

    4/19 15/19

    4/19 15/19

    4/19 fraction of vehicles are trucks.

    Suppose that a Markov chain has the following transition matrix

    P =

    1 21 1 a a2 b 1 b

    Then,1= b

    a +b and 2=

    a

    a + b.

    Example. A rapid transit system has just started operating. In the first monthof operation, it was found that 25% of commuters are using the system while75% are travelling by automobile. Suppose that each month 10% of transit usersgo back to using their cars, while 30% of automobile users switch to the transitsystem. What fraction of people will eventually use the transit system?

    37

  • 7/21/2019 Lecture Notes-March06 (1)

    38/43

    The probability matrix,

    P =

    0.9 0.1

    0.3 0.7

    Then, 1= 0.4

    0.3 + 0.1

    = 0.75 and 2= 0.1

    0.3 + 0.1

    = 0.25.

    Example. Market research suggests that in a five year period 8% of people withcable television will get rid of it, and 26% of those without it will sign up for it.What is the long run fraction of people with cable TV?

    P =

    Cable NoCable 0.92 0.08No 0.26 0.74

    Then, Cable= 0.26

    0.26 + 0.08=

    26

    34= 0.7647.

    38

  • 7/21/2019 Lecture Notes-March06 (1)

    39/43

    Lecture 15:Long-run Behavior (Limiting Behavior) ...

    Example. Consider an (s, S) inventory control policy. Assume that the distribution ofthe demand on dayn is

    P(Dn= 0) = 0.3, P(Dn= 1) = 0.4, P(Dn = 2) = 0.2, P(Dn= 3) = 0.1.

    Suppose that sales produce a profit of $12 but it costs $2 a day to keep unsold unitsin the store overnight. What are the optimal values of s and S that maximize thelong-run net profit?

    The objective is to maximize the long-run net profit, i.e.,

    E[net profit] = E[sales] E[holding costs] .

    LetIdenote the inventory level at the beginning of the day. Conditioning on theinventory level at the beginning of the day, we have

    E[net profit] =k

    E[net profit|I=k] P(I=k).

    Note that P(I) is the long-run probability of having Iunits at the beginning ofthe day.

    Since it is impossible to sell 4 units in a day, and it costs us to have unsoldinventory we should never have more than 3 units on hand.

    Based on the above discussion the inventory level at the beginning of a day is

    either 3, 2, or 1. We consider them separately.

    Suppose that the inventory level at the beginning of a day is 3, i.e., I= 3.

    Then the sales of the day is

    E[sales|I= 3] = E[sales|I= 3, Dn= 0] P(Dn= 0)+E[sales|I= 3, Dn= 1] P(Dn=

    +E[sales|I= 3, Dn= 2] P(Dn= 2) + E[sales|I= 3, Dn= 3] P(Dn= 3)

    = [0 (12)] P(Dn= 0)+[1 (12)] P(Dn= 1)+[2 (12)] P(Dn= 2)+[3 (12)] P(Dn= 3)

    = [0 (12)] (0.3) + [1 (12)] (0.4) + [2 (12)] (0.2) + [3 (12)] (0.1) = 13.2.

    The holding costs of the day is

    E[costs|I= 3] =E[costs|I= 3, Dn= 0] P(Dn= 0)+E[costs|I= 3, Dn= 1] P(Dn=

    +E[costs|I= 3, Dn= 2] P(Dn= 2) + E[costs|I= 3, Dn= 3] P(Dn= 3)

    = [3 (2)] P(Dn= 0)+[2 (2)] P(Dn = 1)+[1(2)] P(Dn= 2)+[0 (2)] P(Dn= 3)

    = [3 (2)] (0.3) + [2 (2)] (0.4) + [1 (2)] (0.2) + [0 (2)] (0.1) = 3.8.

    39

  • 7/21/2019 Lecture Notes-March06 (1)

    40/43

    The net profit of the day is

    E[net profit|I= 3] =E[sales|I= 3] E[costs|I= 3] = 13.2 3.8 = 9.4.

    Suppose that the inventory level at the beginning of a day is 2, i.e., I= 2.

    Then the sales of the day is

    E[sales|I= 2] = E[sales|I= 2, Dn= 0] P(Dn= 0)+E[sales|I= 2, Dn= 1] P(Dn=

    +E[sales|I= 2, Dn= 2] P(Dn= 2) + E[sales|I= 2, Dn= 3] P(Dn= 3)

    = [0 (12)] P(Dn= 0)+[1 (12)] P(Dn= 1)+[2 (12)] P(Dn= 2)+[2 (12)] P(Dn= 3)

    = [0 (12)] (0.3) + [1 (12)] (0.4) + [2 (12)] (0.2) + [2 (12)] (0.1) = 12.

    The holding costs of the day is

    E[costs|I= 2] =E[costs|I= 2, Dn= 0] P(Dn= 0)+E[costs|I= 2, Dn= 1] P(Dn=

    +E[costs|I= 2, Dn= 2] P(Dn= 2) + E[costs|I= 2, Dn= 3] P(Dn= 3)

    = [2 (2)] P(Dn= 0)+[1 (2)] P(Dn = 1)+[0(2)] P(Dn= 2)+[0 (2)] P(Dn= 3)

    = [2 (2)] (0.3) + [1 (2)] (0.4) + [0 (2)] (0.2) + [0 (2)] (0.1) = 2.0.

    The net profit of the day is

    E[net profit|I= 2] =E[sale|I= 2] E[costs|I= 2] = 12 2 = 10.

    Suppose that the inventory level at the beginning of a day is 1, i.e., I= 1.

    Then the sales of the day is

    E[sale|I= 1] = E[sale|I= 1, Dn= 0] P(Dn= 0)+E[sale|I= 1, Dn= 1] P(Dn= 1)

    +E[sale|I= 1, Dn= 2] P(Dn= 2) + E[sale|I= 1, Dn= 3] P(Dn= 3)

    = [0 (12)] P(Dn= 0)+[1 (12)] P(Dn= 1)+[1 (12)] P(Dn= 2)+[1 (12)] P(Dn= 3)

    = [0 (12)] (0.3) + [1 (12)] (0.4) + [1 (12)] (0.2) + [1 (12)] (0.1) = 8.4.

    The holding costs of the day is

    E[costs|I= 1] =E[costs|I= 1, Dn= 0] P(Dn= 0)+E[costs|I= 1, Dn= 1] P(Dn=

    +E[costs|I= 1, Dn= 2] P(Dn= 2) + E[costs|I= 1, Dn= 3] P(Dn= 3)

    = [1 (2)] P(Dn= 0)+[1 (2)] P(Dn = 1)+[0(2)] P(Dn= 2)+[0 (2)] P(Dn= 3)

    = [1 (2)] (0.3) + [0 (2)] (0.4) + [0 (2)] (0.2) + [0 (2)] (0.1) = 0.6.

    The net profit of the day is

    E[net profit|I= 1] =E[sale|I= 1] E[costs|I= 1] = 8.4 0.6 = 7.8.

    40

  • 7/21/2019 Lecture Notes-March06 (1)

    41/43

    To obtain E[net profit] =3

    k=0 E[net profit|I=k] P(I = k), we need to calcu-lateP(I=k) which depends on the inventory control policy.

    Since it is impossible to sell 4 units in a day, and it costs us to have unsoldinventory we should never have more than 3 units on hand, we compare the profitof (2, 3), (1, 3), (0, 3), (1, 2), and (0, 1) inventory policies.

    Consider (2, 3) inventory policy. In this case we always start a day with 3 units,therefore,

    P =

    0.1 0.2 0.4 0.3

    0.1 0.2 0.4 0.3

    0.1 0.2 0.4 0.3

    0.1 0.2 0.4 0.3

    and

    P20 =

    0.1 0.2 0.4 0.3

    0.1 0.2 0.4 0.3

    0.1 0.2 0.4 0.3

    0.1 0.2 0.4 0.3

    Therefore, under the (2, 3) inventory control policy, the long-run probabilities

    of having 0, 1, 2, and 3 units at the end of the day are 0 = 0.1, 1 = 0.2,2= 0.4, and3 = 0.3, respectively.

    Also, under the (2, 3) inventory control policy, the inventory at the beginningof a day is always 3. Therefore,

    E[net profit] =

    3k=1

    E[net profit|I=k] P(I=k) =

    E[net profit|I= 1] P(I= 1)+E[net profit|I= 2] P(I= 2)+E[net profit|I= 3] P(I

    = (7.8) (0) + (12) (0) + (9.4)(1) = 9.4.

    Consider (1, 3) inventory policy. Then,

    P =

    0.1 0.2 0.4 0.3

    0.1 0.2 0.3 0.4

    0.3 0.4 0.3 0

    0.1 0.2 0.4 0.3

    and

    P20 =

    19/110 30/110 40/110 21/110

    19/110 30/110 40/110 21/110

    19/110 30/110 40/110 21/110

    19/110 30/110 40/110 21/110

    41

  • 7/21/2019 Lecture Notes-March06 (1)

    42/43

    Therefore, under the (1, 3) inventory control policy, the long-run probabilities ofhaving 0, 1, 2, and 3 units at the end of the day are 0 = 19/110, 1 = 30/110,2 = 40/110, and3= 21/110, respectively.

    Under the (1, 3) inventory control policy, the inventory at the beginning of a dayis either 2 or 3. The long-run probability that the inventory level at the beginningof a day is 2 is P(I= 2) =2 = 40/110, andP(I= 3) =0+1+3 = 70/110.Therefore,

    E[net profit] =3

    k=1

    E[net profit|I=k] P(I=k) =

    E[net profit|I= 1] P(I= 1)+E[net profit|I= 2] P(I= 2)+E[net profit|I= 3] P(I= 3

    = (7.8) (0) + (10)

    40

    110

    + (9.4)

    70

    110

    = 9.61818.

    Consider (0, 3) inventory policy. Then,

    P =

    0.1 0.2 0.4 0.3

    0.7 0.3 0 0

    0.3 0.4 0.3 0

    0.1 0.2 0.4 0.3

    and

    P20 =

    343/1070 300/1070 280/1070 147/1070

    343/1070 300/1070 280/1070 147/1070

    343/1070 300/1070 280/1070 147/1070

    343/1070 300/1070 280/1070 147/1070

    Therefore, under the (0, 3) inventory control policy, the long-run probabilities ofhaving 0, 1, 2, and 3 units at the end of the day are0= 343/1070,1= 300/1070,2 = 280/1070, and 3= 147/1070, respectively.

    Under the (0, 3) inventory control policy, the inventory at the beginning of a dayis either 1, 2 or 3. Therefore, P(I = 1) = 1 = 300/1070, P(I = 2) = 2 =280/1070, and P(I= 3) =0+ 3= 490/1070. Therefore,

    E[net profit] =3

    k=1

    E[net profit|I=k] P(I=k) =

    E[net profit|I= 1] P(I= 1)+E[net profit|I= 2] P(I= 2)+E[net profit|I= 3] P(I= 3

    = (7.8)

    300

    1070

    + (10)

    280

    1070

    + (9.4)

    490

    1070

    = 9.108.

    42

  • 7/21/2019 Lecture Notes-March06 (1)

    43/43

    Consider (1, 2) inventory policy. Then,

    P =

    0.3 0.4 0.3

    0.3 0.4 0.3

    0.3 0.4 0.3

    Therefore, under the (1, 2) inventory control policy, the long-run probabilities of

    having 0, 1, and 2 units at the end of the day are 0 = 0.3,1= 0.4, and2= 0.3,respectively. Then,

    E[net profit] =2

    k=1

    E[net profit|I=k] P(I=k) =

    = (7.8) (0) + (10) (1) = 10.

    Consider (0, 2) inventory policy. Then,

    P =

    0.3 0.4 0.3

    0.7 0.3 0

    0.3 0.4 0.3

    Therefore, under the (0, 2) inventory control policy, the long-run probabilities ofhaving 0, 1, and 2 units at the end of the day are 0= 49/110, 1 = 40/110, and2 = 21/110, respectively. Then,

    E[net profit] =

    2k=1

    E[net profit|I=k] P(I=k) =

    = (7.8)

    40

    110

    + (10)

    70

    110

    = 9.2.

    Therefore, the optimal values for sand Sares = 1 and S= 2.

    Remark. Let P be the transition matrix of a Markov chain. Then, the stationarydistribution of the Markov chain is the solutions ofP = where is the vector thestationary distribution.