foss lecture1

8/12/2019 Foss Lecture1

1/32

Lectures on Stochastic Stability

Sergey FOSS

Heriot-Watt University

This mini-course presents an overview of stochastic stability methods, mostly moti-vated by (but not limited to) stochastic network applications. We work with stochastic

recursive sequences, and, in particular, Markov chains, in a general Polish state space. We

discuss and compare methods based on (i) Lyapunov functions, and fluid limits, (ii) ex-

plicit coupling (renovating events and Harris chains), (iii) monotonicity, and some others.

We also discuss instability methods and perfect simulation methods.

Lectures are based on handouts of my lecture notes (Colorado State Uni, 1996; Novosi-

birsk State Uni, 19972000; Kazakh National University, 2007), on the joint overview pa-

per with Takis Konstantopoulos (2004), on notes written by us for a Short LMS/EPSRC

Course for PhD students (September 2006), and on some (more-or-less) recent publica-tions.

1


2/32

Table of Topics

1. Introduction.

2. Lyapunov techniques. Criteria for Positive Recurrence and for Instability.3. Fluid Approximation Approach.

4. Coupling and Harris Chains.

5. Monotonicity and Saturation Rule.

6. Renovation Theory, Perfect Simulation.

7. Some intriguing open problems.

2


3/32

1 Lecture 1. Basic Tools.

1.1 Notation, Acronyms, and Basic Concepts

R.v. random variable

i.i.d. independent identically distributed

X , Y , Z , , , , . . . for r.v.s

F, G distribution functions, f density function

P probability and probability measure, E expectation, D variance

=F means P(x) =F(x)for all x

=P means P(B) =P(B), B B.

I(A), or1(A) the indicator function of event A, I(A) = 1 ifA occurs, and .I(A) = 0,

otherwise.

Here are standart families of distributions:

U[a, b] G(p)

E() B(m, p)

N(a, 2) ()

Convergence:

na.s. means P(lim n = ) = 1, or > 0, P(supmn |m |> )0

asn .

np

means > 0, P(|n |> ) 0 as n .

The same for random vectors.

3


4/32

Key Properties of Convergence. Let mean either a.s.or

p.

(1) Ifn andn , then (n, n) (, )

(2) Ifn and ifg is a continuous function, then g(n) ().

(3) More generally, assume that g is not continuous everywhere and denote by Dg a setof its discontinuity points. Ifn and ifP( Dg) = 0, theng(n)().

Weak convergence of distribution functions: Fn F, if, for each x such that F(x)is

continuous in x,

Fn(x) F(x).

Equivalent form: Fn F if, for any bounded and continuous function g , g(x)dFn(x)

g(x)dF(x).

Comment on terminology: Weak convergence is the most common term. Otherterms are convergence of/in distribution(s) and convergence in law.

Weak convergence of random variables: n . It means: n = Fn, = F and

Fn F.

Note that n is just a convenient notation ! There is no any real convergence of

random variables on sample paths.

Relations between convergence types:

na.s. implies n

p and n

p implies n .

Both converse statements are incorrect. Here are two examples:

Example 1. Weak convergence does not imply convergence in probability. LetP(1 =

1) =P(1= 1) = 1/2 and n+1= n, n = 1, 2, . . ..

Example 2. Convergence in probability does not imply a.s. convergence. Let ,F, P

= ((0, 1], B(0,1], )where is the Lebesgue measure. Let0 1. Let, for m= 1, 2, . . .,

for n such that 1 + 2 +. . .+ 2m1 < n 1 + 2 +. . .+ 2m1 + 2m, and for i =

n (1 + 2 + . . . + 2m1),

n() = 1 if ((i 1)/2m, i/2m) and n= 0, otherwise.

4


5/32

Laws of Large Numbers.

If, 1, 2, . . . are i.i.d. random variables with a finite mean, saya = E, then the

Weak Law of Large Numbers (WLLN) says:

Sn/n pa as n

and the Strong Law of Large Numbers (SLLN) says

Sn/na.s.a as n .

Lebesgue and Beppo Levy Theorems.

Theorem (Beppo Levy). If{n} is a.s. non-negative and non-decreasing sequence

of random variables, then

E limn

n = limn

En

where both sides are either finite or infinite simultaneously.

5


6/32

Coupling.

is a copy of they have the same distribution D=. In general, and

may be defined on different probability spaces.

(a) Coupling of distribution functions (d.f.) or of probability measures.

For two d.f.s F1 andF2, theircouplingis a construction of a two-variate distribution

functionF(x1, x2)such that F(x1, ) =F1(x1)and F(, x2) =F2(x2).

Similarly, for two probability measures, P1 andP2 on the real line, their coupling is a

probability measure on the plane P(), such that its projections areP1 andP2.

The same definitions of coupling may be introduced for any number of distributions

(distribution functions, probability measures).

Such a coupling may also be viewed as follows: we define a probability space ,

F,

Pand two random variables 1 and 2 on this space such that 1 = F1 and 2 = F2 (or,

in other notation, 1 = P1 and 2 = P2). Then their joint distribution, say F, has

marginals F1 andF2 (or, equivalently, a probablity measure P(B) =P((1, 2) B)has

martinals P1 andP2).

(b) Coupling of two random variables.

Let1 be defined on 1,F1, P1and 2 be defined on 2,F2, P2.

A coupling of these two r.v.s is defined by, first, an introduction of a new probability

space, say,F, Pand, then, by defining a pair of two r.v.s1,2 on this space suchthat1 D=1,2 D=2.Examples:

(1)F1= U(0, 1),F2 = U(0, 1);

(2)F1= U(0, 1),F2 = E(1);

(3)F1= U(0, 1),F2 = (1);

(4)F1= B(n, p), F2= (np);

(5)F1 has a density 2xI(x (0, 1)and F2 a density 2(1 x)I(x (0, 1).

In each example, there are many couplings !

6


7/32

1.2 Weak and strong convergence

Lemma 0.

IfFn F (allFn andFare d.f.s), then a coupling of{Fn}andF:

n a.s. .

Proof. For a d.f. F, define its inverse F1 by

F1(z) = inf{x :F(x) z}, z (0, 1).

Let =(0,1), F be the -algebra of Borel subsets in (0,1), and P the Lebesgue

measure on (0,1).

Set() =, . Then =U(0, 1).

Letn= F1

n (), = F1()and shown

a.s.. Note thatn=Fn, =F

In order to avoid some technicalities, assume, for simplicity, that all d.f. are continuous.Let

n

= infmn

m, n= supmn

m, Fn= supmn

Fm, Fn= infmn

Fm

Thenn

=Fn,n=Fn.

Indeed,

P(n

x) = P(n

< x) =P(m n : m< x) =

= P(m n: F1m ()< x) =P(m n : < Fm(x)) =

= P( < supmn

Fm(x)) = Fn(x)

Similarly,P(n> x) =. . .= 1 Fn(x).

SinceFn F andFn F(by definition), it is sufficient to show that, for instance,

n

a.s..

But both {Fn} and {n} are monotone as a function ofn!

Then n

a.s. and, therefore, there exists such that n

a.s. Then

a.s.

IfP(=)>0, then there exists x:

P( x)>P( x).

ButP( x) =F(x) = lim Fn(x) P( x)!

Thus, we got a contradiction, and n

a.s. By similar arguments, n a.s.

Therefore, n a.s.

Problem No 1. Prove this lemma without the additional

assumption that all d.f.s are continuous

Exercises: What is F1 for the following distribution functions:

U(0, 1),E(), N(0, 1), B(1, p), B(n, p,()...

7


8/32

1.3 Uniform integrability

Let{n}n1 be a sequence of real-valued r.v.s.

Definition 1.

{n} are uniformly integrable (UI), ifE|n|< nand, moreover,

supn

E{|n| I(|n| x)} h(x) 0 asx .

Comments:

Actually, we can put = instead of in the definition above. But I prefer to

keep since I want the upper bound h(x) to be monotone non-increasing and right-

continuous.

Clearly, if{n}are UI, thensupn E|n| is finite.

Examples:(1)n=E(n),n = 1, 2, . . .are UI if and only ifminn n> 0.

(2)n 2n 2n 0

1

2n

1

2n 1

1

n

|=

E|n|= 2, En= 0; 0,

but{n} are not UI!

Lemma 1.

The following are equivalent:

(i){n} are UI;

(ii) a functiong: [0, )[0, ):

(a)g(0)> 0; g ;limx g(x) =;(b)supn E{|n| g(|n|)}<

Note: g(0)>0 is not essential!

Proof.

(ii)(i). For each n,

E{|n| I(|n| x)} E{|n| g(|n|)

g(|n|) I(|n| x)}

1

g(x) sup

n

E{|n| g(|n|)} 0 as x .

(i) (ii). Assume that h(x)>0 x (otherwise the statement is trivial).

Form Z, let

Am= {x: 1

22(m+1) < h(x)

1

22m}

and, for x Am, let g(x) = 2m. Fromh(0)0.

Note that Am is an interval which is closed from the left and open from the right.

Denote byzm its left boundary point, zm Am. Then

E{|n| g(|n|)}=

m

E{|n| g(|n|) I(|n| Am)}=

8


9/32

=

m

E{|n| 2m I(|n| Am)}

m

2mE{|n| I(|n| zm)}

m 2m h(zm) m 2

m 1

22m

0 is arbitrary, E|n| E||.

(b) Assume now that at least one of distributions of r.v.s has an unbounded support,

that is,P(|n| N)0 such that P(||= x) = 0. Since na.s.

,

n n I(|n|< x)a.s. I(||< x) .

Then

n, P(|n| x) =P(|| x) = 1 |=En E (see (a));

and

|n| |n|a.s. |=E|n| E|n| supn

E|n| K n |=E|| K.

9


10/32

(b2) Show first that E||< . Indeed,

E||= limx

E{|| I(|| N)} K 0, choosex such that P(||= x) = 0,h(x) , andE{|| I(||

x)} .

Let

n= E{|n| I(|n| x)} and = E{|| I(|| x)}.

Then

E|n| = E{|n| I(|n|< x)} + n,

E = E{|| I(||< x)} + .

Sincen nand|| , then

lim sup(E|n| E||) 2 and

liminf(E|n| E||) 2 for any.

Letting to 0, we obtain the first statement of the lemma.

Prove now the second statement. First, from E 0

and then choose x0= x0()such that P(= x0) = 0 and

E{ I( x0)} /2.

Then we may use part (b1) from the proof of (1): for a given x0,

En E |= E{n I(n x0)}= E(n n) =

=En En E E= E{ I( x0)} /2.

Therefore, n()such that

E{n I(n x0)} n > n().

Now, n = 1, 2, . . . , n(),

En< |= xn : E{n I(n xn)} .

Letx = max(x1, . . . , xn(), x0). Then

E{n I(n x)} n.

Thus,

supn

E{n I(n x)} 0 as x .

10


11/32

1.4 Some useful properties of UI

Property 1. [If{n}are UI and if{n}are such that|n| |n|a.s., then{n}are UI.

Indeed, let h(x) be from Definition 1. Then, x >0,

E{|n| I(|n|> x)} E{|n| I(|n|> x)} E{|n| I(|n|> x)} h(x).

Property 2.

If{n} is an i.i.d. sequence with finite mean, E|1| x) P(|n| > x) x).

In particular, if r.v.s{n} admit a stochastic integrable majorant,

|n|st||, n

and ifE||


12/32

(a) The statement and the proof of Lemma 1 stay the same if we replace n= 1, 2, . . .

by t T.

(b) Similarly, the statement and the proof of Lemma 2 stay unchanged if we replacen= 1, 2, . . . by t T = [0, ).

(c) Properties 1 and 3 still hold is we replace n= 1, 2, . . . by t T.

12


13/32

1.5 Coupling inequality. Maximal coupling. Dobrushins

theorem.

In this section, we assume that random variables are not necessarily real-valued and maytake values in a general measurable space (X,BX) which is assumed to be complete

separablemetric space.

The Coupling Inequality

Let1, 2 : ,F, P (X,BX) be two X-valued r.v.s. Let

P1(B) =P(1 B), P2(B) =P(2 B), B BX.

Then, forB BX,

P1(B) P2(B) = P(1 B, 1= 2) + P(1 B, 1=2)

P(2 B, 1= 2) P(2 B, 1 =2) =

= P(1 B, 1=2) P(2 B, 1 =2) P(1=2)

P(1 =2)

Therefore, for any B BX, |P1(B) P2(B)| P(1 =2), that is

()sup

BBX

|P1(B) P2(B)| P(1 =2)

The Maximal Coupling

Now we reformulate the result obtained. Note that the LHS of inequality (*) depends

on marginaldistributionsP1 andP2 only and does not depend on the joint distribution

of1 and2. Therefore, we get the following:

for any coupling of marginal distributions P1and P2, inequality (*) holds. Equivalently,

()supBBX |P1(B) P2(B)| infin all coupling P(1 =2)

The following questions seem to be natural:

(?) Is there equality in ()?

(??) If the answer isyes, then does there exist a coupling such that

supBBX

|P1(B) P2(B)|= P(1 =2)?

13


14/32

The answers to both questions are positive! And this is the content of

Dobrushins theorem.

Theorem 1.

LetP1andP2be two probability measures on a complete separable metric

space(X,BX). There exists a coupling of these probalility measures such

that, fori=Pi,i= 1, 2,

supBBX

|P1(B) P2(B)|= P(1=2).

Proof. (B) =P1(B) P2(B)is a signed measure. Then Banach theorem states that

there exists a subset C X such that

(a) (B) 0 BC;

(b) (B) 0 B X\ CC.

Note:

1) if(C) = 0, then P1= P2 and the coupling is obvious;

2) (C) =(C).

Assume(C)> 0. Introduce 4 distributions (probability measures):

Q1,1 is defined by Q1,1= U(C), ifP1(C) = 0,

Q1,1(B) = P1(C B)P1(C)

, B BX, otherwise.

and

Q2,1 is defined by Q2,1(B) =P2(C B) P1(C B)

(C), B BX.

Similarly,

Q2,2 is defined by

Q2,2= U(C), ifP2(C) = 0,

Q2,2(B) =P2(C B)

P2(C) , B BX, otherwise.

and

Q1,2 is defined by Q1,2(B) =P1(C B) P2(C B)

(C) , B BX.

Then introduce 5 mutually independent r.v.s:

1,1 =Q1,1, 1,2=Q1,2, 2,1=Q2,1, 2,2 =Q2,2,

and 1 2 0

P1(C) P2(C) (C)

14


15/32

Now we can define 1 and2 as follows:

1= 1,1 I(= 1) + 2,2 I(= 2) + 2,1 I(= 0),

2= 1,1 I(= 1) + 2,2 I(= 2) + 1,2 I(= 0).

Simple calculations show that i=Pi, i = 1, 2. This is Problem No 3 for you.

Then,

P(1=2) =P(= 0) =(C) supBBX

|P1(B) P2(B)|.

So,

P(1=2) = supBBX

|P1(B) P2(B)|.

Comment. Banach theorem and Radon-Nykodim theorem are two equivalent state-

ments formulated in slightly different ways.

There is (formally!) another proof (see, e.g. T. Lindvalls book on the coupling method)

based on Radon-Nykodim theorem:

Consider a new probability measure P() = (P1() + P2())/2. Let fi = dPi

dP be corre-

sponding densities. Then

sup

BBX

|P1(B) P2(B)|= 1 min(f1(x), f2(x))P(dx),

and we may repeat the previous construction using densities.

What is the maximal coupling in the following examples:

(1) Two discrete two-point distributions.

(2) Two absolutely continuous distributions on (0, 1)with densitiesf1 andf2.

(3) Bernoulli and Poisson distributions.

(4) Normal and exponential distributions.

(this is another exercise to you)

15


16/32

1.6 Probabilistic Metrics

Dobrushins theorem provides a positive solution to one of important problems in the

theory of Probabilistic Metrics. We will discuss briefly basic concepts of this theory.Again, consider a complete separable metric space (X,BX)and introduce the following

notation:

1)X2 = X X,

2) B2X

= BX BX is a-algebra in X2 generated by all sets B1 B2,B1, B2 BX,

3)diag(X2) ={(x, x), x X}.

Problem No 4. Prove thatdiag(X2) B2X

. (Actually, there is no need to assume

that the state space is complete separable metric, and the minimal requirement for

diag(X2) B2X

to hold is that the sigma-algebra BX is countably generated).

LetPbe any probability distribution on (X2,B2X). Denote byPi,i= 1, 2its marginaldistributions:

P1(B) = P(B X),

P2(B) = P(X B), B BX.

LetPbe a set of all probability distributions (measures) on (X2,B2X

).

Definition 3.

A function d: P [0, ) is called a probabilistic metric if it satisfies

the following conditions:

(1) P(diag(X2

)) = 1 |=d(P) = 0;

(2) d(P) = 0 |=P1= P2;

(3) the triangle inequiality:P(1) has marginalsP1 andP2P(2) has marginalsP1 andP3P(3) has marginalsP3 andP2

|=d(P(1)) d(P(2)) + d(P(3));

Definition 4.

A probabilistic metricd is simple if it depends on marginal distributions

only (i.e. if P(1) and P(2) have the same marginals, then d(P(1)) =

d(P(2))), and complex otherwise.

For a simple metric, it is reasonable to write d(P1, P2) instead ofd(P), sod has the

meaning of a distance between P1 andP2.

For a complex metric, we may also write d(1, 2) instead ofd(P) where 1, 2 is a

coupling of two r.v.s with a joint distribution P,

P(B) =P((1, 2) B), B B2X

.

So,d(1, 2)may be considered as a distance between r.v.s.

16


17/32

We can also write d(1, 2)for simple metrics. In this case,

d(1, 2) =d(F1, F2) =d(P1, P2).

Examples.

Simple Complex

1)supBB |P1(B) P2(B)| 2)P(1=2) P(X2 diag(X2))

(Total variation norm (T.V.N.)) (Indicator metric (I.M.))

For real-valued r.v.s:

3)supx |F1(x) F2(x)| 5)inf{ >0 :P(|1 2|> )< }

(Uniform metric (U.M.)) (Ki Fan metric (K.F.M.))

4)inf{ >0 : F1(x ) F2(x) F1(x + ) + x}

(Levy metric (L.M.))

One of key problems in the theory of probabilistic metrics is to find answers to the

following questions:

Assume a simple metricd(P1, P2) is given. Does there exist a complex metric

d such that

(a) the following coupling inequality holds:

d(1, 2) infall couplings

d(1, 2) ?(compare with())(b) If yes, then is it possible to replace by = in (a) ?

(c) Does there exist a coupling such thatd(1, 2) =d(1, 2)?The following result holds:

Theorem 2.

The answers to the above questions are positive for the metrics:

(1) d= T.V.N.d= I.M.(2) d= L.M.d= K.F.M.Comment. Statement (1) is Dobrushins theorem. Statement (2) is Strassens theo-

rem (its proof is omitted).

17


18/32

1.7 Stopping times

Let ,F, Pbe a probability space and {n}n1 a sequence of r.v.s, n : R.

Denote byFn a -algebra, generated byn:Fn F; Fn = {

1n (B), B B},

whereB is a -algebra of Borel sets in R.

Then, for 1 k n, F[k,n] is a -algebra generated byk, . . . , n; i.e.

F F[k,n] is a minimal -algebra such that

F[k,n] Fl for all l = k, . . . , n.

Another way to describe F[k,n] is:

let k,n := (k, . . . , n)be a random vector; k,n: Rnk+1. Then

F[k,n]= {1k,n(B), B Bnk+1},

whereBnk+1 is a -algebra of Borel sets in Rnk+1.

Finally,F[1,) is a -algebra generated by the whole sequence {n}n1.

Good Property :

A F[1,), a sequence of events{An}n1, An F[1,n] such that

P(A \ An) + P(An \ A) 0 asn .

Let now : {1, 2, . . . , n , . . .} be an integer-valued r.v. (we say it is a counting

r.v.)

Definition 5.

is a stopping time (ST) with respect to{n}, ifn 1,

{= n} F[1,n]

(or, equivalently { n} F[1,n]).

Another variant of a definition of a stopping time is:

Definition 6.

is an ST if a family of functionshn: Rn {0, 1} such that:

n 1, I(= n) =hn(k, . . . , n)a.s.

(or, equivalently I( n) =hn(k, . . . , n)a.s.).Examples of STs:

(1) = min{n 1 :n x};

(2) = min{n 1 :n

1i x};

(3) More examples....

Assume now that {n} is an i.i.d. sequence, is an ST with P(


19/32

Lemma 3.

The following statements hold:

1) {i} is an i.i.d. sequence;

2) i D=1;

3) {i}i1 and a random vector(, 1, . . . , ) are mutually indepen-

dent.

Corollary 1.

{i}i1 andS 1+ . . . + are mutually independent.

Proof of Lemma 3. It is sufficient to show that

k 1, m 1, Borel sets B1, . . . , Bk andC1, . . . , C m,

()P({= k; 1 B1, . . . , kBk} {1 C1, . . . ,m Cm}) ==P(= k;1 B1, . . . , k Bk)P(1 C1, . . . , m Cm).

Indeed, () |=1), 2), and 3).

First, take B1= . . .= Bk =Bk+1= . . .= R. Then, m,

()P(1 C1, . . . ,m Cm)

t.p.f.

=

k=1

P(= k;1 C1, . . . ,m Cm)

()=

k=1

P(= k)m

i=1

P(1 Ci) =m

i=1

P(1 Ci).

In particular, j1 Cj, we can take m j andCi= R for i =j .

Then the LHS of()=P(j Cj),

the RHS of()=P(1 Cj).

|= 2)

Now, take anyC1, . . . , C m and replace in()m

i=1

P(1 Ci) bym

i=1

P(1 Ci).

|= 1)

Finally, take anyB1, . . . , Bk andC1, . . . , C m and replace in()m

i=1P(1 Ci) by

m

i=1P(i Ci).

|= 3)

So, we will prove ()now:

P({= k; 1 B1, . . . , kBk} {1 C1, . . . ,m Cm}) =

P({hk(1, . . . , k) = 1; 1 B1, . . . , kBk} F[1,k]

{k+1 C1, . . . , k+m Cm} F[k+1,k+m]

) =

=P(. . .) P(. . .) =

=P(. . .) m

i=1

P(k+i Ci) =P(. . .) m

i=1

P(1 Ci).

19


20/32

Lemma 4.

(Waldidentity)

Assume thatE|1|< andE


21/32

|= . . . F[1,k] k

|= . . .

F[1,k].

Now let us write (1)i instead of i

(1) instead of

(2)i ...

i(2) ... ... ...

...

Lemma 6. If(i) is a ST w.r. to{(j)i }i1 j= 1, . . . , J and if{(j+1)i }= {(j)i },

then(1) + . . . + (J) is an ST w.r. to{i}i1.

Problem No 5. Prove Lemma 6.

21


22/32

1.8 Two-dimensional stopping times

Let {n,1}n1 and {n,2}n1 be two sequences of r.v.s and F[k1,n1][k2,n2] a -algebra

generated by k1,1, k1+1,1, . . . , n1,1; k2,2, k2+1,2, . . . , n2,2.

Definition 7.

A pair of r.v.s1, 2: {1, 2, . . .} is an ST w.r. to{n,1} and{n,2},

if

n1 1, n2 1 {1= n1, 2= n2} F[1,n1][1,n2].

Lemma 7.

If{n,1}n1 and{n,2}n1 are two mutually independent sequences and

if(1, 2) is an ST, then

1) each of the sequences

{i,1} {1+i,1}and{i,2} {2+i,2}

is i.i.d., and these sequences are mutually independent;

2) i,1D=1,1; i,2

D=1,2;

3) {{i,1}i1; {i,2}i1} and a random vector

(1, 2; 1,1, . . . , 1,1; 1,2, . . . , 2,2)

are mutually independent.

Proof is omitted.

Lemma 8.

In conditions of Lemma 7, assume, in addition, that

1,1D=1,2.

Then a sequence{n}n1,

n=

n,1, if n 1n1+2,2, if n > 1

is i.i.d.; nD=1,1.

Proof. We have to show that n = 1, 2, . . ., B1, . . . , Bl

P(1 B1, . . . , n Bn) =n

i=1

P(1,1 Bi).

1) n, B

P(n B) =P(n,1 B; n 1) + P(n1+2,2 B; n > 1).

22


23/32

P(n,1 B; n 1) = P(1,1 B) P(1,1 B) P(n > 1) =

= P(n,1 B) P(n 1)

P(n1+2,2 B; n > 1) =

n1

l=1

P(2+nl,2 B; 1 = l)

=n1l=1

P(nl,2 B; 1= l)

= . . .= P(1,2 B) P(1 < n)

2) Problem No 6. Prove the statement for joint distributions. Use the induc-

tion arguments.

Here is another variant of a two-dimensional analogue of Lemma 3.

Lemma 9.

Assume that

(i) n= (n,1, n,2)is a sequence (n= 1, 2, . . .) of independent random

vectors;

(ii) each of{n,1}n1 and{n,2}n1 is an i.i.d. sequence;

(iii) 1,1D=1,2;

(iv) (1, 2)is an ST and1 2 = .

Then

n = n,1, if n n,2, if n > is an i.i.d. sequence; n

D=1,1.

Proof is very similar to that of Lemma 8 (omitted).

Finally, here is a further generalization of Lemma 9.

Lemma 10.

In the statement of Lemma 9, replace

( i )by( i ) =

m1 1, m2 1:n = ((n1)m1+1,1, . . . , nm1,1; (n1)m2+1,2, . . . , nm2,2) is

an i.i.d. sequence;

and

(iv)by(iv) =

(1, 2)is an ST,P(1 {m1, 2m1, . . .}) =P(2 {m2, 2m2, . . .}) = 1

and 1m1

2m2

.

Then

n=

n,1, if n 1n1+2,2, if n > 1

is an i.i.d. sequence; nD=1,1.


23


24/32

1.9 Stationary Sequences and Processes

Discrete Time

Definition 8.

(a) Let{n}n0 be a sequence of r.v.s.

It is stationary if l = 1, 2, . . ., 0 i1 < i2 < . . . < il,

B1, . . . , Bl B, m = 1, 2, . . .

P(i1 B1, . . . , il Bl) =P(i1+m B1, . . . , il+m Bl). (1)

(b) Similarly, a double-infinite sequence {n}n= is stationary, if (1)

holds m Z and B1, . . . , Bl B.

Continuous Time

Definition8.

(a) Let{t}t0 be a family of r.v.s.

It is stationary, if l = 1, 2, . . ., 0 t1 < t2 < . . . < tl,

B1, . . . , Bl B, u 0

P(t1 B1, . . . , tl Bl) =P(t1+u B1, . . . , tl+u Bl).

(b) Similarly,{t}t=is stationary, if the above equality holds u R

and B1, . . . , Bl B.

Definition 9. A sequence of events{An}n= is stationary, if a sequence of random

variables{I(An)}n= is stationary.

Assume that {An}n=is a stationary sequence and that P(A0)> 0and P(

n=0An) =

1.

Introduce the following r.v.s:

+ = min{n 1 : I(An) = 1} min{n 1 : An}

= min{n 1 : I(An) = 1}

+ : P( > n) =P(A1 . . . An|A0)

: P( > n) =P(A1 . . . An|A0)

Lemma 11.

(a) D=;

(b) D=;

(c) P(=n) =P(A0) P(n) n = 1, 2, . . .

Remark 4. [ The statement of the lemma is not obvious, in general.

24


25/32

Examples: Let{n}be an i.i.d. sequence,P(n> 0) > 0.

The we can take a) An= {n> 0}; b)An= {n+ n1 > 0}.

Proof of Lemma 11.

(a)

P( > n) = P(A1 . . . An)

m

= P(A1+m . . . An+m)

m=n1

=

= P(An . . . A1) =P( > n).

(b)

P(=n) = P(A0A1 . . . An1An)

P(A0)

=P(AnAn+1 . . . A1A0)

P(A0)

=

= P( =n).

(c)

P(n) = P(A1 . . . An1) =P(A0A1 . . . An1) + P(A0A1 . . . An1)

= P(A0) P(A1 . . . An1|A0) + P(A1 . . . An) =

= P(A0) P(n) + P(n + 1).

|= P(=n) =P(n) P(n + 1) =P(A0) P(n).

Corollary 2.

k >0, Ek


26/32

and, using similar arguments with the lower bound,

Ek P(A0)

k+ 1

Ek+1.

|= Ek andEk+1 are either finite or infinite simultaneously.

26


27/32

1.10 On -algebras generated by a sequence of r.v.s.

(1). Let ,F, P be a probabililty space andn: R, n = 1, 2, . . . a sequence

of r.v.s. Let F[k,n] = (k, . . . , n); F[k,) = (k, k+1 . . .).ForA, B F, introduce a distance

d(A, B) =P(A \ B) + P(B\ A).

(A)Recall basic properties of-algebras.

1) IfF(1),F(2) are -algebras on |= F(1) F(2) is -algebra, too, but

F(1) F(2) may be not, in general.

2) More generally, let T be any parameter set and F(t), t T -algebras on

|= tTF(t)

is -algebra, too.

By definition,F[1,) is a minimal-algebra which contains all -algebras F[1,n] , n=

1, 2, . . . it is an intersection of all -algebras F[1,n]n = 1, 2, . . ..

SinceF F[1,n] n |= F[1,] F.

(B)Now we study properties of the distance d:

(1) Clearly, d(A, B) =d(B, A) 0;

(2) d(A, C) d(A, B) + d(B, C)(the triangle inequality);

Indeed, A \ C= (A \ B) (A (B\ C)) (A \ B) (B\ C)

|= P(A \ C) P(A \ B) + P(B\ C).

Similarly,

P(C\ A) P(B\ A) + P(C\ B).

(3) d(A, B) =d(A, B)(since P(A \ B) =P(B\ A));

(4) |P(A) P(B)| |P(A B) + P(A \ B) P(A B) P(B\ A)| d(A, B);

(5) d(A1 A2, B1 B2) d(A1, B1) + d(A2, B2);

Indeed,(A1A2)\(B1B2) = (A1\(B1B2))(A2\(B1B2)) (A1\B1)(A2\B2)

|= P((A1 A2) \ (B1 B2)) P(A1 \ B1) + P(A2 \ B2).

Lemma 12.

A F[1,), {An}n1, An F[1,n] : d(A, An)0.

27


28/32

Proof. Let U be a set of events A F such that {An}n1, An F[1,n] :

d(A, An)0.

1) One can easily se that U F[1,m] m = 1, 2, . . ..

Indeed, m, A F[1,m], let

An =

, ifn < m;

A, ifn m.

Therefore, A U.

2) Thus, it is sufficient to show that U is-algebra. Then, with necessity,U F[1,),

that completes the proof.

2.1) First we prove that Uis an algebra, i.e.

(i) U;

(ii) A U |= A U;

(iii) k,A(1), . . . , A(k) U |= A(1) . . . A(k) U.

(i) is obvious, (ii) follows from property (3), and (iii) follows from (5):

d(A(1) . . . A(k), A(1)n . . . A(k)n )

k

j=1d(A(j), A(j)n ) 0.

2.2) Now we prove that U is a -algebra:

(iii) A(1), A(2) . . . U |= A j=1A(j) U.

LetB (k) =kj=1A(j) Then B (k) A and P(B(k))P(A).

|= {B(k)n } : B(k)n F[1,n], d(B

(k), B(k)n ) 0 as n .

Choose

n(1) = min{n 1 : d(B(1), B(1)l ) 1/2l n}

and, for k 1,

n(k+ 1) = min{n n(k) : d(B(k), B(k)l ) 1/2

k l n}.

Then let

An=

, ifn < n(1);

B(k)

n(k), ifn(k) n < n(k+ 1).

Clearly, An F[1,n]. Then d(A, An) d(A, B(k)) + 1/2k, for n(k) n < n(k+ 1).

Sincek as n , d(A, An) 0.

28


29/32

Lemma 13.

Let{n}n= be a double-infinite sequence of r.v.s,

F(,) = {. . . , 2, 1, 0, 1, 2, . . .}.

Then A F(,), {An}, An F[n,n] : d(A, An) 0.


(2). Sigma-algebras generated by sequences of independent r.v.s.

Definition 10.

For a sequence{n}n1 of r.v.s, its tail-algebra is

F= k=1F[k,).

Note: Since F[k+1,) F[k,), |= F= k=lF[k,) l.

Definition 11.

For a sequence{n}n=,

F= k=1F[k,)

k=lF[k,), < l


30/32

Lemma 15.

If {n}

n=is a sequence of independent r.v.s, then both FandF

are trivial.


(3). A stationary sequence of r.v.s.

Definition 12.

A sequence{n}n1 (or{n}n=) is stationary, if

l 1, 1 n1 < n2< . . . < nl (or without 1),

k 1 (or < k


31/32

Definition 13.

An F[1,)-measurable (orF(,)-measurable) r.v. is invariant (w.r.to

), if

= a.s. (i.e. P( = ) = 1).

An eventA F[1,) (orA F(,)) is invariant (w.r.to), if

P(A A) =P(A).

Note that = a.s. x,

P({ x} { x}) =P( x).

Comments, examples...

Definition 14.

A stationary sequence{n} is ergodic (w.r.to), if A F[1,) (AF[1,)),

Ais invariant |= P(A) = 01

(or is invariant |= = consta.s. ).

Remark 5.

All invariant events (sets) form a -algebra F(inv) (invariant-algebra).

Lemma 16.

(1) A F[1,) (or A F(,)) a sequence of events{nA, n

0} (or{nA, n }) is stationary;

(2) If {n} is stationary ergodic, then A F[1,) (or A

F(,)),P(A)>0

|= P(n=lnA) = 1 l (andP(n=l

nA) = 1 l).

Proof. (1) follows from definitions.

(2) Let B =n=lnA. Then

B = n=l(

n

A) =n=l+1

n

A

andB B

|= P(B B) =P(B) =P(B) |= B is invariant

|= P(B) = 01.

ButP(B) P(lA) =P(A)>0 |= P(B) = 1.

Lemma 17.

IfA is invariant, then B F such thatd(A, B) = 0.

31


32/32

Proof. There are two cases: (a) F[1,); (b) F(,). Here we give a proof in the

first case.

Problem No 10. Prove the lemma in the case (b).

1) Let B0,m= A A 2A . . . mA,B0=

n=0

nA. Then

A= B0,0 B0,1 . . . B0,m B0,m+1 . . . B0

andP(B0,m)P(B0). But

P(B0,m) =P(A) m |= P(B0) =P(A)and d(B0, A) = 0.

2) For k 1, put Bk =kB0

n=k

nA.

Note that Bk+1 Bk andBk F[k,),

P(Bk) =P(B0) =P(A) and d(Bk, A) = 0.

Let

B = limk

Bk |= P(B) =P(A) and d(B, A) = 0.

SinceB F[k,) k |= B F.

Remark 6.

In the caseF(,), the symmetric statement is true, too: if A is

invariant, then B F such thatd(A, B) = 0.

Corollary 3. Any i.i.d. sequence is stationary ergodic.Indeed,

F is trivial |=ifA is invariant,B F,P(B) = 0 1and d(A, B) = 0

|= P(A) = 0 1.

Remark 7.

There exists a number of weaker conditions that imply the triviality ofthe tail-algebra F and, as a corollary, the ergodicity of a stationary

sequence.

For instance, we can introduce the following mixing coefficients:

dk = supBF[k,),AF(,0]

|P(A B) P(A) P(B)|,

and then show that ifdk0 as k , then F is trivial.

In general, there are examples when F is not trivial, but Finv is (i.e. the sequence

is ergodic).

Example n+1= n n;1 =

1, w.pr. 1/2

1, w.pr. 1/2 Then

foss lecture1

Documents