characterization of cutoff for reversible markov chainscharacterization of cutoff for reversible...

Post on 30-Dec-2019

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Characterization of cutoff for reversible Markov chainsYuval Peres

Joint work with Riddhi Basu and Jonathan Hermon

23 Feb 2015

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 1 / 35

Notation

Transition matrix - P (reversible).

Stationary dist. - π.

Reversibility: π(x)P (x, y) = π(y)P (y, x), ∀x, y ∈ Ω.

Laziness P (x, x) ≥ 1/2, ∀x ∈ Ω.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 2 / 35

TV distance

The total-variation distance between 2 dist. µ, ν on Ω is:

‖µ− ν‖TVd= max

A⊂Ωµ(A)− ν(A) .

d(t)d= max

x‖Ptx − π‖TV,

where Ptx(y) := P t(x, y).

The ε-mixing-time (0 < ε < 1) is:

tmix(ε)d= min t : d(t) ≤ ε

tmixd= tmix(1/4).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 3 / 35

TV distance

The total-variation distance between 2 dist. µ, ν on Ω is:

‖µ− ν‖TVd= max

A⊂Ωµ(A)− ν(A) .

d(t)d= max

x‖Ptx − π‖TV,

where Ptx(y) := P t(x, y).

The ε-mixing-time (0 < ε < 1) is:

tmix(ε)d= min t : d(t) ≤ ε

tmixd= tmix(1/4).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 3 / 35

Cutoff - definition

Def: a sequence of MCs (X(n)t ) exhibits cutoff if

t(n)mix(ε)− t(n)

mix(1− ε) = o(t(n)mix), ∀ 0 < ε < 1/4. (1)

(wn) is called a cutoff window for (X(n)t ) if: wn = o

(t(n)mix

), and

t(n)mix(ε)− t(n)

mix(1− ε) ≤ cεwn, ∀n ≥ 1,∀ε ∈ (0, 1/4).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 4 / 35

Cutoff - definition

Def: a sequence of MCs (X(n)t ) exhibits cutoff if

t(n)mix(ε)− t(n)

mix(1− ε) = o(t(n)mix), ∀ 0 < ε < 1/4. (1)

(wn) is called a cutoff window for (X(n)t ) if: wn = o

(t(n)mix

), and

t(n)mix(ε)− t(n)

mix(1− ε) ≤ cεwn, ∀n ≥ 1, ∀ε ∈ (0, 1/4).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 4 / 35

Cutoff

Figure : cutoff

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 5 / 35

Examples

Cutoff was first identified for random transpositions (Diaconis & Shashahani 81).tmix = n logn

2and wn = n . Many other card shuffling schemes have cutoff.

Lazy simple random walk on the hypercube 0, 1n exhibits a cutoff withtmix = n logn

2(half the coupon collector time) and wn = n (Aldous 83).

Biased nearest neighbor random walk on an 1, 2, . . . , n has cutoff, with a biastoward n has cutoff around E1[Tn].

The glauber dynamics of the Ising model in large temperature has cutoff for manyunderlying graphs (LS).

Lazy simple random walk on the n-cycle does not exhibit a cutoff (tmix = Θ(n2)).Same for higher dimensional tori of side length n.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 6 / 35

Examples

Cutoff was first identified for random transpositions (Diaconis & Shashahani 81).tmix = n logn

2and wn = n . Many other card shuffling schemes have cutoff.

Lazy simple random walk on the hypercube 0, 1n exhibits a cutoff withtmix = n logn

2(half the coupon collector time) and wn = n (Aldous 83).

Biased nearest neighbor random walk on an 1, 2, . . . , n has cutoff, with a biastoward n has cutoff around E1[Tn].

The glauber dynamics of the Ising model in large temperature has cutoff for manyunderlying graphs (LS).

Lazy simple random walk on the n-cycle does not exhibit a cutoff (tmix = Θ(n2)).Same for higher dimensional tori of side length n.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 6 / 35

Examples

Cutoff was first identified for random transpositions (Diaconis & Shashahani 81).tmix = n logn

2and wn = n . Many other card shuffling schemes have cutoff.

Lazy simple random walk on the hypercube 0, 1n exhibits a cutoff withtmix = n logn

2(half the coupon collector time) and wn = n (Aldous 83).

Biased nearest neighbor random walk on an 1, 2, . . . , n has cutoff, with a biastoward n has cutoff around E1[Tn].

The glauber dynamics of the Ising model in large temperature has cutoff for manyunderlying graphs (LS).

Lazy simple random walk on the n-cycle does not exhibit a cutoff (tmix = Θ(n2)).Same for higher dimensional tori of side length n.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 6 / 35

Background

Many chains are believed to exhibit cutoff. Verifying the occurrence of cutoffrigorously is usually hard.

The name cutoff was coined by Aldous and Diaconis in their seminal 86 paper:

(AD 86) “the most interesting open problem”: Find verifiable conditions for cutoff.

So far cutoff was mostly studied by understanding examples.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 7 / 35

Background

Many chains are believed to exhibit cutoff. Verifying the occurrence of cutoffrigorously is usually hard.

The name cutoff was coined by Aldous and Diaconis in their seminal 86 paper:

(AD 86) “the most interesting open problem”: Find verifiable conditions for cutoff.

So far cutoff was mostly studied by understanding examples.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 7 / 35

Background

Many chains are believed to exhibit cutoff. Verifying the occurrence of cutoffrigorously is usually hard.

The name cutoff was coined by Aldous and Diaconis in their seminal 86 paper:

(AD 86) “the most interesting open problem”: Find verifiable conditions for cutoff.

So far cutoff was mostly studied by understanding examples.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 7 / 35

Spectral gap & relaxation-time

Let λ2 be the largest non-trivial e.v. of P .

Definition: gap = 1− λ2 - the spectral gap.

Def: trel := 1gap

- the relaxation-time.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 8 / 35

The product condition (Prod. cond.)

In a 2004 Aim workshop I proposed that The product condition (Prod. Cond.) -gap(n)t

(n)mix →∞ (equivalently, t(n)

rel = o(t(n)mix))

should imply cutoff for ”nice” reversible chains.

(It is a necessary condition for cutoff)

It is not always sufficient (Aldous and Pak).

Problem: Find families of MCs s.t. Prod. Cond. =⇒ cutoff.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 9 / 35

The product condition (Prod. cond.)

In a 2004 Aim workshop I proposed that The product condition (Prod. Cond.) -gap(n)t

(n)mix →∞ (equivalently, t(n)

rel = o(t(n)mix))

should imply cutoff for ”nice” reversible chains.

(It is a necessary condition for cutoff)

It is not always sufficient (Aldous and Pak).

Problem: Find families of MCs s.t. Prod. Cond. =⇒ cutoff.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 9 / 35

Aldous’ example

Figure : Fixed bias to the right conditioned on a non-lazy step. Different holding probabilities along 2parallel branches of equal length.

t(n)rel = O(1).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 10 / 35

Aldous’ example

d(t)

Figure : d(t) is up to negligible terms the probability that the rightmost state was not hit by time t(starting from the leftmost state).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 11 / 35

Hitting and Mixing

Def: The hitting time of a set A ⊂ Ω = TA := mint : Xt ∈ A.

Hitting times of “worst” sets are related to mixing:

Fact (Random Target): Under laziness and reversibilitytmix ≤ C maxx

∑y π(y)Ex[Ty] ≤ C maxx,y Ex[Ty].

Generalization (Aldous 82 LW 95) under reversibility and laziness

tmix ≤ C maxx,A

π(A)Ex[TA].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 12 / 35

Hitting and Mixing

Def: The hitting time of a set A ⊂ Ω = TA := mint : Xt ∈ A.

Hitting times of “worst” sets are related to mixing:

Fact (Random Target): Under laziness and reversibilitytmix ≤ C maxx

∑y π(y)Ex[Ty] ≤ C maxx,y Ex[Ty].

Generalization (Aldous 82 LW 95) under reversibility and laziness

tmix ≤ C maxx,A

π(A)Ex[TA].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 12 / 35

Hitting and Mixing

Def: The hitting time of a set A ⊂ Ω = TA := mint : Xt ∈ A.

Hitting times of “worst” sets are related to mixing:

Fact (Random Target): Under laziness and reversibilitytmix ≤ C maxx

∑y π(y)Ex[Ty] ≤ C maxx,y Ex[Ty].

Generalization (Aldous 82 LW 95) under reversibility and laziness

tmix ≤ C maxx,A

π(A)Ex[TA].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 12 / 35

Hitting and Mixing

Oliveira (2011) and independently P.-Sousi (2011) (+ an by Griffiths et al. (2012)):for any irreducible reversible lazy MC

tmix = Θ(tH), where tH := maxx,A:π(A)≥1/2

Ex[TA]. (2)

The threshold 1/2 cannot be improved. For two n-cliques connected by a singleedge sets of π measure ≥ 1/2 + ε are hit in constant number of steps.

d(t) ∼ 12P(the other clique was hit by time t) =⇒ maybe we should look at tails

rather than on expectations.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 13 / 35

Hitting and Mixing

Oliveira (2011) and independently P.-Sousi (2011) (+ an by Griffiths et al. (2012)):for any irreducible reversible lazy MC

tmix = Θ(tH), where tH := maxx,A:π(A)≥1/2

Ex[TA]. (2)

The threshold 1/2 cannot be improved. For two n-cliques connected by a singleedge sets of π measure ≥ 1/2 + ε are hit in constant number of steps.

d(t) ∼ 12P(the other clique was hit by time t) =⇒ maybe we should look at tails

rather than on expectations.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 13 / 35

Hitting and Cutoff

Diaconis & Saloff-Coste (06) (separation cutoff) and Ding-Lubetzky-P. (10) (TVcutoff):

A seq. of birth and death chains exhibits cutoff iff the Prod. Cond. holds.

Both proofs reduce the problem to establishing concentration of hitting times.

We extend their results to weighted nearest-neighbor RWs on trees.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 14 / 35

Hitting and Cutoff

Diaconis & Saloff-Coste (06) (separation cutoff) and Ding-Lubetzky-P. (10) (TVcutoff):

A seq. of birth and death chains exhibits cutoff iff the Prod. Cond. holds.

Both proofs reduce the problem to establishing concentration of hitting times.

We extend their results to weighted nearest-neighbor RWs on trees.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 14 / 35

Cutoff for trees

Theorem

Let (V, P, π) be a lazy Markov chain on a tree T = (V,E). Then

tmix(ε)− tmix(1− ε) ≤ C| log ε|√treltmix, for any 0 < ε ≤ 1/4.

In particular, the Prod. Cond. implies cutoff with a cutoff window wn =

√t(n)rel t

(n)mix.

DLP (10) - for lazy BD chains tmix(ε)− tmix(1− ε) ≤ O(ε−1√treltmix).

In many cases any cutoff window must be Ω

(√t(n)rel t

(n)mix

).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 15 / 35

Cutoff for trees

Theorem

Let (V, P, π) be a lazy Markov chain on a tree T = (V,E). Then

tmix(ε)− tmix(1− ε) ≤ C| log ε|√treltmix, for any 0 < ε ≤ 1/4.

In particular, the Prod. Cond. implies cutoff with a cutoff window wn =

√t(n)rel t

(n)mix.

DLP (10) - for lazy BD chains tmix(ε)− tmix(1− ε) ≤ O(ε−1√treltmix).

In many cases any cutoff window must be Ω

(√t(n)rel t

(n)mix

).

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 15 / 35

A mixing parameter in terms of hitting times

Let 0 < ε < 1.

hit(ε) := mint : Px[TA > t] ≤ ε : for all x and A ⊂ Ω s.t. π(A) ≥ 1/2.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 16 / 35

Main abstract result

Definition: A sequence has hit-cutoff if

hit(n)(ε)− hit(n)(1− ε) = o(hit(n)(1/4)) for all 0 < ε < 1/4.

Main abstract result:

Theorem

Let (Ωn, Pn, πn) be a seq. of finite reversible lazy MCs. Then TFAE:

(i) Cutoff.

(ii) hit-cutoff.

We show that (ii) implies the Prod. Cond. (for (i) this is known).

Hence (i)⇐⇒(ii) follows from the following quantitative relation between hit andtmix using hit = Θ(tmix):

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 17 / 35

Main abstract result

Definition: A sequence has hit-cutoff if

hit(n)(ε)− hit(n)(1− ε) = o(hit(n)(1/4)) for all 0 < ε < 1/4.

Main abstract result:

Theorem

Let (Ωn, Pn, πn) be a seq. of finite reversible lazy MCs. Then TFAE:

(i) Cutoff.

(ii) hit-cutoff.

We show that (ii) implies the Prod. Cond. (for (i) this is known).

Hence (i)⇐⇒(ii) follows from the following quantitative relation between hit andtmix using hit = Θ(tmix):

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 17 / 35

Main abstract result

Definition: A sequence has hit-cutoff if

hit(n)(ε)− hit(n)(1− ε) = o(hit(n)(1/4)) for all 0 < ε < 1/4.

Main abstract result:

Theorem

Let (Ωn, Pn, πn) be a seq. of finite reversible lazy MCs. Then TFAE:

(i) Cutoff.

(ii) hit-cutoff.

We show that (ii) implies the Prod. Cond. (for (i) this is known).

Hence (i)⇐⇒(ii) follows from the following quantitative relation between hit andtmix using hit = Θ(tmix):

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 17 / 35

To mix - escape and then relax

For any reversible irreducible finite lazy chain and any 0 < ε ≤ 1/4,

hit(3ε)− trel| log(2ε)| ≤ tmix(2ε) ≤ hit(ε) + trel| log(2ε)|

Terms involving trel are negligible under the Prod. Cond..

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 18 / 35

Hitting times when X0 ∼ π

Easy direction: to mix, the chain must first escape from small sets = “first stage ofmixing”.

Fact: Let A ⊂ Ω be such that π(A) ≥ 1/2. Then (under reversibility)

Pπ[TA > 2strel] ≤e−s

2, for all s ≥ 0.

Follows by the spectral decomposition of PAc (the chain killed upon hitting A).

By a coupling argument,

Px[TA > t+ 2strel] ≤ d(t) + Pπ[TA > 2strel].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 19 / 35

Hitting times when X0 ∼ π

Easy direction: to mix, the chain must first escape from small sets = “first stage ofmixing”.

Fact: Let A ⊂ Ω be such that π(A) ≥ 1/2. Then (under reversibility)

Pπ[TA > 2strel] ≤e−s

2, for all s ≥ 0.

Follows by the spectral decomposition of PAc (the chain killed upon hitting A).

By a coupling argument,

Px[TA > t+ 2strel] ≤ d(t) + Pπ[TA > 2strel].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 19 / 35

Hitting times when X0 ∼ π

Easy direction: to mix, the chain must first escape from small sets = “first stage ofmixing”.

Fact: Let A ⊂ Ω be such that π(A) ≥ 1/2. Then (under reversibility)

Pπ[TA > 2strel] ≤e−s

2, for all s ≥ 0.

Follows by the spectral decomposition of PAc (the chain killed upon hitting A).

By a coupling argument,

Px[TA > t+ 2strel] ≤ d(t) + Pπ[TA > 2strel].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 19 / 35

The hard direction

We say the chain is ε-mixed w.r.t. A ⊂ Ω at time t+ s if for every x

|Pt+sx (A)− π(A)| ≤ ε.

We say that the event TG ≤ t “serves as a certificate” of being ε-mixedw.r.t. A ⊂ Ω at time t+ s if

|Px[Xt+s ∈ A | TG ≤ t]− π(A)| ≤ ε.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 20 / 35

Application of the max ineq.

Claim: Denote t := hit(ε), s := trel × log(2/ε). Then

tmix(2ε) ≤ t+ s.

Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.

Consider

G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε

.

A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.

=⇒ Px[TG > t] ≤ ε. (by def. of t).

For any x,A:

|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s

|Psg(A)− π(A)| ≤ 2ε.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35

Application of the max ineq.

Claim: Denote t := hit(ε), s := trel × log(2/ε). Then

tmix(2ε) ≤ t+ s.

Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.

Consider

G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε

.

A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.

=⇒ Px[TG > t] ≤ ε. (by def. of t).

For any x,A:

|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s

|Psg(A)− π(A)| ≤ 2ε.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35

Application of the max ineq.

Claim: Denote t := hit(ε), s := trel × log(2/ε). Then

tmix(2ε) ≤ t+ s.

Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.

Consider

G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε

.

A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.

=⇒ Px[TG > t] ≤ ε. (by def. of t).

For any x,A:

|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s

|Psg(A)− π(A)| ≤ 2ε.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35

Application of the max ineq.

Claim: Denote t := hit(ε), s := trel × log(2/ε). Then

tmix(2ε) ≤ t+ s.

Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.

Consider

G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε

.

A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.

=⇒ Px[TG > t] ≤ ε. (by def. of t).

For any x,A:

|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s

|Psg(A)− π(A)| ≤ 2ε.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35

Remark:

The threshold 1/2 in the definition of hit can be replaced by any 0 < α < 1, i.e. ata price of an additive error of trel| log(1− α)| the analysis extends to

hitα,x(ε) := mint : Px[TA > t] ≤ ε : for all A ⊂ Ω s.t. π(A) ≥ α,

hitα(ε) := maxx∈Ω

hitα,x(ε)

If α > 1/2 one needs to assume that the Prod. Cond. holds to maintain theequivalence of the 2 notions of cutoff.

=⇒When trel is known, it suffices to consider escape probabilities from small setsin order to bound the mixing-time.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 22 / 35

Remark:

The threshold 1/2 in the definition of hit can be replaced by any 0 < α < 1, i.e. ata price of an additive error of trel| log(1− α)| the analysis extends to

hitα,x(ε) := mint : Px[TA > t] ≤ ε : for all A ⊂ Ω s.t. π(A) ≥ α,

hitα(ε) := maxx∈Ω

hitα,x(ε)

If α > 1/2 one needs to assume that the Prod. Cond. holds to maintain theequivalence of the 2 notions of cutoff.

=⇒When trel is known, it suffices to consider escape probabilities from small setsin order to bound the mixing-time.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 22 / 35

Tools

Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =

∑y

P t(x, y)f(y).

For f ∈ RΩ define Eπ[f ] :=∑x∈Ω

π(x)f(x) and ‖f‖22 := Eπ[f2].

For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.

The following is well-known and follows from elementary linear-algebra.

Lemma (Contraction Lemma)

Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then

VarπPt1A ≤ e−2t/trel/4. (3)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35

Tools

Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =

∑y

P t(x, y)f(y).

For f ∈ RΩ define Eπ[f ] :=∑x∈Ω

π(x)f(x) and ‖f‖22 := Eπ[f2].

For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.

The following is well-known and follows from elementary linear-algebra.

Lemma (Contraction Lemma)

Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then

VarπPt1A ≤ e−2t/trel/4. (3)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35

Tools

Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =

∑y

P t(x, y)f(y).

For f ∈ RΩ define Eπ[f ] :=∑x∈Ω

π(x)f(x) and ‖f‖22 := Eπ[f2].

For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.

The following is well-known and follows from elementary linear-algebra.

Lemma (Contraction Lemma)

Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then

VarπPt1A ≤ e−2t/trel/4. (3)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35

Tools

Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =

∑y

P t(x, y)f(y).

For f ∈ RΩ define Eπ[f ] :=∑x∈Ω

π(x)f(x) and ‖f‖22 := Eπ[f2].

For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.

The following is well-known and follows from elementary linear-algebra.

Lemma (Contraction Lemma)

Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then

VarπPt1A ≤ e−2t/trel/4. (3)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35

Maximal Inequality

The main ingredient in our approach is Starr’s maximal-inequality (66) (refines Stein’smax-inequality (61))

Theorem (Maximal inequality)

Let (Ω, P, π) be a lazy irreducible reversible Markov chain. Let f ∈ RΩ. Define thecorresponding maximal function f∗ ∈ RΩ as

f∗(x) := sup0≤k<∞

|P k(f)(x)| = sup0≤k<∞

|Ex[f(Xk)]|.

Then for 1 < p <∞,‖f∗‖p ≤ q‖f‖p 1/p+ 1/q = 1. (4)

Applying this (with p = 2) to fs := P s(1A − π(A)) in conjunction with the ContractionLemma yields the desired bound on the size of

Gc ⊂ (f∗s )2 > 8‖fs‖22 ⊂ (f∗s )2 > 2‖f∗s ‖22.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 24 / 35

Maximal Inequality

The main ingredient in our approach is Starr’s maximal-inequality (66) (refines Stein’smax-inequality (61))

Theorem (Maximal inequality)

Let (Ω, P, π) be a lazy irreducible reversible Markov chain. Let f ∈ RΩ. Define thecorresponding maximal function f∗ ∈ RΩ as

f∗(x) := sup0≤k<∞

|P k(f)(x)| = sup0≤k<∞

|Ex[f(Xk)]|.

Then for 1 < p <∞,‖f∗‖p ≤ q‖f‖p 1/p+ 1/q = 1. (4)

Applying this (with p = 2) to fs := P s(1A − π(A)) in conjunction with the ContractionLemma yields the desired bound on the size of

Gc ⊂ (f∗s )2 > 8‖fs‖22 ⊂ (f∗s )2 > 2‖f∗s ‖22.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 24 / 35

Trees

Let: T := (V,E) be a finite tree.

(V, P, π) a lazy MC corresponding to some (lazy) weighted nearest-neighbor walkon T (i.e. P (x, y) > 0 iff x, y ∈ E or y = x).

Fact: (Kolmogorov’s cycle condition) every MC on a tree is reversible.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 25 / 35

Trees

Can the tree structure be used to determine the identity of the “worst” sets?

Easier question: what set of π measure ≥ 1/2 is the “hardest” to hit in a birth &death chain with state space [n] := 1, 2, . . . , n ?

Answer: take a state m with π([m]) ≥ 1/2 and π([m− 1]) < 1/2. Then the setworst set would be either [m] or [n] \ [m− 1].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 26 / 35

Trees

Can the tree structure be used to determine the identity of the “worst” sets?

Easier question: what set of π measure ≥ 1/2 is the “hardest” to hit in a birth &death chain with state space [n] := 1, 2, . . . , n ?

Answer: take a state m with π([m]) ≥ 1/2 and π([m− 1]) < 1/2. Then the setworst set would be either [m] or [n] \ [m− 1].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 26 / 35

Trees

Can the tree structure be used to determine the identity of the “worst” sets?

Easier question: what set of π measure ≥ 1/2 is the “hardest” to hit in a birth &death chain with state space [n] := 1, 2, . . . , n ?

Answer: take a state m with π([m]) ≥ 1/2 and π([m− 1]) < 1/2. Then the setworst set would be either [m] or [n] \ [m− 1].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 26 / 35

Trees

How to generalize this to trees?

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 27 / 35

Central vertex

oC3

C2

C1

C4

∏(Ci)≤1/2for all i

Figure : A vertex o ∈ V is called a central-vertex if each connected component of T \ o hasstationary probability at most 1/2.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 28 / 35

Trees

There is always a central-vertex (and at most 2). We fix one, denote it by o andcall it the root.

There is a simple reduction: To is concentrated (from worst leaf) =⇒ hit-cutoff.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 29 / 35

Trees

There is always a central-vertex (and at most 2). We fix one, denote it by o andcall it the root.

There is a simple reduction: To is concentrated (from worst leaf) =⇒ hit-cutoff.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 29 / 35

Trees

It suffices to show that Ex[To] = Ω(tmix) =⇒ To is concentrated if X0 = x.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 30 / 35

Trees

It suffices to show that Ex[To] = Ω(tmix) =⇒ To is concentrated if X0 = x.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 30 / 35

Trees

o=vk

x=v0v1

v2v3

vk-1

Figure : Let v0 = x,v1, . . . , vk = o be the vertices along the path from x to o.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 31 / 35

Trees

Proof of Concentration: Varx[To] ≤ Ctreltmix:

It suffices to show that Varx[To] ≤ 4trelEx[To].

If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1

d= Tvi underX0 = vi−1

τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =

∑Varx[τi].

The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤

∑4trelEx[τi] = 4trelEx[To].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35

Trees

Proof of Concentration: Varx[To] ≤ Ctreltmix:

It suffices to show that Varx[To] ≤ 4trelEx[To].

If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1

d= Tvi underX0 = vi−1

τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =

∑Varx[τi].

The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤

∑4trelEx[τi] = 4trelEx[To].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35

Trees

Proof of Concentration: Varx[To] ≤ Ctreltmix:

It suffices to show that Varx[To] ≤ 4trelEx[To].

If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1

d= Tvi underX0 = vi−1

τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =

∑Varx[τi].

The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤

∑4trelEx[τi] = 4trelEx[To].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35

Trees

Proof of Concentration: Varx[To] ≤ Ctreltmix:

It suffices to show that Varx[To] ≤ 4trelEx[To].

If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1

d= Tvi underX0 = vi−1

τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =

∑Varx[τi].

The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤

∑4trelEx[τi] = 4trelEx[To].

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35

Beyond trees

The tree assumption can be relaxed. We can treat jumps to vertices of boundeddistance on a tree (i.e. the length of the path from u to v in the tree (which is nowjust an auxiliary structure) is > r =⇒ P (u, v) = 0) under some mild necessaryassumption.

Previously the BD assumption could not be relaxed mainly due to it beingexploited via a representation of hitting times result for BD chains.

In particular, if P (u, v) ≥ δ > 0 for all u, v s.t. dT (u, v) ≤ r (and otherwiseP (u, v) = 0), then

cutoff⇐⇒ the Prod. Cond. holds.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 33 / 35

Beyond trees

The tree assumption can be relaxed. We can treat jumps to vertices of boundeddistance on a tree (i.e. the length of the path from u to v in the tree (which is nowjust an auxiliary structure) is > r =⇒ P (u, v) = 0) under some mild necessaryassumption.

Previously the BD assumption could not be relaxed mainly due to it beingexploited via a representation of hitting times result for BD chains.

In particular, if P (u, v) ≥ δ > 0 for all u, v s.t. dT (u, v) ≤ r (and otherwiseP (u, v) = 0), then

cutoff⇐⇒ the Prod. Cond. holds.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 33 / 35

Beyond trees

The tree assumption can be relaxed. We can treat jumps to vertices of boundeddistance on a tree (i.e. the length of the path from u to v in the tree (which is nowjust an auxiliary structure) is > r =⇒ P (u, v) = 0) under some mild necessaryassumption.

Previously the BD assumption could not be relaxed mainly due to it beingexploited via a representation of hitting times result for BD chains.

In particular, if P (u, v) ≥ δ > 0 for all u, v s.t. dT (u, v) ≤ r (and otherwiseP (u, v) = 0), then

cutoff⇐⇒ the Prod. Cond. holds.

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 33 / 35

Open problems

What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?

When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?

Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?

Does the product condition imply cutoff for transitive graphs under certain mildconditions?

Does contraction implies cutoff? (under some mild conditions...)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35

Open problems

What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?

When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?

Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?

Does the product condition imply cutoff for transitive graphs under certain mildconditions?

Does contraction implies cutoff? (under some mild conditions...)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35

Open problems

What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?

When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?

Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?

Does the product condition imply cutoff for transitive graphs under certain mildconditions?

Does contraction implies cutoff? (under some mild conditions...)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35

Open problems

What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?

When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?

Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?

Does the product condition imply cutoff for transitive graphs under certain mildconditions?

Does contraction implies cutoff? (under some mild conditions...)

Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35

top related