characterization of cutoff for reversible markov chainscharacterization of cutoff for reversible...
TRANSCRIPT
Characterization of cutoff for reversible Markov chainsYuval Peres
Joint work with Riddhi Basu and Jonathan Hermon
23 Feb 2015
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 1 / 35
Notation
Transition matrix - P (reversible).
Stationary dist. - π.
Reversibility: π(x)P (x, y) = π(y)P (y, x), ∀x, y ∈ Ω.
Laziness P (x, x) ≥ 1/2, ∀x ∈ Ω.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 2 / 35
TV distance
The total-variation distance between 2 dist. µ, ν on Ω is:
‖µ− ν‖TVd= max
A⊂Ωµ(A)− ν(A) .
d(t)d= max
x‖Ptx − π‖TV,
where Ptx(y) := P t(x, y).
The ε-mixing-time (0 < ε < 1) is:
tmix(ε)d= min t : d(t) ≤ ε
tmixd= tmix(1/4).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 3 / 35
TV distance
The total-variation distance between 2 dist. µ, ν on Ω is:
‖µ− ν‖TVd= max
A⊂Ωµ(A)− ν(A) .
d(t)d= max
x‖Ptx − π‖TV,
where Ptx(y) := P t(x, y).
The ε-mixing-time (0 < ε < 1) is:
tmix(ε)d= min t : d(t) ≤ ε
tmixd= tmix(1/4).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 3 / 35
Cutoff - definition
Def: a sequence of MCs (X(n)t ) exhibits cutoff if
t(n)mix(ε)− t(n)
mix(1− ε) = o(t(n)mix), ∀ 0 < ε < 1/4. (1)
(wn) is called a cutoff window for (X(n)t ) if: wn = o
(t(n)mix
), and
t(n)mix(ε)− t(n)
mix(1− ε) ≤ cεwn, ∀n ≥ 1,∀ε ∈ (0, 1/4).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 4 / 35
Cutoff - definition
Def: a sequence of MCs (X(n)t ) exhibits cutoff if
t(n)mix(ε)− t(n)
mix(1− ε) = o(t(n)mix), ∀ 0 < ε < 1/4. (1)
(wn) is called a cutoff window for (X(n)t ) if: wn = o
(t(n)mix
), and
t(n)mix(ε)− t(n)
mix(1− ε) ≤ cεwn, ∀n ≥ 1, ∀ε ∈ (0, 1/4).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 4 / 35
Cutoff
Figure : cutoff
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 5 / 35
Examples
Cutoff was first identified for random transpositions (Diaconis & Shashahani 81).tmix = n logn
2and wn = n . Many other card shuffling schemes have cutoff.
Lazy simple random walk on the hypercube 0, 1n exhibits a cutoff withtmix = n logn
2(half the coupon collector time) and wn = n (Aldous 83).
Biased nearest neighbor random walk on an 1, 2, . . . , n has cutoff, with a biastoward n has cutoff around E1[Tn].
The glauber dynamics of the Ising model in large temperature has cutoff for manyunderlying graphs (LS).
Lazy simple random walk on the n-cycle does not exhibit a cutoff (tmix = Θ(n2)).Same for higher dimensional tori of side length n.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 6 / 35
Examples
Cutoff was first identified for random transpositions (Diaconis & Shashahani 81).tmix = n logn
2and wn = n . Many other card shuffling schemes have cutoff.
Lazy simple random walk on the hypercube 0, 1n exhibits a cutoff withtmix = n logn
2(half the coupon collector time) and wn = n (Aldous 83).
Biased nearest neighbor random walk on an 1, 2, . . . , n has cutoff, with a biastoward n has cutoff around E1[Tn].
The glauber dynamics of the Ising model in large temperature has cutoff for manyunderlying graphs (LS).
Lazy simple random walk on the n-cycle does not exhibit a cutoff (tmix = Θ(n2)).Same for higher dimensional tori of side length n.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 6 / 35
Examples
Cutoff was first identified for random transpositions (Diaconis & Shashahani 81).tmix = n logn
2and wn = n . Many other card shuffling schemes have cutoff.
Lazy simple random walk on the hypercube 0, 1n exhibits a cutoff withtmix = n logn
2(half the coupon collector time) and wn = n (Aldous 83).
Biased nearest neighbor random walk on an 1, 2, . . . , n has cutoff, with a biastoward n has cutoff around E1[Tn].
The glauber dynamics of the Ising model in large temperature has cutoff for manyunderlying graphs (LS).
Lazy simple random walk on the n-cycle does not exhibit a cutoff (tmix = Θ(n2)).Same for higher dimensional tori of side length n.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 6 / 35
Background
Many chains are believed to exhibit cutoff. Verifying the occurrence of cutoffrigorously is usually hard.
The name cutoff was coined by Aldous and Diaconis in their seminal 86 paper:
(AD 86) “the most interesting open problem”: Find verifiable conditions for cutoff.
So far cutoff was mostly studied by understanding examples.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 7 / 35
Background
Many chains are believed to exhibit cutoff. Verifying the occurrence of cutoffrigorously is usually hard.
The name cutoff was coined by Aldous and Diaconis in their seminal 86 paper:
(AD 86) “the most interesting open problem”: Find verifiable conditions for cutoff.
So far cutoff was mostly studied by understanding examples.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 7 / 35
Background
Many chains are believed to exhibit cutoff. Verifying the occurrence of cutoffrigorously is usually hard.
The name cutoff was coined by Aldous and Diaconis in their seminal 86 paper:
(AD 86) “the most interesting open problem”: Find verifiable conditions for cutoff.
So far cutoff was mostly studied by understanding examples.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 7 / 35
Spectral gap & relaxation-time
Let λ2 be the largest non-trivial e.v. of P .
Definition: gap = 1− λ2 - the spectral gap.
Def: trel := 1gap
- the relaxation-time.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 8 / 35
The product condition (Prod. cond.)
In a 2004 Aim workshop I proposed that The product condition (Prod. Cond.) -gap(n)t
(n)mix →∞ (equivalently, t(n)
rel = o(t(n)mix))
should imply cutoff for ”nice” reversible chains.
(It is a necessary condition for cutoff)
It is not always sufficient (Aldous and Pak).
Problem: Find families of MCs s.t. Prod. Cond. =⇒ cutoff.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 9 / 35
The product condition (Prod. cond.)
In a 2004 Aim workshop I proposed that The product condition (Prod. Cond.) -gap(n)t
(n)mix →∞ (equivalently, t(n)
rel = o(t(n)mix))
should imply cutoff for ”nice” reversible chains.
(It is a necessary condition for cutoff)
It is not always sufficient (Aldous and Pak).
Problem: Find families of MCs s.t. Prod. Cond. =⇒ cutoff.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 9 / 35
Aldous’ example
Figure : Fixed bias to the right conditioned on a non-lazy step. Different holding probabilities along 2parallel branches of equal length.
t(n)rel = O(1).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 10 / 35
Aldous’ example
d(t)
Figure : d(t) is up to negligible terms the probability that the rightmost state was not hit by time t(starting from the leftmost state).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 11 / 35
Hitting and Mixing
Def: The hitting time of a set A ⊂ Ω = TA := mint : Xt ∈ A.
Hitting times of “worst” sets are related to mixing:
Fact (Random Target): Under laziness and reversibilitytmix ≤ C maxx
∑y π(y)Ex[Ty] ≤ C maxx,y Ex[Ty].
Generalization (Aldous 82 LW 95) under reversibility and laziness
tmix ≤ C maxx,A
π(A)Ex[TA].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 12 / 35
Hitting and Mixing
Def: The hitting time of a set A ⊂ Ω = TA := mint : Xt ∈ A.
Hitting times of “worst” sets are related to mixing:
Fact (Random Target): Under laziness and reversibilitytmix ≤ C maxx
∑y π(y)Ex[Ty] ≤ C maxx,y Ex[Ty].
Generalization (Aldous 82 LW 95) under reversibility and laziness
tmix ≤ C maxx,A
π(A)Ex[TA].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 12 / 35
Hitting and Mixing
Def: The hitting time of a set A ⊂ Ω = TA := mint : Xt ∈ A.
Hitting times of “worst” sets are related to mixing:
Fact (Random Target): Under laziness and reversibilitytmix ≤ C maxx
∑y π(y)Ex[Ty] ≤ C maxx,y Ex[Ty].
Generalization (Aldous 82 LW 95) under reversibility and laziness
tmix ≤ C maxx,A
π(A)Ex[TA].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 12 / 35
Hitting and Mixing
Oliveira (2011) and independently P.-Sousi (2011) (+ an by Griffiths et al. (2012)):for any irreducible reversible lazy MC
tmix = Θ(tH), where tH := maxx,A:π(A)≥1/2
Ex[TA]. (2)
The threshold 1/2 cannot be improved. For two n-cliques connected by a singleedge sets of π measure ≥ 1/2 + ε are hit in constant number of steps.
d(t) ∼ 12P(the other clique was hit by time t) =⇒ maybe we should look at tails
rather than on expectations.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 13 / 35
Hitting and Mixing
Oliveira (2011) and independently P.-Sousi (2011) (+ an by Griffiths et al. (2012)):for any irreducible reversible lazy MC
tmix = Θ(tH), where tH := maxx,A:π(A)≥1/2
Ex[TA]. (2)
The threshold 1/2 cannot be improved. For two n-cliques connected by a singleedge sets of π measure ≥ 1/2 + ε are hit in constant number of steps.
d(t) ∼ 12P(the other clique was hit by time t) =⇒ maybe we should look at tails
rather than on expectations.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 13 / 35
Hitting and Cutoff
Diaconis & Saloff-Coste (06) (separation cutoff) and Ding-Lubetzky-P. (10) (TVcutoff):
A seq. of birth and death chains exhibits cutoff iff the Prod. Cond. holds.
Both proofs reduce the problem to establishing concentration of hitting times.
We extend their results to weighted nearest-neighbor RWs on trees.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 14 / 35
Hitting and Cutoff
Diaconis & Saloff-Coste (06) (separation cutoff) and Ding-Lubetzky-P. (10) (TVcutoff):
A seq. of birth and death chains exhibits cutoff iff the Prod. Cond. holds.
Both proofs reduce the problem to establishing concentration of hitting times.
We extend their results to weighted nearest-neighbor RWs on trees.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 14 / 35
Cutoff for trees
Theorem
Let (V, P, π) be a lazy Markov chain on a tree T = (V,E). Then
tmix(ε)− tmix(1− ε) ≤ C| log ε|√treltmix, for any 0 < ε ≤ 1/4.
In particular, the Prod. Cond. implies cutoff with a cutoff window wn =
√t(n)rel t
(n)mix.
DLP (10) - for lazy BD chains tmix(ε)− tmix(1− ε) ≤ O(ε−1√treltmix).
In many cases any cutoff window must be Ω
(√t(n)rel t
(n)mix
).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 15 / 35
Cutoff for trees
Theorem
Let (V, P, π) be a lazy Markov chain on a tree T = (V,E). Then
tmix(ε)− tmix(1− ε) ≤ C| log ε|√treltmix, for any 0 < ε ≤ 1/4.
In particular, the Prod. Cond. implies cutoff with a cutoff window wn =
√t(n)rel t
(n)mix.
DLP (10) - for lazy BD chains tmix(ε)− tmix(1− ε) ≤ O(ε−1√treltmix).
In many cases any cutoff window must be Ω
(√t(n)rel t
(n)mix
).
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 15 / 35
A mixing parameter in terms of hitting times
Let 0 < ε < 1.
hit(ε) := mint : Px[TA > t] ≤ ε : for all x and A ⊂ Ω s.t. π(A) ≥ 1/2.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 16 / 35
Main abstract result
Definition: A sequence has hit-cutoff if
hit(n)(ε)− hit(n)(1− ε) = o(hit(n)(1/4)) for all 0 < ε < 1/4.
Main abstract result:
Theorem
Let (Ωn, Pn, πn) be a seq. of finite reversible lazy MCs. Then TFAE:
(i) Cutoff.
(ii) hit-cutoff.
We show that (ii) implies the Prod. Cond. (for (i) this is known).
Hence (i)⇐⇒(ii) follows from the following quantitative relation between hit andtmix using hit = Θ(tmix):
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 17 / 35
Main abstract result
Definition: A sequence has hit-cutoff if
hit(n)(ε)− hit(n)(1− ε) = o(hit(n)(1/4)) for all 0 < ε < 1/4.
Main abstract result:
Theorem
Let (Ωn, Pn, πn) be a seq. of finite reversible lazy MCs. Then TFAE:
(i) Cutoff.
(ii) hit-cutoff.
We show that (ii) implies the Prod. Cond. (for (i) this is known).
Hence (i)⇐⇒(ii) follows from the following quantitative relation between hit andtmix using hit = Θ(tmix):
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 17 / 35
Main abstract result
Definition: A sequence has hit-cutoff if
hit(n)(ε)− hit(n)(1− ε) = o(hit(n)(1/4)) for all 0 < ε < 1/4.
Main abstract result:
Theorem
Let (Ωn, Pn, πn) be a seq. of finite reversible lazy MCs. Then TFAE:
(i) Cutoff.
(ii) hit-cutoff.
We show that (ii) implies the Prod. Cond. (for (i) this is known).
Hence (i)⇐⇒(ii) follows from the following quantitative relation between hit andtmix using hit = Θ(tmix):
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 17 / 35
To mix - escape and then relax
For any reversible irreducible finite lazy chain and any 0 < ε ≤ 1/4,
hit(3ε)− trel| log(2ε)| ≤ tmix(2ε) ≤ hit(ε) + trel| log(2ε)|
Terms involving trel are negligible under the Prod. Cond..
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 18 / 35
Hitting times when X0 ∼ π
Easy direction: to mix, the chain must first escape from small sets = “first stage ofmixing”.
Fact: Let A ⊂ Ω be such that π(A) ≥ 1/2. Then (under reversibility)
Pπ[TA > 2strel] ≤e−s
2, for all s ≥ 0.
Follows by the spectral decomposition of PAc (the chain killed upon hitting A).
By a coupling argument,
Px[TA > t+ 2strel] ≤ d(t) + Pπ[TA > 2strel].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 19 / 35
Hitting times when X0 ∼ π
Easy direction: to mix, the chain must first escape from small sets = “first stage ofmixing”.
Fact: Let A ⊂ Ω be such that π(A) ≥ 1/2. Then (under reversibility)
Pπ[TA > 2strel] ≤e−s
2, for all s ≥ 0.
Follows by the spectral decomposition of PAc (the chain killed upon hitting A).
By a coupling argument,
Px[TA > t+ 2strel] ≤ d(t) + Pπ[TA > 2strel].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 19 / 35
Hitting times when X0 ∼ π
Easy direction: to mix, the chain must first escape from small sets = “first stage ofmixing”.
Fact: Let A ⊂ Ω be such that π(A) ≥ 1/2. Then (under reversibility)
Pπ[TA > 2strel] ≤e−s
2, for all s ≥ 0.
Follows by the spectral decomposition of PAc (the chain killed upon hitting A).
By a coupling argument,
Px[TA > t+ 2strel] ≤ d(t) + Pπ[TA > 2strel].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 19 / 35
The hard direction
We say the chain is ε-mixed w.r.t. A ⊂ Ω at time t+ s if for every x
|Pt+sx (A)− π(A)| ≤ ε.
We say that the event TG ≤ t “serves as a certificate” of being ε-mixedw.r.t. A ⊂ Ω at time t+ s if
|Px[Xt+s ∈ A | TG ≤ t]− π(A)| ≤ ε.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 20 / 35
Application of the max ineq.
Claim: Denote t := hit(ε), s := trel × log(2/ε). Then
tmix(2ε) ≤ t+ s.
Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.
Consider
G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε
.
A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.
=⇒ Px[TG > t] ≤ ε. (by def. of t).
For any x,A:
|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s
|Psg(A)− π(A)| ≤ 2ε.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35
Application of the max ineq.
Claim: Denote t := hit(ε), s := trel × log(2/ε). Then
tmix(2ε) ≤ t+ s.
Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.
Consider
G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε
.
A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.
=⇒ Px[TG > t] ≤ ε. (by def. of t).
For any x,A:
|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s
|Psg(A)− π(A)| ≤ 2ε.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35
Application of the max ineq.
Claim: Denote t := hit(ε), s := trel × log(2/ε). Then
tmix(2ε) ≤ t+ s.
Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.
Consider
G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε
.
A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.
=⇒ Px[TG > t] ≤ ε. (by def. of t).
For any x,A:
|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s
|Psg(A)− π(A)| ≤ 2ε.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35
Application of the max ineq.
Claim: Denote t := hit(ε), s := trel × log(2/ε). Then
tmix(2ε) ≤ t+ s.
Proof: want for every A ⊂ Ω to have G = Gs(A) ⊂ Ω s.t. π(G) ≥ 1/2 and TG ≤ tserves as a certificate of “being ε-mixed w.r.t. A” at time t+ s.
Consider
G = Gs(A) :=g : ∀s ≥ s, |Psg(A)− π(A)| ≤ 2e−s/trel = ε
.
A consequence of Starr’s maximal ineq. is that π(G) ≥ 1/2.
=⇒ Px[TG > t] ≤ ε. (by def. of t).
For any x,A:
|Pt+sx (A)− π(A)| ≤ Px[TG > t] + maxg∈G,s≥s
|Psg(A)− π(A)| ≤ 2ε.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 21 / 35
Remark:
The threshold 1/2 in the definition of hit can be replaced by any 0 < α < 1, i.e. ata price of an additive error of trel| log(1− α)| the analysis extends to
hitα,x(ε) := mint : Px[TA > t] ≤ ε : for all A ⊂ Ω s.t. π(A) ≥ α,
hitα(ε) := maxx∈Ω
hitα,x(ε)
If α > 1/2 one needs to assume that the Prod. Cond. holds to maintain theequivalence of the 2 notions of cutoff.
=⇒When trel is known, it suffices to consider escape probabilities from small setsin order to bound the mixing-time.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 22 / 35
Remark:
The threshold 1/2 in the definition of hit can be replaced by any 0 < α < 1, i.e. ata price of an additive error of trel| log(1− α)| the analysis extends to
hitα,x(ε) := mint : Px[TA > t] ≤ ε : for all A ⊂ Ω s.t. π(A) ≥ α,
hitα(ε) := maxx∈Ω
hitα,x(ε)
If α > 1/2 one needs to assume that the Prod. Cond. holds to maintain theequivalence of the 2 notions of cutoff.
=⇒When trel is known, it suffices to consider escape probabilities from small setsin order to bound the mixing-time.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 22 / 35
Tools
Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =
∑y
P t(x, y)f(y).
For f ∈ RΩ define Eπ[f ] :=∑x∈Ω
π(x)f(x) and ‖f‖22 := Eπ[f2].
For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.
The following is well-known and follows from elementary linear-algebra.
Lemma (Contraction Lemma)
Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then
VarπPt1A ≤ e−2t/trel/4. (3)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35
Tools
Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =
∑y
P t(x, y)f(y).
For f ∈ RΩ define Eπ[f ] :=∑x∈Ω
π(x)f(x) and ‖f‖22 := Eπ[f2].
For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.
The following is well-known and follows from elementary linear-algebra.
Lemma (Contraction Lemma)
Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then
VarπPt1A ≤ e−2t/trel/4. (3)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35
Tools
Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =
∑y
P t(x, y)f(y).
For f ∈ RΩ define Eπ[f ] :=∑x∈Ω
π(x)f(x) and ‖f‖22 := Eπ[f2].
For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.
The following is well-known and follows from elementary linear-algebra.
Lemma (Contraction Lemma)
Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then
VarπPt1A ≤ e−2t/trel/4. (3)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35
Tools
Def: For f ∈ RΩ, t ≥ 0, define P tf ∈ RΩ byP tf(x) := Ex[f(Xt)] =
∑y
P t(x, y)f(y).
For f ∈ RΩ define Eπ[f ] :=∑x∈Ω
π(x)f(x) and ‖f‖22 := Eπ[f2].
For g ∈ RΩ denote Varπg := ‖g − Eπ[g]‖22.
The following is well-known and follows from elementary linear-algebra.
Lemma (Contraction Lemma)
Let (Ω, P, π) be a finite rev. irr. lazy MC. Let A ⊂ Ω. Let t ≥ 0. Then
VarπPt1A ≤ e−2t/trel/4. (3)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 23 / 35
Maximal Inequality
The main ingredient in our approach is Starr’s maximal-inequality (66) (refines Stein’smax-inequality (61))
Theorem (Maximal inequality)
Let (Ω, P, π) be a lazy irreducible reversible Markov chain. Let f ∈ RΩ. Define thecorresponding maximal function f∗ ∈ RΩ as
f∗(x) := sup0≤k<∞
|P k(f)(x)| = sup0≤k<∞
|Ex[f(Xk)]|.
Then for 1 < p <∞,‖f∗‖p ≤ q‖f‖p 1/p+ 1/q = 1. (4)
Applying this (with p = 2) to fs := P s(1A − π(A)) in conjunction with the ContractionLemma yields the desired bound on the size of
Gc ⊂ (f∗s )2 > 8‖fs‖22 ⊂ (f∗s )2 > 2‖f∗s ‖22.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 24 / 35
Maximal Inequality
The main ingredient in our approach is Starr’s maximal-inequality (66) (refines Stein’smax-inequality (61))
Theorem (Maximal inequality)
Let (Ω, P, π) be a lazy irreducible reversible Markov chain. Let f ∈ RΩ. Define thecorresponding maximal function f∗ ∈ RΩ as
f∗(x) := sup0≤k<∞
|P k(f)(x)| = sup0≤k<∞
|Ex[f(Xk)]|.
Then for 1 < p <∞,‖f∗‖p ≤ q‖f‖p 1/p+ 1/q = 1. (4)
Applying this (with p = 2) to fs := P s(1A − π(A)) in conjunction with the ContractionLemma yields the desired bound on the size of
Gc ⊂ (f∗s )2 > 8‖fs‖22 ⊂ (f∗s )2 > 2‖f∗s ‖22.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 24 / 35
Trees
Let: T := (V,E) be a finite tree.
(V, P, π) a lazy MC corresponding to some (lazy) weighted nearest-neighbor walkon T (i.e. P (x, y) > 0 iff x, y ∈ E or y = x).
Fact: (Kolmogorov’s cycle condition) every MC on a tree is reversible.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 25 / 35
Trees
Can the tree structure be used to determine the identity of the “worst” sets?
Easier question: what set of π measure ≥ 1/2 is the “hardest” to hit in a birth &death chain with state space [n] := 1, 2, . . . , n ?
Answer: take a state m with π([m]) ≥ 1/2 and π([m− 1]) < 1/2. Then the setworst set would be either [m] or [n] \ [m− 1].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 26 / 35
Trees
Can the tree structure be used to determine the identity of the “worst” sets?
Easier question: what set of π measure ≥ 1/2 is the “hardest” to hit in a birth &death chain with state space [n] := 1, 2, . . . , n ?
Answer: take a state m with π([m]) ≥ 1/2 and π([m− 1]) < 1/2. Then the setworst set would be either [m] or [n] \ [m− 1].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 26 / 35
Trees
Can the tree structure be used to determine the identity of the “worst” sets?
Easier question: what set of π measure ≥ 1/2 is the “hardest” to hit in a birth &death chain with state space [n] := 1, 2, . . . , n ?
Answer: take a state m with π([m]) ≥ 1/2 and π([m− 1]) < 1/2. Then the setworst set would be either [m] or [n] \ [m− 1].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 26 / 35
Trees
How to generalize this to trees?
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 27 / 35
Central vertex
oC3
C2
C1
C4
∏(Ci)≤1/2for all i
Figure : A vertex o ∈ V is called a central-vertex if each connected component of T \ o hasstationary probability at most 1/2.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 28 / 35
Trees
There is always a central-vertex (and at most 2). We fix one, denote it by o andcall it the root.
There is a simple reduction: To is concentrated (from worst leaf) =⇒ hit-cutoff.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 29 / 35
Trees
There is always a central-vertex (and at most 2). We fix one, denote it by o andcall it the root.
There is a simple reduction: To is concentrated (from worst leaf) =⇒ hit-cutoff.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 29 / 35
Trees
It suffices to show that Ex[To] = Ω(tmix) =⇒ To is concentrated if X0 = x.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 30 / 35
Trees
It suffices to show that Ex[To] = Ω(tmix) =⇒ To is concentrated if X0 = x.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 30 / 35
Trees
o=vk
x=v0v1
v2v3
vk-1
Figure : Let v0 = x,v1, . . . , vk = o be the vertices along the path from x to o.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 31 / 35
Trees
Proof of Concentration: Varx[To] ≤ Ctreltmix:
It suffices to show that Varx[To] ≤ 4trelEx[To].
If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1
d= Tvi underX0 = vi−1
τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =
∑Varx[τi].
The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤
∑4trelEx[τi] = 4trelEx[To].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35
Trees
Proof of Concentration: Varx[To] ≤ Ctreltmix:
It suffices to show that Varx[To] ≤ 4trelEx[To].
If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1
d= Tvi underX0 = vi−1
τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =
∑Varx[τi].
The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤
∑4trelEx[τi] = 4trelEx[To].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35
Trees
Proof of Concentration: Varx[To] ≤ Ctreltmix:
It suffices to show that Varx[To] ≤ 4trelEx[To].
If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1
d= Tvi underX0 = vi−1
τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =
∑Varx[τi].
The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤
∑4trelEx[τi] = 4trelEx[To].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35
Trees
Proof of Concentration: Varx[To] ≤ Ctreltmix:
It suffices to show that Varx[To] ≤ 4trelEx[To].
If X0 = x then To is the sum of crossing times of the edges along the pathbetween x: τi := Tvi − Tvi−1
d= Tvi underX0 = vi−1
τ1, . . . , τk are independent =⇒ it suffices to bound the sum of their variancesVarx[To] =
∑Varx[τi].
The law of each τi is a mixture of geometrics with parameters ≥ 1/2trel =⇒∑Varx[τi] ≤
∑4trelEx[τi] = 4trelEx[To].
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 32 / 35
Beyond trees
The tree assumption can be relaxed. We can treat jumps to vertices of boundeddistance on a tree (i.e. the length of the path from u to v in the tree (which is nowjust an auxiliary structure) is > r =⇒ P (u, v) = 0) under some mild necessaryassumption.
Previously the BD assumption could not be relaxed mainly due to it beingexploited via a representation of hitting times result for BD chains.
In particular, if P (u, v) ≥ δ > 0 for all u, v s.t. dT (u, v) ≤ r (and otherwiseP (u, v) = 0), then
cutoff⇐⇒ the Prod. Cond. holds.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 33 / 35
Beyond trees
The tree assumption can be relaxed. We can treat jumps to vertices of boundeddistance on a tree (i.e. the length of the path from u to v in the tree (which is nowjust an auxiliary structure) is > r =⇒ P (u, v) = 0) under some mild necessaryassumption.
Previously the BD assumption could not be relaxed mainly due to it beingexploited via a representation of hitting times result for BD chains.
In particular, if P (u, v) ≥ δ > 0 for all u, v s.t. dT (u, v) ≤ r (and otherwiseP (u, v) = 0), then
cutoff⇐⇒ the Prod. Cond. holds.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 33 / 35
Beyond trees
The tree assumption can be relaxed. We can treat jumps to vertices of boundeddistance on a tree (i.e. the length of the path from u to v in the tree (which is nowjust an auxiliary structure) is > r =⇒ P (u, v) = 0) under some mild necessaryassumption.
Previously the BD assumption could not be relaxed mainly due to it beingexploited via a representation of hitting times result for BD chains.
In particular, if P (u, v) ≥ δ > 0 for all u, v s.t. dT (u, v) ≤ r (and otherwiseP (u, v) = 0), then
cutoff⇐⇒ the Prod. Cond. holds.
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 33 / 35
Open problems
What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?
When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?
Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?
Does the product condition imply cutoff for transitive graphs under certain mildconditions?
Does contraction implies cutoff? (under some mild conditions...)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35
Open problems
What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?
When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?
Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?
Does the product condition imply cutoff for transitive graphs under certain mildconditions?
Does contraction implies cutoff? (under some mild conditions...)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35
Open problems
What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?
When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?
Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?
Does the product condition imply cutoff for transitive graphs under certain mildconditions?
Does contraction implies cutoff? (under some mild conditions...)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35
Open problems
What can be said about the geometry of the “worst” sets in some interestingparticular cases (say, transitivity or monotonicity)?
When can the worst sets be described as |f2| ≤ C (Pf2 = λ2f2)?
Is it true that for transitive graphs the mixing time is bounded by the expected timeit takes the walk to escape a ball containing at least half the vertices?
Does the product condition imply cutoff for transitive graphs under certain mildconditions?
Does contraction implies cutoff? (under some mild conditions...)
Joint work with Riddhi Basu and Jonathan Hermon Characterization of cutoff for reversible Markov chains 34 / 35