the moser-tardos resample algorithm: where is the limit ...zz124/alenex17.pdf · what happens when...
TRANSCRIPT
The Moser-Tardos Resample algorithm: Where is the limit? (an experimentalinquiry)
Jan Dean Catarata∗ Scott Corbett∗ Harry Stern∗ Mario Szegedy∗
Tomas Vyskocil∗ Zheng Zhang∗
AbstractThe celebrated Lovasz Local Lemma (LLL) guarantees thatlocally sparse systems always have solutions. The Moser-Tardos Resample algorithm does not only find such asolution in linear time, but its beautiful analysis has greatlyenhanced LLL related research [9, 10]. Nevertheless twomajor questions remain open.
1. How far beyond Lovasz’s condition can we expectthat Resample still performs in polynomial (linear)expected running time?
2. In Resample we have a choice between differentconstraint-selection strategies. How much does thischoice matter?
To state the first question correctly is a challenge already.For a solvable fixed instance Resample always comes upwith a solution, but the catch is that the number ofsteps may be very large. We have therefore looked atparameterized instance families and tried to identify phasetransitions in terms of these parameters. Perhaps thebiggest lesson we have learned is that if we want to seephase transition thresholds, i.e. identify parameter valueswhere Resample “stops working,” we need to understandwhat happens when Resample does not work. We havenoticed that in this case the algorithm settles at a metastableequilibrium (at least for the homogenous instances we haveconsidered), a phenomenon mostly studied for physicalsystems.
Concerning the policies for picking the violated con-straints (such as first violated, random violated, recursivefix, etc.), in the context of the grid-coloring problem themethods worked exactly for the same parameter range thenumber of resample steps differed by no more than 20 per-cent.
All results are experimental, although we discuss a
possible reason behind some phenomena.
1 Introduction
The problem of solving constraint systems over discretevariable sets is NP hard in general. There are twonotable exceptions, however.
1. When all constraints belong to certain easy familiesof algebraic type, such as linear equations over afinite field.
∗Department of Computer Science, Rutgers University, Piscat-
away, NJ. Email addresses: [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected]. This research was supportedby NSF grants 1422102 AF, 1514164 AF and NSF-CCF-1628401.
2. When some combinatorial restriction holds, mostimportantly sparsity.
In this article we are concerned with the latter. Thefocus of our investigation is the celebrated Resamplealgorithm of Robin A. Moser and Gabor Tardos [9, 10],that finds a solution for any constraint system inexpected linear time that meets the sparsity constraintof the Lovasz Local Lemma (LLL):
Theorem 1.1. (special case of LLL, [5]) Let X ={x1, . . . , xm} be a set of discrete valued variables. LetC1, . . . , Cn be constraints where each constraint Ci issome (true / false) predicate over a subset vbl(Ci) ofthe variable set X. Define pi as
# of assignments to vbl(Ci) that do not satisfy Ci# of all assignments to vbl(Ci)
If each constraint intersects at most mini1epi− 1 (e ≈
2.71) other constraints (Ci and Cj intersect if vbl(Ci)∩vbl(Cj) 6= ∅), i.e. the system meets the simple-LLLsparsity constraint, then there is an assignment to Xthat satisfies all constraints. (In this 1975 theorem onlyexistence was stated!)
The Moser-Tardos process is very simple: After aninitial random assignment we keep picking violated con-straints. In each such step we reassign random valuesto all of the variables of the newly picked constraint[resample step]. We do this until no more violatedconstraint can be found. In code:
The input is a constraint system Φ with variablesx1, . . . , xn and constraints C1, . . . , Cm. For every xia probability distribution µi on the possible valuesof xi is given. Procedure ResampleConstraint(C)randomly resets every variable xi in vbl(C) accordingto µi. The algorithm starts with an initialization stepin which every xi is randomly set according to µi.Note: Theorem 1.1 generalizes in the presence of µis.Then pi becomes the probability that Ci does not hold
159 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
Algorithm 1 Resample
1: procedure Resample(Φ, µ1, . . . , µn)2: for all xi (1 ≤ i ≤ n) do3: xi ← RandomValue(µi)4: end for5: while (ViolationExists(Φ,x1, . . . , xn)) do6: C ← GetViolatedConstraint /* Unless said
otherwise, violated constraint randomly picked */7: ResampleConstraint(C) /*
Resample Step */8: end while9: end procedure
10: procedure ResampleConstraint(C)11: for all xi ∈ Vbl(C) do12: xi ← RandomValue(µi)13: end for14: end procedure
under∏ni=1 µi. The density criterion for both LLL and
Resample remains the same as in Theorem 1.1, i.e.that every constraint must intersect at most mini
1epi−1
other constraints.
The ancestors of Resample by J. Beck [3], followedby several others [2, 8, 4, 13], did not meet the LLLbound. Given the desirable properties of the Resam-ple algorithm, we would like to understand its preciselimits. The super-elegant proof of Moser and Tardosmysteriously breaks down exactly at the LLL threshold,but what is the algorithm’s true limitation? Anotherquestion is the method that selects the violated con-straint (i.e. GetViolatedConstraint above), whichis arbitrary in the proof of Moser and Tardos. How dodifferent selection methods compare in performance?
x0 x1
x2
x3
x4
x5
x6
x7x8
x9
x10
x11
x12
x13
x14
x3 ∨ x11 ∨ x14
x6 ∨ x9 ∨ x11
x1 ∨ x4 ∨ x7
Figure 1: A “random” 3SAT instance on 15 variableswith sparsity 0.2.
Example. A strict kSAT instance is
C1 ∧ C2 ∧ . . . Cm
where every constraint Ci is a disjunction
Ci = xε1,i1,i ∨ . . . ∨ x
εk,i
k,i (εj,i ∈ {1,−1})
of k literals (i.e. variables and their negations, whereεj,i = −1 indicates that xj,i is negated), such that for1 ≤ j < j′ ≤ k we have xj,i is different from xj′,i.The latter condition is necessary to ensure that theprobability of the event Ai that the ith constraint doesnot hold under a random assignment is exactly 2−k. ThekSAT problem, which asks if an instance has a satisfyingassignment, is NP-hard, but satisfiability automaticallyholds under:
Sparsity restriction: Every constraint Ci shares vari-
ables (negated or non-negated) with at most⌊2k
e
⌋− 1
other constraints.
The statement is an immediate consequence ofTheorem 1.1. The theorem of Moser and Tardos [10]implies that under this condition Resample finds asatisfying assignment in expected Ok(m) time. HeidiGebauer, Tibor Szabo and Gabor Tardos [6] haveproved, there are unsatisfiable kSAT formulas where
every clause meets at most(
1 + O(1)√k
)2k
e other clauses,
so the above bound is close to sharp. The GSTconstruction is however a carefully designed instance.
How about random instances? They are known tobe satisfiable to a density threshold of 4.27 (meaning
that# of clauses
# of variables= 4.27), which is far beyond what
LLL could prove. We have experimentally found thatthe Resample algorithm works for random 3SAT in-stances with density up-to roughly 2.45, still far beyondthe LLL sparsity condition. At density 2.4 ± 0.05 aphase transition occurs: for density 2.6 the algorithmResample is practically unable to cope with 3SAT.
2 Our Results
Given a system Φ of constraints Theorem 1.1 gives asufficient criterion for its solvability, and under the samecriterion Resample also efficiently finds a solution. Infact, Theorem 1.1 has several variants, but for ourapplications none provides a significant improvementover Theorem 1.1. The blueprint of all variants isan abstract probability theoretic version [5]. The bestimprovement is by Shearer [12], who has also provedthe optimality of his version in Lovasz’s original generalabstract setting. The Moser-Tardos argument can bepushed to the Shearer bound [7], but no further, and
160 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
quite surprisingly the same bound appears also in adifferent context in statistical physics [11].
Our goal was to challenge the above notoriousthreshold, and show that Resample carves out anapplicability domain of its own from the parameterspace. We also wanted to find out what it is exactly.
The applicability of Resample on individual in-stances is not well defined, at least if we want to captureit via running time analysis, which involves limit-taking.Therefore we consider families of instances parameter-ized by size and by another parameter. If the latter pa-rameter is the density, it varies opposite with the solv-ability: the larger the density is the less solvable thesystem is. The number of colors for a coloring problemon the other hand varies together with solvability: morecolors means it is easier to find a solution. A third pa-rameter we look at is the skew of the resample distribu-tions, µi. The skew does not affect the solvability of theinstance at all, but it changes the LLL bound, and thebehavior of Resample in the same way as density does:the more skewed the µis are the smaller the LLL boundis, and Resample gets worse too. For a constraint fam-ily with parameter α let let αlovasz, αresample, αsolvability
be the validity/applicability thresholds of LLL, the Re-sample algorithm, and the threshold at which the sys-tem becomes unsolvable. For density-like parameterswe have:
αlovasz ≤ αresample ≤ αsolvability
For parameters varying together with solvability theinequalities get reversed. The example in the previoussection is summarized in:
Random 3SAT:αlovasz αresample αsolvability
0.55 2.4 4.27
For the first column we have just plugged in the averagenumber of neighbors of a clause into the LLL formula,a back of the envelope estimate. For many problemfamilies the third column is unknown, and this paper isthe first to systematically study the second column.
The main message of this article is that αresample
is definable, and at least experimentally exists.Also, often we find αlovasz � αresample, so αresample isa much better predictor of αsolvability than αlovasz.
The existence of αresample means that there is aphase transition threshold for any reasonable param-eterized family, and this threshold is usually differentfrom (better than) the LLL threshold.
The phase transition phenomenon. LetTresample(Φ) denote the expected number of re-sample steps in which Resample solves an instance
1100 1200
1300 1400
1500 1600
1700 1800
1900 2 2.2 2.4 2.6 2.8 3
0
20000
40000
60000
80000
100000
ma
x(n
um
be
r o
f re
sa
mp
les,
10
00
00
)
Number of variables
Density
ma
x(n
um
be
r o
f re
sa
mp
les,
10
00
00
)
0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000
Figure 2: Expected number of resample steps for therandom 3SAT problem in the function of probabilityp and instance size n (= number of variables), with acutoff at 100000.
Φ. To reveal the sharp distinction between “good”and “bad” instances we have considered parameterizedfamilies of instances Φn,α with two parameters: a sizeparameter n, and another parameter α representingdensity, number of colors, etc. We say that there is
a phase transition at α0 if fα(n)def= Tresample(Φn,α)
undergoes a growth rate change in α: for α < α0
the growth of fα(n) is polynomial, but for α > α0 itis exponential. First taking the limit in n and thenlooking at how its properties change in α is not unlikehow physicists study spin chains, where n is the numberof spinors and α is the temperature, magnetization, etc.For such a scheme to work, for a fixed α the membersof the sequence {Φn,α}∞n=1 must only differ in size, butotherwise they should be the “same.” With moderngraph limit theory we could rigorously define what wemean under a family of “same instances,” but we haveomitted this complication as in all of our examples (then × n grid, the random 3SAT, coin chains of lengthn) the sameness is abundantly clear. Figure 2 showshow in the case of random 3SAT a phase transitionhappens around α = 2.4 (n is the number of variables,
α =# of clauses
# of variables).
Meta-stable equilibrium and the Coin Chainproblem. Although Figure 2 does show that a tran-sition happens around α = 2.4, the sharpness of thethreshold remains a question. One might blame thecomplexity of the experiment: for each fixed parametervalue we have to create several random instances andon each we need to run Resample several times. As aresult n cannot be much larger than the 1000s, which,one might think, makes the picture jiggery. But the
161 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
200
400
600
0.6
0.61
0.62
0.63
0.64
0
100k
200k
Nu
mb
er
of
Re
sa
mp
les p
er
Co
in
Number of Coins
Probability
Nu
mb
er
of
Re
sa
mp
les p
er
Co
in
0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000
0
500
1000
1500
2000
2500
3000
3500
4000
0.5 0.6 0.7 0.8 0.9 1
Typic
al n
um
ber
of
Tails
in th
e e
quili
briu
m
Probability
plot
a.) b.)
Figure 3: Capturing the phase transition phenomenon for the Coin Chain problem using very different concepts.In a.) we have run Resample until it solved the instance (the search needed to be cut off for some instances)and plotted the number of resample steps as a function of size and the parameter value p of the instance. In b.)we have fixed the size of the instance and also the number of resample steps (both to a sufficiently large numbers)and we have plotted the number of satisfied constraints as the function of p. The phase transition clearly jumpsout as a sudden drop of the number of Tails at p0 ≈ 0.634.
issue is not that.To be able to zoom into the phase transition we
have created a very uniform family of instances wecalled Coin Chain: Instance Φn,p is a chain x1, . . . , xnof binary variables (Head/Tail). The indices are takenmod n to make the chain a circle. The constraints are{Ci}ni=1, defined via vbl(Ci) = {xi−1, xi, xi+1} and Ci =xi (Head=0, Tail=1). See Figure 8. Resample process:If there are any Heads, we pick a uniformly randomHead, say xi, and resample {xi−1, xi, xi+1}, but with abiased coin: Prob(Head) = p, Prob(Tail) = 1− p. Eventhough Ci depends only on xi, Resample must stilladdress all variables in vbl(Ci). If there are no Heads,we have arrived at the all-Tail assignment, the only onethat satisfies the instance. When we plot the numberof resample steps in terms of n and p (see Figure 3 a.))the graph very similar to Figure 2. There is a phasetransition around 0.63.
Since the process is so intuitive, we can understandwhat happens: If p is less than the phase transitionthreshold, the Heads are eliminated at a linear rate.If however p is above the threshold, relatively quicklya dynamic equilibrium is reached in which the processkeeps finding assignments whose simple statistics (mostnotably the number of unsatisfied constraints) remainnearly stable. This, so-called meta-stable equilibriumpersists for an exponentially long time until the all-Tails assignment is reached by chance. For a chain oflength 40000 already in 1000000 steps the equilibriumwas reached for every p, where the fraction of Tails has
remained constant. We have plotted the most typicalnumber of Tails (The peaks in Figure 9 a.)) as afunction of p (see Figure 3 a.)) and the phase transitionstrikingly came out. We have made several experimentsto explore other parameters of the equilibrium, such ascorrelations, etc. that arises for instance for p = 0.7 andsome other values of p. There are non-trivial short-rangecorrelations, but there are no long-range correlations(i.e. between distant coins of the chain), as expected.In spite of our numerous experiments we cannot provethe existence of the equilibrium, and we cannot evenprove that for p = 0.999 the process does not reach all-Tails solution in polynomial expected time. The LLLgives that Resample works below p = 0.073. Thus:
Coin Chain:plovasz presample psolvability0.073 0.634 1
There is an even simpler model, for which we can showthe meta-stable equilibrium by mathematical means(Section 4). When the equilibrium-method is applied tothe random 3SAT problem, see Figure 4, we get a similarpicture to Figure 3 b.) and place the phase transitionwith greater confidence around 2.4.
The grid graph. Most of our experiments concernedthe 2D grid. In the k-coloring problem we must assignnumbers from one through k to the nodes of the gridsuch that neighboring nodes have different colors. Herethe natural parameter is k. What we have found is
162 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
0
500000
1e+06
1.5e+06
2e+06
2.5e+06
3e+06
3.5e+06
4e+06
2 2.2 2.4 2.6 2.8 3
Mo
st
co
mm
on
ra
tio
of
ba
d/a
ll cla
use
s
density
Figure 4: The fraction of violated constraints for arandom 3SAT with 1000 variables after 100000 resamplesteps.
2D grid, k coloring:χlovasz χresample χsolvability
19 4 2
Here χsolvability = 2 means that the grid is 2-colorable, χresample = 4 means that the Resamplealgorithm efficiently 4-colors the grid, and χlovasz = 18means that the LLL only implies that the grid is 19colorable. Since k is discrete, the transition happensfrom three to four: k ≤ 3 requires exponential and k ≥ 4requires linear number of resamples in terms of the gridsize, n2.The Selection Method. Resample (Algorithm 1)gives the freedom to choose any method GetViolat-edConstraint for selecting a violated constraint C forthe next resample step. We have looked at several rea-sonable resample methods in the case of the n×n torusk-coloring problem:
Recursive Fix: We iteratively call a Fix routine justas in Moser’s original paper. The routine does notreturn until all intersecting constraints are satisfied.
Random: We pick an unsatisfied (= badly colored)edge each time entirely at random from the set of allun-satisfied edges.
Fixed Order: We pick the first unsatisfied edgeaccording to a fixed (lexicographic) ordering of theedges.
Random Fixed Order: We pick the first unsatisfiedconstraint according to a random permutation of edges,fixed initially before the start of the process.
Cyclical: For (a certain fixed, lexicographic-type)
0.8
0.85
0.9
0.95
1
1.05
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Re
sa
mp
les/S
ize
2
Size
Resample Method Comparison
Random Fixed OrderCyclical
Fixed OrderRandom
RecursiveLBN
Figure 5: Experimental results of performing variousresample methods on the n× n torus. The node colorsare uniformly distributed on {1, 2, 3, 4, 5, 6}. The x-coordinate represents size n and the y-coordinate isthe number of resamples averaged over 100 tests andnormalized (divided) by n2.
cyclic order of the edges we always pick the next badedge, every time moving only forward in the cycle.
Least Bad Neighbors: We randomly pick an unsat-isfied edge among the ones that have the least numberunsatisfied neighbors. [This is to simulate a worst caseresample strategy.]
The results of some of our of experiments aresummarized in Figure 5.The support conjecture. The coin chain model wasnot the first for which we have tried to plot a phasetransition in terms of the skew. In the familiar gridcoloring problem let k = 4 fixed, and set the resampleprobabilities of colors 1 though 4 to (p, 1−p3 , 1−p3 , 1−p3 ),with 0 < p < 1. The instances Φn,p were the n × ntori (nearly identical to the grid, but both dimensionsare wrapped around to enhance symmetry). Figure 12shows the number of resample steps for growing n interms of p. We do not see any phase transition for any ofthe resample methods! This was our first attempt to seea phase transition in terms of a continuous parameter,a disappointment, but its lack of success has lead usformulate a conjecture:
Conjecture 2.1. (Support Conjecture) WhetherResample runs either polynomially or exponentiallyon a growing size family of the same instances, and isunaffected by how we fix the distribution µis, as long asit has full support on all values xi may take.
The conjecture seemed to be in line with DimitrisAchlioptas’s intuition that Resample’s success is hard-
163 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
20 25 30 35 40 45 50 55 60 65 70
200
400
600
800
1000
Resam
ple
s/S
ize
2
Size
Random Fixed Order
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
30
40
50
60
70
80
90
100
110
200
400
600
800
1000
Size
Random
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
20
30
40
50
60
70
80
90
100
200
400
600
800
1000
Size
Recursive
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
50
100
150
200
250
300
350
400
450
200
400
600
800
1000
Resam
ple
s/S
ize
2
Size
Least Bad Neighbors
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
40
60
80
100
120
140
160
180
200
400
600
800
1000
Size
Cyclical
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
0
200
400
600
800
1000
1200
1400
1600
200
400
600
800
1000
Size
Fixed Order
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
Figure 6: Each curve in each diagram corresponds to some probability p. The grid is 4-cored with distribution(p, 1−p3 , 1−p3 , 1−p3 ). Each point is an average of 100 tests divided by n2. Even p = 10−5 is linear for all methods(maybe except Fixed Order).
wired in the instance and independent of the resampleprobabilities (which are add-ons). Although the CoinChain example refutes the Support Conjecture in gen-eral, the conjecture seems to hold for the grid-coloringproblem, and probably for several other instances too,raising the question:
What combinatorial properties of Φn,α determine ifthe Support Conjecture holds for that family?
3 The Coin Chain
We have thoroughly investigated the coin chain prob-lem, discussed in the previous section and the first thingwe have noticed is that the meta-stable distribution isunique and attracts the state: no matter what initialassignment we start from we end up in that distribu-tion. This is demonstrated in Figure 7 starting fromall-Heads and from 1% Heads. (When we start from ani.i.d. with 0.634n expected number of Tails, the num-ber of tails first goes up and then settles again around0.634n, but at this time with the meta-stable distribu-tion.) The Tails-density of the state, when reaches themeta-stable distribution will oscillate around its mostlikely value. In Figure 9 a.) we have plotted the relativedensity of the Tails during this oscillation for various pa-rameter values. In Figure 9 b.) we did the same withrandom 3SAT, which gives a remarkably similar picture,indicating that understanding Resample on the CoinChain is a good firs step.
0
500
1000
1500
2000
2500
3000
3500
4000
0 200k 400k 600k 800k 1m 1200k 1400k 1600k 1800k 2m
Nu
mb
er
of
he
ad
s
Steps
Initially 100% headsInitially 1% heads
Figure 7: The Resample process for the Coin Chain“mixes” to the meta-stable distribution.
4 The Math of Meta-stable equilibrium
In this section we explain the meta-stable equilibriumphenomenon with rigor on the following little example:
164 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
H HHHHH TTTT T
Figure 8: A constraint in the coin chain problem: the middle coin decides if the constraint is violated (theconstraint fails if it is Head). Upon ResampleConstraint all three coins must be resampled.
0
50000
100000
150000
200000
250000
300000
350000
400000
450000
0 500 1000 1500 2000 2500 3000 3500 4000
Occu
ren
ce
s
Most common number of tails
probability 0.63probability 0.6338
probability 0.635probability 0.64probability 0.65
probability 0.7probability 0.9
0
500
1000
1500
2000
2500
3000
3500
4000
4500
0 0.005 0.01 0.015 0.02 0.025 0.03 0.035
Occu
ren
ce
Ratio of unsatisfied clauses and all clauses
Density 2.4Density 2.5Density 2.6Density 2.7Density 2.8
a.) b.)
Figure 9: Oscillation around the stable number of Tails. Left: Coin Chain, Right: random 3SAT.
x2
x1
x4
x3
Φ4,p :
C12C21
C23
C32C34
C43
C41C14
C24
C42
C13C31
Instance Φn,p: n Boolean variables x1, . . . , xn;n(n − 1) constraints, Ci,j(xi, xj) (1 ≤ i 6= j ≤ n). Theconstraints are defined by
Ci,j(xi, xj) =
{0 if xi = 01 if xi = 1
Notice that Ci,j depends only on xi. Furthermore,the assignment 11 . . . 1 satisfies all constraints of theinstance and it is the only satisfying assignment. Theinstance has symmetry Sn, which greatly simplifies theanalysis. Parameter p of the instance is the probabilitywith which resample sets a variable to zero:
xi = 0 with probability pxi = 1 with probability 1− p
The Resample algorithm is a Markov chain on allof the 2n assignments, but because of the Sn symmetrythe transition probabilities depend only on the numberof zeros of the assignments. Thus the Markov chainprojects to a smaller Markov chain X[t] on n+ 1 states.If the current assignment has k zeroes — we simply callthis state k — it transits to one of the states k−2, k−1,k, k + 1 with the following probabilities:
to k − 2 with probability k−1n−1 (1− p)2
to k − 1 with probability n(1−p)+k(3p−1)−2pn−1 (1− p)
to k with probability (k−1)p+2(n−k)(1−p)n−1 p
to k + 1 with probability n−kn−1p
2
These four numbers are p−2(k), p−1(k), p0(k) andp1(k), respectively. Let ∆(k) denote the expectedchange from k = X[t] to X[t+ 1]. Then
∆(k) = E(X[t+ 1]−X[t] | X[t] = k)
= p1(k)− p−1(k)− 2p−2(k)
=(2p− 1)n− k + 2(1− p)
n− 1
165 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
. . . . . .0 1 k k + 1k − 1k − 2 n− 1 n
p+1(k)
p−2(k)
p−1(k)
p0(k)
Figure 10: The Markov chain X[t]
1 2 3 4 5 6 7 8 9 100
Figure 11: The arrow lengths indicate expected move from different states under X[t]. The walk will likely tostay around a middle point for a very long time where the expected move is zero. Below the middle point thereis an upward drift, above the middle point there is a downward drift. The location of the neutral point dependson p. This schematic picture corresponds to p = 0.75.
This function is linear in k and
∆(k)
{> 0 if k < (2p− 1)n+ 2(1− p)< 0 if k > (2p− 1)n+ 2(1− p)
If the current state is k and ∆(k) > 0, the number ofzeroes is likely to increase and if ∆(k) < 0, the numberof zeroes is likely to decrease. This means that thesystem will oscillate around k = (2p− 1)n+ 2(1− p) ≈(2p − 1)n. When p < 0.5 and n is large, the systemsettles in linear time at 11 . . . 1. When p = 0.5 and n islarge, the system still reaches 11 . . . 1 in expected O(n2)time. Finally, when p > 0.5 and n is large, the systemwill have around (2p− 1)n zeros for a very long time.
5 Selection Methods
In this section we briefly describe the different types ofmethods we have compared, that select a violated clausein the GetViolatedConstraint routine.
5.1 Recursive Resampling Recursive method ofchoosing violated clauses for grids/tori is as follows.First a random monochromatic edge is chosen to beresampled. That edge is resampled and all adjacentviolated edges, including the edge just resampled, areresampled if they are bad edges. Furthermore, theneighbors of those neighbors will be resampled and soon, until all neighbors in the ”neighborhood” have agood coloring. If no violated edge remains in the graphwhen the recursion returns to the top-level caller thenthe process is finished. If not then a new random edgeis selected to be recursively resampled and the processis repeated.Pseudocode.
Algorithm 2 Recursive Resample Algorithm
1: procedure RecursiveResampling(G) . G agraph
2: while exists a violated edge in G do3: e← get random violated edge4: RecursiveRecolor(G,e)5: end while6: end procedure7: procedure RecursiveRecolor(G,e)8: while exists a violated edge in neighborhood(e)
do9: for every violated edge f in neighborhood(e)
do10: RecursiveRecolor(G,f)11: end for12: end while13: Recolor(G,e)14: end procedure
Implementation. The biggest challenge here is toselect a random edge in constant time. Each edge haspointers to it’s vertices in the graph. Violated edges arestored in an array A. If an edge is violated it will storeit’s index in A for quick access. To select a randomedge, the method randomly selects one of the violatededges from A. That edge is then resampled and asub-routine v2e is called which returns all adjacentedges to a given vertex. Those edges are then checkedif they are violated. If an edge is violated and not in Athen it is added to A. If an edge is no longer violatedand it is in A then it is removed from A. When edgesare added to A they are appended to the end and whenthey are removed they are replaced with the last edgeof the array (or null if it is the last edge in the array).For each resample, checking edges and managing A
166 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
takes constant time.
5.2 Random Resampling In this method, to getthe next edge to be resampled, randomly choose a badedge uniformly. The process repeats until all edges arenicely colored.Pseudocode.
Algorithm 3 Dynamic Random Resample Algorithm
1: procedure DynamicRandomResampling(G) .G a graph
2: while exists a violated edge in G do3: e← get random violated edge4: Recolor(G,e)5: end while6: end procedure
Implemetation. Similar to Recursive Resampling, weuse an array A to store all bad edges. The method willrandomly choose an edge from A and resamples it andalso does the same management for A done in RecursiveResampling.
5.3 Fixed Order Resampling Fixed order firstorders the edges. Starting from the top left at (0,0)we traverse the nodes from left to right and from top tobottom enumerating the edge to the right of the nodethen the edge below the node. With this ordering weresample the edge with the lowest number with badcoloring.Pseudocode.
Algorithm 4 Fixed Order Resample Algorithm
1: procedure FixedOrderResampling(G,π) . G agraph, π edge ordering
2: while exists a violated edge in G do3: e ← get first violated edges of G from π
ordering4: Recolor(G,e)5: end while6: end procedure
Implemetation. This method requires a heap to keeptrack of the next edge to be resampled. All violatededges are put into a heap with the least numbered edgeon top. This heap is stored as an array A. Similarto previous method implementations, each edge storesit’s position in A. To choose the next edge the methodtakes the top edge from the heap and resamples it. Nextit calls v2e to get all effected edges by resampling andadds or removes said edges from A if they are violated
and not in A or satisfied and in A respectively. Addingand removing items from a heap takes O(logn).
Studying a visual run of the Fixed Order resample,we observed a moving ”frontier” where at a certain row,all rows below it will be a good coloring except for afew. The few would form a path of bad edges from the”frontier” reaching as far down as the first row of thegraph.
5.4 Fixed Random Resampling Similarly to fixedorder, fixed random enumerates the edges in a randomordering. Then when selecting the next edge to beresampled, this method chooses the edge with the leastnumber and that is badly colored.Pseudocode.
Algorithm 5 Fixed Random Resample Algorithm
1: procedure FixedRandomResampling(G,π) . Ga graph, π edge ordering
2: while exists a violated edge in G do3: e ← get first violated edges of G from π
ordering4: Recolor(G,e)5: end while6: end procedure
Implemetation. This method first stores all edges intoan array B. Next it randomly chooses an edge fromB, enumerates it (every edge stores it’s ordering) andremoves it from B. This process takes O(n). Afterenumerating, all violated edges are stored into a heapthat is implemented by an array A. Similar to FixedOrder Resampling, an edge is selected by taking thetop of the heap. After that edge is resampled theneighboring edges go through the same process as FixedOrder in adding and removing them from A. Choosingan edge to resample takes O(log n) time.
5.5 Cyclical Resampling Similar to Fixed Orderresample where first there must be an ordering of edges.We use the same ordering as the one for Fixed Order.Next the method will iterate over the list of edgesresampling each monochromatic edge it finds. If thegraph isn’t colored by the time it reaches the end, itrepeats, iterating over the list until the graph is colored.Pseudocode.
167 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
Algorithm 6 Cyclical Resample Algorithm
1: procedure CyclicalResampling(G,π) . G agraph, π edge ordering
2: while exists a violated edge in G do3: a ← get array of violated edges of G in π
ordering4: for every violated edge e in a do5: Recolor(G,e)6: Update(a,e,G)7: end for8: end while9: end procedure
Implementation. To efficiently implement thismethod we must use an array that stores all bad edgesin order. When an edge is resampled we insert orremove effected edges in to the array only if they areedges that haven’t been passed yet in the current cycle.
5.6 Least Bad Neighbors (LBN) We observedwhen resampling edges where there are many neighbor-ing bad edges, there is a more likely chance that theresampling will reduce the total number of bad edges.Inversely, a resampling on an edge where there are fewneighboring bad edges will result in a more likely chancethat the resampling will increase the total number ofbad edges. Least Neighbors variation will choose thenext edge to be resampled by choosing the edge withleast number of badly colored neighbors. If there aremultiple candidates then the algorithm randomly selectsone of them.Pseudocode.
Algorithm 7 LBN Resample Algorithm
1: procedure LBNResampling(G) . G a graph2: while exists a violated edge in G do3: e ← get edge with least number of bad
neighbors4: Recolor(G,e)5: end while6: end procedure
Implementation. A min-heap is used to keep trackof the edges with the edge of least neighbors on top ofthe heap. This heap implementation is very similar tofixed order and random order but the ordering is basedon number of neighboring bad edges.
6 Test Parameters for the Grid and Torus
There are certain factors that affect the performance ofthe Resample Algorithm. In particular we studied the
effects of changing the number of colors, the resampleprobabilities in the case of 4 and 6 colors, the methodvariation and the size of the graph. We chose to studymore closely the behaviors of the Resample Algorithmon the coloring problem of Grid/Torus graph typebecause the problem is very easy to picture.Probability Distributions. For 6 colors, we useda probability distribution of (p, p, p, 1−3p3 , 1−3p3 , 1−3p3 )where p is the probability of that color appearing.For 4 colors, we used a probability distribution ofp, 1−p3 , 1−p3 , 1−p3 . We experimented on what happenswhen p gets closer to zero. In both cases the closer pis to zero, the more the coloring is similar to using 3colors, yet the number of resample steps for any fixedp 6= 0 was linear in terms of n2 (number of constraints).Graph Size. We ran tests on square Grids/Tori whosesides were of the following lenghts:100, 200, ..., 1000Grid vs. Torus In testing Grids and Tori of large sizeit became apparent that there is very little differencebetween the two.
7 The robustness of the support conjecture
Figure 12 and Figure 13 relate to similar experiments.Figure 13 is of 6 colors with half the colors haveprobability p of appearing while the other half haveprobability 1−3p
3 . Figure 12 is of 4 colors with one
having probability p and the other three with 1−p3 . As
p gets closer to 0, both distributions are closer to beingsupported on 3 colors. However, even if p = 1
10000 thereis still linear behavior in respect to size. As the functionof p, the number of resamples grows hyperbolically forthe same n. These experiments have given rise to oursupport conjecture for the Grids/Tori coloring problem.
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35
Ave
rag
e R
esa
mp
les
Probability
Probability Comparison
Random Fixed OrderLBN
Figure 12: Comparison of best and worst resamplingpolicy. Experiment with fixed graph size = 1000: andchange in probability p with 6 colors and distribution(p,p,p, 1−3p3 , 1−3p3 , 1−3p3 ). Each point is an average of 100tests.
168 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
10
15
20
25
30
35
40
45
200
400
600
800
1000
Resam
ple
s/S
ize
2
Size
Random Fixed Order
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
15
20
25
30
35
40
45
50
55
60
200
400
600
800
1000
Size
Random
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
15
20
25
30
35
40
45
50
55
200
400
600
800
1000
Size
Recursive
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
20
40
60
80
100
120
140
160
200
400
600
800
1000
Resam
ple
s/S
ize
2
Size
Least Bad Neighbors
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
10
20
30
40
50
60
70
80
90
200
400
600
800
1000
Size
Cyclical
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
0
50
100
150
200
250
300
350
400
200
400
600
800
1000
Size
Fixed Order
1/1000
1/2000
1/3000
1/4000
1/5000
1/6000
1/7000
1/8000
1/9000
1/10000
Figure 13: Experiment with change in probabilityp and graph size n with 6 colors and distribution(p,p,p, 1−3p3 , 1−3p3 , 1−3p3 ). Each point is an average of 100tests divided by the number of vertices of the torus (n2).
8 Conclusions and Open Problems
On the theoretical, but also on the experimental sidemost questions remain open. We believe, the toppriority among them is to develop theoretical methodsto bound αresample in cases when it is significantlybeyond the magic αlovasz. Simple ad hoc methodssometimes work: e.g. up-to p = 1/3 for the coin chainand for ≥ 7 colors for the grid, but we do not havesweeping new general methods, different from the LLL,that address αresample. We have no methods at all forshowing that Resample does not work beyond somethreshold (other than in the most simple cases such asin Section 4).
The Full-support conjecture says that when we ap-ply Resample to color the grid, aside from a constantfactor it does not matter how we set the probabilities,when µi has full support. This might be a hint for theexistence of a more combinatorial alternative to Resam-ple [in some cases] that avoids the µis altogether. Thenew framework of Achlioptas-Iliopoulos [1] is combina-torial, though it does not solve the support conjecturefor grids.
Acknowledgements. We thank Dimitris Achlioptasfor interesting discussions, and Alistair Sinclair who haspointed us to the physics term “meta-stable equilib-rium.”
References
[1] D. Achlioptas and F. Iliopoulos, Random walksthat find perfect objects and the lovasz local lemma,
in 55th IEEE Annual Symposium on Foundations ofComputer Science, FOCS, 2014, pp. 494–503.
[2] N. Alon, A parallel algorithmic version of the LocalLemma, in FOCS, 1991, pp. 586–593.
[3] J. Beck, An algorithmic approach to the Lovasz Lo-cal Lemma. i, Random Struct. Algorithms, 2 (1991),pp. 343–366.
[4] A. Czumaj and C. Scheideler, Coloring non-uniform hypergraphs: a new algorithmic approach tothe general Lovasz Local Lemma, in SODA, 2000,pp. 30–39.
[5] P. Erdos and L. Lovasz, Problems and results on3-chromatic hypergraphs and some related questions,In A. Hajnal, R. Rado and V.T. Sos, editors, Infiniteand Finite Sets (to Paul Erdos on his 60th birthday),(1975), pp. 609–627.
[6] H. Gebauer, T. Szabo, and G. Tardos, The locallemma is tight for sat, in SODA, 2011, pp. 664–674.
[7] K. B. R. Kolipaka and M. Szegedy, Moser andTardos meet Lovasz, in STOC, 2011, pp. 235–244.
[8] M. Molloy and B. A. Reed, Further algorithmicaspects of the Local Lemma, in STOC, 1998, pp. 524–529.
[9] R. A. Moser, A constructive proof of the Lovasz LocalLemma, in STOC, 2009, pp. 343–350.
[10] R. A. Moser and G. Tardos, A constructive proof ofthe general Lovasz Local Lemma, J. ACM, 57 (2010).
[11] A. D. Scott and A. D. Sokal, On dependencygraphs and the lattice gas, Combinatorics, Probability& Computing, 15 (2006), pp. 253–279.
[12] J. B. Shearer, On a problem of Spencer, Combina-torica, 5 (1985), pp. 241–245.
[13] A. Srinivasan, Improved algorithmic versions of theLovasz Local Lemma, in SODA, 2008, pp. 611–620.
169 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
APPENDIX
The LLL and the Resample algorithm
A general scenario that occurs again and again incomputer science is the following: We have variablesx1 . . . , xn, possibly each with a different domain. Inour experiments xi are either be Boolean, i.e. takevalues from {0, 1}, or take values from a color set[k] = {1, . . . , k}. We also have constraints that expressrelations between (typically small) subsets of variables.In our 3SAT experiment for instance, the variables areBoolean and the constraints are of the form xε11 ∨ xε22 ∨xε33 , where ε1, ε2, ε3 indicate possible negations. In ourk-coloring experiments the variables range in the colorset [k], and constraint C(x, y) expresses that x and yhave different colors. This is the familiar graph coloringproblem.
Definition .1. (vbl(C)) The variable set of a con-straint C(xi1 , . . . , xi`) is {xi1 , . . . , xi`}, and it is denotedby vbl(C).
Dependency graph. To quantify the sparsity ofa general constraint system C1, . . . , Cm we define thedependency graph G on the set of its constraints by
V (G) = {1, . . . ,m},E(G) = {(i, j) | vbl(Ci) ∩ vbl(Cj) 6= ∅}
Bad event probabilities. For the Lovasz LocalLemma (and for the Resample algorithm) it is nec-essary that the domain of each variable is assigned aprobability distribution: xi ∼ µi. Then we can talkabout the probability that a constraint does NOT hold,assuming its variables are independently drawn. Let pibe this probability for constraint Ci. The instance wewant to satisfy is C1 ∧C2 ∧ . . . ∧Cm, i.e. we want eachconstraint Ci to hold.
The Lovasz Local Lemma is expressed in terms ofthe constraint graph G and the bad event probabilityvector p = (p1, . . . , pm). A simple version states that if
(∆(G) + 1)× |p|∞ < 1/e
then all the bad events can be avoided with positiveprobability, where
∆(G) = maximum degree of G|p|∞ = max
ipi
If we want to express the sparsity condition solelyin terms of ∆(G) and |p|∞, the simple version is nearlysharp. The lemma is local in the sense that its conditioncan be checked for each constraint separately just bylooking at its probability and that how many neighborsit has. A local but more complicated condition is:
Standard LLL condition, STD(G, p): For some 0 ≤z1, . . . , zn < 1:
pi < zi∏
j:(i,j)∈E(G)(1− zj)(.1)
From STD(G, p) one can recover the simple conditionby setting zi = 1/∆ and assuming pi = 1/(e∆ + e).
Theorem .1. (Lovasz Local Lemma) LetC1, . . . , Cm be a system of constraints on variablesx1, . . . , xn with dependency graph G such that when thexi ∼ µi independently, then the probability that Ci doesnot hold is pi. Then STD(G, p), where p = (p1, . . . , pn),implies that the system is satisfiable.
The Standard LLL can beat the Simple LLL by a largefactor when G has very different vertex degrees. Weremark that LLL is often stated in a more generalabstract setting, with which we are not concerned here.
Resample: L. Moser, and later L. Moser and G. Tardoshave discovered that when STD(G, p) holds, one cansuccessfully run the very simple Algorithm 1 to actuallyfind a solution for C1 ∧ . . . ∧ Cm. In Moser’s originalpaper the violated clause had to be chosen in a certainway, but Moser and Tardos have proven that evenwhen an adversary chooses the clauses to be resampled,Resample works.
Theorem .2. (Moser-Tardos) Assume that theSTD(G, p) condition holds with z1, . . . , zm in Equation(.1). Then Resample finds a solution for C1∧ . . .∧Cmin expected running time at most
m∑i=1
zi1− zi
(.2)
regardless of how the GetViolatedConstraint pro-cedure picks a violated clause.
170 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p
x7
x4
x1
x8
x5
x2
x9
x6
x3
C8
C11
C3
C12
C9
C6
C4
C7
C10
C1
C5
C2
C8
C11
C3
C12
C9
C6
C4
C7
C10
C1
C5
C2
Instance Dependency graph G
171 Copyright © by SIAM
Unauthorized reproduction of this article is prohibited
Dow
nloa
ded
06/0
3/17
to 1
65.2
30.2
25.2
7. R
edis
trib
utio
n su
bjec
t to
SIA
M li
cens
e or
cop
yrig
ht; s
ee h
ttp://
ww
w.s
iam
.org
/jour
nals
/ojs
a.ph
p