the moser-tardos resample algorithm: where is the limit ...zz124/alenex17.pdf · what happens when...

The Moser-Tardos Resample algorithm: Where is the limit? (an experimentalinquiry)

Jan Dean Catarata∗ Scott Corbett∗ Harry Stern∗ Mario Szegedy∗

Tomas Vyskocil∗ Zheng Zhang∗

AbstractThe celebrated Lovasz Local Lemma (LLL) guarantees thatlocally sparse systems always have solutions. The Moser-Tardos Resample algorithm does not only find such asolution in linear time, but its beautiful analysis has greatlyenhanced LLL related research [9, 10]. Nevertheless twomajor questions remain open.

1. How far beyond Lovasz’s condition can we expectthat Resample still performs in polynomial (linear)expected running time?

2. In Resample we have a choice between differentconstraint-selection strategies. How much does thischoice matter?

To state the first question correctly is a challenge already.For a solvable fixed instance Resample always comes upwith a solution, but the catch is that the number ofsteps may be very large. We have therefore looked atparameterized instance families and tried to identify phasetransitions in terms of these parameters. Perhaps thebiggest lesson we have learned is that if we want to seephase transition thresholds, i.e. identify parameter valueswhere Resample “stops working,” we need to understandwhat happens when Resample does not work. We havenoticed that in this case the algorithm settles at a metastableequilibrium (at least for the homogenous instances we haveconsidered), a phenomenon mostly studied for physicalsystems.

Concerning the policies for picking the violated con-straints (such as first violated, random violated, recursivefix, etc.), in the context of the grid-coloring problem themethods worked exactly for the same parameter range thenumber of resample steps differed by no more than 20 per-cent.

All results are experimental, although we discuss a

possible reason behind some phenomena.

1 Introduction

The problem of solving constraint systems over discretevariable sets is NP hard in general. There are twonotable exceptions, however.

1. When all constraints belong to certain easy familiesof algebraic type, such as linear equations over afinite field.

∗Department of Computer Science, Rutgers University, Piscat-

away, NJ. Email addresses: [email protected],

[email protected], [email protected],

[email protected], [email protected],

[email protected]. This research was supportedby NSF grants 1422102 AF, 1514164 AF and NSF-CCF-1628401.

2. When some combinatorial restriction holds, mostimportantly sparsity.

In this article we are concerned with the latter. Thefocus of our investigation is the celebrated Resamplealgorithm of Robin A. Moser and Gabor Tardos [9, 10],that finds a solution for any constraint system inexpected linear time that meets the sparsity constraintof the Lovasz Local Lemma (LLL):

Theorem 1.1. (special case of LLL, [5]) Let X ={x1, . . . , xm} be a set of discrete valued variables. LetC1, . . . , Cn be constraints where each constraint Ci issome (true / false) predicate over a subset vbl(Ci) ofthe variable set X. Define pi as

# of assignments to vbl(Ci) that do not satisfy Ci# of all assignments to vbl(Ci)

If each constraint intersects at most mini1epi− 1 (e ≈

2.71) other constraints (Ci and Cj intersect if vbl(Ci)∩vbl(Cj) 6= ∅), i.e. the system meets the simple-LLLsparsity constraint, then there is an assignment to Xthat satisfies all constraints. (In this 1975 theorem onlyexistence was stated!)

The Moser-Tardos process is very simple: After aninitial random assignment we keep picking violated con-straints. In each such step we reassign random valuesto all of the variables of the newly picked constraint[resample step]. We do this until no more violatedconstraint can be found. In code:

The input is a constraint system Φ with variablesx1, . . . , xn and constraints C1, . . . , Cm. For every xia probability distribution µi on the possible valuesof xi is given. Procedure ResampleConstraint(C)randomly resets every variable xi in vbl(C) accordingto µi. The algorithm starts with an initialization stepin which every xi is randomly set according to µi.Note: Theorem 1.1 generalizes in the presence of µis.Then pi becomes the probability that Ci does not hold

159 Copyright © by SIAM

Unauthorized reproduction of this article is prohibited

Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Algorithm 1 Resample

1: procedure Resample(Φ, µ1, . . . , µn)2: for all xi (1 ≤ i ≤ n) do3: xi ← RandomValue(µi)4: end for5: while (ViolationExists(Φ,x1, . . . , xn)) do6: C ← GetViolatedConstraint /* Unless said

otherwise, violated constraint randomly picked */7: ResampleConstraint(C) /*

Resample Step */8: end while9: end procedure

10: procedure ResampleConstraint(C)11: for all xi ∈ Vbl(C) do12: xi ← RandomValue(µi)13: end for14: end procedure

under∏ni=1 µi. The density criterion for both LLL and

Resample remains the same as in Theorem 1.1, i.e.that every constraint must intersect at most mini

1epi−1

other constraints.

The ancestors of Resample by J. Beck [3], followedby several others [2, 8, 4, 13], did not meet the LLLbound. Given the desirable properties of the Resam-ple algorithm, we would like to understand its preciselimits. The super-elegant proof of Moser and Tardosmysteriously breaks down exactly at the LLL threshold,but what is the algorithm’s true limitation? Anotherquestion is the method that selects the violated con-straint (i.e. GetViolatedConstraint above), whichis arbitrary in the proof of Moser and Tardos. How dodifferent selection methods compare in performance?

x0 x1

x2

x3

x4

x5

x6

x7x8

x9

x10

x11

x12

x13

x14

x3 ∨ x11 ∨ x14

x6 ∨ x9 ∨ x11

x1 ∨ x4 ∨ x7

Figure 1: A “random” 3SAT instance on 15 variableswith sparsity 0.2.

Example. A strict kSAT instance is

C1 ∧ C2 ∧ . . . Cm

where every constraint Ci is a disjunction

Ci = xε1,i1,i ∨ . . . ∨ x

εk,i

k,i (εj,i ∈ {1,−1})

of k literals (i.e. variables and their negations, whereεj,i = −1 indicates that xj,i is negated), such that for1 ≤ j < j′ ≤ k we have xj,i is different from xj′,i.The latter condition is necessary to ensure that theprobability of the event Ai that the ith constraint doesnot hold under a random assignment is exactly 2−k. ThekSAT problem, which asks if an instance has a satisfyingassignment, is NP-hard, but satisfiability automaticallyholds under:

Sparsity restriction: Every constraint Ci shares vari-

ables (negated or non-negated) with at most⌊2k

e

⌋− 1

other constraints.

The statement is an immediate consequence ofTheorem 1.1. The theorem of Moser and Tardos [10]implies that under this condition Resample finds asatisfying assignment in expected Ok(m) time. HeidiGebauer, Tibor Szabo and Gabor Tardos [6] haveproved, there are unsatisfiable kSAT formulas where

every clause meets at most(

1 + O(1)√k

)2k

e other clauses,

so the above bound is close to sharp. The GSTconstruction is however a carefully designed instance.

How about random instances? They are known tobe satisfiable to a density threshold of 4.27 (meaning

that# of clauses

# of variables= 4.27), which is far beyond what

LLL could prove. We have experimentally found thatthe Resample algorithm works for random 3SAT in-stances with density up-to roughly 2.45, still far beyondthe LLL sparsity condition. At density 2.4 ± 0.05 aphase transition occurs: for density 2.6 the algorithmResample is practically unable to cope with 3SAT.

2 Our Results

Given a system Φ of constraints Theorem 1.1 gives asufficient criterion for its solvability, and under the samecriterion Resample also efficiently finds a solution. Infact, Theorem 1.1 has several variants, but for ourapplications none provides a significant improvementover Theorem 1.1. The blueprint of all variants isan abstract probability theoretic version [5]. The bestimprovement is by Shearer [12], who has also provedthe optimality of his version in Lovasz’s original generalabstract setting. The Moser-Tardos argument can bepushed to the Shearer bound [7], but no further, and



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

quite surprisingly the same bound appears also in adifferent context in statistical physics [11].

Our goal was to challenge the above notoriousthreshold, and show that Resample carves out anapplicability domain of its own from the parameterspace. We also wanted to find out what it is exactly.

The applicability of Resample on individual in-stances is not well defined, at least if we want to captureit via running time analysis, which involves limit-taking.Therefore we consider families of instances parameter-ized by size and by another parameter. If the latter pa-rameter is the density, it varies opposite with the solv-ability: the larger the density is the less solvable thesystem is. The number of colors for a coloring problemon the other hand varies together with solvability: morecolors means it is easier to find a solution. A third pa-rameter we look at is the skew of the resample distribu-tions, µi. The skew does not affect the solvability of theinstance at all, but it changes the LLL bound, and thebehavior of Resample in the same way as density does:the more skewed the µis are the smaller the LLL boundis, and Resample gets worse too. For a constraint fam-ily with parameter α let let αlovasz, αresample, αsolvability

be the validity/applicability thresholds of LLL, the Re-sample algorithm, and the threshold at which the sys-tem becomes unsolvable. For density-like parameterswe have:

αlovasz ≤ αresample ≤ αsolvability

For parameters varying together with solvability theinequalities get reversed. The example in the previoussection is summarized in:

Random 3SAT:αlovasz αresample αsolvability

0.55 2.4 4.27

For the first column we have just plugged in the averagenumber of neighbors of a clause into the LLL formula,a back of the envelope estimate. For many problemfamilies the third column is unknown, and this paper isthe first to systematically study the second column.

The main message of this article is that αresample

is definable, and at least experimentally exists.Also, often we find αlovasz � αresample, so αresample isa much better predictor of αsolvability than αlovasz.

The existence of αresample means that there is aphase transition threshold for any reasonable param-eterized family, and this threshold is usually differentfrom (better than) the LLL threshold.

The phase transition phenomenon. LetTresample(Φ) denote the expected number of re-sample steps in which Resample solves an instance

1100 1200

1300 1400

1500 1600

1700 1800

1900 2 2.2 2.4 2.6 2.8 3

0

20000

40000

60000

80000

100000

ma

x(n

um

be

r o

f re

sa

mp

les,

10

00

00

)

Number of variables

Density

ma

x(n

um

be

r o

f re

sa

mp

les,

10

00

00

)

0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000

Figure 2: Expected number of resample steps for therandom 3SAT problem in the function of probabilityp and instance size n (= number of variables), with acutoff at 100000.

Φ. To reveal the sharp distinction between “good”and “bad” instances we have considered parameterizedfamilies of instances Φn,α with two parameters: a sizeparameter n, and another parameter α representingdensity, number of colors, etc. We say that there is

a phase transition at α0 if fα(n)def= Tresample(Φn,α)

undergoes a growth rate change in α: for α < α0

the growth of fα(n) is polynomial, but for α > α0 itis exponential. First taking the limit in n and thenlooking at how its properties change in α is not unlikehow physicists study spin chains, where n is the numberof spinors and α is the temperature, magnetization, etc.For such a scheme to work, for a fixed α the membersof the sequence {Φn,α}∞n=1 must only differ in size, butotherwise they should be the “same.” With moderngraph limit theory we could rigorously define what wemean under a family of “same instances,” but we haveomitted this complication as in all of our examples (then × n grid, the random 3SAT, coin chains of lengthn) the sameness is abundantly clear. Figure 2 showshow in the case of random 3SAT a phase transitionhappens around α = 2.4 (n is the number of variables,

α =# of clauses

# of variables).

Meta-stable equilibrium and the Coin Chainproblem. Although Figure 2 does show that a tran-sition happens around α = 2.4, the sharpness of thethreshold remains a question. One might blame thecomplexity of the experiment: for each fixed parametervalue we have to create several random instances andon each we need to run Resample several times. As aresult n cannot be much larger than the 1000s, which,one might think, makes the picture jiggery. But the



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

200

400

600

0.6

0.61

0.62

0.63

0.64

0

100k

200k

Nu

mb

er

of

Re

sa

mp

les p

er

Co

in

Number of Coins

Probability

Nu

mb

er

of

Re

sa

mp

les p

er

Co

in

0 20000 40000 60000 80000 100000 120000 140000 160000 180000 200000

0

500

1000

1500

2000

2500

3000

3500

4000

0.5 0.6 0.7 0.8 0.9 1

Typic

al n

um

ber

of

Tails

in th

e e

quili

briu

m

Probability

plot

a.) b.)

Figure 3: Capturing the phase transition phenomenon for the Coin Chain problem using very different concepts.In a.) we have run Resample until it solved the instance (the search needed to be cut off for some instances)and plotted the number of resample steps as a function of size and the parameter value p of the instance. In b.)we have fixed the size of the instance and also the number of resample steps (both to a sufficiently large numbers)and we have plotted the number of satisfied constraints as the function of p. The phase transition clearly jumpsout as a sudden drop of the number of Tails at p0 ≈ 0.634.

issue is not that.To be able to zoom into the phase transition we

have created a very uniform family of instances wecalled Coin Chain: Instance Φn,p is a chain x1, . . . , xnof binary variables (Head/Tail). The indices are takenmod n to make the chain a circle. The constraints are{Ci}ni=1, defined via vbl(Ci) = {xi−1, xi, xi+1} and Ci =xi (Head=0, Tail=1). See Figure 8. Resample process:If there are any Heads, we pick a uniformly randomHead, say xi, and resample {xi−1, xi, xi+1}, but with abiased coin: Prob(Head) = p, Prob(Tail) = 1− p. Eventhough Ci depends only on xi, Resample must stilladdress all variables in vbl(Ci). If there are no Heads,we have arrived at the all-Tail assignment, the only onethat satisfies the instance. When we plot the numberof resample steps in terms of n and p (see Figure 3 a.))the graph very similar to Figure 2. There is a phasetransition around 0.63.

Since the process is so intuitive, we can understandwhat happens: If p is less than the phase transitionthreshold, the Heads are eliminated at a linear rate.If however p is above the threshold, relatively quicklya dynamic equilibrium is reached in which the processkeeps finding assignments whose simple statistics (mostnotably the number of unsatisfied constraints) remainnearly stable. This, so-called meta-stable equilibriumpersists for an exponentially long time until the all-Tails assignment is reached by chance. For a chain oflength 40000 already in 1000000 steps the equilibriumwas reached for every p, where the fraction of Tails has

remained constant. We have plotted the most typicalnumber of Tails (The peaks in Figure 9 a.)) as afunction of p (see Figure 3 a.)) and the phase transitionstrikingly came out. We have made several experimentsto explore other parameters of the equilibrium, such ascorrelations, etc. that arises for instance for p = 0.7 andsome other values of p. There are non-trivial short-rangecorrelations, but there are no long-range correlations(i.e. between distant coins of the chain), as expected.In spite of our numerous experiments we cannot provethe existence of the equilibrium, and we cannot evenprove that for p = 0.999 the process does not reach all-Tails solution in polynomial expected time. The LLLgives that Resample works below p = 0.073. Thus:

Coin Chain:plovasz presample psolvability0.073 0.634 1

There is an even simpler model, for which we can showthe meta-stable equilibrium by mathematical means(Section 4). When the equilibrium-method is applied tothe random 3SAT problem, see Figure 4, we get a similarpicture to Figure 3 b.) and place the phase transitionwith greater confidence around 2.4.

The grid graph. Most of our experiments concernedthe 2D grid. In the k-coloring problem we must assignnumbers from one through k to the nodes of the gridsuch that neighboring nodes have different colors. Herethe natural parameter is k. What we have found is



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

0

500000

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

3.5e+06

4e+06

2 2.2 2.4 2.6 2.8 3

Mo

st

co

mm

on

ra

tio

of

ba

d/a

ll cla

use

s

density

Figure 4: The fraction of violated constraints for arandom 3SAT with 1000 variables after 100000 resamplesteps.

2D grid, k coloring:χlovasz χresample χsolvability

19 4 2

Here χsolvability = 2 means that the grid is 2-colorable, χresample = 4 means that the Resamplealgorithm efficiently 4-colors the grid, and χlovasz = 18means that the LLL only implies that the grid is 19colorable. Since k is discrete, the transition happensfrom three to four: k ≤ 3 requires exponential and k ≥ 4requires linear number of resamples in terms of the gridsize, n2.The Selection Method. Resample (Algorithm 1)gives the freedom to choose any method GetViolat-edConstraint for selecting a violated constraint C forthe next resample step. We have looked at several rea-sonable resample methods in the case of the n×n torusk-coloring problem:

Recursive Fix: We iteratively call a Fix routine justas in Moser’s original paper. The routine does notreturn until all intersecting constraints are satisfied.

Random: We pick an unsatisfied (= badly colored)edge each time entirely at random from the set of allun-satisfied edges.

Fixed Order: We pick the first unsatisfied edgeaccording to a fixed (lexicographic) ordering of theedges.

Random Fixed Order: We pick the first unsatisfiedconstraint according to a random permutation of edges,fixed initially before the start of the process.

Cyclical: For (a certain fixed, lexicographic-type)

0.8

0.85

0.9

0.95

1

1.05

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

Re

sa

mp

les/S

ize

2

Size

Resample Method Comparison

Random Fixed OrderCyclical

Fixed OrderRandom

RecursiveLBN

Figure 5: Experimental results of performing variousresample methods on the n× n torus. The node colorsare uniformly distributed on {1, 2, 3, 4, 5, 6}. The x-coordinate represents size n and the y-coordinate isthe number of resamples averaged over 100 tests andnormalized (divided) by n2.

cyclic order of the edges we always pick the next badedge, every time moving only forward in the cycle.

Least Bad Neighbors: We randomly pick an unsat-isfied edge among the ones that have the least numberunsatisfied neighbors. [This is to simulate a worst caseresample strategy.]

The results of some of our of experiments aresummarized in Figure 5.The support conjecture. The coin chain model wasnot the first for which we have tried to plot a phasetransition in terms of the skew. In the familiar gridcoloring problem let k = 4 fixed, and set the resampleprobabilities of colors 1 though 4 to (p, 1−p3 , 1−p3 , 1−p3 ),with 0 < p < 1. The instances Φn,p were the n × ntori (nearly identical to the grid, but both dimensionsare wrapped around to enhance symmetry). Figure 12shows the number of resample steps for growing n interms of p. We do not see any phase transition for any ofthe resample methods! This was our first attempt to seea phase transition in terms of a continuous parameter,a disappointment, but its lack of success has lead usformulate a conjecture:

Conjecture 2.1. (Support Conjecture) WhetherResample runs either polynomially or exponentiallyon a growing size family of the same instances, and isunaffected by how we fix the distribution µis, as long asit has full support on all values xi may take.

The conjecture seemed to be in line with DimitrisAchlioptas’s intuition that Resample’s success is hard-



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

20 25 30 35 40 45 50 55 60 65 70

200

400

600

800

1000

Resam

ple

s/S

ize

2

Size

Random Fixed Order

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

30

40

50

60

70

80

90

100

110

200

400

600

800

1000

Size

Random

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

20

30

40

50

60

70

80

90

100

200

400

600

800

1000

Size

Recursive

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

50

100

150

200

250

300

350

400

450

200

400

600

800

1000

Resam

ple

s/S

ize

2

Size

Least Bad Neighbors

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

40

60

80

100

120

140

160

180

200

400

600

800

1000

Size

Cyclical

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

0

200

400

600

800

1000

1200

1400

1600

200

400

600

800

1000

Size

Fixed Order

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

Figure 6: Each curve in each diagram corresponds to some probability p. The grid is 4-cored with distribution(p, 1−p3 , 1−p3 , 1−p3 ). Each point is an average of 100 tests divided by n2. Even p = 10−5 is linear for all methods(maybe except Fixed Order).

wired in the instance and independent of the resampleprobabilities (which are add-ons). Although the CoinChain example refutes the Support Conjecture in gen-eral, the conjecture seems to hold for the grid-coloringproblem, and probably for several other instances too,raising the question:

What combinatorial properties of Φn,α determine ifthe Support Conjecture holds for that family?

3 The Coin Chain

We have thoroughly investigated the coin chain prob-lem, discussed in the previous section and the first thingwe have noticed is that the meta-stable distribution isunique and attracts the state: no matter what initialassignment we start from we end up in that distribu-tion. This is demonstrated in Figure 7 starting fromall-Heads and from 1% Heads. (When we start from ani.i.d. with 0.634n expected number of Tails, the num-ber of tails first goes up and then settles again around0.634n, but at this time with the meta-stable distribu-tion.) The Tails-density of the state, when reaches themeta-stable distribution will oscillate around its mostlikely value. In Figure 9 a.) we have plotted the relativedensity of the Tails during this oscillation for various pa-rameter values. In Figure 9 b.) we did the same withrandom 3SAT, which gives a remarkably similar picture,indicating that understanding Resample on the CoinChain is a good firs step.

0

500

1000

1500

2000

2500

3000

3500

4000

0 200k 400k 600k 800k 1m 1200k 1400k 1600k 1800k 2m

Nu

mb

er

of

he

ad

s

Steps

Initially 100% headsInitially 1% heads

Figure 7: The Resample process for the Coin Chain“mixes” to the meta-stable distribution.

4 The Math of Meta-stable equilibrium

In this section we explain the meta-stable equilibriumphenomenon with rigor on the following little example:



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

H HHHHH TTTT T

Figure 8: A constraint in the coin chain problem: the middle coin decides if the constraint is violated (theconstraint fails if it is Head). Upon ResampleConstraint all three coins must be resampled.

0

50000

100000

150000

200000

250000

300000

350000

400000

450000

0 500 1000 1500 2000 2500 3000 3500 4000

Occu

ren

ce

s

Most common number of tails

probability 0.63probability 0.6338

probability 0.635probability 0.64probability 0.65

probability 0.7probability 0.9

0

500

1000

1500

2000

2500

3000

3500

4000

4500

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

Occu

ren

ce

Ratio of unsatisfied clauses and all clauses

Density 2.4Density 2.5Density 2.6Density 2.7Density 2.8

a.) b.)

Figure 9: Oscillation around the stable number of Tails. Left: Coin Chain, Right: random 3SAT.

x2

x1

x4

x3

Φ4,p :

C12C21

C23

C32C34

C43

C41C14

C24

C42

C13C31

Instance Φn,p: n Boolean variables x1, . . . , xn;n(n − 1) constraints, Ci,j(xi, xj) (1 ≤ i 6= j ≤ n). Theconstraints are defined by

Ci,j(xi, xj) =

{0 if xi = 01 if xi = 1

Notice that Ci,j depends only on xi. Furthermore,the assignment 11 . . . 1 satisfies all constraints of theinstance and it is the only satisfying assignment. Theinstance has symmetry Sn, which greatly simplifies theanalysis. Parameter p of the instance is the probabilitywith which resample sets a variable to zero:

xi = 0 with probability pxi = 1 with probability 1− p

The Resample algorithm is a Markov chain on allof the 2n assignments, but because of the Sn symmetrythe transition probabilities depend only on the numberof zeros of the assignments. Thus the Markov chainprojects to a smaller Markov chain X[t] on n+ 1 states.If the current assignment has k zeroes — we simply callthis state k — it transits to one of the states k−2, k−1,k, k + 1 with the following probabilities:

to k − 2 with probability k−1n−1 (1− p)2

to k − 1 with probability n(1−p)+k(3p−1)−2pn−1 (1− p)

to k with probability (k−1)p+2(n−k)(1−p)n−1 p

to k + 1 with probability n−kn−1p

2

These four numbers are p−2(k), p−1(k), p0(k) andp1(k), respectively. Let ∆(k) denote the expectedchange from k = X[t] to X[t+ 1]. Then

∆(k) = E(X[t+ 1]−X[t] | X[t] = k)

= p1(k)− p−1(k)− 2p−2(k)

=(2p− 1)n− k + 2(1− p)

n− 1



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

. . . . . .0 1 k k + 1k − 1k − 2 n− 1 n

p+1(k)

p−2(k)

p−1(k)

p0(k)

Figure 10: The Markov chain X[t]

1 2 3 4 5 6 7 8 9 100

Figure 11: The arrow lengths indicate expected move from different states under X[t]. The walk will likely tostay around a middle point for a very long time where the expected move is zero. Below the middle point thereis an upward drift, above the middle point there is a downward drift. The location of the neutral point dependson p. This schematic picture corresponds to p = 0.75.

This function is linear in k and

∆(k)

{> 0 if k < (2p− 1)n+ 2(1− p)< 0 if k > (2p− 1)n+ 2(1− p)

If the current state is k and ∆(k) > 0, the number ofzeroes is likely to increase and if ∆(k) < 0, the numberof zeroes is likely to decrease. This means that thesystem will oscillate around k = (2p− 1)n+ 2(1− p) ≈(2p − 1)n. When p < 0.5 and n is large, the systemsettles in linear time at 11 . . . 1. When p = 0.5 and n islarge, the system still reaches 11 . . . 1 in expected O(n2)time. Finally, when p > 0.5 and n is large, the systemwill have around (2p− 1)n zeros for a very long time.

5 Selection Methods

In this section we briefly describe the different types ofmethods we have compared, that select a violated clausein the GetViolatedConstraint routine.

5.1 Recursive Resampling Recursive method ofchoosing violated clauses for grids/tori is as follows.First a random monochromatic edge is chosen to beresampled. That edge is resampled and all adjacentviolated edges, including the edge just resampled, areresampled if they are bad edges. Furthermore, theneighbors of those neighbors will be resampled and soon, until all neighbors in the ”neighborhood” have agood coloring. If no violated edge remains in the graphwhen the recursion returns to the top-level caller thenthe process is finished. If not then a new random edgeis selected to be recursively resampled and the processis repeated.Pseudocode.

Algorithm 2 Recursive Resample Algorithm

1: procedure RecursiveResampling(G) . G agraph

2: while exists a violated edge in G do3: e← get random violated edge4: RecursiveRecolor(G,e)5: end while6: end procedure7: procedure RecursiveRecolor(G,e)8: while exists a violated edge in neighborhood(e)

do9: for every violated edge f in neighborhood(e)

do10: RecursiveRecolor(G,f)11: end for12: end while13: Recolor(G,e)14: end procedure

Implementation. The biggest challenge here is toselect a random edge in constant time. Each edge haspointers to it’s vertices in the graph. Violated edges arestored in an array A. If an edge is violated it will storeit’s index in A for quick access. To select a randomedge, the method randomly selects one of the violatededges from A. That edge is then resampled and asub-routine v2e is called which returns all adjacentedges to a given vertex. Those edges are then checkedif they are violated. If an edge is violated and not in Athen it is added to A. If an edge is no longer violatedand it is in A then it is removed from A. When edgesare added to A they are appended to the end and whenthey are removed they are replaced with the last edgeof the array (or null if it is the last edge in the array).For each resample, checking edges and managing A



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

takes constant time.

5.2 Random Resampling In this method, to getthe next edge to be resampled, randomly choose a badedge uniformly. The process repeats until all edges arenicely colored.Pseudocode.

Algorithm 3 Dynamic Random Resample Algorithm

1: procedure DynamicRandomResampling(G) .G a graph

2: while exists a violated edge in G do3: e← get random violated edge4: Recolor(G,e)5: end while6: end procedure

Implemetation. Similar to Recursive Resampling, weuse an array A to store all bad edges. The method willrandomly choose an edge from A and resamples it andalso does the same management for A done in RecursiveResampling.

5.3 Fixed Order Resampling Fixed order firstorders the edges. Starting from the top left at (0,0)we traverse the nodes from left to right and from top tobottom enumerating the edge to the right of the nodethen the edge below the node. With this ordering weresample the edge with the lowest number with badcoloring.Pseudocode.

Algorithm 4 Fixed Order Resample Algorithm

1: procedure FixedOrderResampling(G,π) . G agraph, π edge ordering

2: while exists a violated edge in G do3: e ← get first violated edges of G from π

ordering4: Recolor(G,e)5: end while6: end procedure

Implemetation. This method requires a heap to keeptrack of the next edge to be resampled. All violatededges are put into a heap with the least numbered edgeon top. This heap is stored as an array A. Similarto previous method implementations, each edge storesit’s position in A. To choose the next edge the methodtakes the top edge from the heap and resamples it. Nextit calls v2e to get all effected edges by resampling andadds or removes said edges from A if they are violated

and not in A or satisfied and in A respectively. Addingand removing items from a heap takes O(logn).

Studying a visual run of the Fixed Order resample,we observed a moving ”frontier” where at a certain row,all rows below it will be a good coloring except for afew. The few would form a path of bad edges from the”frontier” reaching as far down as the first row of thegraph.

5.4 Fixed Random Resampling Similarly to fixedorder, fixed random enumerates the edges in a randomordering. Then when selecting the next edge to beresampled, this method chooses the edge with the leastnumber and that is badly colored.Pseudocode.

Algorithm 5 Fixed Random Resample Algorithm

1: procedure FixedRandomResampling(G,π) . Ga graph, π edge ordering

2: while exists a violated edge in G do3: e ← get first violated edges of G from π

ordering4: Recolor(G,e)5: end while6: end procedure

Implemetation. This method first stores all edges intoan array B. Next it randomly chooses an edge fromB, enumerates it (every edge stores it’s ordering) andremoves it from B. This process takes O(n). Afterenumerating, all violated edges are stored into a heapthat is implemented by an array A. Similar to FixedOrder Resampling, an edge is selected by taking thetop of the heap. After that edge is resampled theneighboring edges go through the same process as FixedOrder in adding and removing them from A. Choosingan edge to resample takes O(log n) time.

5.5 Cyclical Resampling Similar to Fixed Orderresample where first there must be an ordering of edges.We use the same ordering as the one for Fixed Order.Next the method will iterate over the list of edgesresampling each monochromatic edge it finds. If thegraph isn’t colored by the time it reaches the end, itrepeats, iterating over the list until the graph is colored.Pseudocode.



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

Algorithm 6 Cyclical Resample Algorithm

1: procedure CyclicalResampling(G,π) . G agraph, π edge ordering

2: while exists a violated edge in G do3: a ← get array of violated edges of G in π

ordering4: for every violated edge e in a do5: Recolor(G,e)6: Update(a,e,G)7: end for8: end while9: end procedure

Implementation. To efficiently implement thismethod we must use an array that stores all bad edgesin order. When an edge is resampled we insert orremove effected edges in to the array only if they areedges that haven’t been passed yet in the current cycle.

5.6 Least Bad Neighbors (LBN) We observedwhen resampling edges where there are many neighbor-ing bad edges, there is a more likely chance that theresampling will reduce the total number of bad edges.Inversely, a resampling on an edge where there are fewneighboring bad edges will result in a more likely chancethat the resampling will increase the total number ofbad edges. Least Neighbors variation will choose thenext edge to be resampled by choosing the edge withleast number of badly colored neighbors. If there aremultiple candidates then the algorithm randomly selectsone of them.Pseudocode.

Algorithm 7 LBN Resample Algorithm

1: procedure LBNResampling(G) . G a graph2: while exists a violated edge in G do3: e ← get edge with least number of bad

neighbors4: Recolor(G,e)5: end while6: end procedure

Implementation. A min-heap is used to keep trackof the edges with the edge of least neighbors on top ofthe heap. This heap implementation is very similar tofixed order and random order but the ordering is basedon number of neighboring bad edges.

6 Test Parameters for the Grid and Torus

There are certain factors that affect the performance ofthe Resample Algorithm. In particular we studied the

effects of changing the number of colors, the resampleprobabilities in the case of 4 and 6 colors, the methodvariation and the size of the graph. We chose to studymore closely the behaviors of the Resample Algorithmon the coloring problem of Grid/Torus graph typebecause the problem is very easy to picture.Probability Distributions. For 6 colors, we useda probability distribution of (p, p, p, 1−3p3 , 1−3p3 , 1−3p3 )where p is the probability of that color appearing.For 4 colors, we used a probability distribution ofp, 1−p3 , 1−p3 , 1−p3 . We experimented on what happenswhen p gets closer to zero. In both cases the closer pis to zero, the more the coloring is similar to using 3colors, yet the number of resample steps for any fixedp 6= 0 was linear in terms of n2 (number of constraints).Graph Size. We ran tests on square Grids/Tori whosesides were of the following lenghts:100, 200, ..., 1000Grid vs. Torus In testing Grids and Tori of large sizeit became apparent that there is very little differencebetween the two.

7 The robustness of the support conjecture

Figure 12 and Figure 13 relate to similar experiments.Figure 13 is of 6 colors with half the colors haveprobability p of appearing while the other half haveprobability 1−3p

3 . Figure 12 is of 4 colors with one

having probability p and the other three with 1−p3 . As

p gets closer to 0, both distributions are closer to beingsupported on 3 colors. However, even if p = 1

10000 thereis still linear behavior in respect to size. As the functionof p, the number of resamples grows hyperbolically forthe same n. These experiments have given rise to oursupport conjecture for the Grids/Tori coloring problem.

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

Ave

rag

e R

esa

mp

les

Probability

Probability Comparison

Random Fixed OrderLBN

Figure 12: Comparison of best and worst resamplingpolicy. Experiment with fixed graph size = 1000: andchange in probability p with 6 colors and distribution(p,p,p, 1−3p3 , 1−3p3 , 1−3p3 ). Each point is an average of 100tests.



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

10

15

20

25

30

35

40

45

200

400

600

800

1000

Resam

ple

s/S

ize

2

Size

Random Fixed Order

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

15

20

25

30

35

40

45

50

55

60

200

400

600

800

1000

Size

Random

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

15

20

25

30

35

40

45

50

55

200

400

600

800

1000

Size

Recursive

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

20

40

60

80

100

120

140

160

200

400

600

800

1000

Resam

ple

s/S

ize

2

Size

Least Bad Neighbors

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

10

20

30

40

50

60

70

80

90

200

400

600

800

1000

Size

Cyclical

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

0

50

100

150

200

250

300

350

400

200

400

600

800

1000

Size

Fixed Order

1/1000

1/2000

1/3000

1/4000

1/5000

1/6000

1/7000

1/8000

1/9000

1/10000

Figure 13: Experiment with change in probabilityp and graph size n with 6 colors and distribution(p,p,p, 1−3p3 , 1−3p3 , 1−3p3 ). Each point is an average of 100tests divided by the number of vertices of the torus (n2).

8 Conclusions and Open Problems

On the theoretical, but also on the experimental sidemost questions remain open. We believe, the toppriority among them is to develop theoretical methodsto bound αresample in cases when it is significantlybeyond the magic αlovasz. Simple ad hoc methodssometimes work: e.g. up-to p = 1/3 for the coin chainand for ≥ 7 colors for the grid, but we do not havesweeping new general methods, different from the LLL,that address αresample. We have no methods at all forshowing that Resample does not work beyond somethreshold (other than in the most simple cases such asin Section 4).

The Full-support conjecture says that when we ap-ply Resample to color the grid, aside from a constantfactor it does not matter how we set the probabilities,when µi has full support. This might be a hint for theexistence of a more combinatorial alternative to Resam-ple [in some cases] that avoids the µis altogether. Thenew framework of Achlioptas-Iliopoulos [1] is combina-torial, though it does not solve the support conjecturefor grids.

Acknowledgements. We thank Dimitris Achlioptasfor interesting discussions, and Alistair Sinclair who haspointed us to the physics term “meta-stable equilib-rium.”

References

[1] D. Achlioptas and F. Iliopoulos, Random walksthat find perfect objects and the lovasz local lemma,

in 55th IEEE Annual Symposium on Foundations ofComputer Science, FOCS, 2014, pp. 494–503.

[2] N. Alon, A parallel algorithmic version of the LocalLemma, in FOCS, 1991, pp. 586–593.

[3] J. Beck, An algorithmic approach to the Lovasz Lo-cal Lemma. i, Random Struct. Algorithms, 2 (1991),pp. 343–366.

[4] A. Czumaj and C. Scheideler, Coloring non-uniform hypergraphs: a new algorithmic approach tothe general Lovasz Local Lemma, in SODA, 2000,pp. 30–39.

[5] P. Erdos and L. Lovasz, Problems and results on3-chromatic hypergraphs and some related questions,In A. Hajnal, R. Rado and V.T. Sos, editors, Infiniteand Finite Sets (to Paul Erdos on his 60th birthday),(1975), pp. 609–627.

[6] H. Gebauer, T. Szabo, and G. Tardos, The locallemma is tight for sat, in SODA, 2011, pp. 664–674.

[7] K. B. R. Kolipaka and M. Szegedy, Moser andTardos meet Lovasz, in STOC, 2011, pp. 235–244.

[8] M. Molloy and B. A. Reed, Further algorithmicaspects of the Local Lemma, in STOC, 1998, pp. 524–529.

[9] R. A. Moser, A constructive proof of the Lovasz LocalLemma, in STOC, 2009, pp. 343–350.

[10] R. A. Moser and G. Tardos, A constructive proof ofthe general Lovasz Local Lemma, J. ACM, 57 (2010).

[11] A. D. Scott and A. D. Sokal, On dependencygraphs and the lattice gas, Combinatorics, Probability& Computing, 15 (2006), pp. 253–279.

[12] J. B. Shearer, On a problem of Spencer, Combina-torica, 5 (1985), pp. 241–245.

[13] A. Srinivasan, Improved algorithmic versions of theLovasz Local Lemma, in SODA, 2008, pp. 611–620.



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

APPENDIX

The LLL and the Resample algorithm

A general scenario that occurs again and again incomputer science is the following: We have variablesx1 . . . , xn, possibly each with a different domain. Inour experiments xi are either be Boolean, i.e. takevalues from {0, 1}, or take values from a color set[k] = {1, . . . , k}. We also have constraints that expressrelations between (typically small) subsets of variables.In our 3SAT experiment for instance, the variables areBoolean and the constraints are of the form xε11 ∨ xε22 ∨xε33 , where ε1, ε2, ε3 indicate possible negations. In ourk-coloring experiments the variables range in the colorset [k], and constraint C(x, y) expresses that x and yhave different colors. This is the familiar graph coloringproblem.

Definition .1. (vbl(C)) The variable set of a con-straint C(xi1 , . . . , xi`) is {xi1 , . . . , xi`}, and it is denotedby vbl(C).

Dependency graph. To quantify the sparsity ofa general constraint system C1, . . . , Cm we define thedependency graph G on the set of its constraints by

V (G) = {1, . . . ,m},E(G) = {(i, j) | vbl(Ci) ∩ vbl(Cj) 6= ∅}

Bad event probabilities. For the Lovasz LocalLemma (and for the Resample algorithm) it is nec-essary that the domain of each variable is assigned aprobability distribution: xi ∼ µi. Then we can talkabout the probability that a constraint does NOT hold,assuming its variables are independently drawn. Let pibe this probability for constraint Ci. The instance wewant to satisfy is C1 ∧C2 ∧ . . . ∧Cm, i.e. we want eachconstraint Ci to hold.

The Lovasz Local Lemma is expressed in terms ofthe constraint graph G and the bad event probabilityvector p = (p1, . . . , pm). A simple version states that if

(∆(G) + 1)× |p|∞ < 1/e

then all the bad events can be avoided with positiveprobability, where

∆(G) = maximum degree of G|p|∞ = max

ipi

If we want to express the sparsity condition solelyin terms of ∆(G) and |p|∞, the simple version is nearlysharp. The lemma is local in the sense that its conditioncan be checked for each constraint separately just bylooking at its probability and that how many neighborsit has. A local but more complicated condition is:

Standard LLL condition, STD(G, p): For some 0 ≤z1, . . . , zn < 1:

pi < zi∏

j:(i,j)∈E(G)(1− zj)(.1)

From STD(G, p) one can recover the simple conditionby setting zi = 1/∆ and assuming pi = 1/(e∆ + e).

Theorem .1. (Lovasz Local Lemma) LetC1, . . . , Cm be a system of constraints on variablesx1, . . . , xn with dependency graph G such that when thexi ∼ µi independently, then the probability that Ci doesnot hold is pi. Then STD(G, p), where p = (p1, . . . , pn),implies that the system is satisfiable.

The Standard LLL can beat the Simple LLL by a largefactor when G has very different vertex degrees. Weremark that LLL is often stated in a more generalabstract setting, with which we are not concerned here.

Resample: L. Moser, and later L. Moser and G. Tardoshave discovered that when STD(G, p) holds, one cansuccessfully run the very simple Algorithm 1 to actuallyfind a solution for C1 ∧ . . . ∧ Cm. In Moser’s originalpaper the violated clause had to be chosen in a certainway, but Moser and Tardos have proven that evenwhen an adversary chooses the clauses to be resampled,Resample works.

Theorem .2. (Moser-Tardos) Assume that theSTD(G, p) condition holds with z1, . . . , zm in Equation(.1). Then Resample finds a solution for C1∧ . . .∧Cmin expected running time at most

m∑i=1

zi1− zi

(.2)

regardless of how the GetViolatedConstraint pro-cedure picks a violated clause.



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

x7

x4

x1

x8

x5

x2

x9

x6

x3

C8

C11

C3

C12

C9

C6

C4

C7

C10

C1

C5

C2

C8

C11

C3

C12

C9

C6

C4

C7

C10

C1

C5

C2

Instance Dependency graph G



Dow

nloa

ded

06/0

3/17

to 1

65.2

30.2

25.2

7. R

edis

trib

utio

n su

bjec

t to

SIA

M li

cens

e or

cop

yrig

ht; s

ee h

ttp://

ww

w.s

iam

.org

/jour

nals

/ojs

a.ph

p

the moser-tardos resample algorithm: where is the limit ...zz124/alenex17.pdf · what happens when...

Documents