what do we know about mean field?web.stanford.edu/~montanar/other/talks/stsp07.pdfwhat do we know...
TRANSCRIPT
What do we know about mean field?
Andrea Montanari
Electrical Engineering and Statistics,Stanford University
November 12, 2007
Andrea Montanari What do we know about mean field?
Outline
1 An old example
2 Why do you care?
3 Criteria
4 Corrections
5 Puzzling mean field phenomena
6 Conclusion
Andrea Montanari What do we know about mean field?
Outline
1 An old example
2 Why do you care?
3 Criteria
4 Corrections
5 Puzzling mean field phenomena
6 Conclusion
Andrea Montanari What do we know about mean field?
Outline
1 An old example
2 Why do you care?
3 Criteria
4 Corrections
5 Puzzling mean field phenomena
6 Conclusion
Andrea Montanari What do we know about mean field?
Outline
1 An old example
2 Why do you care?
3 Criteria
4 Corrections
5 Puzzling mean field phenomena
6 Conclusion
Andrea Montanari What do we know about mean field?
Outline
1 An old example
2 Why do you care?
3 Criteria
4 Corrections
5 Puzzling mean field phenomena
6 Conclusion
Andrea Montanari What do we know about mean field?
Outline
1 An old example
2 Why do you care?
3 Criteria
4 Corrections
5 Puzzling mean field phenomena
6 Conclusion
Andrea Montanari What do we know about mean field?
An old example
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+
+++
+++
~E
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+++
++++
−−−−
−−−−
~Eeff
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+−
+−
+−
+−
+−+−
+−
+−+−
+++
++++
~Eeff
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+−
+−
+−
+−
+−+−
+−
+−+−
+++
++++
~Eeff
~p = Molecule polarization
~Eeff = ~E +4π
3N 〈~p〉
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+−
+−
+−
+−
+−+−
+−
+−+−
+++
++++
~Eeff
~p = Molecule polarization
~Eeff = ~E +4π
3N 〈~p〉
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+−
~E
〈~p〉 = α ~E + O(E 2)
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
+−
~Eeff
〈~p〉 = α ~Eeff + O(E 2)
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
〈~p〉 = α ~Eeff
~Eeff = ~E +4π
3N α ~Eeff
~Eeff = ~E/(1− 4π
3N α)
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
〈~p〉 = α ~Eeff
~Eeff = ~E +4π
3N α ~Eeff
~Eeff = ~E/(1− 4π
3N α)
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
〈~p〉 = α ~Eeff
~Eeff = ~E +4π
3N α ~Eeff
~Eeff = ~E/(1− 4π
3N α)
Andrea Montanari What do we know about mean field?
Clausius-Mossotti-Lorentz-Lorenz formula
〈~p〉 = α ~Eeff
~Eeff = ~E +4π
3N α ~Eeff
~Eeff = ~E/(1− 4π
3N α)
. . . interesting but fishy. . .
Andrea Montanari What do we know about mean field?
Why do you care?
Andrea Montanari What do we know about mean field?
Beyond physics
Artificial intelligenge.
Computer vision.
Communications.
Coding theory.
Optimization.
Counting.
We cannot rely on the intuition of (good) physicists
Andrea Montanari What do we know about mean field?
Beyond physics
Artificial intelligenge.
Computer vision.
Communications.
Coding theory.
Optimization.
Counting.
We cannot rely on the intuition of (good) physicists
Andrea Montanari What do we know about mean field?
Graphical models
x1
x2 x3 x4
x5
x6
x7x8x9
x10
x11
x12
µ(x) =1
Z
∏(ij)∈E
ψij(xi , xj) , x = (x1, . . . , xn).
Andrea Montanari What do we know about mean field?
Generic computational tasks
Optimization
x∗ = arg maxx
∏(ij)∈E
ψij(xi , xj) .
Partition function
Z =∑x
∏(ij)∈E
ψij(xi , xj) .
Marginals
µ(xi ) =∑x∼i
µ(x) .
Sampling.
Andrea Montanari What do we know about mean field?
Generic computational tasks
Optimization
x∗ = arg maxx
∏(ij)∈E
ψij(xi , xj) .
Partition function
Z =∑x
∏(ij)∈E
ψij(xi , xj) .
Marginals
µ(xi ) =∑x∼i
µ(x) .
Sampling.
Andrea Montanari What do we know about mean field?
Generic computational tasks
Optimization
x∗ = arg maxx
∏(ij)∈E
ψij(xi , xj) .
Partition function
Z =∑x
∏(ij)∈E
ψij(xi , xj) .
Marginals
µ(xi ) =∑x∼i
µ(x) .
Sampling.
Andrea Montanari What do we know about mean field?
Generic computational tasks
Optimization
x∗ = arg maxx
∏(ij)∈E
ψij(xi , xj) .
Partition function
Z =∑x
∏(ij)∈E
ψij(xi , xj) .
Marginals
µ(xi ) =∑x∼i
µ(x) .
Sampling.
Andrea Montanari What do we know about mean field?
To be concrete
Coding over binary memoryless symmetric channels.
ENCODER CHANNEL DECODERm x y m̂
Andrea Montanari What do we know about mean field?
Noisy channel
0 0
11
1−p
1−p
p
p
Andrea Montanari What do we know about mean field?
ENCODER 1000010010111010︸ ︷︷ ︸N
110111011︸ ︷︷ ︸N/2
encoder ⇔ constraints over message bits
Andrea Montanari What do we know about mean field?
LDPC codes [Gallager 1963, MacKay 1995]
x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0 · · · x5 ⊕ x6 ⊕ x8 = 0
x1 x2 x3 x4 x5 x6 x7 x8
constraints over message bits ⇔ graphical representation
Andrea Montanari What do we know about mean field?
x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0 · · · x5 ⊕ x6 ⊕ x8 = 0
x1 x2 x3 x4 x5 x6 x7 x8y y y y y y y yy1 y2 y3 y4 y5 y6 y7 y8
µ(x |y) =1
Z (y)I(x1 ⊕ x2 ⊕ x3 ⊕ x4 = 0) · · · I(x5 ⊕ x6 ⊕ x8 = 0) ·
· Q(y1|x1) · · ·Q(y8|x8)
Andrea Montanari What do we know about mean field?
Bethe-Peierls approximation
a
i
νi→a(xi ) = µi (xi |y , a is taken off)
Andrea Montanari What do we know about mean field?
Bethe-Peierls equations: 1
i a
b
c
νi→a
ν̂c→i
ν̂b→i
νi→a(xi ) ∝ Q(yi |xi )ν̂b→i (xi )ν̂c→i (xi )
Andrea Montanari What do we know about mean field?
Bethe-Peierls equations: 2
a i
j
l
ν̂a→i
νj→a
νl→a
ν̂a→i (xi ) ∝∑
xj⊕xl=xi
νj→a(xj)νl→a(xl)
Andrea Montanari What do we know about mean field?
Belief Propagation
ν(t+1)i→a (xi ) ∝ Q(yi |xi )ν̂
(t)b→i (xi )ν̂
(t)c→i (xi )
ν̂(t)a→i (xi ) ∝
∑xj⊕xl=xi
ν(t)j→a(xj)ν
(t)l→a(xl)
[Gallager 1963, Pearl 1984]
Andrea Montanari What do we know about mean field?
Criteria
Andrea Montanari What do we know about mean field?
Intuition
Bethe approximation is reasonable if
1. G does not contain too many short loops.
2. ‘Far apart variables’ are weakly correlated.
Andrea Montanari What do we know about mean field?
Formalization
Definition
A ‘set of messages’ is a collection {νi→j( · )} indexed by directededges in G, where νi→j( · ) is a distribution over X .
Andrea Montanari What do we know about mean field?
Given F ⊆ G , diam(F ) ≤ 2`
νF (xF ) ≡ 1
W (νF )
∏(ij)∈F
ψ(ij)(xi , xj)∏i∈∂F
νi→j(i)(xi ) .
Andrea Montanari What do we know about mean field?
Bethe states
Definition
A probability distribution ρ on XV is an (ε, r) Bethe state, if thereexists a set of messages {νi→j( · )} such that, for any F ⊆ G withdiam(F ) ≤ 2r
||ρF − νF ||TV ≤ ε .
Andrea Montanari What do we know about mean field?
Consistency Condition → Bethe Equations
Proposition
If ρ is a (ε, 2)-Bethe state with respect to the message set{νi→j( · )}, then, for any i → j
||νi→j − Tνi→j ||TV ≤ Cε ,
Tνi→j(xi ) ≡1
zi→j
∏l∈∂i\j
∑xl
ψil(xi , xl)νl→i (xl) .
Andrea Montanari What do we know about mean field?
Correlation decay: Notation
B(i , r) ball of radius r and center i .
x∼i ,r = { xj : j 6∈ B(i , r) }.
Andrea Montanari What do we know about mean field?
Correlation decay: Definitions
Uniqueness:
supx ,x ′
∑xi
∣∣µ(xi |x∼i ,r )− µ(xi |x ′∼i ,r )∣∣ → 0
[cf. Tatikonda, Gamarnik, Bayati,. . . ]
Extremality: ∑xi ,x∼i,`
|µ(xi , x∼i ,r )− µ(xi )µ(x∼i ,r )| → 0
[cf. Peres, Mossel]
Concentration:∑xi(1)...xi(k)
∣∣µ(xi(1), . . . , xi(k))− µ(xi(1)) · · ·µ(xi(k))∣∣ → 0
Andrea Montanari What do we know about mean field?
Correlation decay: Definitions
Uniqueness:
supx ,x ′
∑xi
∣∣µ(xi |x∼i ,r )− µ(xi |x ′∼i ,r )∣∣ → 0
[cf. Tatikonda, Gamarnik, Bayati,. . . ]
Extremality:
I(Xi ;X∼i ,r ) → 0
[cf. Peres, Mossel]
Concentration:∑xi(1)...xi(k)
∣∣µ(xi(1), . . . , xi(k))− µ(xi(1)) · · ·µ(xi(k))∣∣ → 0
Andrea Montanari What do we know about mean field?
Conditions
Shortest loop ≥ 4`
Theorem (Tatikonda-Jordan 02)
If µ is unique ‘with rate δ( · )’ then it an (ε, r) Bethe state for anyr < ` and ε ≥ Cδ(`− r), with respect to the message set outputby belief propagation.
Theorem (Dembo-Montanari M07)
If µ is extremal ‘with rate δ( · )’ then it an (ε, r) Bethe state forany r < ` and ε ≥ Cδ(`− r).
Andrea Montanari What do we know about mean field?
Conditions
Shortest loop ≥ 4`
Theorem (Tatikonda-Jordan 02)
If µ is unique ‘with rate δ( · )’ then it an (ε, r) Bethe state for anyr < ` and ε ≥ Cδ(`− r), with respect to the message set outputby belief propagation.
Theorem (Dembo-Montanari M07)
If µ is extremal ‘with rate δ( · )’ then it an (ε, r) Bethe state forany r < ` and ε ≥ Cδ(`− r).
Andrea Montanari What do we know about mean field?
Corrections
Andrea Montanari What do we know about mean field?
Small loops
Kikuchi, Cluster Variational Method (60s)Correlations within small subset of varables
⇓
Yedidia, Freeman, Weiss (2002)Generalized Belief Propagation
Andrea Montanari What do we know about mean field?
Small loops
Kikuchi, Cluster Variational Method (60s)Correlations within small subset of varables
⇓
Yedidia, Freeman, Weiss (2002)Generalized Belief Propagation
Andrea Montanari What do we know about mean field?
What about large loops?
Idea: Cavity distribution [AM-Rizzo 2005]
4
12
3
ν(x1, x2, x3, x4)
Andrea Montanari What do we know about mean field?
What about large loops?
Idea: Cavity distribution [AM-Rizzo 2005]
4
12
3
ν(x1, x2, x3, x4)
Andrea Montanari What do we know about mean field?
‘Large’ loops
12
345
67
0
X2,3,4
ν(1, 2, 3, 4)ψ(0, 2)ψ(0, 3)ψ(0, 4) ∝X2,3,4
ν(0, 5, 6, 7)ψ(1, 5)ψ(1, 6)ψ(1, 7)
Andrea Montanari What do we know about mean field?
‘Large’ loops
Bethe approximation: ν factorizes
→ keep correlations in ν
[Alternative approaches: Chertkov et al. 06, Parisi-Slanina 06]
Andrea Montanari What do we know about mean field?
‘Large’ loops
Bethe approximation: ν factorizes
→ keep correlations in ν
[Alternative approaches: Chertkov et al. 06, Parisi-Slanina 06]
Andrea Montanari What do we know about mean field?
‘Large’ loops
Bethe approximation: ν factorizes
→ keep correlations in ν
[Alternative approaches: Chertkov et al. 06, Parisi-Slanina 06]
Andrea Montanari What do we know about mean field?
Puzzling mean field phenomena
Andrea Montanari What do we know about mean field?
A piece standard wisdom
Phase transitions:
1. Non-analytic free energy (thermodynamic limit).
2. Diverging correlation length from 2-points functions.
3. Slow down of Glauber dynamics.
[Stroock-Zegarlinski, Martinelli, Dyer et al. . . ]
Andrea Montanari What do we know about mean field?
A counter-example
xi ∈ {+1,−1}
E (x) = −M∑
a=1
xi1(a)xi2(a)xi3(a)xi4(a) ,
µ(x) ∝ exp{−βE (x)} .
→ Each spin xi participates in 3 interactions
→ Random graph
Andrea Montanari What do we know about mean field?
A counter-example
xi ∈ {+1,−1}
E (x) = −M∑
a=1
xi1(a)xi2(a)xi3(a)xi4(a) ,
µ(x) ∝ exp{−βE (x)} .
→ Each spin xi participates in 3 interactions
→ Random graph
Andrea Montanari What do we know about mean field?
A counter-example
xi ∈ {+1,−1}
E (x) = −M∑
a=1
xi1(a)xi2(a)xi3(a)xi4(a) ,
µ(x) ∝ exp{−βE (x)} .
→ Each spin xi participates in 3 interactions
→ Random graph
Andrea Montanari What do we know about mean field?
‘Dynamic phase transition’
Proposition
For any β ≥ 0
−βf (β) = log 2 +3
4log coshβ .
Proposition
If τ(β;N) is the autocorrelation time
τ(β;N) ≤ τ∗(β)for β ≤ β′d ,
τ(β;N) ≥ exp {C (β)N} for β ≥ β′′d .
Theorem
If `(β;N) is the ‘point-to-set’ correlation length
C1` ≤ τ ≤ exp{C2|B(`)|} .
Andrea Montanari What do we know about mean field?
‘Dynamic phase transition’
Proposition
For any β ≥ 0
−βf (β) = log 2 +3
4log coshβ .
Proposition
If τ(β;N) is the autocorrelation time
τ(β;N) ≤ τ∗(β)for β ≤ β′d ,
τ(β;N) ≥ exp {C (β)N} for β ≥ β′′d .
Theorem
If `(β;N) is the ‘point-to-set’ correlation length
C1` ≤ τ ≤ exp{C2|B(`)|} .
Andrea Montanari What do we know about mean field?
‘Dynamic phase transition’
Proposition
For any β ≥ 0
−βf (β) = log 2 +3
4log coshβ .
Proposition
If τ(β;N) is the autocorrelation time
τ(β;N) ≤ τ∗(β)for β ≤ β′d ,
τ(β;N) ≥ exp {C (β)N} for β ≥ β′′d .
Theorem
If `(β;N) is the ‘point-to-set’ correlation length
C1` ≤ τ ≤ exp{C2|B(`)|} .
Andrea Montanari What do we know about mean field?
Conclusion
Andrea Montanari What do we know about mean field?
Many open problems
Example
On which graphs the Ising model has mean field behavior?
Andrea Montanari What do we know about mean field?
Many open problems
Example
On which graphs the Ising model has mean field behavior?
Andrea Montanari What do we know about mean field?
If you want to know more. . .
google Stat 316
Andrea Montanari What do we know about mean field?