paper review: an exact mapping between the variational renormalization group and deep learning
Post on 14-Apr-2017
82 Views
Preview:
TRANSCRIPT
An exact mapping between the VariationalRenormalization Group and Deep Learning
Kai-Wen Zhao, kv
Physics, National Taiwan University
kelispinor@gmail.com
December 1, 2016
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 1 / 18
Outline
Overview
Renormalization Group
Physical world with various length scales
Symmetry and Scale Invariance
Restricted Boltzman Machine
Generative, Energy-based Model, Unsupervised Learning Algorithm
Richard Feynman: What I Cannot Create, I Do Not Understand.
Mapping
Unsupervised Deep Learning Implements the Kadanoff Real SpaceVariational Renormalization Group
HRGλ [{hj}] = HRBM
λ [{hj}]
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 2 / 18
Overview of Variational RG
Statistical Physics
An ensemble of N spins {vi}, take value ±1, i is position index in somelattice. Boltzman distribution and partition function
P({vi}) =e−H({vi})
Z, where Z = Trvi e
−H({vi}) =∑
v1,v2,...=±1e−H({vi})
Typically, Hamiltonian depends on a set of couplings {Ks}
H[{vi}] = −∑i
Kivi −∑ij
Kijvivj −∑ijk
Kijkvivjvk + ...
Free energy of spin system
F = − logZ = − log(Trvi e−H({vi}))
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 3 / 18
Overview of Variational RG
Overview of Variational Renormalization Group
Idea behind RG: To finde a new coarsed-grained description of spinsystem, where one has integrated out short distance fluctuations.
N Physical spins: {vi}, couplings {K}M Coarse-grained spins: {hj}, couplings {K̃}, where M < N
Renormalization transformation is often represented as a mapping
{K} 7→ {K̃}
Coarse-grained Hamiltonian
HRG [{hj}] = −∑i
K̃ihi −∑ij
K̃ijhihj −∑ijk
K̃ijkhihjhk + ...
Now, we do not distinguish vi and {vi} if no ambiguity
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 4 / 18
Overview of Variational RG
Overview of Variational Renormalization Group
Variational RG scheme (Kadanoff)
Coarse graining procedure: Tλ(vi , hj) couples auxiliary spins hj to physicalspins vi
Naturally, we marginalize over the physical spins
exp (−HRGλ (hj)) = Trvi exp (Tλ(vi , hj)− H(vi ))
The free energy of coarse grained system
F hλ = −log(Trhj e
−HRGλ (hj ))
Choose parameters λ to ensure long-distrance observables are invariant.Minimize free energy difference
∆F = F hλ − F v
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 5 / 18
Overview of Variational RG
Overview of Variational Renormalization Group
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 6 / 18
RBMs and Deep Neural Networks
Restricted Boltzman Machine
Binary data probability distribution P(vi ). Energy function
E (vi , hj) =∑ij
wijvihj +∑i
civi +∑j
bjhj
where we denote parameters λ = {w , b, c}. Joint probability
pλ(vi , hj) =e−E(vi ,hj )
Z
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 7 / 18
RBMs and Deep Neural Networks
Restricted Boltzman Machine
Variational distribution of visible variables
pλ(vi ) =∑hj
p(vi , hj) = Trhjpλ(vi , hj) :=e−H
RBMλ (vi )
Z
pλ(hj) =∑vi
p(vi , hj) = Trvipλ(vi , hj) :=e−H
RBMλ (hj )
Z
Kullback-Leibler divergence
DKL(P(vi )||pλ(vi )) =∑vi
P(vi ) logP(vi )
pλ(vi )
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 8 / 18
Exact Mapping VRG to DL
Mapping Variational RG to RBM
In RG scheme, the couplings between visible and hidden spins are encodesby the operators T . Analogous role, in RBM, is played by joint energyfunction.
T (vi , hj) = −E (vi , hj) + H(vi )
To derive equivalent statement from coarse-grained Hamiltonian
e−HRGλ (hj )
Z=
Trvi eTλ(vi ,hj )−H(vi )
Z
= Trvie−E(vi ,hj )
Z= pλ(hj)
=e−H
RBMλ (hj )
Z
Subsituting the right-hand side yields
HRGλ [{hj}] = HRBM
λ [{hj}] (1)
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 9 / 18
Exact Mapping VRG to DL
Mapping Variational RG to RBM
The operator Tλ can be viewed as a variational approximation forconditional probability
eT (vi ,hj ) = e−E(vi ,hj )+H(vi )
=pλ(vi , hj)
pλ(vi )eH(vi )−HRBM
λ (vi )
= pλ(hj |vi )eH(vi )−HRBMλ (vi )
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 10 / 18
Examples
Examples: 2D Ising Model
Two dimensional nearest neighbor Ising model with ferromagnetic coupling
H({vi}) = −J∑<ij>
vivj
Phase transition occurs when J/(kBT ) = 0.4352.Experiment Setup
20,000 samples, 40x40 periodic lattice
RBM’s architecture 1600-400-100-25
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 11 / 18
Examples
Examples: 2D Ising Model
Figure: Top layer
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 12 / 18
Examples
Examples: 2D Ising Model
Figure: Middle layer
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 13 / 18
Examples
Examples: 2D Ising Model
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 14 / 18
Conclusion
Conclusion and Discussion
One-to-one mapping between RBM-based DNN and variational RG
Suggest learning implements RG-like scheme to extract importantfeatures from data
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 15 / 18
Relate to us
Relate to us: Auto-Encoder and Convolutional AE
z is the codes extracted by machine
φ : X → Z ψ : Z → X
arg min ||X − (ψ ◦ φ)X ||2
Figure: Scheme of Auto-Encoder
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 16 / 18
Relate to us
Relate to us: Auto-Encoder and Convolutional AE
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 17 / 18
Relate to us
Thanks
Kai-Wen Zhao, kv (NTU-PHYS) Review December 1, 2016 18 / 18
top related