unsupervised identification of the phase transition …...• 1 not so interesting: no phase...
TRANSCRIPT
UNSUPERVISED RECOGNITION OFPHASE TRANSITION ON THE LATTICE
ANDREAS ATHENODOROU
MARIE SKLODOWSKA CURIE INDIVIDUAL POSTDOCTORAL FELLOW
INFN, UNIVERSITY OF PISA & THE CYPRUS INSTITUTE
8TH INTERNATIONAL CONFERENCE ON
NEW FRONTIERS IN PHYSICS (ICNFP 2019),
KOLYMBARI, 27TH OF AUGUST 2019
BASED ON ARXIV:1903.03506 AND UNPUBLISHED WORK
IN COLLABORATION WITH: C. ALEXANDROU, A. APSEROS, C. CHRYSOSTOMOU, C. HAVADJIA, S. SHIAKAS, S. PAUL, D. VADACCHINO
MACHINE LEARNING IN STATISTICAL SYSTEMS
Neural networks can be used to identify phases and phase transitions in condensed matter systems via
supervised machine learning. Readily programmable through modern software libraries, we show that
a standard feed-forward neural network can be trained to detect multiple types of order parameter
directly from raw state configurations sampled with Monte Carlo………………………………………
We show that this classification occurs within the neural network without knowledge of the Hamiltonian
or even the general locality of interactions. These results demonstrate the power of machine learning
as a basic research tool in the field of condensed matter and statistical physics.
Since then …….• J. Carrasquilla and R. G. Melko, Machine learning phases of matter, Nature Physics 13 (Feb, 2017) 431 EP –, [arXiv:1605.01735].
• L. Wang, Discovering phase transitions with unsupervised learning, Phys. Rev. B 94 (Nov, 2016) 195105, [arXiv:1606.00318].
• P. Broecker, J. Carrasquilla, R. G. Melko, and S. Trebst, Machine learning quantum phases of matter beyond the fermion sign problem, Scientific Reports 7
(2017), no. 1 8823, [arXiv:1608.07848].
• K. Ch’ng, J. Carrasquilla, R. G. Melko, and E. Khatami, Machine learning phases of strongly correlated fermions, Physical Review X 7 (2017), no. 3 031038,
[arXiv:1609.02552].
• E. van Nieuwenburg, Y.-H. Liu, and S. Huber, Learning phase transitions by confusion, Nature Physics 13 (Feb, 2017) 435 EP –, [arXiv:1610.02048].
• S. J. Wetzel, Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders, Phys. Rev. E 96 (Aug, 2017)
022140, [arXiv:1703.02435].
• W. Hu, R. R. P. Singh, and R. T. Scalettar, Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical
examination, Phys. Rev. E 95 (Jun, 2017) 062122, [arXiv:1704.00080].
• P. Broecker, F. F. Assaad, and S. Trebst, Quantum phase recognition via unsupervised machine learning, arXiv e-prints (July, 2017) arXiv:1707.00663,
[arXiv:1707.00663].
• N. Yoshioka, Y. Akagi, and H. Katsura, Learning disordered topological phases by statistical recovery of symmetry, Phys. Rev. B 97 (May, 2018) 205110,
[arXiv:1709.05790].
• J. Venderley, V. Khemani, and E.-A. Kim, Machine learning out-of-equilibrium phases of matter, Phys. Rev. Lett. 120 (Jun, 2018) 257204,
[arXiv:1711.00020].
• W. Zhang, J. Liu, and T.-C. Wei, Machine learning of phase transitions in the percolation and XY models, arXiv e-prints (Apr., 2018) arXiv:1804.02709,
[arXiv:1804.02709].
• S. J. Wetzel, Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders, Physical Review E 96 (2017), no. 2
022140, [arXiv:1703.02435].
• S. J. Wetzel and M. Scherzer, Machine Learning of Explicit Order Parameters: From the Ising Model to SU(2) Lattice Gauge Theory, Phys. Rev. B96 (2017),
no. 18 184410, [arXiv:1705.05582].
• A. Morningstar and R. G. Melko, Deep learning the ising model near criticality, arXiv:1708.04622.
• D. Kim and D.-H. Kim, Smallest neural network to learn the ising criticality, Phys. Rev. E 98 (Aug, 2018) 022138, [arXiv:1804.02171].
• R. B. Jadrich, B. A. Lindquist, and T. M. Truskett, Unsupervised machine learning for detection of phase transitions in off-lattice systems. I. Foundations, jcp
149 (Nov., 2018) 194109, [arXiv:1808.00084].
• X. L. Zhao and L. B. Fu, Machine Learning Phase Transition: An Iterative Proposal, arXiv e-prints (Aug., 2018) arXiv:1808.01731, [arXiv:1808.01731].
• S. S. Funai and D. Giataganas, Thermodynamics and Feature Extraction by Machine Learning, arXiv:1810.08179.
• K. Kashiwa, Y. Kikuchi, and A. Tomiya, Phase transition encoded in neural network, arXiv:1812.01522.
• C. Giannetti, B. Lucini, and D. Vadacchino, Machine Learning as a universal tool for quantitative investigations of phase transition, arXiv:1812.06726
MACHINE LEARNING IN STATISTICAL SYSTEMS
TempoIn
tensi
ty
Like
Dislike
fastrelaxed
light
soa
ring
A
B
C
A
B
C
K-nearest neighbors
MACHINE LEARNING
Supervised learning
Unsupervised learning
Reinforcement learning
Label 1 2 3
2
(With some high probability)
Classification
Cat
Dog
x
MACHINE LEARNING
1. Input layer
2. Hidden layer
3. Hidden layer
4. Hidden layer
5. .
6. .
7. .
8. .
9. .
……..
10. Output layer
𝑥1
𝑥2
𝑥3
𝑥4
𝑥5
𝑥6
𝑥7
𝑥8
𝑦𝑗 = 𝑓
1
8
𝑤𝑖𝑗𝑥𝑖 + 𝑏𝑗
Ԧ𝑦
1
2
n
Ԧ𝑧
1
2
m
𝑧𝑘 = 𝑓
1
𝑛
𝑤𝑖𝑘 𝑥𝑖 + 𝑏𝑘
Ԧ𝑒
1
2
l
𝑧𝑟 = 𝑓
1
𝑚
𝑤𝑖𝑟 𝑥𝑖 + 𝑏𝑟
NEURAL NETWORKS
• 1𝐷 Not so interesting: No phase transition
(never magnetised)
• 2𝐷 more interesting: There is a phase
transition
• Simplest Description of Ferromagnetism
• Hamiltonian:
• 2𝐷 Ising model is the simplest non trivial
statistical system one can investigate
• It is in the same Universality Class with the 𝑍2 Gauge Theory as well as the 𝑆𝑈(𝑁) for 𝑁 = 2,3,4.
• It is exactly solvable for zero external magnetic field ℎ (Lars Onsager 1944)
• Critical Temperature:
Nearest neighbors
THE ISING MODEL
• 1𝐷 Not so interesting: No phase transition
(never magnetised)
• 2𝐷 more interesting: There is a phase
transition
• Simplest Description of Ferromagnetism
• Hamiltonian:
Nearest neighbors
Observables:
• Magnetization is the order parameter:
𝑚 =1
𝑁σ𝑖=1𝑁 𝑠𝑖
The 2D Ising model has a second order phase
transition (magnetization is continuous)
• Magnetic susceptibility
𝜒 =𝑁
𝑇𝑚2 − 𝑚 2
• Heat Capacity
𝐶 =𝜕 𝐸
𝜕𝑇
THE ISING MODEL
• 1𝐷 Not so interesting: No phase transition
(never magnetised)
• 2𝐷 more interesting: There is a phase
transition
• Simplest Description of Ferromagnetism
• Hamiltonian:
Nearest neighbors
Observables (near criticality ~𝑇𝑐):
• Magnetization is the order parameter:
𝑚 =1
𝑁σ𝑖=1𝑁 𝑠𝑖 𝑚(𝑇)~ 𝑇 − 𝑇𝑐
𝑏
The 2D Ising model has a second order phase
transition (magnetization is continuous)
• Magnetic susceptibility
𝜒 =𝑁
𝑇𝑚2 − 𝑚 2 χ(𝑇)~ 𝑇 − 𝑇𝑐
−𝛾
• Heat Capacity
𝐶 =𝜕 𝐸
𝜕𝑇χ(𝑇)~ 𝑇 − 𝑇𝑐
−𝛼
THE ISING MODEL
𝐿
• The autoencoder aims to define a representation (encoding) for our assemblage of data, by
performing dimensionality reduction:
• It encodes the input data ({𝑋}) from the input layer into a latent dimension ({𝑧})• Then it uncompresses that latent dimension into an approximation of the original data ({X})
• It consists of two components, the encoder function 𝑔𝜑 and a decoder function 𝑓𝜃 and the
reconstructed input is 𝑋 = 𝑓𝜃(𝑔𝜑(𝑥))
• The autoencoder learns the parameters 𝜑 and 𝜃 together
• 𝑋 = 𝑓𝜃(𝑔𝜑(𝑥)) can approximate an identity function.
• Criterion: Minimize the Mean Square Error (MSE):
THE ISING MODEL IN THE AUTOENCODER
• Eight layers
• Fully connected (Dense)
• Single latent dimension
• Through experimentation
Dropout, reduces overfitting G. E. Hinton et al, arXiv:1207.0580.
THE ISING MODEL IN THE AUTOENCODER
• Eight layers
• Fully connected (Dense)
• Single latent dimension
• Through experimentation
Implementation:
• Using Keras (F. Chollet et al., “Keras.” https://keras.io, 2015)
• Using Tensorflow (M. Abadi et al OSDI, vol. 16, pp. 265–283, 2016.)
• 66.66.% of data used for training
• 33.33.% of data used for validating
• Training is performed for 2000 iterations
• Example:
THE ISING MODEL IN THE AUTOENCODER
• Each configuration is re-expressed in the form of a vector:
(-1, 1, -1,1,1, -1,1,-1,1,-1,1,1,-1,1,-1,1,1,1,1,-1)
Input into the encoder
𝐿
𝐿𝐿2
Latent Dimension
• In other words, each configuration is
assigned a number, the latent
dimension, which includes all the
physically necessary information so
that the decoder re-creates the
actual configuration
PROCEDURE
• Each configuration is re-expressed in the form of a vector:
(-1, 1, -1,1,1, -1,1,-1,1,-1,1,1,-1,1,-1,1,1,1,1,-1)
Input into the encoder
𝐿
𝐿𝐿2
• In other words, each configuration is assigned a number, the latent dimension, which includes all the
physically necessary information so that the decoder re-creates the actual configuration
• Τhe autoencoder receives information for only one lattice volume, and thus, it "knows" nothing about
configurations produced for other volume sizes
PROCEDURE
Attempt to identify signals of the phase structure of the 2D-Ising model
Question: How the latent dimension behaves as a function of the temperature T for each configuration.
• We produce in total 40000 configurations
• 200 configurations for every single temperature 𝑇.
• Temperature range 𝑇 = 1 − 4.5
• We make sure that we cover the whole range of temperatures between the two extreme cases
of the Ising behaviour:
• the nearly "frozen" at 𝑇 = 1• the disordered at 𝑇 = 4.5• We assume that we have no prior knowledge on what is happening in-between
• Wait! Isn’ t there a degree of supervision?
• We could have chosen different temperature ranges that cover all possible phase regions
For instance, we could choose instead 𝑇 = 0.01 − 1000 (computationally more expensive)
PROCEDURE
THE LATENT DIMENSION PER CONfiGURATION
We plot the latent dimension for each different configuration, as a function of the temperature 𝑇, for
four different lattice sizes, 𝐿 = 25, 35, 50, 150.
We plot the latent dimension for each different configuration, as a function of the temperature 𝑇, for
four different lattice sizes, 𝐿 = 25, 35, 50, 150.
For low temperatures we obtain two plateaus, one located at 𝑧 = 1 and one at 𝑧 = −1.
• Corresponds to two distinct states that are not connected through any kind of transformation
• This reflects the spontaneously broken Ζ2 ≡ {−1,1} global symmetry group.
• One can interpret these two plateaus as the two cases where all spins are up or down.
• Is this true?
• We test the correlation coefficient between the latent dimension to magnetization
Correlation Coefficient between
magnetization and latent dimension
THE LATENT DIMENSION PER CONfiGURATION
We plot the latent dimension for each different configuration, as a function of the temperature 𝑇, for
four different lattice sizes, 𝐿 = 25, 35, 50, 150.
For low temperatures we obtain two plateaus, one located at 𝑧 = 1 and one at 𝑧 = −1.
• Corresponds to two distinct states that are not connected through any kind of transformation
• This reflects the spontaneously broken Ζ2 ≡ {−1,1} global symmetry group.
• One can interpret these two plateaus as the two cases where all spins are up or down.
• Is this true?
• We test the correlation coefficient between the latent dimension to magnetization
• The latent dimension −1 and 1 corresponds to the two orientations of the spins.
• Finally, the two plateaus become more distinct as the lattice size increases.
• At some temperature range 𝛥𝛵𝑡𝑟𝑎𝑛𝑠. the aforementioned behaviour collapses to one state,
which is located around 𝑧 = 0. This reflects the restoration of Ζ2 symmetry.
• In other words, it corresponds to the case where all the spins are disoriented.
• Critical point: as the lattice size increases the width of this transition decreases and this step
becomes steeper and steeper. At 𝐿 → ∞ 𝑇𝑐(𝐿) is localised right on the critical temperature 𝑇𝑐
THE LATENT DIMENSION PER CONfiGURATION
Hence:
The autoencoder predicts the two phases
It provides to a good approximation the
critical temperature!
In fact from 𝐿 = 150, by fitting a line we
get 𝑇 ≅ 2.28(4).
Already consistent with analytical value
𝑇 ≅ 2.269… .
Of course this does not tell us what happens
at 𝐿 → ∞
THE LATENT DIMENSION PER CONfiGURATION
What happens within different "temperature windows"
What if we use a temperature window within the range 𝑇 = 1 − 2 and apply the autoencoder
• Two ordered states are visible without the presence of a critical point
• Since there is no visible signal for a phase transition behaviour within this range of 𝑇 it is
reasonable to use another temperature window
What if we choose 𝑇 = 3 − 4.5• No particular pattern is observed
𝑇 = 1 − 2
𝑇 = 3 − 4.5
DIFFERENT TEMPERATURE WINDOWS
Next step would be to go to 𝑇 = 2 − 3
First: Average latent dimension 𝑧 as a function of the Temperature 𝑇
𝐿 = 150
For low temperatures the latent dimension is, in a good approximation, equally distributed
between the values of 𝑧 = 1 and 𝑧 = −1
Safe to consider absolute value of the latent dimension
CAN WE DEFINE MACROSCOPIC QUANTITIES?
CAN WE DEFINE MACROSCOPIC QUANTITIES?
Since the latent dimension per configuration is symmetric with respect to the 𝑇 axis, it would be
reasonable to define the average absolute latent dimension
The latent dimension resembles the behaviour of the magnetization per spin configuration as a
function of the temperature:
𝑚 =1
𝑁conf
𝑖=1
𝑁conf
𝑠𝑖
THE LATENT SUSCEPTIBILITY AND THE CRITICAL TEMPERATUREWe print the average absolute latent dimension as a function of the temperature
Compare with magnetization:
Second order As if First order
Absolute average latent dimension can be used as an order parameter to identify the critical
temperature, but cannot capture the right order of the phase transition
We print the average absolute latent dimension as a function of the temperature
Compare with magnetization:
Becomes
Steeper
𝑇𝑐(𝐿) extracted from the autoencoder data will suffer less from finite size scaling effects
THE LATENT SUSCEPTIBILITY AND THE CRITICAL TEMPERATURE
THE LATENT SUSCEPTIBILITY AND THE CRITICAL TEMPERATURE
We define the latent susceptibility as
Like the magnetic susceptibilityTheir peaks determine
More points are requiredNo overfitting!!!!
(Using the extracted weights)
We extract the latent susceptibility as well as the magnetic susceptibility:
From the peaks we extract
Does it extrapolate to the right value
THE LATENT SUSCEPTIBILITY AND THE CRITICAL TEMPERATURE
Results obtained using the latent susceptibility
suffer less from finite-size scaling effects as
compared to those when using the magnetic
susceptibility.
We present 𝑇𝑐(𝐿) extracted from fitting the
latent susceptibility and the magnetic
susceptibility as a function of 1/𝐿.
We adopt the standard finite scaling behavior:
Analytically extracted value:
For magnetic susceptibility 𝜈 = 1
THE LATENT SUSCEPTIBILITY AND THE CRITICAL TEMPERATURE
• Apply deep learning Autoencoder on Configurations of the 2D Ising model at zero magnetic field
• No prior knowledge on the system
• We determine both phase regions and the critical temperature
• We define the average absolute latent dimension as a quasi-order parameter
• We define the latent susceptibility
• We observe the spontaneously broken Ζ2 ≡ {−1,1} global symmetry group.
• We can determine the critical temperature 𝑇𝑐 to an adequate accuracy
• We can extract 𝑇𝑐(𝐿) as a function of 1/𝐿 with small finite scaling effects
• Autoencoder does not know about the order of the phase but instead recognizes the phase sector as
the important feature of the Ising model configurations
• WHY DOES THIS WORK???
CONCLUSIONS
FURTHER WORK: 3D-ISING MODEL
Late
nt d
imension
L=10 L=12
𝐿 = 16 𝐿 = 24 𝐿 = 32 𝐿 = 36
• 3D-Ising Model
Critical Temperature 𝑇𝐶 = 4.511… (Talapov and Blöte (1996))
Second Order phase transition
Late
nt dim
ensi
on
Linea
r Activa
tion
Tanh
Activa
tion
• 4D-Ising Model
Critical Temperature 𝑇𝐶 = 6.65 (P. H. Lundow, K. Markström 2012)
3 states
Late
nt d
imension
L=10 L=12 L=16
𝑇𝑇 𝑇
FURTHER WORK: 4D-ISING MODEL
• POTTS MODEL
The Potts (1952) model is a generalization of the Ising model such that each spin can have
𝑞 values.
Interesting because it is a multistate model with
2nd order phase transition for 𝑞 ≤ 4 1st order phase transition for 𝑞 > 4
3 states 4 states
Late
nt d
imensi
on
Late
nt d
imensi
on
FURTHER WORK: 2D-POTTS
Autoencoders can “see” the broken Center Symmetry of the Underlying Group!
𝑍(𝐺) = {𝑧 ∈ 𝐺 ∣ ∀𝑔 ∈ 𝐺, 𝑧𝑔 = 𝑔𝑧}
Very Interesting to see what happens with field theories:
𝑈(1) – Infinite order phase transition with continuous center symmetry
𝑆𝑈(𝑁) – 𝑍𝑁 center symmetry
𝑍(2) 𝑍(3) 𝑍(4) 𝑐𝑜𝑛𝑡.
2D-Ising Model 2D Q=3 Potts 2D Q=4 Potts 2D XY-Model
FURTHER WORK:
• Unsupervised investigation of the phase transition in 2D-Potts and XY-models
• Unsupervised investigation of the phase transition in field theories such as
• Compact 𝑈(1) (infinite order phase transition)
• 𝑆𝑈(𝑁) (second order for small 𝑁 first order as we increase 𝑁)
• Supervised/unsupervised investigation of Topological field objects in statistical systems and
quantum field theories (Instantons, Nielsen-Olesen vortices, Kosterlitz-Thouless vortices)
• Production of configurations using generative algorithms
FURTHER WORK
UPCOMING WORKSHOPMachine Learning for High Energy Physics, on and off the Lattice
ECT* Trento, 23 – 27 March 2020
Organising Committee:
• Andreas Athenodorou (Pisa)
• Dimitrios Giataganas (NKUA)
• Enrico Rinaldi (RIKEN)
• Biagio Lucini (Swansea)
• Kyle Cranmer (NYU)
• Constantia Alexandrou (Cyprus)