complex system

151
Physics of Complex Systems — Lecture notes — PRELIMINARY VERSION July 6, 2015 Prof. Dr. Haye Hinrichsen Lehrstuhl f ¨ ur Theoretische Physik III Fakult¨ at f ¨ ur Physik und Astronomie Universit¨ at W ¨ urzburg Summer term 2014

Upload: fernandoaguilar

Post on 10-Dec-2015

7 views

Category:

Documents


0 download

DESCRIPTION

Análisis de sistemas complejos.

TRANSCRIPT

Page 1: Complex system

Physics of Complex Systems

— Lecture notes —

PRELIMINARY VERSION

July 6, 2015

Prof. Dr. Haye Hinrichsen

Lehrstuhl fur Theoretische Physik IIIFakultat fur Physik und Astronomie

Universitat Wurzburg

Summer term 2014

Page 2: Complex system
Page 3: Complex system

This lecture, starting in the summer term 2014, is concerned with the research field“Physics of Complex Systems”, that emerged in the early 1990’s as a new branch oftheoretical and experimental physics. Important landmarks were the foundation of theSanta Fe Institute in New Mexico, the Complex Systems Department at the WeizmannInstitute of Science in Israel, and later the Max-Planck Institute for the Physics of Com-plex Systems in Dresden, to name only a few.

“Complex” does not necessarily mean “complicated”. In fact, most physicists are notparticularly interested in engineering complicated models with a large number of pa-rameters. Instead the term “complex” refers to systems consisting of a large numberof interacting building blocks. The idea is that such a composite system can exhibitnew properties which are not part of the individual building blocks. Such new phe-nomenons, which are not visible in the building blocks alone, have been termed as“emergent”.

This idea, however, is not new. In fact, this is exactly what people have been doingfor decades in solid-state physics. For example, a paramagnetic-ferromagnetic phasetransition in a magnetic material is an emergent phenomenon which cannot be readoff from an individual atom. Moreover, any kind of many-particle physics, includingthermodynamics and statistical mechanics, can be viewed as a study of “complex sys-tems”. However, in the late 80’s the idea came up that all these concepts can be appliedto a much broader range of systems and phenomena, and that one does not necessarilyneed highly sophisticated experiments to observe them.

Early pioneers in this direction started to put beer foam between two glass plates,putting it on a copy machine, xeroxing how the foam coarsens as a function of time,and sending the copies to Physical Review Letters. Other people suggested that thestatistical distribution of the water level of the Nile follows universal laws. And ofcourse, the upcoming internet inspired new approaches to describe the properties ofcomplex networks.

Meanwhile the field of “Complex Systems” has matured and is now established asan interdisciplinary field of theoretical and experimental physics. Moreover, it includesvarious interdisciplinary subtopics as, for example, econophysics and social networkstudies. However, as the field is very broad and its boundaries are not clearly defined,any lecture on the physics of complex systems will inevitably depend on the specificbackground of the lecturer. The present lecture is rooted in the concepts of StatisticalPhysics. We will start with the notion of thermal equilibrium and then turn to the non-equilibrium case.

Writing lecture notes it is practically impossible to avoid mistakes. Please help usto improve these lecture notes by informing me about mistakes and possible improve-ments per email.

(hinrichsen at physik uni-wuerzburg de).

Page 4: Complex system

Thank you!

Haye HinrichsenWurzburg, summer term 2014

Page 5: Complex system

Contents

1. Many-particle systems on a lattice 11.1. Classical cartoon of complex systems . . . . . . . . . . . . . . . . . . . . . 11.2. The exclusion process on a one-dimensional lattice . . . . . . . . . . . . . 10

2. Equilibrium 292.1. Entropy as an information measure . . . . . . . . . . . . . . . . . . . . . . 292.2. Entropy in Statistical Physics . . . . . . . . . . . . . . . . . . . . . . . . . 362.3. Thermostatics with conserved quantities and reservoirs . . . . . . . . . . 412.4. Conserved quantities and external reservoirs . . . . . . . . . . . . . . . . 44

3. Systems out of equilibrium 473.1. Dynamics of subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.2. Entropy production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533.3. Fluctuation theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583.4. Heat and work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4. Nonequilibrium phase transitions 654.1. Directed percolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.1.1. Directed bond percolation on a lattice . . . . . . . . . . . . . . . . 654.1.2. Absorbing states and critical behavior . . . . . . . . . . . . . . . . 674.1.3. The Domany-Kinzel cellular automaton . . . . . . . . . . . . . . . 714.1.4. The contact process . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.1.5. The critical exponents η, η′, ν⊥, and ν‖ . . . . . . . . . . . . . . . . 744.1.6. Scaling laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.1.7. Universality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764.1.8. Langevin equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.1.9. Multifractal properties of currents on directed percolation clusters 784.1.10. Characterizing non-equilibrium transition by Yang-Lee zeroes in

the complex plane . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.2. Other classes of absorbing phase transitions . . . . . . . . . . . . . . . . . 81

4.2.1. Parity-conserving particle processes . . . . . . . . . . . . . . . . . 814.2.2. The voter universality class . . . . . . . . . . . . . . . . . . . . . . 824.2.3. Absorbing phase transitions with a conserved field . . . . . . . . 834.2.4. The diffusive pair contact process . . . . . . . . . . . . . . . . . . 84

4.3. Epidemic spreading with long-range interactions . . . . . . . . . . . . . . 864.3.1. Immunization and mutations . . . . . . . . . . . . . . . . . . . . . 864.3.2. Long-range infections . . . . . . . . . . . . . . . . . . . . . . . . . 874.3.3. Incubation times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.4. Surface growth and non-equilibrium wetting . . . . . . . . . . . . . . . . 89

Haye Hinrichsen — Complex Systems

Page 6: Complex system

vi Contents

5. Equilibrium critical phenomena and spin glasses 915.1. The Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 915.2. Ising phase transition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 985.3. Numerical simulation of the Ising model . . . . . . . . . . . . . . . . . . 1025.4. Continuum limit of the Ising model . . . . . . . . . . . . . . . . . . . . . 1065.5. Spin glasses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6. Neural networks 1196.1. Biological background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.2. Magnetically inspired neural networks . . . . . . . . . . . . . . . . . . . . 1226.3. Hierarchical networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

A. Mathematical details 131A.1. Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 131A.2. Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

A.2.1. The physical meaning of tensor products . . . . . . . . . . . . . . 135

Haye Hinrichsen — Complex Systems

Page 7: Complex system

1. Many-particle systems on a lattice

1.1. Classical cartoon of complex systems

Configuration space

In the following let us consider an arbitrary physical system which consists of a largenumber of interacting building blocks. There is a large variety of such systems, in-cluding multi-particle systems such as gases and solids as well as more macroscopicphenomena such as granular flow or the dynamics of brain cells.

Ultimately every physical system follows the laws of quantum physics, which is themost fundamental physical theory of today. However, as we cannot even solve theHelium atom exactly within the framework of quantum mechanics, such a task will bepractically impossible in the case of interacting complex systems. Therefore, we needa simplified description which is still able to account for the most salient features ofcomplex systems.

It is well-known that the laws of quantum mechanics have far-reaching consequences.This includes the quantization of states in bound systems and the emergence of newphenomena such as quantum entanglement. However, usually these quantum featuresare not visible in our macroscopic world. The deep reason for the apparent classical be-havior on macroscopic scales is the phenomenon of decoherence caused by ongoing in-teraction of the system with the environment. Roughly speaking, the environment per-manently ‘measures’ the system, carrying away its quantum information and therebydestroying quantum effects in the system itself.

In many cases it is therefore sufficient to model a complex quantum-mechanical sys-tem as a classical one. Nevertheless this classical cartoon inherits some of the quantum-mechanical features. One of them is the assumption that the states of the system, whichare now interpreted as classical configurations, are in some sense quantized so that theycan be thought of as being discrete. Another one is that the dynamics between thesestates is of stochastic nature, i.e. the system jumps spontaneously from one state to theother, just as it would happen in quantum mechanics according to Fermi’s golden rule.

As an example let us consider molecular beam epitaxy (MBE), an experimental tech-nique which is frequently used e.g. in Prof. Molenkamps lab. In such experimentsone exposes a solid-state surface in a UV chamber to a beam of incident particles evap-orating from a thermal source. Some of these atoms land on the surface, forming adeposition layer. The actual microscopic processes depend on various parameters suchas the temperature and the involved materials. Typically the deposited atoms (calledadatoms) diffuse for some time on the surface until they find another adatom, forming

Haye Hinrichsen — Complex Systems

Page 8: Complex system

2 Many-particle systems on a lattice

Figure 1.1.: Cartoon of an MBE experiment. Left: The physical system is modeled as a substrate witha lattice structure on which certain atoms, here represented as colored cubes, are depositedand removed by evaporation according to specific dynamical rules. Right: Each classicalconfiguration of the model can be thought of as a microstate (red dot) in a huge configura-tion space denoted as Ωsys.

the nucleus of an immobile deposition layer. This is shown schematically on the leftside of Fig. 1.1.

With advanced microscopy techniques it is possible to track the motion of individualadatoms in real time. It turns out that the motion is discontinuous, i.e. the adatomsjump instantaneously from a given position on the lattice to a neighboring one. More-over, these jumps occur spontaneously, similar to the clicks of a Geiger counter, indi-cating that the events of jumping are totally random. In fact, the jumps are not causedby quantum-mechanical tunneling, rather they are thermally induced by lattice vibra-tions. Since thermal fluctuations are fully chaotic, they can be considered as some kindof random noise, triggering diffusive moves of the adatoms every now and then.

In a minimalistic model one would of course not incorporate the chaotic thermalvibrations of the substrate in all detail, instead one would restrict the description of themodel to the specific configurations of the adatoms with certain probabilistic transitionrules. To this end we first have to specify the set of all possible configurations of thesystem, which will be denoted as Ωsys. This is shown schematically in the right panelof Fig. 1.1, where the configurations are represented as red dots.

The example of MBE nicely illustrates that the precise meaning of a configuration de-pends on the chosen level of abstraction in a given model. With respect to this level ofabstraction, a configuration specifies the actual state of the system at a given time t inall detail. Such a configuration, which accounts for all details on the chosen level, isoften referred to as a microstate of the system. However, many authors simply use theterm ‘state’ instead of ‘microstate’ or ‘configuration’. This can lead to confusion sincethe term ‘state’ is also used for probability distributions and sometimes for macroscopicstates such as (p, T) in thermodynamics. Therefore, we prefer to use the term ‘configu-ration’ throughout these lecture notes.

Summary: In these lecture notes we use the following terms in the following sense:

configuration: All details about a microscopic configuration of a given modelmicrostate: Same as ‘configuration’Ωsys: Configuration space = set of all possible configurationsstate: Probability distribution (measure) on configurations.thermodyn. state: Set of macroscopic thermodynamical variables such as (p, T).

Haye Hinrichsen — Complex Systems

Page 9: Complex system

1.1 Classical cartoon of complex systems 3

Stochastic dynamics

Having characterized the configuration space, we have to find a suitable formulationfor the dynamics of the system, i.e. we need appropriate rules how the system evolvesin time. As outlined before, a large class of complex system evolves by instantaneousjumps from one configuration to the other. Denoting the individual configurations byc ∈ Ωsys, such a jump (sometimes also called microscopic transition) can be denoted byc→ c′.

A particular transition c→ c′ in amicroscopic transition network

The microscopic transitions give rise to a transition net-work in the configuration space. Note that in realisticsystems this network is far from being fully connectedbecause jumps between very different configurationsare usually impossible. For example, in MBE it willnever happen that 10 adatoms hop simultaneously inone direction. In fact, assuming instantaneous jumpsit is clear that only a single atom can move at a giventime since the probability that two random time inter-vals of size zero coincide vanishes. This means that thetransition network is usually very sparse.

Moreover, the likelihood of different microscopic transitions may be different, mean-ing that any microscopic transition c → c′ occurs randomly with a certain individualrate wc→c′ , where a zero rate indicates that the transition does not take place at all. Notethat in general rates may be different in opposite directions, i.e.

wc→c′ 6= wc′→c. (1.1)

At this point it is important to understand the difference between a probability p anda rate w. While a probability is a numerical value p ∈ [0, 1], a rate w ∈ R+

0 is definedas a probability per unit time and thus carries the dimension [time]−1. A rate can beinterpreted as follows: If dt denotes an infinitesimal time span, the probability for thetransition c→ c′ to happen just within this time span is given by wc→c′ dt.

In principle the rates could vary as a function of time and they could also depend onthe specific history of the evolution. However, in what follows we will assume that thesystem under consideration has no memory. This is the so-called Markov assumption,stating that the future evolution of the system does not (statistically) depend on thehistory but only on the actual configuration of the system. Unless stated otherwise, allsystems considered in this lecture are Markov processes which fulfill this assumption.Moreover, we will usually assume that the rates do not depend on time.

Summary: The definition of a complex stochastic system requires:

• the definition of a configuration space Ωsys• the definition of a set of transition rates wc→c′ > 0.

Ergodicity

A system with a transition network, where each configuration can be reached from anyother configuration, is called ergodic. This means that the transition network does not

Haye Hinrichsen — Complex Systems

Page 10: Complex system

4 Many-particle systems on a lattice

Figure 1.2.: Configuration space of a system with five sites (see below) which can be either empty oroccupied by a particle. If the dynamics conserves the number of particles the total spaceof 32 configurations decomposes into six sectors with fixed particle number. Transitionswithin each sector are allowed, while transitions between different sectors are forbidden bythe conservation law.

decompose into separate disconnected pieces.

Such a decomposition emerges naturally in systems with conserved quantities. Forexample, let us consider a system that conserves the total number of particles (seeFig. 1.2). This means that transitions are only possible between configurations withthe same particle number. Obviously such a system is not fully ergodic, instead it de-composes into a collection of dynamical sectors labeled by different particle numbers.

Another situation emerges in the presence of one-way transitions, i.e. wc→c′ > 0while wc′→c = 0. The presence of such transitions does not automatically break ergod-icity. However, it may happen that certain configurations (or sets of configurations) canbe reached but cannot be left. Such trapping states or sectors, which will also play animportant role later in these lectures, are called absorbing.

Finally, it is possible that a transition network decomposes only effectively into sev-eral parts. In this case the parts are still mutually connected, but the likelihood forthe system to go from one part to the other tends to zero. This happens, for example,in a ferromagnet: Here the macroscopic states of positive and negative magnetizationare fully symmetric to each other. However, the system stays in one of the magneti-zation directions because it can only flip if all internal degrees of freedom flip almostsimultaneously. This means that a magnetization flip is in principle possible but highlyunlikely so that the two subspaces of the configuration space are effectively discon-nected. This happens only in systems with a virtually infinite configuration space (theso-called thermodynamic limit and plays an important role in the context of spontaneoussymmetry breaking.

Probability distribution and Master equation

So far we have seen that a classical stochastic complex system is given in terms of a setΩsys of possible configurations c ∈ Ωsys. The system evolves in time by instantaneoustransitions c→ c′ which occur spontaneously like a radioactive decay with certain rateswc→c′ ≥ 0. The set of all configurations, the transition rates, and the initial configurationfully define the stochastic process under consideration.

Haye Hinrichsen — Complex Systems

Page 11: Complex system

1.1 Classical cartoon of complex systems 5

As the time evolution is stochastic, it is of course impossible to predict the actualsequence of transitions, i.e., the stochastic trajectory is completely unpredictable. In fact,the only quantities that can be predicted in a stochastic process are probabilities. Inthis context an important object to study is the probability Pc(t) to find the systemat time t in a certain configuration c. Obviously this probability distribution evolvescontinuously in time. Moreover it has to be normalized at any time, i.e.,

∑c

Pc(t) = 1 ∀t . (1.2)

For a given set of rates one can easily figure out how this probability distributionevolves in time. On the one hand, the probability Pc(t) will decrease due to the outgoingmicroscopic transitions from c to somewhere else, and this loss will be proportional tothe sum of the corresponding rates ∑c′ wc→c′ (the so-called escape rate) and proportionalto the probability Pc(t) itself. On the other hand, the probability Pc(t) will increase dueto the incoming microscopic transitions, and this gain will be proportional to the corre-sponding rates wc′→c times the probability Pc′(t) to find the system in the configurationfrom where the transition originates. Collecting all these loss and gain contributions,one arrives at a system of linear differential equations. This system of equations, calledmaster equation, reminds of a continuity equation and describes the probability flowbetween different configurations in terms of gain and loss contributions:

∂tPc(t) = ∑

c′wc′→cPc′(t)︸ ︷︷ ︸gain terms

−∑c′

wc→c′Pc(t)︸ ︷︷ ︸loss terms

. (1.3)

In this equation the gain and loss terms balance one another so that the normalizationcondition (1.2) is preserved as time proceeds. Note that the factor Pc(t) in the secondterm can be pulled out in front of the sum.

It is important to note that the rates wc→c′ carry the unit [time]−1. Unlike probabili-ties, the numerical value of a rate depends on the unit of time and may by larger than1. Rescaling all rates of a system by the same factor simply results in a change of thetime scale. For example, multiplying all rates by 2 would mean that the whole processis simply running twice as fast.

Formal properties of the master equation

Eigenmode decomposition: Let N = |Ω| be the number of configurations and let usenumerate the configurations by c1, . . . , cN in a specific order. The probability distribu-tion may be thought of as a list Pc1(t), . . . , PcN (t) of N time-dependent non-negativefunctions which sum up to 1. Let us now interpret this list of probabilities as a columnvector in some linear vector space V, using the Dirac notation |Pt〉 ∈ V:

|Pt〉 =(

Pc1(t), . . . , PcN (t))T . (1.4)

Since the probability distribution Pc(t) is referred to as the state of the system, the vector|Pt〉 is called state vector. Likewise, the vector space V is denoted as the state space of the

Haye Hinrichsen — Complex Systems

Page 12: Complex system

6 Many-particle systems on a lattice

system. Note that a state (i.e. vector in V) represents an ensemble of a large number ofconfigurations with individual probabilities.

The dimension of the vector space V is equal to the number of configurations ofthe system, i.e. it is generally an extremely high-dimensional space, comparable witha Hilbert space in quantum physics. However, in contrast to quantum mechanics,V = RN is a real vector space. In addition, its vectors are only physically meaning-ful if all components in the canonical representation are positive and sum up to 1. Thismeans that the actual state space is actually a small subset of RN in the positive hyper-quadrant which has the form of a convex simplex.

Since the master equation is a linear system of differential equations, it is possible towrite it in the compact form

∂t|Pt〉 = −L|Pt〉 , (1.5)

with a certain linear operator L, where the minus sign is introduced as a matter of con-venience, as will be explained below. The operator L is the so-called Liouville operatoror Liouvillian which generates the temporal evolution of the system.

In order to represent the Liouvillian as a matrix we have to define a suitable basis.The most natural choice is the so-called canonical configuration basis, defined by unitvectors

|c1〉 =(1, 0, 0, . . . , 0

)T

|c2〉 =(0, 1, 0, . . . , 0

)T (1.6). . .|cN〉 =

(0, 0, 0, . . . , 1

)T .

In the canonical configuration basis the Liouvillian is defined by the matrix elements

〈c′|L|c〉 = −wc→c′ + δc,c′ ∑c′′

wc→c′′ . (1.7)

Obviously, a formal solution of this first-order differential equation is given by

|Pt〉 = exp(−Lt)|P0〉, (1.8)

where exp(−Lt) is the matrix exponential function and |P0〉 denotes the initial probabilitydistribution at t = 0, the so-called initial state.

Recall: Matrix exponential function:The exponential function eA of a matrix A is defined by the usual power series eA = ∑k

1k! Ak

or by the product representation eA = limn→∞(1 + A/n)n. Technically the matrix exponen-tial function is most easily computed if one chooses the eigenbasis in which the operator Ais diagonal. In this basis one can compute eA simply by applying the exponential functionto each diagonal element separately.

This allows us to express the solution |Pt〉 as a sum over exponentially varying eigen-modes. To this end we diagonalize the Liouville operator, solving the eigenvalue prob-lem

L|k〉 = λk|k〉 . (1.9)

Expanding the initial state as a linear combination of these eigenvectors by

|P0〉 = ∑k

ak|k〉 (1.10)

Haye Hinrichsen — Complex Systems

Page 13: Complex system

1.1 Classical cartoon of complex systems 7

with certain coefficients ak, the formal solution can be written as

|Pt〉 = e−Lt ∑k

ak|k〉 = ∑k

ake−Lt|k〉 = ∑k

ake−λkt|k〉 . (1.11)

This is the so-called eigenmode decomposition of the master equation.Remark: Comparison with quantum mechanicsIn quantum mechanics the situation is similar. Here we also have a linear evolution equation,namely, the Schrodinger equation ih∂t|ψt〉 = H|ψt〉. This equation can be solved formally by|ψt〉 = exp(− i

h Ht)|ψ0〉. Diagonalizing the Hamiltonian by H|n〉 = En|n〉 and expanding the

initial state by |ψ0〉 = ∑n an|n〉, the general solution can be written as |ψt〉 = ∑n ane−ih Ent|n〉,

known as the eigenmode decomposition of the Schrodinger evolution equation.

Probability conservation: The gain and loss terms in the master equation correspondto the non-diagonal and diagonal matrix elements of the Liouville operator, respec-tively. As the Liouville operator was introduced together with a minus sign in front,its non-diagonal elements are always negative while its diagonal elements are positive(c.f. Eq. (1.7)).

As mentioned above, the gain and loss terms balance one another in order to preservethe normalization of the probability distribution. To see this in the vector notation, letus introduce the row vector

〈1| = (1, 1, 1, . . . , 1) . (1.12)

Using this vector the normalization condition (1.2) can then be rewritten in the simpleform

〈1|Pt〉 = 1 ∀t . (1.13)

This immediately implies that〈1|L = 0 , (1.14)

i.e. the sum over all columns in the matrix of the Liouville operator is zero. In the math-ematical literature such matrices are known as stochastic operators or intensity matrices,having the property that all off-diagonal entries are real and positive and that the sumover each column of the matrix vanishes.

In contrast to a quantum-mechanical Hamiltonian, the Liouville operator of a stochas-tic system is in general neither Hermitean nor symmetric. Consequently the eigenval-ues of an intensity matrix may be complex, indicating oscillatory behavior,1 but one canshow that their real part is always non-negative. This ensures that all eigenmodes areeither stationary or decaying exponentially with time:

|P(t)〉 = ∑k

ak e−λkt|k〉 , Re(λk) ≥ 0 . (1.15)

Moreover, the eigenvectors of a non-symmetric matrix are not necessary pairwise or-thogonal, yet they can be used as a basis, justifying the decomposition in Eq. (1.10).In addition, the left and right eigenvectors (row and column eigenvectors) of L do nothave the same components. In quantum mechanics we are used to obtain the adjoint

1As will be discussed in Chapter xxx, the possibility of complex eigenvalues plays an important role inthe context of chemical oscillations.

Haye Hinrichsen — Complex Systems

Page 14: Complex system

8 Many-particle systems on a lattice

left eigenvector simply by taking the complex conjugate components of the right eigen-vector. For a Liouvillian, these components of left and right eigenvectors are generallyunrelated.

Intensity matrices obey the Perron-Frobenius theorem (see Appendix A.1). This theo-rem tells us that an intensity matrix has at least one eigenvector with the eigenvaluezero, and that the components of this vector behave like probabilities. If the eigenvaluezero is non-degenerate, this is the only state in the expansion (1.15) which survives inthe limit t → ∞. This means that any system with a finite configuration space relaxesexponentially into a time-independent state |Pstat〉 with

∂t|Pstat〉 = −L|Pstat〉 = 0 . (1.16)

This is the so-called stationary state, which is also denoted as |Ps〉, |Pstat〉 or |P∞〉. Notethat the term ‘stationary’ does not mean that the dynamics of the system is frozen,rather it may continue to jump between different configurations. Stationarity rathermeans that our knowledge about the system, the probability distribution Pc(t), does nolonger depend on time. As we will discuss in the following chapter, stationarity mustnot be confused with thermal equilibrium, which turns out much a stronger constraint.

Solving a stochastic process basically means to diagonalize its Liouvillian. To obtainsuch a complete solution is often very difficult. Sometimes it is already useful to deter-mine only the lowest-lying eigenvector which is the stationary state. In fact, most of theexact solution methods presented throughout the remainder of this chapter are solelyconcerned with finding the stationary state. In addition, it is sometimes of interest tofind the second eigenstate with the smallest non-vanishing real part. The correspond-ing eigenvalue can be interpreted as the longest time scale, determining the asymptoticrelaxation properties.

As mentioned before the Liouvillian is generally not symmetric and thus the compo-nents of its left and right eigenvectors are generally different. The example of the sta-tionary state nicely illustrates this difference between right and left eigenvectors. Thestationary distribution of the system is given by the components of the right eigenvector,which is usually non-trivial to compute. How does the corresponding left eigenvectorlook like? The answer is very simple: If the spectrum is non-degenerate, it has to be thevector 〈1| because this is by definition a left eigenvector with eigenvalue zero:

L|P∞〉 = 0 ⇔ 0 = 〈1|L . (1.17)

Remark: Comparing quantum mechanics and stochastic dynamics

Quantum theory Stochastic Markov processescomplex Hilbert space Cn real probability space Rn

complex amplitude vectors |ψ〉 real probability vectors |P〉quadratic normalization 〈ψ|ψ〉 = 1 linear normalization 〈1|P〉 = 1unitary evolution ∂t|ψt〉 = 1

ih H|ψt〉 probability-conserving evolution ∂t|Pt〉 = −L|Pt〉eigenvalue problem H|φ〉 = E|φ〉 eigenvalue problem L|φ〉 = λφenergy E ∈ R relaxation time 1/λ, Re[λ] ≥ 0ground state⇔ lowest E stationary state⇔ λ = 0

Haye Hinrichsen — Complex Systems

Page 15: Complex system

1.1 Classical cartoon of complex systems 9

Figure 1.3.: Biased random walk in one dimension with a wall on the left side.

Example: Biased random walk on a one-dimensional chain

The probably simplest example of a stochastic Markov process in the framework de-scribed above is a (biased) random walk on a one-dimensional lattice. In this modela single particle (the random walker) is located at site n ∈ N0 of a one-dimensionallattice. As time evolves it jumps spontaneously to site n + 1 with rate wR and to siten− 1 with rate wL (see Fig. 1.3). If wL = wR the random walk is symmetric, otherwiseit is said to be biased in one direction.

By constraining the dynamics to non-negative integers we introduce some kind of“wall” at n = 0, where the random walker is reflected. As we will see below, thisguarantees the existence of a stationary state, provided that the walk is biased towardsthe wall.

Remark: Simulation on a computer:The total rate for a jump in either direction is wL + wR. Therefore, the average time elapsingbetween consecutive jumps is on average given by τ = 1/(wL + wR). Actually these eventsdo not happen at regular time intervals, rather they occur randomly as in a radioactive decay.This means that the time intervals ∆t between consecutive events are randomly distributedaccording to a Poisson distribution (also known as shot noise), i.e.

P(∆t) ∝1τ

e−∆t/τ . (1.18)

On a computer (see lecture on “Computational Physics”) such time intervals can simply begenerated by setting

∆t := −τ ln(r) , (1.19)

where r is a uniformly distributed random number between 0 and 1. The leads us to thefollowing update procedure (written here in C/C++ style):

if (rnd()<wr/(wr+wl)) n++;

else n--;

t += -log(rnd())/(wl+wr);

Here double rnd(void) is a standard random number generator, int n is the position,double t is the actual physical time, and double wl,wr are the rates for jumps to theleft and to the right. Take care that the random numbers equal to zero are excluded sinceotherwise the logarithm would diverge.

The master equation for this process takes the form

∂tPt(n) = wRPt(n− 1)︸ ︷︷ ︸gain coming from left

+ wLPt(n + 1)︸ ︷︷ ︸gain coming from right

− (wL + wR)Pt(n)︸ ︷︷ ︸loss terms

. (1.20)

As for the compact vector notation, we first note that there are infinitely many possibleconfigurations n ∈ Z, hence the corresponding vector space of probability vectors isinfinite-dimensional. In the canonical configuration basis 0, 1, 2, . . . the Liouvillian

Haye Hinrichsen — Complex Systems

Page 16: Complex system

10 Many-particle systems on a lattice

has the matrix elements

Ln,n′ = (wL + wR)δn,n′ − wRδn,n′+1 − wLδn,n′−1 (1.21)

or, in matrix notation

L =

wR −wL−wR wL + wR −wL

−wR wL + wR −wL. . . . . . . . .

. . . . . . . . .

. (1.22)

By construction, the vector 〈1| is a left eigenvector to the eigenvalue zero. As discussedabove, the components of the corresponding right eigenvector are different and can becomputed recursively as follows. Assume that the first component is given by somenumber P0. Then the first line of the matrix tells us that

wRP0 − wLP1 = 0 , (1.23)

hence P1 = bP0 with b = wR/wL. As can be verified easily, the following lines providea recursion relation

Pk = bPk−1 (1.24)

with the closed solutionPk = P0bk . (1.25)

In order to interpret these components as probabilities, they need to be normalized.This can be done by computing the geometric series

1 =∞

∑k=1

Pk = P0

∑k=1

bk =P0

1− b, (1.26)

fixing the value of P0. This lead to the result

|P〉 =(

P0, P1, P2, . . .)T , Pk = (1− b)bk . (1.27)

The right eigenvector with these components just describes the stationary state of theprocess. As expected, this result is only meaningful for b < 1, where the random walkis biased to move towards zero.

Remark: For b = 1 (unbiased case) one obtains one half of a Gaussian distribution whichspreads continuously so that the width is increasing as

√t. For b > 1 (biased to the right)

one obtains a full Gaussian which is spreading as√

t and moving away from the origin atconstant velocity. In both cases the solution is not stationary.

1.2. The exclusion process on a one-dimensional lattice

In this lecture we are mainly concerned with stochastic many-particle systems on adiscrete lattice. The lattice consists of the finite or infinite number of sites which are

Haye Hinrichsen — Complex Systems

Page 17: Complex system

1.2 The exclusion process on a one-dimensional lattice 11

square lattice honeycomb latticetriangular lattice

Figure 1.4.: Various examples of two-dimensional lattice structures.

arranged in a certain lattice geometry (see e.g. Fig. 1.4). For the purpose of this lecture,we will be most concerned with one-dimensional chains of sites.

Each site can be vacant or occupied by one or several particles. Models in whichthe number of particles per site is unlimited are often referred to as bosonic models. Incontrast models with the restriction of the particle number per site are called fermionic.The difference basis of particles are usually enumerated by capital letters A, B, C, andso forth. In the simplest case, there is only one species of particles involved, denoted asA, and the lattice sites are restricted to carry at most one particle. This means that eachsite can be only in one of two local states, namely, vacant (∅) or occupied by a particle(A).

Many-particle diffusion: The Simple Exclusion Proces (SEP)

The probably simplest example of a non-trivial many-particle system is the so-calledsimple exclusion process (SEP). This model describes just many diffusing particles on alattice. However, the individual random walks of these particles are not totally inde-pendent since it is assumed that each site is occupied by at most one particle. In otherwords, the particles ’exclude each other’, which explains the name of the model.

This means that the exclusion process falls into the simplest class of lattice models,where each site can be in only two different configurations, namely, vacant or occupiedby an particle, denoted as ’∅’ and ’A’.

On a finite one-dimensional chain with L sites the exclusion principle ensures that theconfiguration space of the model is finite, containing 2L different configurations whichmay be represented in the same way as binary numbers:

∅∅ . . . ∅∅∅ empty lattice∅∅ . . . ∅∅A∅∅ . . . ∅A∅∅∅ . . . ∅AA∅∅ . . . A∅∅. . .AA . . . AAA fully occupied lattice

Remark: Numerical bit coding techniquesThe possibility to enumerate all configurations by binary numbers can be exploited as a very

Haye Hinrichsen — Complex Systems

Page 18: Complex system

12 Many-particle systems on a lattice

Figure 1.5.: Simple exclusion process with two lattice sites. In the figure shown above site 1 is vacantwhile site 2 is occupied by an particle A, i.e., the system is currently in the configuration“∅A”. In addition, the system may be connected to two external reservoirs. At site 1 parti-cles may enter from the left reservoir at rate α while particles at site 2 may leave the systemat rate β, moving to the right reservoir.

efficient implementation on a computer. For example, an unsigned long int allows usto describe a chain with L = 64 sites. This so-called bit-coding is very efficient since the pro-cessor can handle 64 bits in parallel. However, the dynamical rules have to be implementedby bit manipulations which are not so easy to code for inexperienced programmers. For thepurpose of this lecture it is much simpler to use a simple static array int s[L], storing thevalues 0 (vacant) and 1 (occupied).

The dynamics of the exclusion process is defined in such a way that every particlejumps to the right (left) with rate wR (wL), provided that the target site is empty. Thismay be written symbolically as the following microscopic transition rules:

A∅wR−→ ∅A (1.28)

∅A wL−→ A∅ (1.29)

Special case of a chain with two sites: For a system with only two sites (L = 2) theLiouville-Operator, represented in the configuration basis (∅∅, ∅A, A∅, AA), wouldbe given by

L(2) =

0 0 0 00 wL −wR 00 −wL wR 00 0 0 0

. (1.30)

This matrix has a three-fold degenerate eigenvalue 0, meaning that the stationary stateof the system is not unique. This is of course not surprising since the dynamical rulespreserve the total number of particles. Therefore, in a system with only two sites theconfiguration space decomposes into three decoupled dynamical sectors, namely withzero, one, and two particles.

In addition to the zeros, the matrix (1.30) has a single non-zero eigenvalue λ = wL +wR, describing a relaxation mode. The corresponding eigenstate has to belong to thesector with one particle because the empty and the fully occupied lattice are frozenconfigurations. In the one-particle sector the dynamics relaxes into a stationary statewhere the system flips randomly between A∅ and ∅A. The normalized probability tofind the system in one of these configurations is given by

Pstat(A∅) =wL

wL + wR, Pstat(∅A) =

wR

wL + wR. (1.31)

Haye Hinrichsen — Complex Systems

Page 19: Complex system

1.2 The exclusion process on a one-dimensional lattice 13

Obviously, if we decided to multiply both rates with the same number the systemwould simply switch faster between the two configurations, but the stationary prob-abilities (1.31) will remain unaffected. In fact, rescaling all rates of a given model bya common factor just changes the relaxation time scale while the stationary propertiesremain invariant. In the literature this freedom is often used to set one of the rates to 1.Another common choice is to choose reciprocal rates, i.e.

wR = q , wL = q−1 . (1.32)

A SEP with external particle input and output: Let us now couple the system totwo external particle reservoirs, as sketched in Fig. 1.5. Instead of modeling the reser-voirs explicitly, they are implemented as spontaneous particle creation and removalprocesses at the boundaries, i.e., particles can enter the system at the leftmost site atrate α and may leave the system from the rightmost side at rate β. In a two-site system,this amounts to adding the transitions

∅∅→ A∅ with rate α

∅A→ AA with rate α

∅A→ ∅∅ with rate β

AA→ A∅ with rate β .

Note that in a two-site system each process appears twice since the site which is notinvolved can be either vacant or occupied. In order to incorporate these rules into theLiouvillian, the two-site interaction matrix has to be extended by appropriate boundaryterms, namely by

L = L(2) +A(2) + B(2) (1.33)

where

A =

α 0 0 00 α 0 0−α 0 0 00 −α 0 0

, B =

0 −β 0 00 β 0 00 0 0 −β0 0 0 β

(1.34)

describe particle entry and exit at the left and right boundary, respectively. Adding allcontributions, the Liouvillian for the system shown in figure 1.5 reads

L =

α −β 0 00 wL + α + β −wR 0−α −wL wR −β0 −α 0 β

.

(1.35)

Unfortunately, the eigenvalues of this 4× 4 matrix cannot be computed easily. In par-ticular, there are no longer three different stationary states because the ongoing particleinflux from the left reservoir and the exit into the right one break particle conservation,thereby mixing the three sectors mentioned above. Consequently, the system has onlyone stationary state, namely,

|Pstat〉 =1N

(β2wR, αβwR, αβ(α + β + wL), α2wR

), (1.36)

Haye Hinrichsen — Complex Systems

Page 20: Complex system

14 Many-particle systems on a lattice

where the normalization constant is given by

N = αβ(α + β + wL) + wR(α2 + αβ + β2) . (1.37)

Similarly we could construct the Liouvillian of a three-site chain by hand. Using thecanonical configuration basis

∅∅∅, ∅∅A, ∅A∅, ∅AA, A∅∅, A∅A, AA∅, AAA

we would arrive at the matrix

L =

α −β 0 0 0 0 0 00 wL + α + β −wR 0 0 0 0 00 −wL wL + wR + α −β −wR 0 0 00 0 0 wL + α + β 0 −wR 0 0−α 0 −wL 0 wR −β 0 00 −α 0 −wL 0 wL + wR + β −wR 00 0 −α 0 0 −wL wR −β0 0 0 −α 0 0 0 β

(1.38)

Formal construction of 1D lattice models: Tensor products

In practice it is rather inconvenient to construct such a matrix by hand. Therefore, it isuseful to understand the general structure of a Liouvillian on a chain in terms of tensorproducts and then to automatize its construction, using e.g. Mathematica R©.

The tensor product of vector spaces, which is introduced in detail in Appendix A.2,work as follows. Suppose that U and V are two vector spaces with dimensions n and m.The tensor product combines these two vector spaces in such a way that the resultingvector space is n ·m-dimensional, i.e., the dimension is multiplicative under the forma-tion of the tensor product. This operation, denoted as ⊗, can be applied to both vectorsand matrices. In Euclidean coordinates systems, the components of the resulting objectare given by the product of all combinations of the components of the tensor factors.For example, for two vectors we have

(ab

)⊗(

cd

)=

acadbcbd

(1.39)

and likewise for matrices

(a bc d

)⊗(

e fg h

)=

ae a f be b fag ah bg bhce c f de d fcg ch dg dh

. (1.40)

In the context of one-dimensional lattices, we will associate with each lattice site a sepa-rate tensor factor. Let us, for example, reconsider the boundary terms of the Liouvilliandiscussed above. These boundary terms are represented by 4× 4 matrices, but it is clear

Haye Hinrichsen — Complex Systems

Page 21: Complex system

1.2 The exclusion process on a one-dimensional lattice 15

that the actual process takes place at only one of the two lattice sites. This circumstancesis reflected by the fact that the boundary operators can be written as tensor products inthe following form:

A =

(α 0−α 0

)︸ ︷︷ ︸

A(1)

⊗(

1 00 1

), B =

(1 00 1

)⊗(

0 −β0 β

)︸ ︷︷ ︸

B(1)

. (1.41)

Here the left tensor factor A(1) describes a single-site particle creation process ∅ → A,whereas the meaning of the second tensor factor, which is a 2× 2 unit matrix, is justthat of “doing nothing”.

The insight that any local operation can be applied to a chain with many sites simplyby forming a suitable tensor product with the unit matrices in all places that are notinvolved, allows us to systematically construct the Liouvillian for any lattice size, aswill be explained in the following.

Remark: In practice it is useful to automatize the tensor product. A very simple Mathematica R©function,which can form the tensor product of both tensors and matrices, takes only few lines:

Attributes[CircleTimes] = Flat, OneIdentity;

CircleTimes[a_List, b_List] := KroneckerProduct[a, b];

For the tensor product |c〉 = |a〉 ⊗ |b〉 one simply writes

cvec = a1,a2,a3 ⊗ b1,b2

where the symbol ⊗ can be obtained by typing ESC c * ESC .

Setting up the Liouvillian: In order to construct the Liouvillian formally, let us againconsider a chain with 3 sites. Note that microscopic diffusion events A∅ ↔ ∅A onlyinvolve a pair of adjacent sites along the chain. As for the Liouvillian, which playsthe role of a time evolution generator, this means that L is given by a sum of two-siteoperators:

L(3) = L12 + L23 = L(2) ⊗ 1 + 1⊗L(2) . (1.42)

Here Li,i+1 is a matrix describing hopping from site i to i + 1 and vice versa. Of coursethis matrix should have exactly the same form as in Eq. (1.30), hence we expect it to bea 4× 4 matrix. On the other hand, the ’total’ Liouvillian L on the left hand side acts onthree sites, hence it has to be a 8× 8 matrix, namely, exactly the one given in Eq. (1.38).

In the case of external reservoirs, there are additional boundary contributions of theform

L(3) = L(2) ⊗ 1 + 1⊗L(2) + A(1) ⊗ 1⊗ 1 + 1⊗ 1⊗B(1) . (1.43)

In the professional literature, nobody would write this in such a complicated way. In-stead, the prevailing notation is:

L(3) =2

∑i=1Li,i+1 +A1 + B3 (1.44)

Haye Hinrichsen — Complex Systems

Page 22: Complex system

16 Many-particle systems on a lattice

with

L1,2 = L(2) ⊗ 1

L2,3 = 1⊗L(2)

A1 = A(1) ⊗ 1⊗ 1

B3 = 1⊗ 1⊗B(1) .

(1.45)

Now we can easily extend this formalism to an arbitrary number of sites. The Liouvil-lian can be expressed as

L =L−1

∑i=1Li,i+1 + A1 + BL (1.46)

with the bulk interaction

Li,i+1 := 1⊗(i−1) ⊗L(2) ⊗ 1⊗(L−i−1) (1.47)

and the boundary matrices

A1 := A(1) ⊗ 1⊗(L−1)

Bl := 1⊗(L−1) ⊗B(1) .(1.48)

In full form the Liouvillian is given by

L =L−1

∑i=1

(1 00 1

)⊗ . . .⊗

(1 00 1

)︸ ︷︷ ︸

acting on sites 1,...,i−1

0 0 0 00 wL −wR 00 −wL wR 00 0 0 0

︸ ︷︷ ︸=L(2)acting on sites i,i+1

⊗(

1 00 1

)⊗ . . .⊗

(1 00 1

)︸ ︷︷ ︸

acting on sites i+2,...,L

+

(α 0−α 0

)⊗(

1 00 1

)⊗ . . .⊗

(1 00 1

)+

(1 00 1

)⊗ . . .⊗

(1 00 1

)⊗(

0 −β0 β

)

Observables

As in quantum mechanics, we are interested in measuring certain observables of inter-est. In quantum physics, observables are represented by Hermitean operators M = M†

whose expectation value is given by

〈M〉 = 〈ψ|M|ψ〉 , (1.49)

where |ψ〉 is the actual quantum state normalized by 〈ψ|ψ〉 = 1. Note that the statevector enters twice, both in the expectation value and the normalization. In stochas-tic Markov processes, however, the situation is different. Here the actual state of thesystem is described by a probability vector |P〉. This vector is normalized linearly by

Haye Hinrichsen — Complex Systems

Page 23: Complex system

1.2 The exclusion process on a one-dimensional lattice 17

〈1||P〉 = 1, and therefore it is no surprise that the same applies to measurements. Morespecifically, if M is a measurement operator, its expectation values given by

〈M〉 = 〈1|M︸ ︷︷ ︸〈M|

|ψ〉 . (1.50)

Opposed to quantum theory, the left vector 〈1| is constant in this case. Therefore, asindicated by the curly bracket in the expression above, we do not need measurementoperators in the framework of this theory, rather it is fully sufficient to work with mea-surement vectors.

A measurement vector is a row vector where each component corresponds to a par-ticular configuration of the system. The meaning of these components is very simple:it just contains the value of what the measurement apparatus would measure in the re-spective configuration. For example, on a chain with two lattice sites, the probability offinding a particle at the left site can be measured by applying the measurement vector

∅∅, ∅A, A∅, AA

〈M| = (0, 0, 1, 1) = (0, 1)⊗ (1, 1) .(1.51)

As shown above, the vector can be written as a tensor product of two local vectors. Theleft local vector (0, 1) represents a measurement which responds with ’1’ to a particleand with a ’0’ otherwise. The right local vector (1, 1), on the other hand, is neutral andhas the simple meaning of ’measuring nothing’.

As a second example, let us consider a vector measuring the mean number of parti-cles on the chain:

∅∅, ∅A, A∅, AA

〈M| = (0, 1, 1, 2) = (0, 1)⊗ (1, 1) + (1, 1)⊗ (0, 1) .(1.52)

As can be seen, this is simply the sum of the particle occupancy vector (0, 1) at theleftmost and the rightmost site.

In the literature the notion of measurement vectors is very uncommon. Instead oneuses diagonal measurement operators, where the diagonal elements are just the compo-nents of the corresponding measurement vector. These matrices are usually constructedas sums of the tensor products

χi = 1⊗(i−1)χ1⊗(L−i) , χ =

(0 00 1

), (1.53)

which gives the probability of finding a particle at site i (while the meaning of identitymatrices is again that of “ measuring nothing”). For example, the average density ofparticles ρ can be measured by applying the measurement operator

ρ(t) =1L〈1|N|P(t)〉 , N =

L

∑i=1

χi . (1.54)

Another important example is that of a two-point correlation function

Cij(t) = 〈1|χiχj|P(t)〉χiχj = 1⊗ . . .⊗ 1⊗ χ︸︷︷︸

i

⊗1⊗ . . .⊗ 1⊗ χ︸︷︷︸j

⊗1⊗ . . .⊗ 1 (1.55)

Haye Hinrichsen — Complex Systems

Page 24: Complex system

18 Many-particle systems on a lattice

which measures the probability that the lattice sites i and j are both occupied at thesame time. However, finding positive values Cij(t) > 0 does not necessarily mean thatthe two sites are really correlated. For example, in a fully occupied system we haveCij = 1 although the two sites do not communicate with each other. For this reason it ismeaningful to subtract the product of the expectation values at each site. This is knownas the “connected part” of the correlation function:

Cconnij (t) := 〈1|χiχj|P(t)〉 − 〈1|χi|P(t)〉 · 〈1|χj|P(t)〉 . (1.56)

Note that the connected part of a correlation function is non-linear in the state vector|P(t)〉. Therefore, it cannot be expressed in terms of a single observable Cconn

ij such thatCconn

ij (t) = 〈1|Cconnij |P(t)〉.

Product States: A (stationary) state |P〉 is called factorizable or product state if it can bewritten as a tensor product of the form

|P〉 =(

e1d1

)⊗(

e2d2

)⊗(

e3d3

)⊗ . . .⊗

(eLdL

). (1.57)

Note that if each of the tensor factors is rescaled individually, the scale factors can bepulled out in front of the expression, i.e.,(

λ1e1λ1d1

)⊗(

λ2e2λ2d2

)⊗(

λ3e3λ3d3

)⊗ . . .⊗

(λLeLλLdL

)= λ1λ2λ3 · · · λL|P〉 . (1.58)

Therefore, without loss of generality, each of these factors can be normalized in such away that its components add up to 1, i.e., di + ei = 1. With this convention the resultingvector is already properly normalized, and di = pi may be interpreted as the probabilityof finding a particle at site i:

|P〉 =(

1− p1p1

)⊗(

1− p2p2

)⊗(

1− p3p3

)⊗ . . .⊗

(1− pL

pL

). (1.59)

This argument shows that a product state involves only L degrees of freedom. Since ageneral normalized vector is characterized by 2L − 1 independent degrees of freedom,it is clear that product spaces form only a small subspace in the full state space. Tofind out in what sense product states are special, we note that the connected part of thetwo-point correlation function vanishes:

Cconnij = 〈1|χiχj|P(t)〉 − 〈1|χi|P(t)〉 · 〈1|χj|P(t)〉 = pi pj − pi pj = 0 . (1.60)

The same applies to any connected n-point correlation function. Thus, we can concludethat

Product states have no correlations.

Remark: Note that product states in classical stochastic systems are analogous to non-entangled states in quantum physics, which also factorize into tensor products.

Haye Hinrichsen — Complex Systems

Page 25: Complex system

1.2 The exclusion process on a one-dimensional lattice 19

Stationary product states: Since product states are simple but also very special, thequestion arises under which conditions a given system has a factorizing stationary state.As an example let us again consider the asymmetric simple exclusion process (ASEP)without boundary terms

L =L−1

∑i=1L(2)

i,i+1 , (1.61)

where L(2) is again the 4× 4 matrix defined in Eq. (1.30). Stationarity means that

L|P〉 = 0.

Obviously, there are two possibilities how this can happen:

• Each of the summands in Eq. (1.61) applied to the product state vanishes individ-ually, i.e. L(2)

i,i+1|P〉 = 0 for all i = 1, . . . , L− 1.

• The individual terms do not vanish separately, only the total sum gives zero. Thisrequires a nontrivial cancellation mechanism between adjacent pairs of sites.

Let us first consider the first case, where each term is supposed to vanish separately.This means that

L(2)[(

1− pipi

)⊗(

1− pi+1pi+1

)]= 0 ∀i = 1, . . . , L− 1 (1.62)

or equivalently 0 0 0 00 wL −wR 00 −wL wR 00 0 0 0

(1− pi)(1− pi+1)

(1− pi)pi+1pi(1− pi+1)

pi pi+1

=

0000

. (1.63)

Since L(2) has four lines one obtains in principle four different equations. In the presentcase, however, the first and the last line vanish identically while the second and thirdline are linearly dependent. Therefore, in the case of the ASEP one obtains only a singleequation, namely

wL(1− pi)pi+1 = wr pi(1− pi+1) , (1.64)

leading to the recursion relation

pi+1 = f (pi) =wR pi

wL(1− pi) + wR pi. (1.65)

With the help of Mathematica R© we can easily convince ourselves that this recursion re-lation leads to the exact solution

pi =wi

R p1

wiL(1− p1) + wi

R p1. (1.66)

As shown in Fig.had encountered (1.6) this allows us to compute density profiles, de-scribing the stationary state of the system. However, it should be kept in mind that thedynamics of the ASEP conserves the number of particles in the system, decomposingthe configuration space into a large number of independent sectors with a fixed number

Haye Hinrichsen — Complex Systems

Page 26: Complex system

20 Many-particle systems on a lattice

20 40 60 80 100

i

0.2

0.4

0.6

0.8

1.0

pi

Figure 1.6.: Density profiles according to Eq. (1.66) for the initial probabilities p1 = 0.98 and p1 = 0.8,respectively, and a rate ratio of wR/wL = 0.95. Since diffusion is biased to the left, theparticles are found preferentially in the left part of the system.

of particles, and that the product state computed above actually describes a probabilis-tic superposition over many of such sectors. In order to compute the stationary statefor a given number of particles, it is therefore necessary to project this solution onto thecorresponding sector (left as an exercise to the reader).

Let us now turn to the generalized model with external reservoirs and let us find outunder which conditions it possesses a factorizable stationary state. If all terms in thesum of the Liouvillian vanish separately, this would imply that its action on the firsttwo sites of the system obeys the equation

(L(2) +A1 ⊗ 1)[(

1− p1p1

)⊗(

1− p2p2

)]= 0 , (1.67)

or, equivalently

α 0 0 00 wL + α −wR 0−α −wL wR 00 −α 0 0

(1− p1)(1− p2)

(1− p1)p2p1(1− p2)

p1 p2

=

0000

. (1.68)

As can be verified easily, the system of equations is now over-determined and thus hasno solutions for any non-vanishing value of α. Therefore, in the case of external reser-voirs, there are no stationary products states for which all summands of the Liouvillianvanish separately. However, as we will see in the next paragraph, stationary productsstates are still possible thanks to an elaborate compensation mechanism between theterms.

Zipper-like compensation mechanism for stationary product states:

Haye Hinrichsen — Complex Systems

Page 27: Complex system

1.2 The exclusion process on a one-dimensional lattice 21

In the following we demonstrate that the asymmetric exclusion pro-cess (ASEP) coupled to external reservoirs still admits factorizablesolutions thanks to a compensation mechanism between adjacentsites. Such products states are homogeneous, i.e., they consist ofidentical tensor factors:

|Pstat〉 =1N

(ed

)⊗(

ed

)⊗ . . .⊗

(ed

), (1.69)

where N = (d + e)L is a normalization factor. The notation with letters e and d isfrequently used in the literature and corresponds to the probabilities 1− p and p.

First, we note that the application of the left boundary term A yields the vector

A(

ed

)=

(α 0−α 0

)(ed

)= +αe

(1−1

). (1.70)

Remarkably, the application of the right boundary vector yields the same vector, al-though with a different prefactor:

B(

ed

)=

(0 −β0 β

)(ed

)= −βd

(1−1

). (1.71)

Clearly, the two vectors on the right hand side coincide up to a minus sign, providedthat eα = dβ. Thus let us choose e and d in such a way that that

eα = dβ = 1 . (1.72)

With this choice, seen on adjacent pairs of sites, the action of the boundary terms isgiven by

A1

[(ed

)⊗(

ed

)]= +

(1−1

)⊗(

ed

)(1.73)

B2

[(ed

)⊗(

ed

)]= −

(ed

)⊗(

1−1

). (1.74)

For a zipper-like compensation mechanism to be established, the disturbing exchangevector (1,−1)T has to be “commuted” from the leftmost to the rightmost site by meansof the remaining contributions in the bulk of the chain. To see that, let us first apply thetwo-site operator to a local two-site product state

L(2)[(

ed

)⊗(

ed

)]=

0 0 0 00 wL −wR 00 −wL wR 00 0 0 0

e2

ededd2

=

0

−de(wR − wL)+de(wR − wL)

0

. (1.75)

This can be rewritten in the form

L(2)[(

ed

)⊗(

ed

)]=

de(wR − wL)

d + e

[−(

1−1

)⊗(

ed

)+

(ed

)⊗(

1−1

)]. (1.76)

Haye Hinrichsen — Complex Systems

Page 28: Complex system

22 Many-particle systems on a lattice

Thus, adding the boundary contributions (1.73) and (1.74), stationarity can be estab-lished if and only if the pre-factor is tuned in such a way that

de(wR − wL)

d + e= 1 . (1.77)

Because of αe = βd = 1 this means that the boundary rates are restricted by α + β =wR − wL. With this particular tuning of the rates it is easy to see that the two-site prob-lem is indeed stationary:

(L(2) +A1 + B2

) [(ed

)⊗(

ed

)]= 0 . (1.78)

The same applies to chains with more than two sites. For example, for a chain withthree sites, we obtain the contributions

A1

[(ed

)⊗3]

= +

(1−1

)⊗(

ed

)⊗(

ed

)(1.79)

L(2)12

[(ed

)⊗3]

= −(

1−1

)⊗(

ed

)⊗(

ed

)+

(ed

)⊗(

1−1

)⊗(

ed

)(1.80)

L(2)23

[(ed

)⊗3]

= −(

ed

)⊗(

1−1

)⊗(

ed

)+

(ed

)⊗(

ed

)⊗(

1−1

)(1.81)

B3

[(ed

)⊗3]

= −(

ed

)⊗(

ed

)⊗(

1−1

)(1.82)

which, adding them all up, cancel mutually on the r.h.s., hence

(L(2)

12 + L(2)23 +A1 + B3

) [(ed

)⊗(

ed

)⊗(

ed

)]= 0 . (1.83)

Thus, we have shown that the ASEP coupled to external reservoirs possesses a station-ary product state on a particular line in the phase diagram where

α + β = wR − wL . (1.84)

Obviously this implies the inequality wR > wL as a necessary condition. This is phys-ically reasonable since a homogeneous stationary state can only be established if thechain transports particles preferentially to the right since otherwise some kind of “traf-fic jam” would emerge at one of the boundaries.

Note that in the case of open boundaries there is no particle conservation, hence thesystem is ergodic and the stationary state is unique.

Since the tensor factors are identical along the chain, the particle density will be con-stant. Nevertheless, the random walk of the particles is biased to the right, generatinga constant flux of particles. As will be discussed later, this is a simple example of anon-equilibrium steady-state (NESS).

Haye Hinrichsen — Complex Systems

Page 29: Complex system

1.2 The exclusion process on a one-dimensional lattice 23

Matrix product states

So far we have seen that the stationary state of the asymmetric exclusion process isgiven by a product state, provided that α + β = wR − wL. This means that we havesolved the stationary problem along a particular line in the phase diagram, where thelattice sites are uncorrelated. Is it possible to solve the same problem for arbitrary ratesα, β, wL, and wR, where the system is expected to exhibit non-trivial correlations ?

A very elegant solution to this problem was proposed by Bernhard Derrida andcoworkers in the middle of the 90s [6, 7]. They realized that the main restriction withordinary product states comes from the fact that the product of two tensor factors

(ed

)⊗(

ed

)=

e2

ededd2

(1.85)

always gives identical numbers in the second and third component of the resulting vec-tor on the right-hand side. As a way out they proposed to replace the numbers e and dby non-commutative objects E and D. These objects may be thought of as nontrivial op-erators acting in some fictitious auxiliary space. This space must not be confused withthe configuration space of the system, rather it is an additional space on top of thatwith the only purpose to establish a certain type of non-commutativity between E andD. Once a suitable representation of the commutation relations is found, E and D canbe expressed as matrices acting in the auxiliary space. To help the reader and to avoidconfusion, we mark all quantities acting in the auxiliary space by a tilde.

Before entering the problem of finding such operators, let us first consider the conse-quences of this approach. Replacing the numbers e and d in the product state

|Pstat〉 =(

ed

)⊗(

ed

)⊗ . . .⊗

(ed

)(1.86)

literally by non-commutative matrices E and D, we would obtain a vector with entriesconsisting of matrices instead of numbers. For example, on a chain with two sites wewould get

|Pstat〉 =(

ED

)⊗(

ED

)=

EEEDDEDD

. (1.87)

In this vector the components would be matrices acting in the auxiliary space. How-ever, what we need is a vector of real-valued probabilities. Therefore, another operationis needed to transform these matrices back into numbers. This mechanism depends onthe boundary conditions under consideration. For example, if the system is open (i.e.coupled to reservoirs) we may take the expectation value of the matrix-valued compo-nents between two vectors 〈α| and |β〉 living in the auxiliary space, i.e.

|Pstat〉 = 1N 〈α|

(ED

)⊗(

ED

)⊗ . . .⊗

(ED

)|β〉 , (1.88)

Haye Hinrichsen — Complex Systems

Page 30: Complex system

24 Many-particle systems on a lattice

where N is a normalization factor given by

N = 〈α|CL|β〉 , C = D + E . (1.89)

For example, in the case of L = 2 sites the explicit vector reads

|Pstat〉 = 1〈α|C2|β〉

〈α|EE|β〉〈α|ED|β〉〈α|DE|β〉〈α|DD|β〉

. (1.90)

Suppose that we had found a valid representation of the matrices, this would allow usto compute any physical quantity, unfolding the full power of the approach. For exam-ple, the probability of finding the configuration ∅A∅A in a four-site system would begiven by

Pstat(∅A∅A) =〈α|EDED|β〉〈α|C4|β〉

(1.91)

That is, to get the probability of a given configuration, we simply we have to replacea vacancy by the operator E and a particle by the operator D and then compute theproduct of these matrices. Note that the string EDED is an ordinary operator product(matrix product) in auxiliary space and should not be confused with the tensor productin configuration space.

Remark: Another important case is that of periodic boundary conditions. Here the mostnatural choice for the reduction of matrices to numbers would be a trace operation

|Pstat〉 = 1N Tr

[(ED

)⊗(

ED

)⊗ . . .⊗

(ED

)]. (1.92)

with the normalizationN = Tr

[CL]

, C = D + E , (1.93)

where the trace is carried out in auxiliary space. Likewise, the probability of the configura-tion ∅A∅A in the case of periodic boundary conditions would be given by

Pstat(∅A∅A) =Tr[EDED

]Tr[C4] . (1.94)

So far we have only outlined the general idea how to represent a stationary state withnon-commutative operators, but we have not yet addressed the question whether suchoperators exist at all. In fact, the existence of such matrices depends significantly on thespecific form of the Liouvillian.

As an example, let us find out under which conditions the asymmetric exclusionprocess (ASEP) with open boundaries admits a matrix representation obeying the samezipper-like compensation mechanism as outlined above for ordinary product states. Inanalogy to Eqs. (1.76)-(1.77) let us postulate the bulk relations

L(2)[(

ED

)⊗(

ED

)]= −

(1−1

)⊗(

ED

)+

(ED

)⊗(

1−1

)(1.95)

Haye Hinrichsen — Complex Systems

Page 31: Complex system

1.2 The exclusion process on a one-dimensional lattice 25

or, explicitly:0 0 0 00 wL −wR 00 −wL wR 00 0 0 0

EEEDDEDD

=

0

wLED− wRDEwRDE− wLED

0

=

0

−D− ED + E

0

. (1.96)

Again, the second and the third line are linearly dependent, giving a single algebraicrelation for the operators E and D:

wRDE− wLED = D + E . (1.97)

This is the so-called matrix algebra induced by the Liouvillian. If the two operators werejust numbers, this relation would be equivalent to that of Eq. (1.77). In other words,the product state solution discussed before is nothing but the special case of a one-dimensional representation of the matrix algebra.

Let us now turn to the boundary equations. Following the same strategy of literallyreplacing numbers by matrices in Eqs. (1.70) and (1.71) we would postulate the relations

A(

ED

)= +

(1−1

)and B

(ED

)= −

(1−1

), (1.98)

givingαE = βD = 1 . (1.99)

But obviously, this result is too restrictive because it would force us to use the one-dimensional representation, reproducing again the case of a product state. However,at this point we can exploit the fact that the boundary matrices at the leftmost and therightmost site are always contracted with the boundary vectors 〈α| and |β〉. That is,instead of Eqs. (1.98) it rather suffices to have

〈α|A(

ED

)= +〈α|

(1−1

)and B

(ED

)|β〉 = −

(1−1

)|β〉 , (1.100)

turning the scalar equation (1.99) into two eigenvalue problems

〈α|E =1α〈α| (1.101)

D|β〉 =1β|β〉 . (1.102)

These eigenvalue equations are in fact more general and allow for a nontrivial matrixrepresentation, as we will see in the following.

Finding matrix representations: With the ansatz described above, the problem ofcalculating the stationary state of the ASEP with open boundaries has now been shiftedto the problem of finding a representation of the matrix algebra, i.e., identifying twomatrices E and D as well as two boundary vectors 〈α| and |β〉 obeying the relations

wRDE− wLED = D + E , 〈α|E =1α〈α| , D|β〉 = 1

β|β〉 . (1.103)

Haye Hinrichsen — Complex Systems

Page 32: Complex system

26 Many-particle systems on a lattice

Figure 1.7.: Lines in (α, β) parameter space for fixed q = 0.5, where one-, two-, and three-dimensionalrepresentations exist (indicated by the blue, orange, and green line, respectively).

Finding such matrix representation is not easy at all, and, if successful, can be con-sidered as a little breakthrough. A general guide for finding matrix representationswas written by Blythe and Evans [8], where one can find several representations of thequadratic algebra given above. It turns out that in this case, any matrix representationhas to be infinite-dimensional. One of them is:

D =1

wR − wL

1 + b

√c0 0 0 · · ·

0 1 + bq√

c1 00 0 1 + bq2 √

c20 0 0 1 + bq3

.... . .

E =1

wR − wL

1 + a 0 0 0 · · ·√

c0 1 + aq 0 00

√c1 1 + aq2 0

0 0√

c2 1 + aq3

.... . .

(1.104)

〈α| =(1 0 0 · · ·

), |β〉 =

100...

,

where

q =wL

wR, a =

1− qα− 1 , b =

1− qβ− 1 , cn = (1− qn+1)(1− abqn) .

(1.105)This representation is not unique since it can be mapped to other ones by similaritytransformation. Let us again remind the reader that these matrices live in an infinite-dimensional auxiliary space, which must not be confused with the (finite-dimensional)configuration space.

Interestingly, there are exceptions: As one sees from the matrices, for certain choicesof the parameters, namely for

1− abqn = 0 , (1.106)

Haye Hinrichsen — Complex Systems

Page 33: Complex system

1.2 The exclusion process on a one-dimensional lattice 27

the representation becomes finite-dimensional since cn = 0 and thus the upper leftcorner of the matrices becomes disconnected from the rest (we will study the specialcase of a two-dimensional representation in the tutorial). These special representationsexist along certain submanifolds in the parameters space (see Fig. 1.7). But apart fromthese special cases one can show that generically an infinite-dimensional representationis needed.

Is this really useful? Is it really an advantage to compute the probability of a configu-ration with say four sites in terms of the product of four infinite-dimensional matrices?It turns out that for finite chains it is in fact possible to truncate the matrices. For exam-ple, for a system with only four sites, it suffices to consider only the first four rows andcolumns of the matrix. This is because the matrices are nonzero only along their diago-nal and one of its neighboring diagonals while the boundary factors have only a singleentry in the first component. This means that the matrices D end E can be thought ofas some kind of ladder operators, moving forward and backward by one component inthe auxiliary space.

Using matrix product states: Having determined matrix representation it is possibleto compute almost any quantity of interest in the stationary state. The most importantone is ρstat

i , the density of particles at site i. For example, in a system with only threesites, the density of particles at the first site would be given by

ρstat1 = Pstat(A∅∅) + Pstat(A∅A) + Pstat(AA∅) + Pstat(AAA) = Pstat(A ? ?)

(1.107)with Pstat given by Eq. (1.88), that is, we have to “integrate out” the other two sites. Inthe matrix product formalism this can be accounted for by writing

ρstat1 =

〈α|DCC|β〉〈α|CCC|β〉

, (1.108)

or, as a general expression on a chain with L sites

ρstati =

〈α|Ci−1DCL−i|β〉〈α|CL|β〉

, (1.109)

where C = D + E. The density profile can now be computed by simply insertingthe matrices Eq. (1.104) and evaluating the corresponding matrix products. A tediouscalculation (not shown here) gives the exact expression

ρstati = xxx (1.110)

Surprisingly, although the matrix product state was assumed to be homogeneous (us-ing the same pair of matrices in each of the tensor factors), the resulting density profileis generally not homogeneous.

Similarly, it is in principle possible to compute any type of n-point correlation func-tion. For example, two point-function is given by

Cstatij =

〈α|Ci−1DCj−i−1DCL−j|β〉〈α|CL|β〉

. (1.111)

Haye Hinrichsen — Complex Systems

Page 34: Complex system

28 Many-particle systems on a lattice

How useful is the matrix product technique? The matrix product method is a power-ful technique which allows certain problems (such as the ASEP with open boundaries)to be solved exactly. As quantum mechanics, which relies on the idea of replacing real-valued phase space variables by non-commutative operators, this method replaces real-valued probabilities by non-commutative matrices. Physical quantities such as densityprofiles and correlation functions can be expressed conveniently in terms of simple ma-trix products.

Nevertheless, the range of applicability seems to be limited. In non-equilibrium sta-tistical physics, the range of soft models seems to be restricted to systems describingvarious kinds of diffusion. So far only very few systems with particle reactions can bedescribed in terms of matrix products.

Recently, the matrix product technique regained enormous importance in the field ofquantum information theory, where it is used to solve chains of interacting qubits. Thisis the reason why we devoted so much attention to this method in this lecture.

Haye Hinrichsen — Complex Systems

Page 35: Complex system

2. Equilibrium

2.1. Entropy as an information measure

Information

Entropy is one of the most fundamental concepts in statistical physics. However, forbeginners the notion of entropy is particularly difficult to understand. On the one hand,there is a large variety of possible definitions of entropy. On the other hand, unlike otherphysical quantities such as energy and momentum that can be measured and have aclear intuitive meaning, entropy as the “measure of uncertainty” is much more difficultto comprehend. In 1948, when Claude Shannon found a lower bound for the requiredbandwidth of telephone lines for which he needed an appropriate designation, it is saidthat J. v. Neumann gave him the advice to

“...call it entropy. [...] Nobody knows what entropy really is, so in a debate you willalways have the advantage.” [3]

In physics, entropy is usually introduced in the context of thermodynamics. Usuallythis is not an easy task for beginners. However, in our digital world of today, whereterms like “Gigabyte” are part of everyday language, it is much easier to introduceentropy on an information-theoretical basis before discussing its application in physics.In fact, entropy in itself is nothing but a measure of information, and this perspectivemay be helpful to make the meaning of entropy more transparent and accessible [1].

In the following let us again consider a classical (i.e. non-quantum) physical system,which is at any time in a well defined configuration. For simplicity let us assume thatthe set of possible configurations, the configuration space Ω, is finite and let us denoteby N = |Ω| the number of its elements. A light switch, for example, possesses a config-uration space Ω = on,off with N = 2 states while a die has the configuration spaceΩ = 1, 2, 3, 4, 5, 6 with N = 6 states. Within this framework we start out by definingverbally:

The information or entropy H of a system is the minimal number of bitsneeded to specify its configuration.

In other words, entropy or information can be thought of as the minimal length of afile on a computer that is needed to describe a system in detail. It is important thatthis length is minimal, i.e., one has to compress the file by removing all redundanciesin the description of the configuration. Alternatively, one may define entropy as thenumber of binary yes-no questions which is necessary to fully identify the microscopicconfiguration of the system.

Haye Hinrichsen — Complex Systems

Page 36: Complex system

30 Equilibrium

Byte (B) = 8 bit Byte (B) = 8 bitKilobyte (kB) = 8 · 103 bit Kibibyte (KiB) = 213 bit

Megabyte (MB) = 8 · 106 bit Mebiibyte (MiB) = 223 bitGigabyte (GB) = 8 · 109 bit Gibibyte (GiB) = 233 bitTerabyte (TB) = 8 · 1012 bit Tebibyte (TiB) = 243 bit

Table 2.1.: Commonly used units of information.

As discussed in the previous chapter, the characterization of the configuration spacedepends on the chosen level of abstraction. The same applies to the notion of entropy:The amount of information needed to specify a configuration depends of course on thelevel of abstraction to which the configuration is referring to. For example, we couldcharacterize the configuration of a die by the number 1...6 shown on its upper face,or by the precise location of all its atoms at a given time. Of course, the informationcontent is different in both cases.

As the entropy of a system is the minimal length of a describing file, any compositesystem can be characterized by first describing its parts and then concatenating thefiles to a single one. If the components of the system were uncorrelated, it would beimpossible to compress the concatenated file even further. It is therefore obvious thatentropy is an extensive, i.e. additive quantity.

It should be emphasized that the term ‘information’ in the present context should notbe confused with the notion of information in everyday life. According to the definitiongiven above, a meaningless sequence of random numbers contains a lot of informationsince a large amount of data is needed to specify all random numbers in detail. Never-theless these random numbers are meaningless, i.e., they have no ‘information’ in theusual sense.

The ’bit’ as elementary unit of information

If the configuration space of a system contains only one element, meaning that the sys-tem is always ina unique configuration, it is already fully characterized and thereforeit has the entropy zero. Contrarily, a binary system such as a light switch can be intwo possible states, hence it is characterized by a single binary digit, called bit. A bitis the smallest possible portion of information and plays the role as a fundamental in-formation unit, from which other commonly used units of information are derived (seeTable 2.1).

It should be pointed out that ‘bit’ is not a physical unit like ‘meter’, which needs tobe gauged by an international prototype in Paris. Since the unit ‘bit’ is not scalable,it is rather defined in itself. In fact, aliens on a different planet will use the same unitof information as we do. For this reason the universal unit ‘bit’ is often suppressed,treating entropy as a dimensionless quantity.

Haye Hinrichsen — Complex Systems

Page 37: Complex system

2.1 Entropy as an information measure 31

number of dice N # of configs 6N necessary bits n bits per die n/N1 6 3 32 36 6 33 216 8 2.6664 1296 11 2.755 7776 13 2.6∞ ∞ ∞ log2 6 ≈ 2.585

Table 2.2.: Number of bits n needed to describe N dice and the corresponding average number of bitsper die n/N (see text).

Non-integer number of bits: Since n bits can form 2n different combinations, it isimmediately clear that a system with |Ω| = 2n configurations can be characterized byn bits. But what happens if the number of state is not a power of 2? For example, twobits with 22 = 4 configurations would not suffice to encode the six faces of a die, butwith three bits two of the 23 = 8 possibilities would be wasted. This suggests that theactual information content of a die is somewhere between 2 and 3 bits.

In order to see that such a non-integer value can be defined in a meaningful waylet us consider a composite system of N independent dice. Clearly, this system canbe in 6N different configurations (see Table). To describe such a system we need n bits,where n is the smallest integer obeying the inequality 2n ≥ 6N . Taking the logarithm onboth sides and using its monotonicity, this inequality can be recast as n log 2 ≥ N log 6.Hence the mean number of bits per die for given N is the lowest rational number n/Nwith the property

nN≥ log 6

log 2= log2 6 . (2.1)

With increasing N this inequality can be satisfied more and more sharply, reducing theredundancy in the description of the state, so that in the limit N → ∞ the entropy perdie converges to H = log2 6 ≈ 2.585 bit.

Following this argument it is well motivated that the information or entropy of thesystem is given by

H = log2 |Ω| . (2.2)

With this formula we can easily confirm the additivity of information (in physics knownas the extensitivity of entropy): Since the total number of configurations of a system com-posed of independent subsystems is obtained by multiplying the number of configura-tions of the respective subsystems, the total entropy is just the sum of their entropies.

Information of physical systems

Entropy is defined differently in different communities. While the above definitionwith a base-2-logarithm is commonly used in information science, mathematicians pre-fer a natural logarithm. For historical reasons physicists use instead the definition

S = kB ln |Ω| , (2.3)

Haye Hinrichsen — Complex Systems

Page 38: Complex system

32 Equilibrium

introducing a conceptually superfluous constant kB ≈ 1.38 · 10−23 J/K, called Boltz-mann constant, giving the entropy a physical unit involving energy. From the con-ceptual point of view this is unfortunate since it hides the nature of the entropy as aninformation measure. We will come back to this point later.

Example: Information of a Helium balloon:

As an example let us consider a balloon filled with a mol of Helium. How much informationis contained in such a balloon? According to the chemical literature [4] Helium at roomtemperature contains an entropy of roughly S = 126 J/K. This corresponds to

H =S

kB ln 2≈ 1.3 · 1025bit . (2.4)

This is more than 5.000 times the storage capacity ever since produced. Printing out thestring of bits we could cover the whole planet with paper, - 40 meters high. However, divid-ing by the number of particles, the so-called Avogadro constant NA ' 6.023 · 1023mol−1, theresulting entropy per particle

h =H

NA≈ 22 bit (2.5)

turns out to be surprisingly small, given the fact that a single floating point number in acomputer already consumes at least 32 bit. Is it possible to explain these results?

Entropy catastrophe: According to the definition above, the entropy of a gas is theamount of information needed to describe the complete microstate of the gas, i.e. thepositions and momenta of all molecules. In the framework of classical physics this leadsimmediately to a paradoxical situation since positions and momenta are given by vec-tors with real-valued components. As real numbers have infinitely many digits, theynaturally carry an infinite amount of information, implying that the entropy of eachmolecule is in principle infinite. This so-called entropy catastrophe was controversiallydebated at the end of the 18th century and was often used as an argument to dismissBoltzmanns theory. The problem could only be solved with the advent of quantummechanics, where the joint precision of position and momentum is bound by the un-certainty relation ∆q∆p ≥ h, where h is Plancks constant. In a rough approximation wemay therefore think of the classical phase space as being divided in cells of the volumeh3. The number of possible states of a particle is then obtained by counting the numberof cells in the phase space volume explored by the particle.

With this insight we can now undertake a very rough estimation of the entropy asfollows. Since Helium is a monoatomic gas, the momenta of the particles will be of theorder

p ≈√

2mE ≈√

2m32

kBT ≈ 9 · 10−24kg m/s . (2.6)

Ignoring the actual statistical distribution of the momenta, let us simply assume thatthe phase space is limited by |~p| ≤ p. Then the accessible phase space volume is givenby

Φ ≈ Vmol(2p)3 ≈ 1.3 · 10−70(Js)3 . (2.7)

Hence, to specify the quantum state of a single particle, we would need an informationof log2(Φ/h3) ≈ 99 bit, which is more than four times larger than the actual value ofabout 22 bit found in the literature (see above).

Haye Hinrichsen — Complex Systems

Page 39: Complex system

2.1 Entropy as an information measure 33

letter E T A . . . X Q Zfrequency 12.70% 9.06% 8.17% . . . 0.15% 0.09% 0.07%morse code . - .- -..- --.- --..

Table 2.3.: The six most and least frequently used characters of the English alphabet, their probabilitiesand the corresponding Morse code (see text).

Example: This contradiction is resolved by another subtlety of quantum physics, namelythat identical particles are indistinguishable. While in classical physics each particles fol-lows its own trajectory, allowing one to tag the particles individually, quantum physics doesno longer have the notion of definite particle trajectories, making it impossible to determinewhich particle came from where. Consequently we cannot distinguish permutations of iden-tical particles, meaning that the actual total number of states is given by

|Ω| ≈ 1NA!

(Φh3

)NA

, (2.8)

giving

H ≈(

eΦNAh3

)NA

. (2.9)

The corresponding entropy can be estimated by means of Stirlings formula n! ≈ nn/en,leading to

h =H

NA≈ log2

Φh3 − log2 NA + log2 e ≈ 21 bit (2.10)

which is in fair agreement with the literature value of 22 bits per particle, although the cal-culation uses simplifications and ignores the actual distribution of the momenta.

Information with previous knowledge in form of a probability distribution

In defining entropy we assumed that the observer, who is informed about the actualconfiguration of the system, does not have any previous knowledge about the system.However, if the observer possesses some partial information in advance, it is intuitivelyclear that less information is needed to characterize a specific configuration of the sys-tem. For example, if the observer already knows that a die is manipulated in such away that the result is always even, the accessible configuration space is reduced fromsix to three possibilities, decreasing the necessary information by one bit.

In many cases an observer has a previous knowledge in form of a probability dis-tribution. For example, in the English alphabet the letters ‘E’ and ‘T’ are much morefrequent than the letters ‘X’ and ‘Q’. Samuel Morse was one of the first to recognize thatit is then more efficient to represent frequent characters by short and seldom ones bylonger codes (see Table 2.1). The Morse alphabet is in fact a very early example of whatis known today as entropy-optimized coding.

Individual entropy of a configuration: To understand the reduction of informationby means of previous knowledge quantitatively, let us again consider an arbitrary spaceΩ of configurations c ∈ Ω occurring with the probabilities pc ∈ [0, 1]. This probabilitydistribution has to be normalized, i.e.

∑c∈Ω

pc = 1 . (2.11)

Haye Hinrichsen — Complex Systems

Page 40: Complex system

34 Equilibrium

For simplicity, let us assume that these probabilities are rational numbers,1. which wewill express in terms of their common denominator m ∈ N by pc = mc/m. With thesenumbers let us now construct a fictitious set with m elements, where the configurationi occurs mc times. In this fictitious set the relative frequency of the configuration isexactly equal to the given probabilities.

Let us then choose one element from this fictitious set. To specify which of the ele-ments was selected, we would need an information of log2 m bit. However, since the setmay contain several copies of the same configuration which cannot be distinguished,the information which of the copies was selected is not of interest and has to be sub-tracted. Therefore, the configuration c has the information content

Hc = log2 m− log2 mc = log2

(mmc

). (2.12)

Because of mc/m = pc we arrive at the main result that the configurational or config-urational entropy Hc of configuration i with respect to a previously known probabilitydistribution is given by

Hc = − log2 pc (2.13)

or, in physicists notation, Hc = −kB ln pc. For example, in the English alphabet the mostfrequent character E, which occurs with the probability pE = 0.127, carries an informa-tion of approximately three bit, while the rarely used letter Z carries an information ofroughly 10 bits. The less likely a configuration is, the greater its information.

Example: Entropy of an information channel:Let us consider a system (information channel) that can be in three different configurations,meaning that it can transmit three different characters A, B, and C. Furthermore let it beknown that these characters occur with the probabilities pA = 1/2, pB = 1/3, and pC = 1/6.According to the prescription above, we construct a fictitious set A, A, A, B, B, C in whichthe relative frequency of the characters equals these probabilities. In this set the configura-tional entropies are given by

HA = − log2(1/2) = 1 bit ,

HB = − log2(1/3) ' 1.585 bit ,

HC = − log2(1/6) ' 2.585 bit

Mean entropy: In many cases, one is not interested in each of the configurationalentropies Hc but only in their arithmetic average

H := 〈H〉Ω = − ∑c∈Ω

pc log2 pc . (2.14)

In the case of data transmission, this so-called Shannon entropy describes the minimalinformation capacity of the transmission channel that is required to transmit the signal.Note that configurations with a vanishing probability pc = 0, for which the logarithm inthe above expression diverges, do not occur and can be excluded. This can be accountedfor by using the convention

0 log2 0 = 0 (2.15)

1This restriction is not very severe, since any real number can be approximated by a rational number toarbitrary precision.

Haye Hinrichsen — Complex Systems

Page 41: Complex system

2.1 Entropy as an information measure 35

Example: In the previous example of three information channel transmitting three lettersA,B,C with different probabilities, the Shannon entropy would be given by

H = ∑c=A,B,C

pc Hc = 1.459 bit .

Without previous knowledge the entropy of the system (information channel) would havebeen H = log2 3 ' 1.585 which is larger. This demonstrates that previous knowledge inform of a probability distribution reduces the amount of information.

Numerical estimation of entropy by sampling on a computer

Everyone knows that the average number of dots on a die can be estimated by throwingthe die many times and computing the arithmetic mean of the results. The reason is thatthe probabilities pc can be estimated by the relative frequency pc ≈ nc

n of the results in nrepeated experiments with an error vanishing as 1/

√n.

The entropy of a system can simply be estimated in the same way by replacing theprobabilities with the relative frequencies, i.e.

H ≈ − ∑c∈Ω

nc

nlog2

nc

n. (2.16)

As any average, this estimator converges to the exact value in the limit n → ∞. How-ever, an important difference is the presence of systematic errors on top of the statisticalones. In order to demonstrate these errors, Fig. 2.1 shows the actual value of the estima-tor for the entropy of a die plotted against the number of experiments (red curve). Asexpected, the data converges to the theoretical value H = log2 6, which is indicated as agreen horizontal line. However, the red data is not symmetrically scattered around theexpected limit, instead it approaches the green line from below, indicating the presenceof systematic corrections. To rule out that this effect is just a fluctuation, we averagedover many such sequences, plotting the result as a red dashed line.

These systematic corrections can be traced back to the nonlinearity of the logarithmin the entropy. Even many experts do not know that various correction methods havebeen developed which can compensate these systematic deviations to a different extent.The simplest one is the correction term

H ≈ |Ω|2n ln 2

− ∑c∈Ω

nc

nlog2

nc

n(2.17)

introduced by Miller [5]. As is shown in Fig. 2.1 in blue color, this simple 1/n correctionimproves the entropy estimates significantly.

An even better estimator was suggested by Grassberger in 2003, who added anotherterm, reading [2]

H ≈ |Ω|2n ln 2︸ ︷︷ ︸

Miller

− 1n ln 2 ∑

c∈Ω

(−1)nc

nc + 1︸ ︷︷ ︸Grassberger

− ∑c∈Ω

nc

nlog2

nc

n︸ ︷︷ ︸naive estimator

. (2.18)

Haye Hinrichsen — Complex Systems

Page 42: Complex system

36 Equilibrium

Figure 2.1.: Numerical estimation of the entropy of a die (red data) depending on the num-ber of throws n together with the average over many repetitions (dashed line).This systematic bias of the data can be compensated by applying Millers correc-tion method (blue data, see text).

2.2. Entropy in Statistical Physics

Gibbs postulate and Second Law

Entropy as a measure of information has no direct physical meaning in itself. It acquiresa physical meaning only through the circumstance that sufficiently complex physicalsystems evolve chaotically. Chaotic behavior means that any kind of fluctuation is am-plified by the nonlinear equations of motions, leading effectively to an apparent ran-dom behavior. As discussed before, this allows us to use the cartoon introduced in theprevious chapter of a system jumping randomly in its classical configuration space Ωaccording to specific rates wc→c′ . How this simple picture can be justified within thetheory of quantum chaos is an interesting research topic on its own.

Quantum reversibility: As already mentioned in the previous chapter, to our presentknowledge any physical system is ultimately described by the laws of quantum physics.In a nonrelativistic setting this means that even a very complex system evolves in timeaccording to the Schrodinger equation

ih∂

∂t|ψ〉 = Hψ . (2.19)

This equation has the remarkable property that it is invariant under time reversal com-bined with complex conjugation

t→ −t , ψ→ ψ∗ (2.20)

Likewise the Hamilton equations of motion, which plays the role of a classical pendantthe Schrdinger equation,

p = −∂H∂q

, q = +∂H∂p

(2.21)

Haye Hinrichsen — Complex Systems

Page 43: Complex system

2.2 Entropy in Statistical Physics 37

is invariant under time reversal:

t− > −t , p→ −p . (2.22)

Time reversal means the following: if a movie showing a physical time evolution isplayed backward it looks physically reasonable, i.e., it could be a valid solution of theevolution equation with appropriate initial conditions.

The underlying assumption of both the unitary Schrodinger evolution and the Hamil-ton equations, which is often omitted in textbooks, is that the system under consider-ation is completely isolated from the environment. In other words, these equations areonly valid in isolated systems.

Equal a priori postulate: Transferring this insight to the cartoon of a system jumpingrandomly in its configuration space we arrive at the conjecture that a isolated systemdescribed within this framework should be time-reversal invariant as well. This meansthat the probability of any stochastic sequence of transitions forward in time has to beexactly equal to the probability of the reversed sequence backward in time. This is thecase if and only if their rates in forward and backward direction are identical, leadingus directly to the most fundamental postulate of statistical physics:

In a isolated physical system the transition rates are symmetric:

wc→c′ = wc′→c (2.23)

This axiom holds for isolated systems which are perfectly isolated so that they do not in-teract with their environment. Under this condition the axiom states that spontaneousjumps between a pair of states are equally probable in both directions. Therefore, themotion of the system has no preferred direction, it rather diffuses in its configurationspace like a random walker.

For symmetric transition rates the Liouvillian is by construction symmetric. Hencethe condition of probability conservation 〈1|L = 0 (see Eq. (1.14)) immediately impliesthe equality

LT|1〉 = L|1〉 = 0 , (2.24)

i.e., the column vector (1, 1, 1, . . .)T is a right eigenvector to the eigenvalue zero. In ad-dition, if the system is ergodic, this eigenvector has to be proportional to the stationarystate of the system, implying

|Pstat〉 =1|Ω| |1〉 . (2.25)

Thus, in the long-time limit t → ∞, an isolated ergodic system will relax into a sta-tionary state where all configurations are equally probable. This is also known as theequipartition postulate, the equal-a-priori postulate or as the Gibbs postulate:

In an isolated stationary ergodic system all states occurwith the same probability pi = 1/|Ω|.

As we will see below, the whole theory of equilibrium statistical mechanics and ther-modynamics can be derived from this simple postulate.

Haye Hinrichsen — Complex Systems

Page 44: Complex system

38 Equilibrium

Second law of thermodynamics: Suppose that we have obtained the informationthat the system is in the configuration c0 at time t = 0, meaning that the initial proba-bility distribution is given by Pc(t0) = δc,c0 . If we let the system evolve without furthermeasurement and if the rates are symmetric in both directions, the system will basicallyperform a random walk in its own configuration space. Thus, as time proceeds, it willbe less and less clear where the system actually is, meaning that the information neededto characterize its configuration increases. This is the essential content of the famousSecond Law of Thermodynamics:

In an isolated physical system the average entropy cannot decrease.

∆H ≥ 0 (2.26)

We will come back to the second law when studying fluctuation theorems.

Subsystems

In most physical situations the system underconsideration is not isolated, instead it interactswith the environment. In this case the usual ap-proach of statistical physics is to consider thesystem combined with the environment as acomposite system. From an extreme point ofview this could encompass the entire Universe.This superordinate total system is then assumedto be isolated, following the same rules as out-lined above.

To distinguish the total system from its parts, we will use the suffixes ‘tot’ for the totalsystem while ‘sys’ and ’env’ refer to the embedded subsystem and its environment,respectively.

The total system, which includes the laboratory system as well as the environment,is characterized by a certain space Ωtot of classical configurations c ∈ Ωtot. The numberof these configurations may be enormous and they are usually not accessible in experi-ments, but in principle there should exist a corresponding probability distribution Pc(t)evolving by some master equation (cf. Eq. (1.3))

ddt

Pc(t) = ∑c′∈Ωtot

(Jc′→c(t)− Jc→c′(t)

). (2.27)

Here we introduced the so-called probability current

Jc→c′(t) = Pc(t)wc→c′ (2.28)

flowing from configuration c to configuration c′, where wc→c′ ≥ 0 denotes the corre-sponding time-independent transition rate.

Let us now consider a subsystem embedded into an environment. Obviously, forevery classical configuration c ∈ Ωtot of the total system we will find the subsystem

Haye Hinrichsen — Complex Systems

Page 45: Complex system

2.2 Entropy in Statistical Physics 39

Figure 2.2.: A subsystem is defined by a projection π which maps each configuration c ∈ Ωtot of thetotal system (left) onto a particular configuration s ∈ Ωsys of the subsystem (right), dividingthe configuration space of the total system into sectors. The figure shows a stochastic pathin the total system together with the corresponding stochastic path in the subsystem.

in a well-defined unique configuration, which, for the sake of clarity, we will denoteas s ∈ Ωsys. Conversely, for a given configuration of the subsystem s ∈ Ωsys the en-vironment (and therewith the total system) can be in many different configurations.This relationship can be expressed in terms of a surjective map π : Ωtot → Ωsys whichprojects every configuration c of the total system onto the corresponding configurations of the subsystem, as sketched in schematically Fig. 5.5.

The projection π divides the space Ωtot into sectors π−1(s) ⊂ Ωtot which consist ofall configurations which are mapped onto the same s. Therefore, the probability tofind the subsystem in configuration s ∈ Ωsys is the sum over all probabilities in thecorresponding sector, i.e.

Ps(t) = ∑c(s)

Pc(t) , (2.29)

where the sum runs over all configurations c ∈ Ωtot with π(c) = s, denoted as c(s).Likewise, the projected probability current Js→s′ in the subsystem flowing from config-uration s to configuration s′ is the sum of all corresponding probability currents in thetotal system:

Js→s′(t) = ∑c(s)

∑c′(s′)

Jc→c′(t) = ∑c(s)

Pc(t) ∑c′(s′)

wc→c′ . (2.30)

This allows us to define effective transition rates in the subsystem by

ws→s′(t) =Js→s′(t)

Ps(t)=

∑c(s) Pc(t)∑c′(s′) wc→c′

∑c(s) Pc(t). (2.31)

In contrast to the transition rates of the total system, which are usually assumed to beconstant, the effective transition rates in the subsystem may depend on time. With thesetime-dependent rates the subsystem evolves according to the master equation

ddt

Ps(t) = ∑s′∈Ωsys

(Js′→s(t)− Js→s′(t)

)Js→s′(t) = Ps(t)ws→s′(t) . (2.32)

From the subsystems point of view the time dependence of the rates reflects the un-known dynamics in the environment. Moreover, ergodicity plays a subtle role: Even ifthe dynamics of the total system was ergodic, the dynamics within the sectors π−1(c)is generally non-ergodic and may decompose into several ergodic subsectors. As wewill see later, this allows the environmental entropy to increase even if the subsystemis stationary.

Haye Hinrichsen — Complex Systems

Page 46: Complex system

40 Equilibrium

Detailed balance: Let us now consider a situation where the total system (the wholeUniverse) approaches a stationary state. In this limit, all configurations of the totalsystem are equally probable, i.e., Pstat

c = 1/|Ωtot|. For the laboratory system, however,the equipartition postulate does not apply. In fact, if we evaluate Eq. (2.29) we obtain

Pstats = ∑

c(s)Pstat

c =|π−1(s)||Ωtot| . (2.33)

This means that the stationary probability of a certain configuration s of the laboratorysystem is proportional to the number of the corresponding configurations of the totalsystem |π−1(s)|.

Let us now turn to the probability currents. Assuming stationarity Eq. (2.30) turnsinto

Js→s′(t) =1|Ωtot| ∑c(s)

∑c′(s′)

wc→c′ . (2.34)

As the total system is isolated, we know that the transition rates of the total system haveto be symmetric, i.e., wc→c′ = wc′→c. Inserting this symmetry into the equation above,we immediately recognize that the probability currents have to be symmetric as well:

Js→s′ = Js′→s . (2.35)

In other words, in the equilibrium state the probability currents cancel one another.This is of course not very surprising: Since the probability currents in the total systemscancel by definition, we expect the same to hold in any subsystem.

Note that this cancellation takes place between all pairs of configurations and notonly effectively between groups of configurations. For this reason such a situation,where all probability currents balance one another microscopically, is referred to asobeying detailed balance.

A stationary state is said to obey detailed balanceif all probability currents cancel pairwise.2

Obviously, detailed balance implies stationarity since there is no net probability flowin the system. However, the effective rates of the laboratory system are not necessarilysymmetric:

ws→s′ 6= ws′→s (2.36)

According to the equipartition postulate the whole Universe is expected to evolve into astate where all configurations are equally probable. This so-called thermal death hypoth-esis has been discussed intensively around the end of the 19th century. Today we knowthat our expanding Universe is actually far from thermalizing into such a state. How-ever, the immediate surrounding of the laboratory system could be locally thermalizedso that the considerations made above still makes sense.

2As we will see in the following chapter, there are also stationary states which do not obey detailedbalance. Such states are called non-equilibrium steady states (NESS).

Haye Hinrichsen — Complex Systems

Page 47: Complex system

2.3 Thermostatics with conserved quantities and reservoirs 41

Thermodynamic equilibrium: Throughout this lecture the terms “stationarity” and“equilibrium” mean something different.

• Stationarity means that the probability distribution Ps(t) does not depend ontime, i.e. Ps(t) = Pstat

s .

• Equilibrium means that the system is stationary and additionally that the ratesobey the condition of detailed balance.

Therefore, thermodynamic equilibrium is a much stronger condition than stationarity.This difference can also be seen by looking at the right side of the master equation

ddt

Ps(t) = ∑s′∈Ωsys

(Js′→s(t)− Js→s′(t)

). (2.37)

Stationarity requires that the sum on the right-hand side of this equation vanishes iden-tically. Equilibrium, on the other hand, requires that each term of the sum vanishesindividually. Clearly, the second condition is much more restrictive.

If the system under consideration is isolated, the Gibbs postulate tells us that in thestationary state all configurations are equally probable and that the rates have to besymmetric. Consequently, the entropy of such an isolated system attains the maximalpossible value H = ln |Ω|.

As we have shown above, the Gibbs postulate allows us to derive that an open sys-tem interacting with its environment approaches a stationary state where it obeys theprinciple of detailed balance. In this situation, the configurations are not necessarilyequally probable, meaning that the entropy is not necessarily maximal. Thus, althoughthe total system (consisting of laboratory system and environment) attains the maximalentropy, this does not automatically apply to a subsystem. In fact, as will be shown,different thermodynamic potentials may become extremal in this case.

2.3. Thermostatics with conserved quantities and reservoirs

Systems with conserved quantities

Microcanonical ensemble: In an isoalted system without conserved quantities, theSecond Law tells us that the system would end up in a equally distributed state whichis not particularly interesting. However, as in any field of physics, the situation changescompletely in the presence of symmetries and the associated conserved laws.

As the most important example let us consider energy conservation. As illustrated inFig. 2.3 this conservation law divides the configurations space Ωsys into sectors Ωsys(E)of the same energy. For an isolated system this means that spontaneous jumps arenow restricted to pairs of configurations within the same sector, while jumps betweendifferent sectors are forbidden. Within these sectors the Second Law remains valid, i.e.the system will approach a stationary equi-partitioned state within this sector with the(energy-dependent) entropy

H(E) = ln |Ωsys(E)| . (2.38)

Haye Hinrichsen — Complex Systems

Page 48: Complex system

42 Equilibrium

Figure 2.3.: Sectors in the state space of a system with energy conservation.

Figure 2.4.: Two systems exchanging energy. The composite system is considered as being isolated.

In statistical physics, this situation is referred to as the microcanonical ensemble.

Systems exchanging entropy and energy: Let us now consider the case of two weaklyinteracting systems A and B coupled by a thermal bridge, as shown in Fig. 2.4. In sucha situation the energy conservation law

E = EA + EB (2.39)

still holds for the combined total system but no longer for its parts. In fact, it mayhappen that subsystem A jumps from one energy sector to another one, provided thatsubsystem B jumps simultaneously to another sector in opposite direction. Obviouslysuch an interaction leads to an exchange of energy between the systems.

Furthermore let us assume that the coupling is so weak that the thermal bridge it-self does not significantly contribute to the entropy of the entire setup so that the totalentropy of the system can be written as the sum of its parts, i.e.,

H = HA + HB . (2.40)

Since the composite system AB can be thought of as being isolated, we can again ap-ply the Second Law, meaning that the system approaches a stationary state in whichH = log2 |Ω| is maximal and constant. However, in the stationary state the system stillfluctuates and the two parts continue to exchange energy. This means that HA, HB aswell as EA, EB fluctuate in such a way that their sums are constant, i.e. a gain on oneside implies an equal loss on the other side and vice versa:

H = HA + HB = constE = EA + EB = const

⇒ ∆HA = −∆HB and ∆EA = −∆EB . (2.41)

Regarding the entropies of the subsystems HA(EA) und HB(EB) as continuous func-tions of the corresponding energies, the differential relation dEX = −dEY implies that

∂HA

∂EA= −∂HA

∂EB. (2.42)

Haye Hinrichsen — Complex Systems

Page 49: Complex system

2.3 Thermostatics with conserved quantities and reservoirs 43

On the other hand, since the entropy of the total system is maximal and constant wehave dHX = −dHY, leading to

∂HA

∂HB= −∂HB

∂EB⇒ ∂HA

∂EA=

∂HB

∂EB. (2.43)

Thus, in the stationary state of the combined system, the derivative

1T

= β :=∂H∂E

(2.44)

is equal on both sides. Here T is the so-called temperature of the system. Consequentlywe can interpret temperature as the change of the required information to specify theconfiguration of a system caused by a change of the energy. In other words, it tellsus how much the logarithm of the size of the sector grows if we increase the energyof the system. According to the arguments given above, the two systems continue toexchange energy until their temperatures coincide.

The unit of temperature: In our everyday world we measure tem-perature in units of Kelvin, Celsius or Fahrenheit. However, from afundamental point of view, the introduction of an independent unitof temperature is absolutely superfluous. In fact, since temperatureis energy divided by information, the correct SI-unit would be [T] =Joule/bit. The price we pay for introducing an unnecessary unit is thenecessity of an unnecessary conversion factor, the so-called Boltzmannconstant kB = 1.380649 · 10−23 J/K. The existence of this constant, likethe existence of vacuum permeabilities ε0, µ0 in electrodynamics, canbe considered as a historical accident. However, of course it would besomewhat strange to specify the temperature in a weather forecast inunits of electron volts per kilobyte.

Intuitive interpretation: To understand the meaning of β = 1T , let us consider two

economical systems like the crude oil exchange markets in Rotterdam and New York.The traders working in these places are not interested in crude oil in itself, may be theyhave never seen a single barrel, rather they are interested in making money. They doso by randomly trading oil for money between the two places, causing a fluctuatingphysical current of oil. Theories in economy tell us that this process continues untilthe markets reach an equilibrium state in which the price in dollars per barrel β = d$

d oilis the same on both sides. In this equilibrium state some oil will continue to fluctuatebetween the two places, but there will be no net flow any more, hence, both tradingplaces will be in detailed balance.

The analogy with trading places allows us to understand intuitively the meaning of’hot’ and ’cold’. If the system is hot, it can offer a lot of energy for very little entropy,i.e., the price of energy in terms of entropy is a low. Conversely, a cold system has onlylittle energy to sell and therefore the price of energy is high. If two such systems interactwith each other, the cold system will be interested in buying energy from the hot one ata very low price. As a result energy flows from the hot to the cold system.

Haye Hinrichsen — Complex Systems

Page 50: Complex system

44 Equilibrium

Figure 2.5.: Equilibration of markets (see text).

Summary: Interpretation of temperature:β = 1/T is the price at which a system can provide energy in return of entropy. Hot systemsoffer energy cheaply. Contrarily, buying energy from a cold system is expensive.

Several conserved quantities: The same arguments are valid for any other conservedquantity such as volume, magnetization, and particle number. Let us denote these con-served quantities by X(1), X(2), . . .. If several subsystems A, B, C, . . . exchange these con-served quantities among each other, they will eventually approach a stationary state,called thermodynamic equilibrium, in which the partial derivatives

β(µ) =∆H

∆X(µ)≈ ∂H

∂X(µ)µ = 1, 2, . . . (2.45)

aquire the same value in all subsystems. Conservation of energy with β = 1/kBT is justa special case.

In physics the situation is basically the same as in economics: The subsystems are notreally interested in the conserved quantity itself, they rather trade conserved quantitiesin order to make as much entropy as possible. Depending on the initial condition thiscauses a macroscopically visible physical flow of the respective conserved quantitiesX(µ) until the “price” β(µ) in units of entropy per X(µ) is the same among all subsystems.

2.4. Conserved quantities and external reservoirs

Systems coupled to a heat bath

Heat baths and free energy: If one of the two systems is so large that the smallersystem cannot influence the ‘price’ of the exchanged physical quantity in terms of en-tropy, the larger one is called reservoir. In what follows, we will use the suffixes ’sys’ forthe laboratory system and ’env’ for the reservoir or synonymously the environment.

In the case of energy exchange, the reservoir is usually referred to as a “heat bath”,which is characterized by a given constant temperature Tenv. The constant temperatureallows us to relate energy and entropy changes in the heat bath without knowing itsactual internal structure:

dEenv = Tenv dHenv . (2.46)

Since energy is conserved we have

dEsys = −dEenv = −Tenv dHenv . (2.47)

Haye Hinrichsen — Complex Systems

Page 51: Complex system

2.4 Conserved quantities and external reservoirs 45

Figure 2.6.: Physical system in contact with a heat bath at constant temperature, exchanging both energyand entropy.

By integrating this relation we obtain3

Esys = const− TenvHenv . (2.48)

Defining the Helmholtz free energy

Fsys = Esys − TenvHsys . (2.49)

(written as F = U − TS in most textbooks) we find that

Fsys = const− Tenv(Hsys + Henv) = const− TenvHtot . (2.50)

As can be seen on the right-hand side of this equation, the maximization of the totalentropy is tantamount with the minimization of Fsys. Therefore, the potential called freeenergy, which looks as if it was a property of the system alone, incorporates already theinteraction with the heat bath without knowing its specific internal structure. In otherwords, from a formal point of view, the definition of the free energy can be seen as aclever trick to write the total entropy of the combined system in form of a potential fora single system. Therefore, whenever we deal with a free energy, we have to keep inmind that the system is tacitly assumed to be coupled to an external heat bath.

Summary: The meaning of the “free energy” :The free energy F is nothing but a different way to write the total entropy of the system plusheat bath (up to a minus sign and a constant) in the form of a single potential of the systemalone. It may be seen as a formal trick to write the combined entropy in a more convenientform, but it is important to realize that does not introduce a new kind of Second Law.

Individual and mean free energy: Let us be a bit more precise in deriving the freeenergy: For a given configuration s ∈ Ωsys the system has a well-defined energy Esys(s),which we will denote here as the configurational energy. The main assumption is thatupon each microscopic fluctuation of the system the heat bath equilibrates so quicklythat it responds to an energy change with an immediate well-defined change of itsentropy:

∆Henv = β∆Eenv . (2.51)

3This integration has to be taken with a grain of salt. On the one hand the integration is only possiblein case of an exact differential. On the other hand, it is not fully clear whether the environment can beintegrated at all, producing a finite integration constant, and whether there exists definite entropy ofthe environment. For simplicity, we ignore these questions at this point.

Haye Hinrichsen — Complex Systems

Page 52: Complex system

46 Equilibrium

Thus, whenever the system undergoes a transition s → s′, the energies and entropieschange instantaneously as

∆Eenv(s→ s′) = −∆Esys(s→ s′) = −(Esys(s′)− Esys(s)) (2.52)∆Henv(s→ s′) = −β(Esys(s′)− Esys(s)) (2.53)

implying that the entropy of the environment is a well-defined function of the system’sconfiguration s:

Henv(s) = const− βEsys(s) . (2.54)

Therefore, we can express the entropy of the total system solely in terms of the config-uration of the subsystem:

Htot(s) = const + Hsys(s)− βEsys(s)︸ ︷︷ ︸=V(s)=−βF(s)

, . (2.55)

This allows us to define the configurational free energy

Fs = −β−1Vs = Esys(s)− THsys(s) . (2.56)

Note that this definition associates with each configuration s ∈ Ωsys of the subsystemsan individual free energy. The ordinary definition of the free energy, which can befound in most textbooks, is obtained by averaging over all subsystem configurations:

F = 〈Fs〉 = 〈Esys(s)〉︸ ︷︷ ︸=〈E〉=U

−T〈Hsys(s)〉 = U − TH . (2.57)

The average energy U = 〈E〉 is also referred to the internal energy of the subsystem.Clearly, minimizing the average free energy is equivalent to maximizing the averagedtotal entropy 〈Htot〉.

Summary: Configurational free energy:As in the case of entropy, where we distinguish between the configurational entropy Hs andthe average entropy H, we can define a configurational free energy Fs for every system con-figuration s ∈ Ωsys and an average free energy F, provided that the environment equilibratealmost instantaneously.

Haye Hinrichsen — Complex Systems

Page 53: Complex system

3. Systems out of equilibrium

So far we have studied complex systems in thermal equilibrium. Thermal equilibriummeans that (a) the system is stationary, and (b) there are no physical currents flowingthrough the system as, for example, heat, particles, or electrical currents. Contrarily,whenever such currents are present, the system is said to be out of equilibrium. In thischapter we are going to study such situations in more detail.

3.1. Dynamics of subsystems

Recalling basic facts about Markov processes

Before starting, let us briefly summarize the formalism of stochastic Markov processesdeveloped in Chapters 1 and 2. At any time t the system is in a certain configurationc ∈ Ω, where Ω denotes the configuration space. The system evolves in time by spon-taneous jumps from one configuration to the other with certain transition rates wc→c′ .These transition rates are not necessarily symmetric, i.e., wc→c′ 6= wc′→c in general.

The actual trajectory of the system cannot be predicted. What can be predicted isthe probability Pc(t) to find the system at time t in the configuration c. Introducing theprobability currents

Jc→c′(t) = Pc(t)wc→c′ (3.1)

these probabilities evolve deterministically according to the master equation

∂tPc(t) = ∑

c′ 6=cJc′→c(t)︸ ︷︷ ︸

gain terms

− ∑c′ 6=c

Jc→c′(t)︸ ︷︷ ︸loss terms

, (3.2)

which is the linear partial differential equation. Listing these probabilities in a canonicalorder as components of a vector |Pt〉, we can rewrite the master equation compactly as

∂t|Pt〉 = −L|Pt〉 . (3.3)

Here L is the Liouville operator with the matrix elements

〈c′|L|c〉 = −wc→c′ + δc,c′ ∑c′′

wc→c′′ , (3.4)

where the diagonal elements 〈c|L|c〉 = ∑c′′ wc→c′′ are the so-called escape rates describ-ing how much probability flows away from c to elsewhere.

Haye Hinrichsen — Complex Systems

Page 54: Complex system

48 Systems out of equilibrium

The formal solution of the master equation reads

|Pt〉 = exp(−Lt)|P0〉 . (3.5)

This solution looks simple, but the evaluation of the matrix exponential function re-quires to diagonalize the Liouvillian, which is generally a non-trivial task.

Introducing the bra vector 〈1| = (1, 1, 1, . . . , 1) the normalization condition (proba-bility conservation) requires that 〈1|Pt〉 = 1 holds for all times, implying 〈1|L = 0.This means that the column sum of the Liouvillian matrix vanishes. Consequently, thecolumn sum of a finite-time evolution matrix T = e−Lt is equal to 1.

A Markov process is called stationary if the probability distribution Pc(t) does notdepend on time. In the vector space formalism, this means that |Pstat〉 is a right eigen-vector to the eigenvalue zero:

∂t|Pstat〉 = −L|Pstat〉 = 0 . (3.6)

A system is called isolated if it does not interact by any means with the outside world.Arguing that the quantum-mechanical evolution of an isolated system would be uni-tary and thus invariant under time reversal, we can conclude that the correspondingtransition rates have to be symmetric, i.e.,

wc→c′ = wc′→c , (isolated system) (3.7)

meaning that the matrix L of an isolated system is symmetric. This implies that left andright eigenvectors have the same components, hence LT|1〉 = L|1〉 = 0. Thus, if thestationary state is unique, it is given by

|Pstat〉 =1|Ω| |1〉 . (3.8)

This result, known as Gibbs postulate or equal-a-priori postulate, tells us that an isolatedsystem relaxes into a stationary equilibrium state in which all configurations are equallyprobable. In this situation the entropy of the system takes on its maximal value H =ln |Ωsys|.

The celebrated second law of thermodynamics states in addition that the average en-tropy of an isolated system on its way into this equilibrium state can only increase, i.e.,∆H ≥ 0. One of the purposes of this chapter is to prove and generalize this inequality.

Systems embedded into the environment

As already discussed in Chapter 2, a system is usually embedded into a larger system,called environment. This surrounding system could be very large, and we may think inan over-simplified scenario of the whole universe. Most importantly, this surroundingsystem itself is assumed to be isolated. We want to describe the whole environment inthe same way as the embedded system itself, i.e., in terms of configurations and tran-sition rates. These configurations, denoted by c, are elements of an incredibly largeconfiguration space Ωtot, where ‘tot’ stands for ‘total’. Contrarily, we denote the config-urations of the laboratory system by s ∈ Ωsys.

Haye Hinrichsen — Complex Systems

Page 55: Complex system

3.1 Dynamics of subsystems 49

Figure 3.1.: A subsystem is defined by a projection π which maps each configuration c ∈ Ωtot of thetotal system (left) onto a particular configuration s ∈ Ωsys of the subsystem (right), dividingthe configuration space of the total system into sectors.

Remember: We consider the division into:

• System: The laboratory system, measurable, configurations s ∈ Ωsys

• Environement: Everything else, non-measurable.• Total: System and environment together, configurations c ∈ Ωtot

For every configuration c of the ‘universe’ our laboratory system will be in a well-defined configuration denoted as s ∈ Ωsys. In order to express this relationship, wehave introduced a projection

π : Ωtot → Ωsys : c 7→ s = π(c) . (3.9)

In statistical physics the process of projecting a larger system onto a smaller one isoften referred to as coarse-graining the configuration space, as illustrated in Fig. 3.1.As discussed previously, this divides the configurational space of the total system intosectors denoted as

Ωtots = c ∈ Ωtot|π(c) = s =: π−1(s) (3.10)

If the total system evolves only within such a sector, the configuration of the laboratorysystem will not change. However, it is important to realize that a sector π−1(s) maynot be fully connected within itself, even though the total system is ergodic. The sectorπ−1(s) may rather decompose into several internally connected subsectors. Movingfrom one subsector to another within π−1 is impossible, it is only possible to connectthem by a path passing different sectors, which requires to temporarily change the con-figuration of the laboratory system. As we will see below, this is basically what happensin any type of cyclically moving engine.

Haye Hinrichsen — Complex Systems

Page 56: Complex system

50 Systems out of equilibrium

Coarse-grained master equation

Under projective coarse-graining the probability distribution can be coarse-grained sim-ply by adding all probabilities in the respective sector:

Ps(t) = ∑c∈Ωtot

s

Pc(t) = ∑c(s)

Pc(t) . (3.11)

Here we introduced the compact notation ∑c(s) of a sum running over all configurationsof the total system with π(c) = s. Likewise, the probability currents between differentconfigurations can be coarse-grained as well:

Js→s′(t) = ∑c(s)

∑c′(s′)

Jc→c′(t) = ∑c(s)

Pc(t) ∑c′(s′)

wc→c′ . (3.12)

If the total system (consisting of laboratory system and environment) evolve accordingto the master equation

ddt

Pc(t) = ∑c′∈Ωtot

(Jc′→c(t)− Jc→c′(t)

). (3.13)

one can easily show that the laboratory system will evolve according to the coarse-grained master equation

ddt

Ps(t) = ∑s′∈Ωsys

(Js′→s(t)− Js→s′(t)

). (3.14)

Note that although the rates of the (isolated) total system are symmetric and time-independent, the effective rates

ws→s′(t) =Js→s′(t)

Ps(t)=

∑c(s) Pc(t)∑c′(s′) wc→c′

∑c(s) Pc(t). (3.15)

are generally neither symmetric nor constant.

Detailed balance

At this point let us briefly recall the concept of detailed balance (see page 40). This con-dition holds whenever the system is in thermal equilibrium with the surrounding en-vironment and if the environment itself is in equilibrium. If a total system is in thermalequilibrium, we have seen that all configurations c ∈ Ωtot are equally probable, imply-ing that the coarse-grained probabilities of the laboratory system are given by

Pstats = ∑

c(s)Pstat

c = ∑c(s)

1|Ωtot| =

|Ωtots |

|Ωtot| . (3.16)

This means that the stationary probability of a certain configuration s of the laboratorysystem is proportional to the number of configurations in the corresponding sector.Likewise, the stationary probability currents in the laboratory system are given by

Jstats→s′ =

1|Ωtot| ∑c(s)

∑c′(s′)

wc→c′ . (3.17)

Haye Hinrichsen — Complex Systems

Page 57: Complex system

3.1 Dynamics of subsystems 51

Knowing that the transition rates of the total system have to be constant and symmetric,this equation implies that the probability currents have to be symmetric as well, leadingdirectly to the condition of detailed balance:

Js→s′ = Js′→s . (3.18)

or, equivalently, in textbook form:

Pstats ws→s′ = Pstat

s ws′→s ∀s 6= s′ (3.19)

Alternative definition of detailed balance: Not so known even among experts is analternative definition of detailed balance which does not rely on the knowledge of thestationary state. Starting point is the notion of stochastic paths or stochastic trajectorieswhich are nothing but sequences of configurations through which the system evolves.Let us now consider a cyclically closed stochastic path

c0 → c1 → c2 → . . .→ cN → c0 (3.20)

which goes back to its starting point. Then it is easy to show that the system obeysdetailed balance if and only if the following identity holds for any closed stochastictrajectory:

N

∏i=1

wsi→si+1

wsi+1→si

= 1. (3.21)

This criterion tells us that if the product of the transition rates along any closed cycleof transitions in forward direction is exactly equal to the product of rates in backwarddirection, we can conclude that the stationary state will obey a detailed balance. Toprove this, we first note that the trivial identity

N

∏i=1

pstatsi

pstatsi+1

= 1 (3.22)

holds for any closed cycle of transition, i.e., the stationary probabilities simply drop ofout along a closed cycle of transitions. Therefore, we have

1 =N

∏i=1

Jstatsi→si+1

Jstatsi+1→si︸ ︷︷ ︸=1

=N

∏i=1

pstatsi

wsi→si+1

pstatsi

wsi+1→si

=N

∏i=1

pstatsi

pstatsi+1

N

∏i=1

wsi→si+1

wsi+1→si

=N

∏i=1

wsi→si+1

wsi+1→si

, (3.23)

proving the property of detailed balance.

Remember: Definition of detailed balance:A stationary state is said to obey detailed balance iif all probability currents cancel pairwise.As an alternative criterion, the product of the transition rates along a closed cycle have to beequal in forward and backward direction.

Haye Hinrichsen — Complex Systems

Page 58: Complex system

52 Systems out of equilibrium

Nonequilibrium

If the transition rates ws→s′(t) do not obey the condition of detailed balance, the labora-tory system is said to be out of equilibrium. Since we have derived detailed balance fromthe assumption that the total system has already reached its equally-distributed micro-canonical stationary state, it conversely follows that a laboratory system out of equilib-rium requires the total system to be nonstationary. As such, the total system has not yetreached the state of maximal entropy. Therefore, as a hallmark of nonequilibrium, weexpect the entropy of the total system (and therewith the entropy of the environment)to increase. Such an increase of entropy in the environment is interpreted as an entropyproduction caused by the nonequilibrium dynamics of the laboratory system.

Before discussing entropy production in more detail, let us consider a simple exam-ples of nonequilibrium systems. First, we note that a system with only two configura-tions always obeys detailed balance by definition (exercise). Therefore, nonequilibriumrequires at least three configurations.

Such a toy system with three configurations s1, s2, s3 isshown on the right hand side. As usual, the configurationsare symbolized by colored bullets while the transition ratesare represented by arrows, the length indicating their mag-nitude. In this example the rates are cyclically biased inone direction, leading to a clockwise oriented probabilitycurrent.

If the jump rates in the respective direction are equal, as indicated by the arrows inthe figure, this system will relax towards a stationary state Pstat

s1= Pstat

s2= Pstat

s3= 1

3 .Nevertheless the system keeps on jumping preferentially in clockwise direction, eventhough the probability distribution is stationary. In the literature this is known as anonequilibrium steady state (NESS).

Remember: Stationarity vs. equilibrium:Equilibrium systems are always stationary, but conversely stationary systems are not alwaysin equilibrium. The condition of equilibrium is stronger because it requires in addition thatthe probability currents between all pairs of configurations vanish. This happens if and onlyif the rates satisfy the condition of detailed balance.

If the system keeps on jumping preferably in clockwise direction, how does the corre-sponding dynamics in the environment look like? First of all, we expect each of thethree configurations to correspond to a sector of configurations in Ωtot

s ⊆ Ωtot. Doesthis mean that in case of the clock model there are only three such sectors?

In order to answer this question on an intuitive basis, recall that the transition ratesoften isolated system have to be symmetric. Therefore, asymmetric rates in the labo-ratory system are only possible between sectors of different sizes. In other words, thesystem jumps preferentially from s to s′ if the target sector is larger, i.e., |Ωtot

s′ | > |Ωtots |.

This is an example of a so-called entropic force which drags the system into the directionof increasing entropy. Obviously, this inequality cannot be satisfied with only threesectors, rather the sector size has to increase whenever the system jumps in clockwisedirection. For this reason this toy model will be termed clock model in the following.

Haye Hinrichsen — Complex Systems

Page 59: Complex system

3.2 Entropy production 53

Figure 3.2.: Clock model: Unfolding of the closed cycle of transitions into a linear sequence which keepstrack of the winding number (see text).

Remark: Entropic force:Imagine that a fly is disturbing you while you are doing the exercises.Usually one would get rid of this problem by opening the window.If there are no other flies outside, it is clear that the fly inside willeventually get out of the window and never return. In some sense itseems as if there was a ’force’ dragging the fly out of the room. Ofcourse, this is not a mechanical force in the usual sense, it is rathera purely probabilistic effect: the fly itself performs an undirectedrandom walk, but as there is so much space outside (a large entropy),it is highly unlikely that the fly, once outside, will return. Thisapparent drag is a simple example of a so-called entropic force.

If the clock model completes one cycle s1 → s2 → s3 → s1 it ends up in the sameconfiguration from where it started, but, from the perspective of the total system, ina different sector of the environment. In other words, although the system returns toits original starting point, the environment has changed significantly. This is exactlywhat all kinds of heat engines do: They run in a closed cycle, but they interact at thesame time with the outside world, consuming energy and producing entropy. In thefollowing let us study this phenomenon of entropy production in more detail.

3.2. Entropy production

As the environment seems to keep track of the winding number, it is meaningful tounfold the closed cycle in our three-state clock model into a linear chain of transitions,as sketched in Fig. 3.2. In this linear chain each bullet is associated with a well-definedsector, which increases in size as we go to the right. Nevertheless keep in mind that westill have to identify every third bullet, indicated by the same color.

How does the environment look like? The simple answer is that we don’t know. Wemade the assumption that it can be described as a stochastic Markov process as well,but neither the structure of the configuration space nor the network of transition rateswc→c′ is actually accessible to our observation. Then, if we don’t know anything aboutthe physics of the environment, how can we quantify the entropy production?

Before addressing this question, let us again study the three-state clock model in asimplified situation which is shown schematically in Fig. 3.3. Here we simply assume

Haye Hinrichsen — Complex Systems

Page 60: Complex system

54 Systems out of equilibrium

Figure 3.3.: Linearized chain of transition together with a fictitious structure of Ωtot.

that the number of states of the total system doubles whenever we jump in clockwisedirection. For simplicity, we also assume that jumps within each the same sector (i.e.,between the green bullets) are forbidden, while neighboring sectors of fully connectedby microscopic transitions, as indicated by thin black lines in the figure. By defini-tion, the corresponding rates have to be symmetric in both directions. Moreover, as thestrongest assumption, let us assume that all these transition rates in the total system areidentical:

wc→c′ =

1 if ws→s′ 6= 00 if ws→s′ = 0 .

(3.24)

With these assumptions, it is clear that whenever the laboratory system hops to theright or to the left, each of the corresponding configurations in the total system is bydefinition equally probable. In other words, each sector equilibrates instantaneously.Therefore, the actual entropy of the respective sector is just the logarithm of the config-urations the corresponding sector.

More specifically, let us assume that the laboratory system is in one particular config-uration, let’s say in the third from left (the blue bullet). Then, according to Fig. 3.3, thetotal system is in one of four possible configurations with equal probability. Supposethat the system wants to jump to the right, performing a transition s → s′. In the totalsystem, there are eight possibilities to realize this transition. If all the correspondingtransition rates wc→c′ are equal to 1, we have

Js→s′ = ∑c(s)

∑c′(s′)

Pstatc wc→c′ = 4 · 8 · 1

4Pstat

s (3.25)

so that ws→s′ = Js→s′/Pstats = 8. On the other hand, the current in opposite direction is

given by

Js′→s = ∑c′(s′)

∑c(s)

Pstatc′ wc′→c = 8 · 4 · 1

8Pstat

s′ , (3.26)

hence ws′→s = Js′→s/Pstats′ = 4. Therefore, the rate of jumping to the right is twice as

large as the rate of jumping to the left, simply because the corresponding configurationspace of the total system doubles its size. Generally, we can conclude that in the case of

Haye Hinrichsen — Complex Systems

Page 61: Complex system

3.2 Entropy production 55

a fully connected transition network in the total system (as the one shown in the figure),the rates in the laboratory system obey the constraint

Ps(t)ws→s′(t)Ps′(t)ws′→s(t)

=|Ωtot

s′ ||Ωtot

s |. (3.27)

Since all configurations in the corresponding sector are equally probable, the entropyof the total system is simply given by Htot

(s) = ln |Ωtots |. Therefore, we can conclude that

the increase of the entropy in the total system during the jump of the laboratory systemfrom s to s′ is given by

∆Htots→s′ = ln

Js→s′(t)Js′→s(t)

= lnPs(t)ws→s′(t)Ps′(t)ws′→s(t)

. (3.28)

Obviously, this expression can be split into two parts, namely, the entropy change ofthe system

∆Hsyss→s′ = ln

Ps(t)Ps′(t)

, (3.29)

which is compatible with the definition Hsyss = − ln Ps, and the entropy production in

the environment

∆Htots→s′ = ln

ws→s′(t)ws′→s(t)

. (3.30)

However, we have to keep in mind that these formulas with derived under very specialassumptions, namely, a fully connected transition network with equal transition ratesbetween the respective sectors.

The Schnakenberg formula

Eq. (3.30) is known as the celebrated Schnakenberg formula because it was derived forthe first time in 1976 by Jurgen Schnakenberg [12]. Today the Schnakenberg formulais accepted as a general expression which quantifies the entropy production in the en-vironment, irrespective of the specific structure of the environment. This means that itis generally assumed that the entropy production, no matter whether the environmentis equilibrated or not, is given by Eq. (3.30). Most researchers use this formula as it iswithout paying attention to its range of validity.

Haye Hinrichsen — Complex Systems

Page 62: Complex system

56 Systems out of equilibrium

It is therefore interesting to investigate how this formulawas actually derived. Schnakenberg noticed that themaster equation has exactly the same structure as thelaw of mass action for chemical reactions. This analogycan be established by associating with each configura-tion s a chemical substance Xs:

s ⇐⇒ Xs

These fictitious chemical substances are then mixed inreactor where they react with one another. The chemi-cal reaction rates kss′ in turn are chosen according to thetransition rates of the master equation. Finally, the en-tropy production of the chemical reaction is studied onthe basis of conventional thermodynamics.

Let us briefly sketch this derivation in more detail. In physical chemistry, an impor-tant quantity is the so-called extent of reaction ξss′ , defined as the accumulated numberof forward reactions Xs → Xs′ minus the number of backward reactions Xs′ → Xs. Thisquantity evolves in time according to

dξss′

dt= Nskss′ − Ns′ks′s , (3.31)

where Ns is the number of molecules of type Xs.

In conventional thermodynamics, each thermodynamic flux is associated with a con-jugate thermodynamic force defined as the partial derivative of the free energy withrespect to the flux. In the present case, the temporal change of the extent of reactiondefines the thermodynamic flux. The conjugate thermodynamic force is the so-calledchemical affinity

Ass′ =∂F

∂ξss′

∣∣∣∣V,T

(3.32)

This implies that the chemical reaction changes to three energy according to

F = ∑ss′

Ass′ ξss′ = ∑ss′

Ass′(Ns′ − Ns). (3.33)

This has to be compared with a chain rule

F = ∑s

∂F∂N

Ns = ∑s

µsNs (3.34)

where µs denotes the chemical potential of the substance Xs. As a result, we obtain thatthe affinities are basically given by the chemical potential difference

Ass′ = µs′ − µs . (3.35)

However, the chemical potentials may change as the reaction proceeds, i.e., they de-pend on the actual molecule numbers Ns. To take this additional change into account,we note that the free energy is usually given by an expression of the form

F = ∑s

(constsNs + kBTNs ln Ns) . (3.36)

Haye Hinrichsen — Complex Systems

Page 63: Complex system

3.2 Entropy production 57

For example, the free energy of an ideal gas can be written in this form. This impliesthat the chemical potentials can be written as

µs = µ(0)s + kBT ln Nc . (3.37)

Inserting this result the affinities are given by

Ass′ = µ(0)s − µ

(0)s + kBT ln

Ns′

Ns. (3.38)

In order to access the difference of the two constants on the right-hand side, let usconsider the stationary equilibrium state. Here, the extent of reaction and therewith theaffinity is zero, i.e. Aeq

ss′ = 0, hence

µ(0)s − µ

(0)s = kBT ln

Neqs′

Neqs

= kBT lnkss′

ks′s, (3.39)

where we used the balance condition Nskss′ = Ns′ks′s in the last step. This leads us to

Ass′ = kBT lnNs′ks′s

Nskss′. (3.40)

Putting all things together we arrive at the free energy change

F = kBT ∑ss′

ξss′ lnNs′ks′s

Nskss′. (3.41)

Introducing the concentrations [Xs] := Ns/N and recalling that the free energy is noth-ing but the trick to quantify entropy changes of the total system ,consisting of laboratorysystem and environment, we get

dHtot

dt= −βF = ∑

ss′ξss′ ln

[Xs]kss′

[Xs′ ]ks′s. (3.42)

We now identify the chemical problem with the original master equation by setting

ξss′=Ps(t)ws→s′(t) , [Xs]=Ps(t) , kss′=ws→s′(t) (3.43)

The courageous corollary by Schnakenberg is to identify the chemical entropy produc-tion with a total entropy production of the master equation:

dHtot

dt= ∑

ss′Ps(t)ws→s′(t) ln

Ps(t)ws→s′(t)Ps′(t)ws′→s(t)

. (3.44)

This can be split into two parts

dHtot

dt= ∑

ss′Ps(t)ws→s′(t) ln

Ps(t)Ps′(t)︸ ︷︷ ︸

dHsysdt

+∑ss′

Ps(t)ws→s′(t) lnws→s′(t)ws′→s(t)︸ ︷︷ ︸

dHenvdt

. (3.45)

which can be identified with the system at the environment.

Haye Hinrichsen — Complex Systems

Page 64: Complex system

58 Systems out of equilibrium

The environmental part derived by Schnakenberg can be written as

dHenv

dt= ∑

ss′Js→s′(t) ln

ws→s′(t)ws′→s(t)

. (3.46)

Obviously, his expression the logarithm is averaged over the current flowing from con-figuration s two configuration s′. In 2005 this led Seifert and Speck to the conjecture thatthe entropy in the environment increases discontinuously whenever the system jumpsto a different configuration:

∆Henvs→s′ = ln

ws→s′(t)ws′→s(t)

. (3.47)

Today, this formula is commonly accepted as valid expression for the entropy pro-duction in the surrounding medium. It is remarkable that this expression does notdepend any way on the physical properties of the environment. Does the environmentreally generate only that little entropy for the microscopic transition from s two s′?Could it be that the real entropy production in an inefficient setup is actually higher?

This is an important question which to my knowledge has not been addressed sofar. In fact, both derivations given above, the one with the fully connected transitionnetwork and Schnakenberg’s chemical variant, implicitly assume that the environmentequilibrates almost instantaneously on a much faster timescale than the intrinsic dy-namics of the system. If it does not equilibrate directly after each transition the actualentropy production may be higher. Probably Schnackenberg’s formula provides only alower bound.

3.3. Fluctuation theorem

The discovery of fluctuation theorems is a very recent development. As we will see, thestandard version of the fluctuation includes the second law of thermodynamics, butbeing an equation rather than an inequality it is much stronger.

The second law states that the entropy of an isolated system increases on average.However, this does not mean that the entropy increases monotonically, it actually fluc-tuates in positive and negative direction, but in such a way that the average is positive.This is shown schematically in Fig. 3.4. Accumulating infinitely many fluctuations inthe histogram, one obtains a quasi-continuous distribution, as shown in the right panelof the figure. The second law of thermodynamics tells us that the average (the first mo-ment) is non-negative, but it does not tell us anything about the shape of the curve. Aswe will see below, fluctuation theorems restrict the form of the curve by an equationrather than an inequality.

Formal definition: Strong, weak and integral fluctuation theorems

Consider a real-valued random variable X and the corresponding probability densityP(X). A so-called strong fluctuation theorem is a relation of the following form:

P(X) = eXP(−X) . (3.48)

Haye Hinrichsen — Complex Systems

Page 65: Complex system

3.3 Fluctuation theorem 59

Figure 3.4.: Schematic drawing how the entropy of an isolated system evolves in time. As shown in themiddle, each transition contributes with a little δ-peak, which may be positive or negative.Accumulating all peaks in a long time limit, one obtains a quasi-continues distribution, asshown in the right panel. The second law of thermodynamics states that the first momentof this distribution is non-negative.

This equation tells us that the left part of the distribution (for negative X) is exponen-tially suppressed. The relation restricts the form of the function P(X) in such a way thatit relates the left and the right half of the distribution. In other words, if the function forpositive X is known, it automatically determines the function for all negative X < 0.

The strong fluctuation theorem has to be distinguished from the so-called weak fluc-tuation theorem, which relates two different probability densities P(X) and P†(X) definedon the same random variable X:

P(X) = eXP†(−X) . (3.49)

Obviously, the weak fluctuation theorem includes the strong one as a special case.

It can be shown easily (left as an exercise to the reader) that the weak fluctuationtheorem (and therefore also the strong one) implies the equality

〈e−X〉 = 1 (3.50)

or in full form ∫ b

aP(X)e−X dX , (3.51)

where [a, b] is the definition range of the distribution. Since this equation involves anintegration, it is also known as integral fluctuation theorem. By using Jensen’s inequalityfor convex functions the integral fluctuation theorem implies that the first moment ofthe distribution is positive:

〈X〉 ≥ 0 (3.52)

This is the “second law” associated with the random variable X. As we will see below,the conventional second law of thermodynamics is obtained by setting X = ∆Htot.

Remember: There are three types of fluctuation theorems (FT’s):

• Strong FTs relate right and left half of the same probability density P.• Weak FTs relate two different probability densities P and P†.• Integral FTs tell us that the expectation value 〈e−X〉 = 1.• Second Law: Integral FTs imply the inequality 〈X〉 ≥ 0 for the first moment.

Haye Hinrichsen — Complex Systems

Page 66: Complex system

60 Systems out of equilibrium

The excitement about the fluctuation theorem stems from the fact that it provides anequation rather than an inequality and that it replaces the well-known second law aftermore than 150 years (the first one to formulate the second law was Rudolf Clausius in1854 , who taught from 1867-1869 at the University of Wurzburg).

Derivation of the strong fluctuation theorem for Htot

Let us now derive the fluctuation theorem for the total entropy of a stationary system.To this end we consider stochastic path

Γ : c0t1−→ c1

t2−→ c2 −→ . . . −→ cN−1tN−→ cN (3.53)

starting at time 0 and ending at the time τ. Along this path the system jumps to theconfiguration cj at time tj. What is the probability to find this particular stochastic path?In order to compute this probability, we first have to know the probability to find thesystem in the initial configuration c0, i.e., the resulting expression will be proportionalto Pc0(0). Furthermore, the probability to find this path will be proportional to theproduct of the rates corresponding to the transitions along the path. However, betweenthe transitions the system stays in a certain configuration for some while which alsointroduces the statistical weight.

More specifically: Weight of “doing nothing”Suppose that the system is currently in the configuration ci. During an infinitesimal timespan ∆t 1 the probability of a transition is proportional to the sum of all the outgoingrates, i.e., dP = wesc

ci∆t, where wesc

ci= ∑c′′ wci→c′′ is the escape rate (see also Eq. (3.4)). The

infinitesimal probability of “doing nothing” is therefore 1 − ∆P. Waiting for a longer pe-riod in configurations ci, let’s say for the time N∆, the corresponding probability of “doingnothing” is simply the product (1− ∆P)N over all small time intervals in between. Sincethe exponential function can be represented in form of the limit ex = limN→∞(1+ x/N)N , itfollows immediately that the probability of doing nothing between two transitions is givenby

Pdoing nothing =∫ ti+1

ti

dt wescci

(t) (3.54)

Putting all things together, the probability to find the stochastic path Γ is given by

PΓ = Pc0(0) exp( N

∑i=0

∫ ti+1

ti

dt wescci(t))

︸ ︷︷ ︸doing nothing

N

∏i=1

wci−1→ci(ti)︸ ︷︷ ︸transitions

(3.55)

where wescc (t) = ∑c′ wc→c′(t) is the escape rate and where we formally defined t0 = 0

and tN+1 = τ.

We now consider the reverse path

Γ† : cNτ−tN−→ cN−1

τ−tN−1−→ cN−2 −→ . . . −→ c1τ−t1−→ c0 . (3.56)

This path runs exactly the opposite direction. Note that time is still running from 0to τ and that the sequence of events is just reflected during this time interval. The

Haye Hinrichsen — Complex Systems

Page 67: Complex system

3.3 Fluctuation theorem 61

corresponding statistical weight is given by

PÆ = PcN (0) exp( 0

∑i=N

∫ τ−ti

τ−ti+1

dt wescci(t))

︸ ︷︷ ︸doing nothing

N

∏i=1

wci→ci−1(τ − ti)︸ ︷︷ ︸transitions

(3.57)

Substituting t→ τ − t in the integral this turns into

PÆ = PcN (0) exp( N

∑i=0

∫ ti+1

ti

dt wescci(τ − t)

)︸ ︷︷ ︸

doing nothing

N

∏i=1

wci→ci−1(τ − ti)︸ ︷︷ ︸transitions

(3.58)

The common goal of all types of fluctuation theorems is to make some sense out of thelogarithmic quotient ln(PΓ/PΓ†) of the forward and the backward probability. Usuallythis is possible only if the terms for “doing nothing” cancel out. In the present casethis requires that the rates are the same in both expressions, meaning that the rates areconstant.

Moreover, computing the logarithmic quotient one has to find a suitable interpreta-tion of ln

(Pc0(0)/PcN (0)

). This contribution reminds of the change of the system en-

tropy ln(

Pc0(0)/PcN (τ)), the only difference being in the evaluation time of the denom-

inator. However, if the system is stationary, this interpretation is indeed possible.

Therefore, in the following we shall make the assumption that the system

• has time-independent rates

• is in the stationary state

Obviously, this includes equilibrium states as well as so-called non-equilibrium steadystates (NESS), as discussed before.

Under these assumptions the contributions for “doing nothing” are exactly identicalin both expressions and the quotient of the two probabilities is given by

PÆ=

pc0

pcN

N

∏i=1

wci−1→ci

wci→ci−1

. (3.59)

Now we can easily recognize that the logarithmic quotient just gives the total entropyproduction:

∆HtotΓ = ln

PÆ= ln

[pc0

pcN

N

∏i=1

wci−1→ci

wci→ci−1

](3.60)

= ln[

pc0

pcN

]+

N

∑i=1

ln[

wci−1→ci

wci→ci−1

]= ∆Hsys + ∆Henv .

This formula allows us to compute the entropy changes for a given path Γ. Moreover,we know the probability density to find such a path. Combining these two pieces ofinformation, we can compute the probability for observing the certain increase or de-crease of the total entropy. To this end, one has to integrate over all possible paths,

Haye Hinrichsen — Complex Systems

Page 68: Complex system

62 Systems out of equilibrium

i.e., one has to perform a path integral∫

DΓ . . .. Although the path integral formalismis beyond the scope of these lecture notes, let us assume that it basically works likean ordinary integration. This allows us to express the probability of finding a certainentropy change as

P(∆Htot) =∫

DΓ PΓ δ(

∆Htot − ∆HtotΓ

)(3.61)

=∫

DΓ PΓ δ(

∆Htot − lnPΓ

PÆ

).

Since the integration can be understood as a sum over all possible paths, including thereversed the ones, we can simply replace Γ by Γ† in the integrand:

P(∆Htot) =∫

DΓ PΓ† δ(

∆Htot − lnPΓ†

)=∫

DΓ PΓ† δ(

∆Htot + ∆HtotΓ

). (3.62)

Remark:This is basically the same as space replacing

∫ +∞−∞ f (x)dx =

∫ +∞−∞ f (−x)dx on the real line.

Finally, we use the relation ∆HtotΓ = ln PΓ

PÆin order to express the backward probabil-

ity PΓ† begin in terms of the the forward probability PΓ:

P(∆Htot) =∫

DΓ e−∆HtotΓ PΓ δ

(∆Htot + ∆Htot

Γ

)= e∆Htot

P(−∆Htot) .

These few lines complete the proof of the strong fluctuation theorem for the total en-tropy.

In thermal equilibrium the entropies of the system and the environment still fluctu-ate, but the condition of detailed balance ensures that the entropy changes are oppositeto each other (∆Hsys = −∆Henv) so that ∆Htot = 0. Therefore, the correspondingprobability density of an equilibrium system is just a δ-peak P(∆Htot) = δ(Htot). Thissolution formally obeys the FT, fulfilling the second law 〈∆Htot〉 = 0 sharply.

Summary: Strong FT for the total entropy:If the system is stationary (rates and probabilities time-independent), the total entropy pro-duction obeys a strong fluctuation theorem

P(∆Htot) = e∆HtotP(−∆Htot) ,

implying the IFT 〈e−Htot 〉 = 1 and the Second Law 〈Htot〉 ≥ 0. This strong FT holds forany stationary nonequilibrium system. In the special case of thermal equilibrium we haveP(∆Htot) = δ(Htot).

3.4. Heat and work

In conventional thermodynamics one distinguishes two fundamentally different kindsof energy transfer between subsystems. The first one is heat , denoted as Q. Accordingto Clausius heat flows from the warm to the cold reservoir, changing their entropy bydH = βdQ. Heat is some kind of disordered energy transfer. Although heat carries the

Haye Hinrichsen — Complex Systems

Page 69: Complex system

3.4 Heat and work 63

-5 0 5 10

∆ Htot

0

0,01

0,02

0,03

0,04

P (

∆ H

tot )

Figure 3.5.: Example of a system with only four configurations and randomly chosen constant transitionrates in the stationary state observed over a time span ∆t = 5 in units of Monte Carloupdates. As can be seen, the probability density is not continuous but rather composed ofa large number of δ-peaks, each corresponding to a different stochastic path. To test thefluctuation theorem we plotted P(∆Htot) as a red histogram and e∆Htot

P(−∆Htot) as blackcrosses. Since the crosses are located on top of the peaks (except for small sampling errorsat high values) this simulation confirms the strong FT for the total entropy production.

unit of energy, it is only defined in a thermodynamical context, – you cannot find thenotion of heat in any textbook on classical mechanics. The usefulness of heat is limited.Of course, heat can be used for heating, but in order to convert heat into a usable formof energy, a heat engine in combination with a cold reservoir is needed. As everyoneknows, the efficiency of such a machine is limited by the formula according to Carnot.

The other fundamental form of energy transfer is work , denoted by W. Work maybe thought of as a usable directed form of energy, such as the mechanical motion of thepiston, a torque generated by a motor or a flowing electric current. By its directed na-ture, work can be used to full extent in a secondary process, i.e., the maximal efficiencyis always 100 %.

Usually heat and work are used in the context of energy transfer, i.e., one is interestedin differences ∆Q and ∆W rather than absolute values. There is much confusion aboutthe sign of heat and energy changes. However, in recent years it became some kind ofstandard to define the sign of heat and work always with respect to the system itself.More specifically, heat flowing into the system is positive while heat flowing out ofthe system is negative. Likewise, work done on the system is positive while workperformed by the system on the environment is considered to be negative.

Distinguishing heat and work in stochastic Markov processes

In the context of stochastic Markov processes – the cartoon of a system jumping be-tween different configurations – there’s nothing like heat and work. In fact, there isn’t

Haye Hinrichsen — Complex Systems

Page 70: Complex system

64 Systems out of equilibrium

even the notion of energy. Therefore, heat and work seem to emerge only in the pres-ence of energy exchange, e.g., in the canonical equilibrium ensemble. Thus, in order tostudy heat and work, let us first consider the canonical case.

In the canonical ensemble each configuration carries the stationary Boltzmann weightpstat

c ∝ e−βEc , where Ec is the energy associated with the configuration of the system.Therefore, whenever the system jumps from c → c′, it undergoes an energy changeof ∆E = ln(pstat

c /pstatc′ ). It should be emphasized that the corresponding probabilities

have to be the stationary ones; if the system still relaxes towards the stationary state,the actual probabilities may be different. Since energy is conserved, ∆E has to comefrom somewhere, and clearly it comes from the environment. Since the system con-tinuously jumps forth and back in its own configurations space, energy continues tofluctuate between the system in a random-walk-like manner. If the system is not yet inequilibrium, it may happen that there is a net flow from the system to the reservoir orvice versa, and this average flow of stochastic energy transfer is called heat.

Haye Hinrichsen — Complex Systems

Page 71: Complex system

4. Nonequilibrium phase transitions

In this Chapter, we are primarily interested in phase transitions in systems far fromequilibrium. To this end we consider stochastic processes that violate detailed balanceso strongly that concepts of equilibrium statistical physics can no longer be applied,even in an approximate sense. We are interested in the question whether stochasticprocesses far from equilibrium can exhibit new phenomena that cannot be observedunder equilibrium conditions. This applies in particular to phase transitions far fromequilibrium.

4.1. Directed percolation

The probably most important class of non-equilibrium processes, which display a non-trivial phase transition from a fluctuating phase into an absorbing state, is Directed Per-colation (DP). As will be discussed below, the DP class comprises a large variety ofmodels that share certain basic properties.

4.1.1. Directed bond percolation on a lattice

To start with let us first consider a specific model called directed bond percolation whichis often used as a simple model for water percolating through a porous medium. Themodel is defined on a tilted square lattice whose sites represent the pores of the medium.The pores are connected by small channels (bonds) which are open with probability pand closed otherwise. As shown in Fig. 4.1, water injected into one of the pores will per-colate along open channels, giving rise to a percolation cluster of wetted pores whoseaverage size will depend on p.

There are two fundamentally different versions of percolation models. In isotropicpercolation the flow is undirected, i.e., the spreading agent (water) can flow in any di-rection through open bonds (left panel of Fig. 4.1). A comprehensive introduction toisotropic percolation is given in the textbook by Stauffer [17]. In the present lecture,however, we are primarily interested in the case of directed percolation. Here the clus-ters are directed, i.e., the water is restricted to flow along a preferred direction in space,as indicated by the arrow in Fig. 4.1. In the context of porous media this preferreddirection may be interpreted as a gravitational driving force. Using the language ofelectronic circuits DP may be realized as a random diode network (cf. Sect. 4.1.9).

The strict order of cause and effect in DP allows one to interpret the preferred di-rection as a temporal coordinate. For example, in directed bond percolation, we mayenumerate horizontal rows of sites by an integer time index t (see Fig. 4.2). Instead of a

Haye Hinrichsen — Complex Systems

Page 72: Complex system

66 Nonequilibrium phase transitions

isotropic percolation directed percolation

Figure 4.1.: Isotropic versus directed bond percolation. The figure shows two identical realizations ofopen and closed bonds on a finite part of a tilted square lattice. A spreading agent (red) isinjected at the central site (blue circle). In the case of isotropic percolation (left) the agentpercolates through open bonds in any direction. Contrarily, in the case of directed perco-lation, the agent is restricted to percolate along a preferred direction, as indicated by thearrow.

7

6

5

4

3

2

1

0 1

1

2

1

2

2

3

2

t N(t)seed

time

Figure 4.2.: Directed bond percolation. The process shown here starts with a single active seed at theorigin. It then evolves through a sequence of configurations along horizontal lines (calledstates) which can be labeled by a time-like index t. An important quantity to study wouldbe the total number of active sites N(t) at time t.

static model of directed connectivity, we shall from now on interpret DP as a dynamicalprocess which evolves in time. Denoting wetted sites as active and dry sites as inactivethe process starts with a certain initial configuration of active sites that can be chosenfreely. For example, in Fig. 4.2 the process starts with a single active seed at the origin.As t increases the process evolves stochastically through a sequence of configurations ofactive sites, also called states at time t. An important quantity, which characterizes theseintermediate states, is the total number of active sites N(t), as illustrated in Fig. 4.2.

Regarding directed percolation as a reaction-diffusion process the local transitionrules may be interpreted as follows. Each active site represents a particle A. If thetwo subsequent bonds are both closed, the particle will have disappeared at the nexttime step by a death process A → ∅ (see Fig. 4.3a). If only one of the bonds is open,the particle diffuses stochastically to the left or to the right, as shown in Fig. 4.3b-4.3c.Finally, if the two bonds are open the particles creates an offspring A→ 2A (Fig. 4.3d).However, it is important to note that each site in directed bond percolation can be ei-ther active or inactive. In the particle language this means that each site can be occupied

Haye Hinrichsen — Complex Systems

Page 73: Complex system

4.1 Directed percolation 67

a) b) c) d) e)

Figure 4.3.: Interpretation of the dynamical rules of directed bond percolation as a reaction-diffusionprocess: a) death process, b)-c) diffusion, d) offspring production, and e) coagulation.

by at most one particle. Consequently, if two particles happen to reach the same site,they merge irreversibly forming a single one by coalescence 2A → A, as illustrated inFig. 4.3e. Summarizing these reactions, directed bond percolation can be interpreted asa reaction-diffusion process which effectively follows the reaction scheme

A→ ∅ death processA→ 2A offspring production (4.1)2A→ A coagulation

combined with single-particle diffusion.

The dynamical interpretation in terms of particles is of course the natural languagefor any algorithmic implementation of DP on a computer. As the configuration at time tdepends exclusively on the previous configuration at time t − 1 it is not necessary tostore the entire cluster in the memory, instead it suffices to keep track of the actualconfiguration of active sites at a given time and to update this configuration in parallelsweeps according to certain probabilistic rules. In the case of directed bond percolation,one obtains a stochastic cellular automaton with certain update rules in which eachactive site of the previous configuration activates its nearest neighbors of the actualconfiguration with probability p. In fact, as shown in Fig. 4.4, a simple non-optimizedC-code for directed bond percolation takes less than a page, and the core of the updaterules takes only a few lines.

4.1.2. Absorbing states and critical behavior

As only active sites at time t can activate sites at time t + 1, the configuration withoutactive sites plays a special role. Obviously, such a state can be reached by the dynamicsbut it cannot be left. In the literature such states are referred to as absorbing. Absorb-ing states can be thought of as a trap: Once the system reaches the absorbing state itbecomes trapped and will stay there forever. As we will see below, a key feature ofdirected percolation is the presence of a single absorbing state, usually represented bythe empty lattice.

The mere existence of an absorbing state demonstrates that DP is a dynamical pro-cess far from thermal equilibrium. As explained in the Introduction, equilibrium statis-tical mechanics deals with stationary equilibrium ensembles that can be generated bya dynamics obeying detailed balance, meaning that probability currents between pairsof sites cancel each other. As the absorbing state can only be reached but not be left,there is always a non-zero current of probability into the absorbing state that violatesdetailed balance. Consequently the temporal evolution before reaching the absorbing

Haye Hinrichsen — Complex Systems

Page 74: Complex system

68 Nonequilibrium phase transitions

t

i

//============== Simple C-code for directed percolation =============

#include <fstream.h>

using namespace std;

const int T=1000; // number of rows

const int R=10000; // number of runs

const double p=0.7; // percolation probability

int N[T]; // cumulative occupation number

//--- random number generator returning doubles between 0 and 1 -----

inline double rnd(void) return (double)rand()/0x7FFFFFFF;

//------------ construct directed percolation cluster ---------------

void DP (void)

int t,i,s[T][T]; // static array

for (t=0;t<T;t++) for (i=0;i<=t;i++) s[t][i]=0; // clear lattice

s[0][0]=1; // place seed

// perform loop over all active sites:

for (t=0; t<T-1; t++) for (i=0; i<=t; i++) if (s[t][i])

N[t]++;

if (rnd()<p) s[t+1][i]=1; // offspring right

if (rnd()<p) s[t+1][i+1]=1; // offspring left

//-------- perform R runs and write average N(t) to disk -----------

int main (void)

for (int t=0; t<T; t++) N[t]=0; // reset N(t)

for (int r=0; r<R; r++) DP(); // perform R runs

ofstream os("N.dat"); // write result to disk

for (int t=0; t<T-1; t++) os << t << "\t" << (double)N[t]/R << endl;

Figure 4.4.: Simple non-optimized program in C for directed bond percolation. The numerical resultsare shown below in Fig. 6.

Haye Hinrichsen — Complex Systems

Page 75: Complex system

4.1 Directed percolation 69

i

t

i

t

i

t

p<pc cc

p=p p>p

Figure 4.5.: Typical DP clusters in 1+1 dimensions grown from a single seed below, at, and abovecriticality.

state cannot be described in terms of thermodynamic ensembles, proving that DP is anon-equilibrium process.

The enormous theoretical interest in DP – more than 800 articles refer to this class ofmodels – is related to the fact that DP displays a non-equilibrium phase transition from afluctuating phase into the absorbing state controlled by the percolation probability p.The existence of such a transition is quite plausible since offspring production and par-ticle death compete with each other. As will be discussed below, phase transitions intoabsorbing states can be characterized by certain universal properties which are indepen-dent of specific details of the microscopic dynamics. In fact, the term ‘directed percola-tion’ does not stand for a particular model, rather it denotes a whole universality classof models which display the same type of critical behavior at the phase transition. Thesituation is similar as in equilibrium statistical mechanics, where for example the Isinguniversality class comprises a large variety of different models. In fact, DP is probablyas fundamental in non-equilibrium statistical physics as the Ising model in equilibriumstatistical mechanics.

In DP the phase transition takes place at a certain well-defined critical percolationprobability pc. As illustrated in Fig. 4.5 the behavior on both sides of pc is very different.In the subcritical regime p < pc any cluster generated from a single seed has a finite lifetime and thus a finite mass. Contrarily, in the supercritical regime p > pc there is a finiteprobability that the generated cluster extends to infinity, spreading roughly within acone whose opening angle depends on p− pc. Finally, at criticality finite clusters of allsizes are generated. These clusters are sparse and remind of self-similar fractal objects.As we will see, a hallmark of such a scale-free behavior is a power-law behavior ofvarious quantities.

The precise value of the percolation threshold pc is non-universal (i.e., it depends onthe specific model) and can only be determined numerically. For example, in the case ofdirected bond percolation in 1+1 dimensions, the best estimate is pc = 0.644700185(5) [18].Unlike isotropic bond percolation in two dimensions, where the critical value is exactlygiven by piso

c = 1/2, an analytical expression for the critical threshold of DP in finitedimensions is not yet known. This seems to be related to the fact that DP is a non-integrable process. It is in fact amazing that so far, in spite of its simplicity and theenormous effort by many scientists, DP resisted all attempts to be solved exactly, even

Haye Hinrichsen — Complex Systems

Page 76: Complex system

70 Nonequilibrium phase transitions

1 10 100 1000

t

0,01

0,1

1

10

100

<N

(t)>

p=0.7

p=0.65

p=0.6447 (critical)

p=0.64

p=0.6

0 100 300 400t10

-2

10-1

100

<N

(t)>

Figure 4.6.: Directed bond percolation: Number of particles N(t) as a function of time for different val-ues of the percolation probability p. The critical point is characterized by an asymptoticpower law behavior. The inset demonstrates the crossover to an exponential decay in thesubcritical phase for p = 0.6.

in 1+1 dimensions.

In order to describe the phase transition quantatively an appropriate order parameteris needed. For simulations starting from a single active seed a suitable order parameteris the average number of particles 〈N(t)〉 at time t, where 〈. . .〉 denotes the average overmany independent realizations of randomness (called runs in the numerical jargon).For example, the program shown in Fig. 4.4 averages this quantity over 10.000 runsand stores the result in a file that can be viewed by a graphical tool such as xmgrace. Asshown in Fig. 4.6, there are three different cases:

• For p < pc the average number of active sites first increases and then decreasesrapidly. As demonstrated in the inset, this decrease is in fact exponential. Obvi-ously, the typical crossover time, where exponential decay starts, depends on thedistance from criticality pc − p.

• At criticality the average number of active sites increases according to a powerlaw 〈N(t)〉 ∼ tθ . A standard regression of the data gives the exponent θ ' 0.302,shown in the figure as a thin dashed line. As can be seen, there are deviations forlow t, i.e., the process approaches a power-law only asymptotically.

• In the supercritical regime p > pc the slow increase of 〈N(t)〉 crosses over to afast linear increase with time. Again the crossover time depends on the distancefrom criticality p− pc.

The properties of 〈N(t)〉 above and below criticality can be used to determine the crit-ical threshold numerically. This iterative procedure works as follows: Starting with amoderate simulation time it is easy to specify a lower and an upper bound for pc by

Haye Hinrichsen — Complex Systems

Page 77: Complex system

4.1 Directed percolation 71

hand, e.g., 0.64 < pc < 0.65 in the case of directed bond percolation. This intervalis then divided into two equal parts and the process is simulated in between, e.g., atp = 0.645. In order to find out whether this value is sub- or supercritical one has tocheck deviations for large t from a straight line in a double logarithmic plot. If thecurve veers down (up) the procedure is iterated using the upper (lower) interval. Indetecting the sign of curvature the human eye is quite reliable but it is also possibleto recognize it automatically. If there is no obvious deviation from a straight line thesimulation time and the number of runs has to be increased appropriately.

Warning: Determining the numerical error for the estimate of a critical exponentnever use the statistical χ2-error of a standard regression! For example, for the data pro-duced by the minimal program discussed above, a linear regression by xmgrace wouldgive the result θ = 0.3017(2). However, the estimate in the literature θ = 0.313686(8)lies clearly outside these error margins, meaning that the actual error must be muchhigher. A reliable estimate of the error can be obtained by comparing the slopes overthe last decade of simulations performed at the upper and the lower bound of pc.

In principle the accuracy of this method is limited only by the available CPU time.We note, however, that this standard method assumes a clean asymptotic power lawscaling, which for DP is indeed the case. However, in some cases the power law maybe superposed by slowly varying deviations, e.g. logarithmic corrections, so that theplotted data at criticality is actually not straight but slightly curved. With the methodoutlined above one is then tempted to ‘compensate’ this curvature by tuning the controlparameter, leading to unknown systematic errors. Recently this happened, for example,in the case of the diffusive pair contact process, as will be described in Sect. 4.2.4.

4.1.3. The Domany-Kinzel cellular automaton

An important model for DP, which includes directed bond percolation as a specialcase, is the celebrated Domany-Kinzel model [19, 20]. The Domany-Kinzel model isa stochastic cellular automaton defined on a diagonal square lattice which evolvesby parallel updates according to certain conditional transition probabilities P[si(t +1)|si−1(t), si+1(t)], where si(t) ∈ 0, 1 denotes the occupancy of site i at time t. Theseprobabilities depend on two parameters p1, p2 and are defined by

P[1|0, 0] = 0 ,P[1|0, 1] = P[1|1, 0] = p1 , (4.2)

P[1|1, 1] = p2 ,

with P[0|·, ·] = 1− P[1|·, ·]. On a computer the Domany-Kinzel model can be imple-mented as follows. To determine the state si(t) of site i at time t we generate for eachsite a random number z ∈ (0, 1) from a flat distribution and set

si(t + 1) =

1 if si−1(t) 6= si+1(t) and zi(t) < p1 ,1 if si−1(t) = si+1(t) = 1 and zi(t) < p2 ,0 otherwise .

(4.3)

This means that a site is activated with probability p2 if the two nearest neighbors atthe previous time step were both active while it is activated with probability p1 if only

Haye Hinrichsen — Complex Systems

Page 78: Complex system

72 Nonequilibrium phase transitions

0 1p

1

0

1

p2

inactive phase

activephase

compact DP

bond DP

site DP

W18

Figure 4.7.: Phase diagram of the Domany-Kinzel model.

transition point p1,c p2,c Ref.Wolfram rule 18 0.801(2) 0 [22]

site DP 0.70548515(20) 0.70548515(20) [18]bond DP 0.644700185(5) 0.873762040(3) [18]

compact DP 1/2 1 [?]

Table 4.1.: Special transition points in the (1+1)-dimensional Domany-Kinzel model.

one of them was active. Thus the model depends on two percolation probabilities p1and p2, giving rise to the two-dimensional phase diagram shown in Fig. 4.7. The ac-tive and the inactive phase are now separated by a line of phase transitions. This lineincludes several special cases. For example, the previously discussed case of directedbond percolation corresponds to the choice p1 = p and p2 = p(2− p). Another specialcase is directed site percolation [?], corresponding to the choice p1 = p2 = p. In thiscase all bonds are open but sites are permeable with probability p and blocked other-wise. Finally, the special case p2 = 0 is a stochastic generalization of the rule ‘W18’ ofWolfram’s classification scheme of cellular automata [21]. Numerical estimates for thecorresponding critical parameters are listed in Table 4.1.

There is strong numerical evidence that the critical behavior along the whole phasetransition line (except for its upper terminal point) is that of DP, meaning that the transi-tions always exhibit the same type of long-range correlations. The short-range correla-tions, however, are non-universal and may change when moving along the phase tran-sition line. They may even change the visual appearance of the clusters, as illustratedin Fig. 4.8, where four typical snapshots of critical clusters are compared. Although thelarge-scale structure of the clusters in the first three cases is roughly the same, the mi-croscoppic texture seems to become bolder as we move up along the phase transitionline. As shown in Ref. [16], this visual impression can be traced back to an increase ofthe mean size of active islands.

Approaching the upper terminal point the mean size of active islands diverges andthe cluster becomes compact. For this reason this special point is usually referred toas compact directed percolation. However, this nomenclature may be misleading for the

Haye Hinrichsen — Complex Systems

Page 79: Complex system

4.1 Directed percolation 73

site DP bond DP compact DPWolfram 18

Figure 4.8.: Domany-Kinzel model: Critical cluster generated from a single active seed at differentpoints along the phase transition line (see text).

t

Figure 4.9.: Lattice geometry of directed bond percolation in 2+1 dimensions. The red lines represent apossible cluster generated at the origin.

following reasons. The exceptional behavior at this point of is due to an additional sym-metry between active and inactive sites along the upper border of the phase diagramat p2 = 1. Here the DK model has two symmetric absorbing states, namely, the emptyand the fully occupied lattice. For this reason the transition does no longer belong tothe universality class of directed percolation, instead it becomes equivalent to the (1+1)-dimensional voter model [13, 23] or the Glauber-Ising model at zero temperature. Sincethe dynamic rules are invariant under the replacement p1 ↔ 1− p1, the correspondingtransition point is located at p1 = 1/2.

The Domany-Kinzel model can be generalized easily to higher spatial dimensions.For example, Fig. 4.9 shows a possible cluster in 2+1-dimensional directed bond perco-lation. Generally, in the d+1-dimensional Domany-Kinzel model the activation proba-bility of site i at time t + 1 depends on the number ni(t) = ∑j∈<i> sj(t) of active nearestneighbors at time t, i.e. the conditional probabilities

P[1|0] = 0 ,P[1|n] = pn , (1 ≤ n ≤ 2d) (4.4)

are controlled by 2d parameters p1, . . . , p2d. The special case of directed bond percola-tion corresponds to the choice pn = 1− (1− p)n while for equal parameters pn = p oneobtains directed site percolation in d+1 dimensions.

Haye Hinrichsen — Complex Systems

Page 80: Complex system

74 Nonequilibrium phase transitions

4.1.4. The contact process

Another important model for directed percolation, which is popular in mathematicalcommunities, is the contact process. The contact process was originally introducedby Harris [24] as a model for epidemic spreading (see Sect. 4.3). It is defined on ad-dimensional square lattice whose sites can be either active (si(t) = 1) or inactive(si(t) = 0). In contrast to the Domany-Kinzel model, which is a stochastic cellular au-tomaton with parallel updates, the contact process evolves by asynchronous updates,i.e., the three elementary processes (offspring production, on-site removal and diffu-sive moves) occur spontaneously at certain rates. Although the microscopic dynamicsdiffers significantly from the Domany-Kinzel model, the contact process displays thesame type of critical behavior at the phase transition. In fact, both models belong to theuniversality class of DP.

On a computer the d+1-dimensional contact process can be implemented as follows.For each attempted update a site i is selected at random. Depending on its state si(t)and the number of active neighbors ni(t) = ∑j∈<i> sj(t) a new value si(t + dt) = 0, 1 isassigned according to certain transition rates w[si(t)→ si(t+ dt), ni(t)]. In the standardcontact process these rates are defined by

w[0→ 1, n] = λn/2d , (4.5)w[1→ 0, n] = 1 . (4.6)

Here the parameter λ plays the same role as the percolation probability in directedbond percolation. Its critical value depends on the dimension d. For example, in 1+1dimensions the best-known estimate is λc ' 3.29785(8) [25].

As demonstrated in Ref. [16] the evolution of the contact process can be described interms of a master equation whose Liouville operator L can be constructed explicitelyon a finite lattice. Diagonalizing this operator numerically one obtains a spectrum ofrelaxational modes with at least one zero mode which represents the absorbing state.In the limit of large lattices the critical threshold λc is usually the point from where onthe first gap in the spectrum of L vanishes.

4.1.5. The critical exponents η, η′, ν⊥, and ν‖

In equilibrium statistical physics, continuous phase transitions as the one in the Isingmodel can be described in terms of a phenomenological scaling theory. For example,the spontaneous magnetization M in the ordered phase vanishes as |M| ∼ (Tc − T)η asthe critical point is approached. Here η is a universal critical exponent, i.e. its value isindependent of the specific realization of the model. Similarly, the correlation length ξdiverges in as ξ ∼ |T − Tc|−ν for T → Tc with another universal exponent ν. Thecritical point itself is characterized by the absence of a macroscopic length scale so thatthe system is invariant under suitable scaling transformations (see below).

In directed percolation and other non-equilibrium phase transitions into absorbingstates the situation is very similar. However, as non-equilibrium system involves timewhich is different from space in character, there are now two different correlation lengths,

Haye Hinrichsen — Complex Systems

Page 81: Complex system

4.1 Directed percolation 75

namely, a spatial correlation length ξ⊥ and a temporal correlation length ξ‖ with twodifferent associated exponents ν⊥ and ν‖. Their ratio z = ν‖/ν⊥ is called dynamicalexponent as it relates spatial and temporal scales at criticality.

What is the analogon of the magnetization in DP? As shown above, in absorbingphase transitions the choice of the order parameter depends on the initial configura-tion. If homogeneous initial conditions are used, the appropriate order parameter isthe density of active sites at time t

ρ(t) = limL→∞

1L ∑

isi(t) . (4.7)

Here the density is defined as a spatial average in the limit of large system sizes L→ ∞.Alternatively, for a finite system with periodic boundary conditions we may express thedensity as

ρ(t) = 〈si(t)〉 , (4.8)

where 〈. . .〉 denotes the ensemble average over many realizations of randomness. Be-cause of translational invariance the index i is arbitrary. Finally, if the process startswith a single seed, possible order parameters are the average mass of the cluster

N(t) = 〈∑i

si(t)〉 (4.9)

and the survival probability

P(t) = 〈1−∏i(1− si(t))〉 . (4.10)

These quantities allow us to define the four standard exponents

ρ(∞) ∼ (p− pc)η , (4.11)

P(∞) ∼ (p− pc)η′ , (4.12)

ξ⊥ ∼ |p− pc|−ν⊥ , (4.13)ξ‖ ∼ |p− pc|−ν‖ . (4.14)

The necessity of two different exponents η and η′ can be explained in the framework ofa field-theoretic treatment, where these exponents are associated with particle creationand annihilation operators, respectively. In DP, however, a special symmetry, calledrapidity reversal symmetry, ensures that η = η′. This symmetry can be proven mosteasily in the case of directed bond percolation, where the density ρ(t) starting from afully occupied lattice and the survival probability P(t) for clusters grown from a seedare exactly equal for all t. Hence both quantities scale identically and the two corre-sponding exponents have to be equal. This is the reason why DP is characterized byonly three instead of four critical exponents.

The cluster mass N(t) ∼ tθ scales algebraically as well. The associated exponent,however, is not independent, instead it can be expressed in terms of the so-called gen-eralized hyperscaling relation [26]

θ =dν⊥ − η − η′

ν‖. (4.15)

In order to determine the critical point of a given model by numerical methods, N(t)turned out to be one of the most sensitive quantities.

Haye Hinrichsen — Complex Systems

Page 82: Complex system

76 Nonequilibrium phase transitions

4.1.6. Scaling laws

Starting point of a phenomenological scaling theory for absorbing phase transitions isthe assumption that the macroscopic properties of the system close to the critical pointare invariant under scaling transformations of the form

∆→ a ∆, ~x → a−ν⊥~x, t→ a−ν‖ t, ρ→ aηρ, P→ aη′P, (4.16)

where a > 0 is some scaling factor and ∆ = p− pc denotes the distance from criticality.Scaling invariance strongly restricts the form of functions. For example, let us considerthe decay of the average density ρ(t) at the critical point starting with a fully occupiedlattice. This quantity has to be invariant under rescaling, hence ρ(t) = aηρ(t a−ν‖).Choosing a such that t a−ν‖ = 1 we arrive at ρ(t) = t−η/ν‖ρ(1), hence

ρ(t) ∼ t−δ , δ = η/ν‖ . (4.17)

Similarly, starting from an initial seed, the survival probability Ps(t) decays as

Ps(t) ∼ t−δ′ , δ′ = η′/ν‖ (4.18)

with δ = δ′ in the case of DP.

In an off-critical finite-size system, the density ρ(t, ∆, L) and the survival probabil-ity P(t, ∆, L) depend on three parameters. By a similar calculation it is easy to showthat scaling invariance always reduces the number of parameters by 1, expressing thequantity of interest by a leading power law times a scaling function that depends onscaling-invariant arguments. Such expressions are called scaling forms. For the densityand the survival probability these scaling forms read

ρ(t, ∆, L) ∼ t−η/ν‖ f (∆ t1/ν‖ , td/z/L) , (4.19)

P(t, ∆, L) ∼ t−η′/ν‖ f ′(∆ t1/ν‖ , td/z/L) . (4.20)

4.1.7. Universality

As outlined in the introduction, the working hypothesis in the field of continuous non-equilibrium phase transitions is the notion of universality. This concept expresses theexpectation that the critical behavior of such transitions can be associated with a finiteset of possible universality classes, each corresponding to a certain type of underlyingfield theory. The field-theoretic action involves certain relevant operators whose formis usually determined by the symmetry properties of the process, while other detailsof the microscopic dynamics lead to contributions which are irrelevant in the field-theoretic sense. This explains why various different models may belong to the sameuniversality class.

In particular the DP class – the “Ising” class of non-equilibrium statistical physics – isextremely robust with respect to the microscopic dynamic rules. The large variety androbustness of DP models led Janssen and Grassberger to the conjecture that a modelshould belong to the DP universality class if the following conditions hold [27, 28]:

Haye Hinrichsen — Complex Systems

Page 83: Complex system

4.1 Directed percolation 77

critical MF d = 1 d = 2 d = 3 d = 4− ε

η 1 0.276486(8) 0.584(4) 0.81(1) 1− ε/6− 0.01128 ε2

ν⊥ 1/2 1.096854(4) 0.734(4) 0.581(5) 1/2 + ε/16 + 0.02110 ε2

ν‖ 1 1.733847(6) 1.295(6) 1.105(5) 1 + ε/12 + 0.02238 ε2

z 2 1.580745(10) 1.76(3) 1.90(1) 2− ε/12− 0.02921 ε2

δ 1 0.159464(6) 0.451 0.73 1− ε/4− 0.01283 ε2

θ 0 0.313686(8) 0.230 0.12 ε/12 + 0.03751 ε2

Table 4.2.: Critical exponents of directed percolation obtained by mean field (MF), numerical, and field-theoretical methods.

1. The model displays a continuous phase transition from a fluctuating active phaseinto a unique absorbing state.

2. The transition is characterized by a positive one-component order parameter.

3. The dynamic rules involve only short-range processes.

4. The system has no unconventional attributes such as additional symmetries orquenched randomness.

Although this conjecture has not yet been proven rigorously, it is highly supported bynumerical evidence. In fact, DP seems to be even more general and has been identifiedeven in systems that violate some of the four conditions.

The universality classes can be characterized in terms of their critical exponents andscaling functions. Hence in order to identify a certain universality class, a precise esti-mation of the critical exponents is an important numerical task. In the case of DP, thenumerical estimates suggest that the critical exponents are given by irrational numbersrather than simple rational values. In addition, scaling functions (such as f and f ′ inEq. (4.19)), which were ignored in the literature for long time, provide a wealth of usefulinformation, as shown e.g. in a recent review by Lubeck [15].

4.1.8. Langevin equation

On a coarse-grained level DP is often described in terms of a phenomenological Langevinequation with a field-dependent noise. This Langevin equation can be derived rigor-ously from the master equation of the contact process [27] and reads

∂tρ(~x, t) = aρ(~x, t)− λρ2(~x, t) + D∇2ρ(~x, t) + ξ(~x, t) . (4.21)

Here ξ(~x, t) is a density-dependent Gaussian noise field with the correlations

〈ξ(~x, t)〉 = 0 , (4.22)〈ξ(~x, t)ξ(~x′, t′)〉 = Γ ρ(~x, t) δd(~x−~x′) δ(t− t′) . (4.23)

Since the amplitude of ξ(~x, t) is proportional to√

ρ(~x, t), the absorbing state ρ(~x, t) = 0does not fluctuate. The square-root behavior is related to the fact that the noise de-scribes density fluctuations on a coarse-grained scale, which can be viewed as the sum

Haye Hinrichsen — Complex Systems

Page 84: Complex system

78 Nonequilibrium phase transitions

x

t

+ −U

I

Figure 4.10.: Electric current running through a random resistor-diode network at the percolationthreshold from one point to the other. The right panel shows a particular realization. Thecolor and thickness of the line represent the intensity of the current.

of individual noise contributions generated by each particle averaged over some meso-scopic box size. According to the central limit theorem, if the number of particles in thisbox is sufficiently high, ξ(~x, t) approaches a Gaussian distribution with an amplitudeproportional to the square root of the number of active sites in the box.

Applying the scaling transformation (4.16) to Eqs. (4.21)-(4.22) a simple dimensionalanalysis gives the mean field critical point ac = 0 and the mean field exponents

ηMF = η′MF

= 1, νMF⊥ = 1/2, νMF

‖ = 1. (4.24)

In this situation the noise is irrelevant in d > 4, marginal in d = 4, and relevant ind < 4 dimensions. This means that dc = 4 is the upper critical dimension of directedpercolation above which the mean field exponents are correct. Below dc the exponentscan be determined by a renormalization group study of the corresponding field theory.A comprehensive introduction to the field theory of DP and other universality classes isbeyond the scope of these lecture notes, the interested reader is referred to very recentand excellent review articles by Janssen, Tauber, Howard, and Lee [29, 30]. We notethat Eq. (4.21) is the minimal Langevin equation needed to describe DP. It may alsoinclude higher order terms such as ρ3(~x, t), ∇4ρ(~x, t), or higher-order contributions ofthe noise, but in the field-theoretic sense these contributions turn out to be irrelevantunder renormalization group transformations, explaining the robustness of DP.

4.1.9. Multifractal properties of currents on directed percolation clusters

So far we have seen that the critical behavior of DP and other absorbing phase transi-tions can be described in terms of scaling laws that involve three independent criticalexponents η, ν⊥, and ν‖. This type of scaling is usually referred to as simple scaling, asopposed to multiscaling, where a whole spectrum of exponents exists. For example, inDP at criticality starting with a homogeneous initial state any integral power ρn of theorder parameter ρ scales in the same way, i.e.

ρn(t) ∼ t−δ n = 1, 2, 3, . . . . (4.25)

Let us now consider an electric current running on a directed percolation cluster accord-ing to Kirchhoff’s laws, interpreting the cluster as a random resistor-diode network. By

Haye Hinrichsen — Complex Systems

Page 85: Complex system

4.1 Directed percolation 79

introducing such a current the theory is extended by an additional physical concept. Infact, even though the DP cluster itself is known to be characterized by simple scalinglaws, a current running on it turns out to be distributed in a multifractal manner. Thisphenomenon was first discovered in the case of isotropic percolation [31, 32] and thenconfirmed for DP [33, 34].

As shown in Fig. 4.10, in directed bond percolation at criticality an electric current Irunning from one point to the other is characterized by a non-trivial distribution of cur-rents. The multifractal structure of this current distribution can be probed by studyingthe moments

M` := ∑b(Ib/I)` ` = 0, 1, 2, . . . , (4.26)

where the sum runs over all bonds b that transport a non-vanishing current Ib > 0.For example, M0 is just the number of conducting bonds while M1 is essentially thetotal resistance between the two points. M2 is the second cumulant of the resistancefluctuations and can be considered as a measure of the noise in a given realization.Finally, M∞ is the number of so-called red bonds that carry the full current I. Thequantity M` is found to scale as a power law

M`(t) ∼ tψ`/ν‖ . (4.27)

In the case of simple scaling, the exponents ψ` would depend linearly on `. In thepresent case, however, a non-linear dependence is found both by field-theoretic as wellas numerical methods (see Ref. [34]). This proves that electric currents running on DPclusters have multifractal properties.

Again it should be emphasized that multifractality is not a property of DP itself,rather it emerges as a new feature whenever an additional process, here the transportof electric currents, is confined to live on the critical clusters of DP.

4.1.10. Characterizing non-equilibrium transition by Yang-Lee zeroes in thecomplex plane

In equilibrium statistical mechanics a large variety of continuous phase transitions hasbeen analyzed by studying the distribution of so-called Yang-Lee zeros [35, 36, 37]. Todetermine these zeros the partition sum of a (finite) equilibrium system is expressed asa polynomial of the control parameter, which is usually a function of temperature. E.g.,for the Ising model the zeros of this polynomial lie on a circle in the complex plane andheckle the real line from both sides in the vicinity of the phase transition as the systemsize increases. This explains why the analytic behavior in finite system crosses over toa non-analytic behavior at the transition point in the thermodynamic limit.

Recently, it has been shown that the concept of Yang and Lee can also be applied tonon-equilibrium systems [38], including DP [39]. To this end one has to consider theorder parameter in a finite percolation tree as a function of the percolation probability pin the complex plane. This can be done by studying the survival probability P(t) (seeEq. (4.10)), which is defined as the probability that a cluster generated in a single siteat time t = 0 survives at least up to time t. In fact, the partition sum of an equilibriumsystem and the survival probability of DP are similar in many respects. They both

Haye Hinrichsen — Complex Systems

Page 86: Complex system

80 Nonequilibrium phase transitions

-1 0 1 2 3 4Re(p)

-0,5

0

0,5

Im(p

)

criticalpoint

Figure 4.11.: Distribution of Yang-Lee zeros of the polynomial P(15) in the complex plane. The transi-tion point is marked by an arrow.

are positive in the physically accessible regime and can be expressed as polynomials infinite systems. As the system size tends to infinity, both functions exhibit a non-analyticbehavior at the phase transition as the Yang-Lee zeros in the complex plane approachthe real line.

In directed bond percolation the survival probability is given by the sum over theweights of all possible configurations of bonds, where each conducting bond contributesto the weight with a factor p, while each non-conducting bond contributes with a factor1 − p. As shown in Ref [39], the polynomial for the survival probability can be ex-pressed as a sum over all cluster configurations c reaching the horizontal row at time t.The polynomial is of the form

P(t) = ∑c

pn(1− p)m , (4.28)

where n denotes the number of bonds while m is the number of bonds belonging to itscluster’s hull. Summing up all weights in Eq. (4.28), one obtains a polynomial of degreet2 + t. For example, the first few polynomials are given by

P(0) = 1 (4.29)P(1) = 2p− p2

P(2) = 4p2 − 2p3 − 4p4 + 4p5 − p6

P(3) = 8p3 − 4p4 − 10p5 − 3p6 + 18p7 + 5p8 − 30p9 + 24p10 − 8p11 + p12

P(4) = 16p4 − 8p5 − 24p6 − 8p7 + 6p8 + 84p9 − 29p10 − 62p11 − 120p12

+244p13 + 75p14 − 470p15 + 495p16 − 268p17 + 83p18 − 14p19 + p20

As t increases, the number of cluster configurations grows rapidly, leading to compli-cated polynomials with very large coefficients. The distribution of zeros in the complexplane for t = 15 is shown in Fig. 4.11. As can be seen, the distribution reminds of a

Haye Hinrichsen — Complex Systems

Page 87: Complex system

4.2 Other classes of absorbing phase transitions 81

fractal, perhaps being a signature of the non-integrable nature of DP. As expected, thezeros approach the phase transition point from above and below. Their distance to thetransition point is found to scale as t−1/ν‖ in agreement with basic scaling arguments.

4.2. Other classes of absorbing phase transitions

So far we discussed directed percolation as the most important class of non-equilibriumphase transitions into absorbing states. Because of the robustness of DP it is interestingto search for other universality classes. The ultimate goal would be to set up a table ofpossible non-trivial universality classes from active phases into absorbing states.

Although various exceptions from DP have been identified, the number of firmlyestablished universality classes is still small. A recent summary of the status quo can befound in Refs. [14, 15]. In these lecture notes, however, we will only address the mostimportant classes with local interactions.

4.2.1. Parity-conserving particle processes

The parity-conserving (PC) universality class comprises phase transitions that occur inreaction-diffusion processes of the form

A→ (n + 1)A2A→ ∅ (4.30)

(4.31)

combined with single-particle diffusion, where the number of offspring n is assumedto be even. As an essential feature, these processes conserve the number of particlesmodulo 2. A particularly simple model in this class with n = 2 was proposed by Zhongand ben-Avraham [40]. The estimated critical exponents

η = η′ = 0.92(2) , ν‖ = 3.22(6) , ν⊥ = 1.83(3) (4.32)

differ significantly from those of DP, establishing PC transitions as an independent uni-versality class. The actual values of δ and θ depend on the initial condition. If theprocess starts with a single particle, it will never stop because of parity conservation,hence δ = 0, i.e. the usual relation δ = η/ν‖ does no longer hold. However, if it startswith two particles, the roles of δ and θ are exchanged, i.e. θ = 0. The theoretical reasonsfor this exchange are not yet fully understood.

The relaxational properties in the subcritical phase differ significantly from the stan-dard DP behavior. While the particle density in DP models decays exponentially asρ(t) ∼ e−t/ξ‖ , in PC models it decays algebraically since the decay is governed by theannihilation process 2A→ ∅.

A systematic field theory for PC models can be found in Refs. [41, 42], confirmingthe existence of the annihilation fixed point in the inactive phase. However, the field-theoretic treatment at criticality is extremely difficult as there are two critical dimen-sions: dc = 2, above which mean-field theory applies, and d′c ≈ 4/3, where for d > d′c

Haye Hinrichsen — Complex Systems

Page 88: Complex system

82 Nonequilibrium phase transitions

Glauber−Ising model at T=0 Classical voter model

Figure 4.12.: Coarsening of a random initial state in the Glauber-Ising model at zero temperature com-pared to the coarsening in the classical voter model.

(d < d′c) the branching process is relevant (irrelevant) at the annihilation fixed point.Therefore, the physically interesting spatial dimension d = 1 cannot be accessed by acontrolled ε-expansion down from upper critical dimension dc = 2.

4.2.2. The voter universality class

Order-disorder transition in models with a Z2-symmetry which are driven by interfacialnoise belong to the so-called voter universality class [23]. As will be explained below,the voter class and the parity conserving class are identical in one spatial dimensionbut different in higher dimensions.

To understand the physical mechanism that generates the phase transition in thevoter model, let us first discuss the difference between interfacial and bulk noise. Con-sider for example the Glauber-Ising model in two spatial dimensions at T = 0. Thismodel has two Z2-symmetric absorbing states, namely, the two fully ordered states.Starting with a random initial configuration one observes a coarsening process formingodered domains whose size grows as

√t. In the Ising model at T = 0 domain growth

is curvature-driven, leading to an effective surface tension of the domain walls. Infact, as shown in Fig. 4.12 the domain walls produced by the Glauber-Ising model ap-pear to be smooth and indeed the density of domain walls is found to decay as 1/

√t.

Increasing temperature occasional spin flips occur, leading to the formation of smallminority islands inside the existing domains. For small temperature the influence ofsurface tension is strong enough to eliminate these minority islands, stabilizing theordered phase. However, increasing T above a certain critical threshold Tc this mech-anism breaks down, leading to the well-known order-disorder phase transition in theIsing model. Thus, from the perspective of a dynamical process, the Ising transitionresults from a competition between surface tension of domain walls and bulk noise.

Let us now compare the Glauber-Ising model with the classical voter model in twospatial dimensions. The classical voter model [13] is a caricatural process in which sites(voters) on a square lattice adopt the opinion of a randomly-chosen neighbor. As theIsing model, the voter model has two symmetric absorbing states. Moreover, an initiallydisordered state coarsens. However, as shown in the right panel of Fig. 4.12, alreadythe visual appearance is very different. In fact, in the voter model the domain sizes arefound to be distributed over the whole range between 1 and

√t. Moreover, in contrast

Haye Hinrichsen — Complex Systems

Page 89: Complex system

4.2 Other classes of absorbing phase transitions 83

to the Glauber-Ising model, the density of domain walls decays only logarithmicallyas 1/ ln t. This marginality of the voter model is usually attributed to the exceptionalcharacter of its analytic properties [43, 44, 45] and may be interpreted physically as theabsence of surface tension.

In the voter model even very small thermal bulk noise would immediately lead to adisordered state. However, adding interfacial noise one observes a non-trivial continu-ous phase transition at a finite value of the noise amplitude. Unlike bulk noise, whichflips spins everywhere inside the ordered domains, interfacial noise restricts spin flipsto sites in the vicinity of domain walls.

Recently Al Hammal et al. [46] introduced a Langevin equation describing voter tran-sitions. It is given by

∂tρ = (aρ− bρ3)(1− ρ2) + D∇2ρ + σ

√1− ρ2ξ , (4.33)

where ξ is a Gaussian noise with constant amplitude. For b > 0 this equation is foundto exhibit separate Ising and DP transitions, while for b ≤ 0 a genuine voter transitionis observed. With these new results the voter universality class is now on a much firmerbasis than before.

In one spatial dimension, kinks between domain walls can be interpreted as particles.Here interfacial noise between two domains amounts to generating pairs of additionaldomain walls nearby. This process, by its very definition, conserves parity and canbe interpreted as offspring production A → 3A, 5A, . . . while pairwise coalescence ofdomain walls corresponds to particle annihilation 2A → ∅. For this reason the voterclass and the parity-conserving class coincide in one spatial dimension. However, theirbehavior in higher dimensions, in particular the corresponding field theories, are ex-pected to be different. Loosely speaking, the parity-conserving class deals with thedynamics of zero-dimensional objects (particles), while in the voter class the objects ofinterest are d-1-dimensional hypermanifolds (domain walls).

4.2.3. Absorbing phase transitions with a conserved field

According to the conjecture by Janssen and Grassberger (cf. Sect. 4.1.7), non-DP be-havior is expected if the dynamics is constrained by additional conservation laws. Forexample, as shown in the previous subsections, parity conservation or a Z2-symmetrymay lead to different universality classes. Let us now consider phase transitions inparticle processes in which the total number of particles is conserved. According to anidea by Rossi et al. [47] this leads to a different universality class of phase transitionswhich is characterized by an effective coupling of the process to a non-diffusive con-served field. Models in this class have infinitely many absorbing states and are relatedto certain models of self-organized criticality (for a recent review see Ref. [15]).

As an example let us consider the conserved threshold transfer process (CTTP). Inthis model each lattice site can be vacant or occupied by either one or two particles.Empty and single occupied sites are considered as inactive while double occupied sitesare regarded as active. According to the dynamical rules each active site attempts to

Haye Hinrichsen — Complex Systems

Page 90: Complex system

84 Nonequilibrium phase transitions

move the two particles randomly to neighboring sites, provided that these target sitesare inactive. By definition of these rules, the total number of particles is conserved.Clearly, it is the background of solitary particles that serves as a conserved field towhich the dynamics of active sites is coupled.

In d ≥ 2 spatial dimensions this model shows the same critical behavior as the Mannasand pile model [48]. The corresponding critical exponents in d = 2 dimensions wereestimated by [15]

η = 0.639(9) , η′ = 0.624(29) , ν⊥ = 0.799(14) , ν‖ = 1.225(29). (4.34)

Obviously, this set of exponents differs from those of all other classes discussed above.In one spatial dimension the situation is more complicated because of a split of theCTTP and Manna universality classes, as described in detail in Ref. [15]

4.2.4. The diffusive pair contact process

Among the known transitions into absorbing states, the transition occurring in the so-called contact process with diffusion (PCPD) is probably the most puzzling one (seeRef. [49] for a recent review). The PCPD is a reaction-diffusion process of particleswhich react spontaneously whenever two of them come into contact. In its simplestversion the PCPD involves two competing reactions, namely

fission: 2A→ 3A ,annihilation: 2A→ ∅ .

In addition individual particles are allowed to diffuse. Moreover, there is an additionalmechanism such that the particle density cannot diverge. In models with at most oneparticle per site this mechanism is incorporated automatically.

The PCPD displays a non-equilibrium phase transition caused by the competition offission and annihilation. In the active phase, the fission process dominates, maintaininga fluctuating steady-state, while in the subcritical phase the annihilation process dom-inates so that the density of particles decreases continuously until the system reachesone of the absorbing states. The PCPD has actually two absorbing states, namely, theempty lattice and a homogeneous state with a single diffusing particle.

The pair-contact process with diffusion was already suggested in 1982 by Grass-berger [50], who expected a critical behavior “distinctly different” from DP. Eight yearsago the problem was rediscovered by Howard and Tauber [51], who proposed a bosonicfield-theory for the one-dimensional PCPD. In this theory the particle density is unre-stricted and thus diverges in the active phase. The first quantitative study of a re-stricted PCPD by Carlon et al. [52] using DMRG techniques led to controversial resultsand released a still ongoing debate concerning the asymptotic critical behavior of the1+1-dimensional PCPD at the transition. Currently the main viewpoints are that thePCPD

• represents a new universality class with well-defined critical exponents [53],

• represents two different universality classes depending on the diffusion rate [54,55] and/or the number of space dimensions [56],

Haye Hinrichsen — Complex Systems

Page 91: Complex system

4.2 Other classes of absorbing phase transitions 85

102

103

104

105

106

107

108

t [mcs]

2.5

3.0

3.5

4.0

4.5

ρ(t

) t

0.2

p=0.795420

p=0.795415

p=0.795410 (previously estimated critical point)

p=0.795405

p=0.795400

Figure 4.13.: High-performance simulation of the PCPD model introduced by Kockelkoren and Chate.The plot shows the density of active sites multiplied by the expected power law. As canbe seen, the lines are slightly curved. Kockelkoren and Chate simulated the process upto about 107 Monte Carlo updates (dotted line), identifying the blue line in the middle ascritical curve. Extending these simulations by one decade one recognizes that this curve isactually subcritical and that the true critical threshold has to be slightly higher. Obviouslya slow drift towards DP (slope indicated by dashed line) cannot be excluded.

• can be interpreted as a cyclically coupled DP and annihilation process [57],

• is a marginally perturbed DP process with continuously varying exponents [58],

• may have exponents depending continuously on the diffusion constant [59],

• may cross over to DP after very long time [60, 61], and

• is perhaps related to the problem of non-equilibrium wetting in 1+1 dimensions [62].

Personally I am in favor of the conjecture that the PCPD in 1+1 dimensions belongsto the DP class. This DP behavior, however, is masked by extremely slow (probablylogarithmic) corrections. Searching the critical point by fitting straight lines in a doublelogarithmic plot may therefore lead to systematic errors in the estimate of the criticalthreshold since the true critical line is not straight but slightly curved. This in turn leadsto even larger systematic errors for the critical exponents. However, as the computa-tional effort is increased, these estimates seem to drift towards DP exponents.

The problem of systematic errors and drifting exponents can be observed for exam-ple in the work by Kockelkoren and Chate, who tried to establish the PCPD as a newuniversality class as part of a general classification scheme [53]. Introducing a particu-larly efficient model they observed clean power laws in the decay of the density overseveral decades, leading to the estimates

δ = η/ν‖ = 0.200(5), z = ν⊥/ν‖ = 1.70(5), η = 0.37(2). (4.35)

However, increasing the numerical effort by a decade in time, it turns out that theircritical point pc = 0.795410(5), including its error margin, lies entirely in the inactive

Haye Hinrichsen — Complex Systems

Page 92: Complex system

86 Nonequilibrium phase transitions

infection: recovery:

Figure 4.14.: Directed percolation as a caricature of an epidemic process

phase (see Fig. 4.13). In the attempt to obtain an apparent power-law behavior, it seemsthat the authors systematically underestimated the critical point.

Presently it is still not yet clear whether the PCPD belongs to the DP universalityclass or not. Apparently computational methods have reached their limit and moresophisticated techniques are needed to settle this question.

4.3. Epidemic spreading with long-range interactions

Directed percolation is often used as a caricatural process for epidemic spreading. Sup-pose that infected and healthy individuals are sitting in a train, as shown in Fig. 4.14.On the one hand, infected people infect their nearest neighbors with a certain probabil-ity per unit time. On the other hand, infected individuals may recover spontaneously.Depending on the rates for infection and recovery, this toy model for epidemic spread-ing just resembles a simple DP process.

Although DP is too simplistic to describe epidemic spreading in reality, there aresome important analogies. Certainly, epidemic spreading in Nature is a non-equilibriumprocess with a transition-like behavior at some threshold of the infection rate. For ex-ample, as an increasing number of people refuses vaccinations, the question arises atwhich percentage of unprotected individuals certain diseases, that became almost ex-tinct, will again percolate through the society.

Epidemic spreading in Nature is of course a much more complex phenomenon. Forexample, it takes place in a very disordered environment and involves short- and long-range interactions. Moreover, individuals protect themselves by sophisticated immu-nization strategies. Certainly, physicist will never be able to predict epidemic spreadingin Nature quantitatively. However, it is possible to extend DP towards a more realisticdescription of epidemic spreading and to study how they influence the behavior at thetransition. Some of these extensions will be discussed in the following.

4.3.1. Immunization and mutations

As a first step towards a more realistic description of epidemic spreading we may in-clude the effect of immunization. For example, we may declare all sites that were ac-tive at least once in the past as immune. One then introduces two different infectionprobabilities, namely, a probability for first infection p0, and a second (usually smaller)

Haye Hinrichsen — Complex Systems

Page 93: Complex system

4.3 Epidemic spreading with long-range interactions 87

0 0,2 0,4 0,6 0,8 1

first infection probability p0

0

0,2

0,4

0,6

0,8

1

rein

fectio

n p

rob

ab

ility

p

compact growth

no growth

multicritical point

GEP

annular growth

Figure 4.15.: Phase diagram for directed percolation with immunization (see text). The right panelshows spreading by annular growth with fresh (green), active (red), and immune (yellow)individuals.

probability p for the reinfection of immune sites. The case of perfect immunization(vanishing reinfection probability) is known as general epidemic process [63] which canbe regarded as a dynamical procedure to grow isotropic percolation clusters.

Introducing a finite reinfection probability one obtains the phase diagram shown inFig. 4.15. It comprises a curved phase transition line with the same critical behavioras in the generalized epidemic process which separates phases of finite and annulargrowth. Moreover, there is a horizontal transition line above which compact clustergrowth is observed. The critical properties along this line are partly dictated by theDP behavior inside immune regions, combined with non-universal properties for thegrowth of the clusters at its boundaries [64]. Both transition lines meet in a point withan interesting multicritical behavior. Extending this model by possible mutations ofthe spreading agent, the memory of immunization is lost. As a result one observes acontrolled crossover back to DP [65].

4.3.2. Long-range infections

Realistic diseases spread by different transport mechanisms, including direct contactbetween local individuals, transport by carriers such as mosquitos, and long-rangetransport e.g. by air planes. Usually it is very difficult to predict how these transportmechanism contribute to epidemic spreading. As an interesting empirical approach,Brockmann and Geisel traced the spatio-temporal trajectories of individual dollar noteswithin the United States [66, 67]. In agreement with previous conjectures [68] theyfound out that the transport distances are distributed algebraically with some empir-ical exponent. Moreover, the time intervals at which the dollar notes were registeredwere found to obey a power law as well.

Motivated by such empirical studies it is near at hand to generalize DP such that thespreading distances r are distributed as a power law

P(r) ∼ r−d−σ , (σ > 0) (4.36)

Haye Hinrichsen — Complex Systems

Page 94: Complex system

88 Nonequilibrium phase transitions

0 0,5 1 1,5 2 2,5 3

σ

0

0,5

1

1,5

2

κ

DP

MF

dominated by spatial Levy flights

dominated bymixed phase incubation times

MFL

MFI

Figure 4.16.: Phase diagram of DP with spatio-temporal Levy flights.

where σ is a control exponent. In the literature such algebraically distributed long-range displacements are known as Levy flights [69] and have been studied extensivelye.g. in the context of anomalous diffusion [70]. In the present context of epidemicspreading it turns out that such long-range flights do not destroy the transition, insteadthey change the critical behavior provided that σ is sufficiently small. More specifically,it was observed both numerically and in mean field approximations that the criticalexponents change continuously with σ [71, 72, 73]. As a major breakthrough, Janssenet al introduced a renormalizable field theory for epidemic spreading transitions withspatial Levy flights [74], computing the critical exponents to one-loop order. Because ofan additional scaling relation only two of the three exponents were found to be inde-pendent. These results were confirmed numerically by Monte Carlo simulations [75].

4.3.3. Incubation times

As a second generalization one can introduce a similar long-range mechanism in tem-poral direction. Such ‘temporal’ Levy flights may be interpreted as incubation times ∆tbetween catching and passing on the infection. As in the first case, these incubationtimes are assumed to be algebraically distributed as

P(∆t) ∼ ∆t−1−κ , (κ > 0) (4.37)

where κ is a control exponent. However, unlike spatial Levy flights, which take placeequally distributed in all directions, such temporal Levy flights have to be directedforward in time. Again it was possible to compute the exponents by a field-theoreticrenormalization group calculation [76].

Recently, we studied the mixed case of epidemic spreading by spatial Levy flightscombined with algebraically distributed incubation times [77]. In this case the corre-

Haye Hinrichsen — Complex Systems

Page 95: Complex system

4.4 Surface growth and non-equilibrium wetting 89

sponding field theory was found to render two additional scaling relations, namely,

2η + (σ− d)ν⊥ − ν‖ = 0 , (4.38)2η − dν⊥ + (κ − 1)ν‖ = 0 . (4.39)

Hence only one of the three exponents is independent. In particular, the dynamical ex-ponent z locks onto the ratio σ/κ. A systematic numerical and field-theoretic studyleads to a generic phase diagram in terms of the control exponents σ and κ whichis shown in Fig. 4.16. It includes three types of mean-field phases, a DP phase, twophases corresponding to purely spatial or purely temporal Levy flights, and a novelfluctuation-dominated phase describing the mixed case in which the critical exponentshave been computed by a field-theoretic renormalization group calculation to one-looporder.

4.4. Surface growth and non-equilibrium wetting

Another interesting direction of non-equilibrium physics is the study of wetting far fromequilibrium. Wetting phenomena occur in a large variety of experiments, where a pla-nar substrate is exposed to a gas phase. Usually the term ‘wetting’ refers to a situationwhere a bulk phase in contact with a substrate coexists with a layer of a different phasewhich is preferentially attracted to the surface of the substrate. By changing physi-cal parameters such as temperature and chemical potential, the system may undergo awetting transition from a non-wet phase, where the thickness of the layer stays finite,to a wet phase, where the layer becomes macroscopic.

In many experimental situations it is reasonable to assume that a wetting stationarylayer is in thermal equilibrium. In fact, methods of equilibrium statistical mechanicsturned out to be very successful in a large variety of theoretical and experimental stud-ies [78]. Therefore, the question arises whether models for wetting far from equilibriummay exhibit new physical phenomena which cannot be observed under equilibriumconditions.

Non-equilibrium wetting is usually modeled as a Kardar-Parisi-Zhang (KPZ) growthprocess [79] growing on top of a hard substrate. Theoretically such a system can bedescribed by a KPZ equation in a potential [80, 81]

∂h(~x, t)∂t

= σ∇2h(~x, t)−∂V(h(~x, t)

)∂h(~x, t)

+ λ(∇h(~x, t)

)2+ ξ(~x, t) , (4.40)

where ξ(~x, t) is a Gaussian noise. It is important to note that the nonlinear term λ(∇h(~x, t)

)2

in this equation is a relevant perturbation of the underlying field theory, i.e., even if λ isvery small, it will be amplified under renormalization group transformations, drivingthe system away from thermal equilibrium. In fact, as can be shown by constructing aclosed loop, it is this term that breaks detailed balance.

Some time ago we introduced a simple solid-on-solid (SOS) model for non-equilibriumwetting in 1+1 dimensions [82, 83, 84]. The model is controlled by an adsorption rateq, a desorption rate p, and optionally by a special deposition rate q0 on sites at height

Haye Hinrichsen — Complex Systems

Page 96: Complex system

90 Nonequilibrium phase transitions

rate qrate q0

evaporation from themiddle of plateaus

evaporation at theedges of terraces

random deposition

rate p:rate 1:

Figure 4.17.: Dynamical rules of the restricted solid-on-solid model for non-equilibrium wetting. Neigh-boring heights are restricted to differ by at most one unit.

0.0 0.5 1.0 1.5 2.0

p

0.0

0.5

1.0

1.5

q

phase coexistence

pinned

moving

Figure 4.18.: Phase diagram of the solid-on-solid model for non-equilibrium wetting. The bold blackline represents the second-order phase transition line. For sufficiently small q0 the transi-tion becomes first order and a region emerges (striped in the figure), where the pinned andthe moving phase coexist.

zero (desorption at the edges takes place at rate 1, see Fig. 4.17). Setting q0 = q andvarying the growth rate the model exhibits a continuous wetting transition at a certaincritical growth rate qc(p). This wetting transition is related to the unpinning process ofan interface from the substrate. Moreover, for q0 6= q the model can emulate a short-range interaction between the interface and the substrate [85, 86]. It was found thatsufficiently strong attractive interaction modifies the nature of the wetting transitionand makes it first order. In addition, it has been demonstrated that there exists an ex-tended region in the phase diagram, where the pinned and the moving phases coexist inthe sense that the transition time from the pinned to the moving phase grows exponen-tially with the system size so that the two phases become stable in the thermodynamiclimit. This type of phase coexistence is in fact a new phenomenon that occurs only faraway from equilibrium and should be experimentally observable.

Haye Hinrichsen — Complex Systems

Page 97: Complex system

5. Equilibrium critical phenomena and spinglasses

5.1. The Ising model

Definition

Equilibrium systems: As we have seen in the preceding chapter, stochastic systems incontact with external reservoirs approach an equilibrium state in which the entropy ofthe system+environment is maximized. Moreover, we have shown that the maximiza-tion of the entropy is equivalent to finding the extremum of an appropriate thermody-namic potential V . The equilibrium state itself is characterized by a condition called“detailed balance”, stating that all probability currents in the system between pairs ofdifferent configurations cancel each other.

Magnetization of Nickel (http://www.doitpoms.ac.uk)

In this chapter we discuss an importantfield of equilibrium statistical mechanics,namely, phase transitions and criticalphenomena in stochastic lattice models atthermal equilibrium. Roughly speaking,the term “phase transition” stands fora sudden change of the macroscopicbehavior when external parametersof the surrounding reservoir, e.g. thetemperature or the pressure, are varied.

For simplicity let us begin with the simplest situation, namely, with systems exchang-ing energy with a heat bath. In this case the potential to be extremalized is

V = Hsys − βE or, in textbook form: F = E− TH. (5.1)

As can be seen, the Helmholtz free energy contains two terms with different signs.One of them is the internal energy, the other one is essentially the entropy of the sys-tem. Hence for systems exchanging energy with their surrounding, internal energyand entropy compete one another. If these two components are in favor of differentmacroscopic situations, such a competition could lead to a phase transition when thetemperature is varied.

The “harmonic oscillator” of systems with an equilibrium phase transition is the cel-ebrated Ising model, named after Ernest Ising, who investigated this model for the first

Haye Hinrichsen — Complex Systems

Page 98: Complex system

92 Equilibrium critical phenomena and spin glasses

Figure 5.1.: Two-dimensional Ising model in an external field in contact with a heat bath.

time in his PhD thesis. The Ising model is a lattice model of interacting classical spins siwhose quantum nature is taken into account only in so far as to they can be either “up”or “down”, corresponding to the values si = ±1.

As any model in equilibrium statistical mechanics the Ising model is solely definedby its energy functional Es, which associates with each system configuration s ∈ Ωsys

a well-defined energy Es. In the Ising model this energy functional is defined in sucha way that aligned spins are favored. This introduces some kind of surface tension,having the tendency of reducing the surface between domains of equally oriented spinsand therefore ordering the system. On the other hand, entropy favors disorder, tryingto orient the spins randomly. Obviously these two mechanisms compete one another.As we will see in the following, this can lead to in order-disorder phase transition.

Definition of Ising model The Ising model is defined on a regular d-dimensional lat-tice of sites enumerated by some index i (see Fig. 5.1). The lattice structure goes intothe definition of the model by defining which of the sites are nearest neighbors. In orderto indicate that two sites i and j are nearest neighbors, we will use the notation < i, j >.For example, a sum running over pairs of neighboring sites will be denoted by ∑<i,j>.With this notation the energy functional of the Ising model is given by

Es = −J ∑<i,j>

sisj , (5.2)

where J is the coupling constant. For J > 0 , the energy is least when the spins arealigned. Furthermore, we may apply an additional external magnetic field h, extendingthe energy functional by a second term

Es = −J ∑<i,j>

sisj − h ∑i

si . (5.3)

This expression holds for arbitrary lattices. In the special case of the one-dimensionallattice with periodic boundary conditions, where the sites are enumerated by i = 1, . . . , L,we have

Es = −JL

∑i=1

sisi+1 − hL

∑i=1

si . (5.4)

Haye Hinrichsen — Complex Systems

Page 99: Complex system

5.1 The Ising model 93

Note that in the case of a vanishing external magnetic field h = 0 the Ising model issymmetric under the replacement si → −si, i.e., the Ising model exhibits a Z2 symmetryunder a global reversal of all spins. Clearly, this symmetry is broken if an externalmagnetic field h is applied.

Thermostatics of the Ising model: The Ising model is assumed to be in contact witha thermal reservoir at temperature T = 1/β. This is a realistic assumptions since al-most all experiments, where the spins are located on the surface of a substrate, are inthermal contact with the supporting three-dimensional solid-state body. Allowing thisspin system to exchange energy with environment, its probability distribution will begiven by the Boltzmann weight

Ps =1Z

exp(−βEs) . (5.5)

The thermodynamic potential in the canonical ensemble, which is equivalent to theHelmholtz free energy, is (see Table ?? on page ??)

V = Hsys − βE = ln Z , (5.6)

where the partition sum

Z(β, h) = ∑s∈Ωsys

exp

(βJ ∑

<i,j>sisj + βh ∑

isi

)(5.7)

depends on the inverse temperature β = 1/T and on the external magnetic field h.Sometimes it is also convenient to define the thermodynamic potential per lattice site

v =VN

, (5.8)

where N is the total number of sites. As we will see below, this allows us to definespecific (i.e. volume-independent) properties of the system.

Before proceeding, let us define the most important observables of interest.

• Magnetization

The most important quantity is the magnetization, which plays the role of an orderparameter at the phase transition. We distinguish between the total configurationalmagnetization

Ms = ∑i

si .. (5.9)

the total mean magnetization

M(β, h) = 〈Ms〉 = ∑s∈Ωsys

Ps Ms , (5.10)

and the corresponding average magnetization per site

m(β, h) =1N

M(β, h) . (5.11)

Haye Hinrichsen — Complex Systems

Page 100: Complex system

94 Equilibrium critical phenomena and spin glasses

Because of Eq. (5.7) these averages can be expressed as partial derivatives withrespect to the external field:

M(β, h) =1

βZ∂Z(β, h)

∂h=

∂V(β, h)∂h

(5.12)

and

m(β, h) =1β

∂v(β, h)∂h

. (5.13)

Note that the configurational total magnetization Ms appears in the energy func-tional Es = −J ∑ij sisj − hMs, where it is multiplied by h, the external magneticfield. The two quantities, the internal order parameter MS and the external field h,are said to form a conjugate pair. This is very common in physics. Whenever wehave an observable O of interest, it is useful to define a conjugate field or cur-rent J and to add the product ±JO to the action or the energy functional. Then,the mean of O can be computed by differentiating with respect to J.

• Susceptibility

Susceptibility of Nickel (http://www.doitpoms.ac.uk)

Furthermore, we are interested inthe susceptibility, defined as the re-sponse of the magnetization persite to a variation of the externalfield

χ(β, h) =∂m(β, h)

∂h, (5.14)

which can also be expressed as

χ(β, h) =1β

∂2v(β, h)∂h2 . (5.15)

Usually the susceptibility increases as we approach the phase transition and fi-nally diverges at the critical point, as shown in the figure.

• Energy

Likewise, another quantity of interest is the configurational energy Es, the meanenergy

E(β, h) = 〈Es〉 = ∑s∈Ωsys

PsEs . (5.16)

and the corresponding average energy per site

ε(β, h) =1N

E(β, h) . (5.17)

Again, these averages may be expressed as partial derivatives of the potential

E(β, h) = −∂V(β, h)∂β

, ε(β, h) = −∂v(β, h)∂β

, (5.18)

Haye Hinrichsen — Complex Systems

Page 101: Complex system

5.1 The Ising model 95

• Heat capacity

Finally, according to Eq. (??), the total and the specific heat capacity are given by

C(β, h) =∂2V(β, h)

∂β2 , c(β, h) =∂2v(β, h)

∂β2 . (5.19)

Mean field theory

The Ising model is defined in such a way that each spin feels only the local magneticfield hi of its nearest neighbors and, if existent, the additional external magnetic field h.In fact, we can rewrite the energy functional as

Es = −∑i(Jhi + h) si , hi = ∑

j∈<i>sj . (5.20)

where the sum runs over all nearest neighbors j of site i.

The mean field limit is an approximation in which the local magnetic field is replacedby the average magnetic field

hi ≈ 2d〈s〉 (5.21)

caused by all spins, where d is the dimension of the system, giving

Es ≈ = −∑i

(2dJ〈s〉+ h

)︸ ︷︷ ︸

H

si . (5.22)

Apart from the indirect coupling via the global magnetic mean field, H, all spins arenow decoupled from each other and fluctuate independently. Therefore, each of themcan be seen as a single spin exchanging energy with the heat bath of temperature T.As such, each spin is oriented according to a probability given by the Boltzmann factor.That is, the probability distribution of a spin configuration si factorizes and is givenby

Ps =exp(−βEs)

Z=

1Z ∏

ie−βE(si) = ∏

ip(si) (5.23)

with

p(si) =e+βHsi

e−βH + e+βH. (5.24)

Therefore, the average spin magnetization 〈s〉 is given by

〈s〉 = p(i)−(1− p(i)

)=

e+βH − e−βH

e+βH + e−βH= tanh

(βH)

(5.25)

i.e.,〈s〉 = tanh

(2dJβ〈s〉+ βh

). (5.26)

Note that there is an interesting twist in this equation. On the one hand, we use theaverage magnetic field to determine the Boltzmann factors for the spin orientation, but

Haye Hinrichsen — Complex Systems

Page 102: Complex system

96 Equilibrium critical phenomena and spin glasses

Figure 5.2.: Average magnetization of the Ising model for a vanishing external magnetic fieldh = 0, as predicted by mean field theory. The two solutions of the mean field equa-tions are shown as a blue line (zero magnetization) and a red line (spontaneousmagnetization).

on the other hand, these Boltzmann factors in turn determine the average magneticfield. In other words, in order to compute the average magnetic field we already needto know the average magnetic field.

Assuming that the average magnetic field fluctuates only slowly compared to the in-dividual spins, this results into a self-consistent feedback loop, allowing us to interpretEq. (5.26) as an implicit equation. For a vanishing external magnetic field h = 0 thisequation has the obvious solution

〈x〉 = 0 . (h = 0) (5.27)

Furthermore, Eq. (5.26) has further nontrivial solutions which cannot be given in aclosed form. However, it is possible to plot them parametrically. To this end, it ismost convenient to introduce the new variables

Tc = 2dJ , z = βH (5.28)

such thatz = βTc〈s〉+ βh . (5.29)

Then in terms of the parameter z we can compute the average magnetization

〈s〉 = tanh z (5.30)

and the quotientTTc

=tanh z

z+

hJz

. (5.31)

This allows us to draw the magnetization as a function of the temperature quotientT/Tc as a parametric plot.

Spontaneous symmetry breaking: The result for vanishing external magnetic fieldh = 0 is shown in figure 5.1. In this figure the trivial solution 〈s〉 = 0 is indicated

Haye Hinrichsen — Complex Systems

Page 103: Complex system

5.1 The Ising model 97

Figure 5.3.: Average magnetization of Ising model for a small positive external magnetic fieldh > 0 within mean field theory. The dashed lines indicate the previous solution forh = 0.

as a horizontal blue line. Obviously, this solution is mapped onto itself under the Z2symmetry transformation si → −si. In addition, for T < Tc, there is another pair of so-lutions, marked by red lines. In this situation, the disordering influence of temperatureis so weak that the spins order themselves due to the positive feedback via the meanfield H. Due to the Z2-symmetry there are two symmetric solutions with positive andnegative magnetization. Actually, the system selects only one of them, meaning thatthe Z2-symmetry is spontaneously broken. The solution with vanishing magnetization(the blue line) still exists, but it becomes unstable for T < Tc , so that any fluctuationwith drive the system away from the blue line to one of the red branches. In physics,this phenomenon is known as spontaneous symmetry breaking.

The same phenomenon happens in ferromagnetic materials. If we increase the tem-perature of a ferromagnet , its magnetic field will decrease and finally vanish at theso-called Curie-temperature (e.g. 1043 Kelvin for iron). Here, the material exhibits aphase transition from the ferromagnetic to the paramagnetic phase. Cooling it downagain, the spins order spontaneously, magnetizing the material randomly in positive ornegative direction.

Influence of the external magnetic field: Let us now switch on the external magneticfield h. Obviously, the external magnetic field will break the Z2 invariance. This can beseen very nicely when plotting the corresponding magnetization curves (see Fig. 5.1),where the red lines are distorted and no longer symmetric to each other. Moreover, thesolution with zero magnetization does not exist anymore.

To understand this figure, let us assume that the system is magnetized in negativedirection. For sufficiently small temperatures, the external magnetic field positive di-rection is not yet strong enough to flip the global orientation of the magnetization.However, as temperature is increased, there will be a certain threshold (indicated bythe green arrow) where the system suddenly jumps to the upper branch. This occursbelow the aforementioned critical temperature Tc = 1/βc.

Haye Hinrichsen — Complex Systems

Page 104: Complex system

98 Equilibrium critical phenomena and spin glasses

Applicability of the mean field approximation: The mean field approximation relieson the assumption that each spin interacts effectively with all other spins of the system.Therefore, mean field theory ignores the notion of nearest neighbors and the latticestructure completely; the dimension of the lattice enters only as an effective number ofnearest neighbors, the so-called coordination number of the interaction.

Usually, the quality of the mean field approximation increases with the dimension ofthe system. Roughly speaking, a high-dimensional system facilitates diffusive mixing,bringing the system closer to the mean field limit. In fact, in the limit of infinitely manydimensions, every site interacts with infinitely many other sites, and therefore meanfield theory is expected to become exact.

As we will see below, there is actually a so-called upper critical dimension dc < ∞above which mean field theory applies1. In low-dimensional systems, however, the lo-cal interactions lead to correlations among neighboring sites, resulting in a different be-havior. Loosely speaking, the spins cannot be longer thought of as a structureless soup,instead they form stochastically correlated patterns which are relevant for the macro-scopic behavior, by studying low dimensional, in particular one-dimensional systems,we can learn a lot about the influence of such correlation effects.

Remember: Mean field theory ignores the lattice structure and lattice dimension, replacinglocal interactions by averaged global ones.

5.2. Ising phase transition

Scale invariance and critical behavior

As we have seen in Fig. 5.1, the Ising model with a zero external field described withinmean field theory is spontaneously magnetized, provided that the temperature is suf-ficiently low. This spontaneous magnetization decreases continuously with increasingtemperature and vanishes at a well-defined critical temperature Tc. Within mean fieldtheory, this critical temperature is given by

TMFc = 2dJ . (5.32)

Such a transition, where the order parameter (the magnetization) vanishes continu-ously at the critical point, is referred to as a continuousphase transition!continuousr second-order phase transition.

Remark: In the theory of critical phenomena one distinguishes between various types ofphase transitions. In experimental physics, the most frequent ones are discontinuous phasetransitions, where the order parameter jumps abruptly at the critical point. Continuousphase transitions are less frequent and appear usually at the ending points of discontinuousphase transition lines. In theoretical physics continuous phase transitions are particularlyinteresting because they exhibit universal properties, as will be discussed below. Contrarily,discontinuous phase transitions are usually non-universal.

1For dc < d < ∞ mean field theory applies in so far as it predicts the correct critical exponents, while theproportionality factors in the power laws may be different. Right at the critical dimension the powerlaws are superseded by logarithmic corrections.

Haye Hinrichsen — Complex Systems

Page 105: Complex system

5.2 Ising phase transition 99

Scale invariance and power laws: In physics, continuous phase transitions are usuallycharacterized by scale invariance and long-range correlations. Scale invariance standsfor a phenomenological approach based on the assumption that the model at the criticalpoint does not exhibit any intrinsic length scale apart from the lattice spacing, providedthat the system is infinite. This postulate imposes constraints on the possible functionalform of physical observables. For example, it rules out that the correlation functionhas an exponential form C(r) = er/r0 because an exponential function would require acertain reference scale r0. Likewise, any other functional dependence represented by apower series is forbidden because it would require a reference scale as well. The onlyfunctional dependence, which is scale invariant by itself, is a power law dependencesuch as C(r) = r−α.

In fact, power laws are ubiquitous in the theory of continuous phase transitions andcritical phenomena. However, usually such power laws do not extend over the fullrange, instead they hold only asymptotically, e.g. in the limit r → ∞. The asymptoticlimit in which the power law becomes exact is referred to as the scaling regime. Far awayfrom the asymptotic scaling regime we expect corrections to scaling caused by the un-derlying lattice structure. Moreover, a power law involves an unknown proportionalityfactor which depends on the specific realization of the model. Ignoring this prefactorwe indicate an asymptotic power law by writing e.g.

C(r) ∼ r−α . (5.33)

Here, the symbol ’∼’ has the meaning of being “asymptotic proportional to”.

Scaling properties of the Ising model: In the context of the Ising model, the criticalbehavior is characterized by various power laws with associated critical exponents,which for historical reasons have been termed as α, β, γ, δ, . . .. As we will see later,only two of them are independent while the other ones are related by so-called scalingrelations.

For example, we may study how the average magnetization per site for a vanishingexternal field h = 0 scales with the temperature distance from the critical point. Thismeans that we are looking for an exponent β such that2

m(β, 0) ∼ (β− βc)β ∼ (Tc − T)β . (5.34)

Further interesting quantities are the susceptibility χ and the specific heat c in the fer-romagnetic phase near the critical point for vanishing external field. These quantitiesscale algebraically as

χ(β, 0) ∼ (β− βc)−γ ∼ (Tc − T)−γ (5.35)

c(β, 0) ∼ (β− βc)−α ∼ (Tc − T)−α . (5.36)

with negative exponents, indicating that both quantities diverge at the critical point.

Finally, an important quantity of interest is the connected part of the spin-spin correla-tion function

C(i, j, β, h) := 〈sisj〉 − 〈si〉〈sj〉 . (5.37)

2The critical exponent β must not be confused with the temperature parameter β = 1/T. Therefore, theexponent is marked by an additional tilde. In the literature the tilde is usually missing.

Haye Hinrichsen — Complex Systems

Page 106: Complex system

100 Equilibrium critical phenomena and spin glasses

In a translationally invariant Ising model we expect this correlator to depend only onthe distance r = |~ri −~rj| between sites i and j:

C(i, j, β, h) = C(r, β, h) (5.38)

In the ferromagnetic ordered phase, this correlation function is usually found to decayin the asymptotic limit r → ∞ exponentially as

C(r, β, 0) ∼ er/ξ(β,0) , (5.39)

where ξ(β, h) is the prevailing correlation length in the system. Aproaching the criti-cal point this correlation length increases and eventually diverges algebraically at thecritical point as

ξ(β, 0) ∼ (β− βc)−ν ∼ (Tc − T)−ν , (5.40)

where ν is the critical exponent associated with the correlation length.

Precisely at the transition point the correlation length ξ is infinite, indicating scaleinvariance. At this point the correlation function does no longer decay exponentially inthe asymptotic regime r → ∞, instead it displays an algebraic behavior

C(r, βc, 0) ∼ r2−d−η , (5.41)

where d is the dimension of the system and η is another critical exponent.

Finally, an interesting situation emerges at the critical point T = Tc , when a smallexternal field is applied. Here we expect that the system responds to the external fieldh > 0 with the magnetization

m(βc, h) ∼ h1/δ , (5.42)

where δ is yet another critical exponent. Since the susceptibility diverges at the criticalpoint, the infinitesimal response to an external field is infinite, implying that δ > 1.

Scaling relations and universality: So far we have introduced six different criticalexponents, namely, α, β, γ, δ, η, and ν. As already mentioned, only two of them areindependent while the other ones are related by four simple relations, namely threeordinary scaling relations

Rushbrooke: α + 2β + γ = 2 (5.43)Widom: γ = β(δ− 1) (5.44)

Fisher: γ = (2− η)ν (5.45)

and the so-called hyperscaling relation

Josephson: 2− α = νd . (5.46)

The difference is the following: hyperscaling relations depend explicitly on the dimen-sion d, while ordinary scaling relations do not.

As we will see, the amazing finding is that the critical exponent α, β, γ, δ, η, and νdepend only on the dimension of the system and the symmetry of the interaction, butnot on the lattice structure and the specific realization of the Ising model. Rather theytake on the same values in a large variety of models which constitute the so-called Isinguniversality class.

Haye Hinrichsen — Complex Systems

Page 107: Complex system

5.2 Ising phase transition 101

d α β γ δ η ν1 – – – – – –2 0 1/8 7/4 15 1/4 13 0.110(1) 0.3265(3) 1.27372(5) 4.789(2) 0.0364(5) 0.6301(4)≥ 4 0 1/2 1 3 0 1/2

Table 5.1.: Critical exponents of Ising model in various dimensions.

Remember: Continuous phase transitions are usually characterized by asymptotic powerlaws (=algebraic decay). The corresponding exponents are expected to be universal, i.e.,they coincide in all models with phase transitions belonging to the same class.

Critical exponents within mean field theory: This is part of our tutorial.

To compute the critical exponent β of the Ising model within mean field theory, let usTaylor-expand the parametric representation of the red curve:

〈s〉 = tanh z = z− z3

3+O(z5) (5.47)

TTc

=tanh z

z= 1− z2

3+O(z4) (5.48)

Solving the second equation for z and inserting the resultant that the first one we obtain

〈s〉 =√

3Tc

(Tc − T)1/2 + . . . . (5.49)

Hence the critical exponent (within mean field theory) is given by β = 1/2.

Ising critical exponents – General picture: Studying the Ising model in variousdimensions, the following picture emerges:

• In one dimension, the Ising model is always in the paramagnetic phase, i.e., aphase transition is absent. Therefore, we cannot define critical exponents. How-ever, in some situations we may interpret the behavior of the one-dimensionalIsing model as having a phase transition at zero temperature Tc = 0, correspond-ing to βc = ∞.

Remark: To understand the physical reason why the Ising model in one dimension doesnot exhibit a phase transition, let us recall the physical mechanism behind the transition.Spontaneous symmetry breaking means that the system magnetizes itself in a particulardirection without any external field. It does so because such a state is entropically favorable.This requires that the spontaneously magnetized state is stable against fluctuations.

For example, let us assume that the system is magnetized in the positive direction. Becauseof the interaction with the heat bath, the spins will nevertheless fluctuate, forming littledroplets of spins in opposite direction. If such a droplet was able to grow it could destroythe magnetized state. Therefore, a robust mechanism is needed, which eliminates minorityislands of down spins. In the Ising model this machanism is due to an effective surfacetension of domain walls. In fact, looking at the energy functional, we immediately see thatneighboring spins oriented in opposite direction cost energy. In other words, domain walls

Haye Hinrichsen — Complex Systems

Page 108: Complex system

102 Equilibrium critical phenomena and spin glasses

between differently oriented domains are energetically punished. Consequently, the systemtries to reduce the total length of domain walls, which effectively introduces some kind ofsurface tension.

Because of the surface tension, the droplets acquire a roundish form. In addition, smalldroplets tend to shrink. It is this mechanism which stabilizes the spontaneously magnetizedstate. Obviously, this mechanism can work only in space dimensions larger or equal than2, the simple reason being that in one dimension domain walls are just points which cannotshrink. This explains why the Ising transition does not exist in one-dimensional systems.

According to a famous argument by Landau, this applies to all equilibrium critical phenom-ena, , i.e., equilibrium phase transitions are impossible in one-dimensional systems.

• In two dimensions, the Ising model exhibits a phase transition and the corre-sponding critical exponents are given by simple rational values. In the theoryof critical phenomena, rational values are an exception rather than the rule. Thedeep reason behind these rational values is the existence of the powerful symme-try in the two-dimensional case, the so-called conformal symmetry.

• In three dimensions, the Ising model still exhibits a nontrivial phase transition.An exact solution of this case is still unknown. The critical exponents can only bedetermined numerically and by means of field theoretic methods, suggesting thatthey are given by irrational values. If the reader could explain the values of theseexponents this would guarantee her/him a professorship.

• In four dimensions and above, the interaction becomes so strongly mixed thatthe mean field approximation becomes valid. Phenomenologically, there existsusually a well-defined dimension, called the upper critical dimension dc, at which agiven model crosses over to mean field behavior. For the Ising model, the uppercritical dimension is 4. For d ≥ dc , the mean field exponents are exact. Preciselyat the critical dimension d = dc, the mean field power laws are superseded bylogarithmic corrections.

5.3. Numerical simulation of the Ising model

Recall that a stochastic Markov process is defined by its configuration space Ωsys anda set of rates ws→s′ . Contrarily, equilibrium models are solely defined in terms of theirconfiguration space Ωsys and their energy functional Es. As a great advantage of equi-librium models, the stationary distribution Ps = 1

Z e−βEs is (almost) for free. However,the definition of equilibrium models does not provide any information about the relax-ation modes. In other words, the energies Es tell us how the stationary state looks likebut they don’t tell us how the stationary state is actually reached.

On the other hand, if we want to simulate the model on a computer, meaning that thecomputer just mimics the Markovian random dynamics, we have to know the transitionrates. This means that we have to invent rates in such a way that the system relaxes intoa stationary state which is just exactly given by the Boltzmann weights. It turns out thatthis procedure is not unique.

How should the rates be designed? in principle they can be chosen freely, provided

Haye Hinrichsen — Complex Systems

Page 109: Complex system

5.3 Numerical simulation of the Ising model 103

c c cT = T T = 1.05 TT=0.98 T

Figure 5.4.: Typical screenshots of the two-dimensional Ising model in the equilibrium state. Left: Or-dered phase below the critical point. Middle: Critical state at the critical point. Right: Su-percritical disordered state.

Figure 5.5.: Illustration of the meaning of detailed balance. The figure shows a system with three statesA,B,C while the blue arrows symbolize transitions at rate 1. The left panel shows a situationwhere the probability currents cancel one another in the stationary state, obeying detailedbalance. Clearly, the stationary state given by pA = pB = pC = 1/3. The system shown onthe right hand side possesses the same stationary state. However, there is a non-vanishingcyclic probability current, meaning that the system is still out of equilibrium.

that they meet the following two criteria:

• The network of transitions has to be ergodic, that is, every configuration can bereached from any other configuration by a sequence of dynamic transitions.

• In the stationary state the dynamics should by the condition of detailed balance.

As discussed in section 2.2 on page 40 the condition of detailed balance states that theprobability currents between pairs of configurations vanishes in both directions:

Js→s′ = Js′→s ⇔ Psws→s′ = Ps′ws′→s ∀s, s′ ∈ Ωsys. (5.50)

Knowing that Ps = 1Z e−βEs this means that the rates have to be chosen in such a way

that obey the condition

ws→s′

ws′→s=

Ps′

Ps= exp

(β(Es − Es′)

). (5.51)

Thus, if we choose any rate, the condition of detailed balance determines the corre-sponding rate in opposite direction. In other words, if we choose the matrix elementsin the upper triangle of the Liouvillian, detailed balance gives us the lower triangle,thereby halving the degrees of freedom.

This leaves us with enormous number of (|Ωsys|2− |Ωsys|)/2 undetermined the ratesof which sufficiently many have to be nonzero in order to make sure that the resulting

Haye Hinrichsen — Complex Systems

Page 110: Complex system

104 Equilibrium critical phenomena and spin glasses

transition network is ergodic. This gives rise to a virtually infinite zoo of legitimatedynamical procedures which all relax into the same equilibrium state. In practice, how-ever, one introduces further conditions in order to reach one of the following goals:

• We may either seek for a dynamics, which mimics what is happening in nature asfaithfully as possible. The most important examples are local spin flip algorithms.

• Alternatively, we may be looking for a dynamics, which is particularly efficientwhen implemented on a computer. as will be discussed below, this includes theso-called cluster algorithms.

Local spin flip dynamics

The simplest class of dynamical procedures for the Ising model involve only singlespin flips. As a starting point, note that the energy functional authorizing model can bewritten as

Es = −∑i

Hi si , Hi = h + J ∑j∈<i>

sj . (5.52)

Considering a single spin as a subsystem, its probability to be oriented in positive di-rection is given by the Boltzmann weight

pi = pi(↑) =eβHi

eβHi + e−βHi. (5.53)

Heat bath dynamics: This suggests to introduce the following dynamics. For eachupdate a site i is randomly selected and the probability pi is computed according to theformula given above. Then the selected spin is oriented according to this probability,i.e., we generate a uniformly distributed random number z between zero and one andset the new value at the selected site to

snewi := sign(pi − z) . (5.54)

Obviously, this dynamic a procedure doesn’t care about the previous orientation ofthe updated spin. As can be verified easily, this procedure amounts to introduce thefollowing transition rates for local spin flips:

w(↓→↑) ∝ eβHi (5.55)w(↑→↓) ∝ e−βHi (5.56)

The corresponding a ratio of the rates is

w(↓→↑)w(↑→↓) = e2βHi . (5.57)

In order to prove that these dynamical rules obey detailed balance. Let us compare theenergy of the configurations before and after a spin flip. For example, if we considerthe transition ↓→↑, the resulting energy change is given by

∆E = Enew − Eold = E↑ − E↓ = −2Hi . (5.58)

Haye Hinrichsen — Complex Systems

Page 111: Complex system

5.3 Numerical simulation of the Ising model 105

Detailed balance is established if and only if

w(↓→↑)w(↑→↓) =

Pnew

Pold=

e−βEnew

e−βEold= e−β∆E = e+2βHi , (5.59)

reproducing Eq. (5.57).

Glauber dynamics: Glauber dynamics differs from heat bath dynamics in so far as thelocal spins are not oriented but flipped with a certain probability. . More specifically,the following steps are carried out:

• A site i is randomly selected.

• The probabilities pi are computed by Eq. (5.53).

• If si = −1 the spin is flipped with probability pi. Otherwise, if si = +1, the spinis flipped with probability 1− pi.

As can be verified easily, this amounts to the rates

w(↓→↑) ∝ eβHi (5.60)w(↑→↓) ∝ e−βHi , (5.61)

hence Glauber dynamics and heat bath dynamics are statistically equivalent.

Metropolis dynamics: The Metropolis algorithm differs from Glauber dynamics inthat one of the moves is carried out with certainty and not with a finite probability. TheMetropolis algorithm consists of the following steps.

• Choose a site i randomly.

• Compute the energy gain ∆E = Enew − Eold = 2si Hi that would result from car-rying out a spin flip at site i (si is the orientation beforebut the update) and set

• If ∆E < 0, i.e., if the system goes into an energetically more favorable state, thespin at site i is flipped with certainty (probability 1).Otherwise, if ∆E ≥ 0 the spin is flipped only with probability p.

This means that one accepts the spin flip with the probability

p := min(1, e−β∆E). (5.62)

and dismisses it otherwise.

In the Metropolis algorithm the rates for local spin flips and no longer independent,instead they depend on the actual contract duration of the immediate environment, andcoded in the sign of the local field Hi. If Hi ≥ 0 we have

w(↓→↑) ∝ 1 (5.63)w(↑→↓) ∝ e−2βHi , (5.64)

Haye Hinrichsen — Complex Systems

Page 112: Complex system

106 Equilibrium critical phenomena and spin glasses

while for Hi < 0 the rates are given by

w(↓→↑) ∝ e2βHi (5.65)w(↑→↓) ∝ 1 , (5.66)

In both cases, the ratio of the rates is again compatible with Eq. (5.57), establishingdetailed balance.

Summary: Local spin flip dynamics for the Ising modelChoose a random site i and let pi = eβHi /(eβHi + e−βHi ). Furthermore, let z ∈ [0, 1] be arandom number. Then perform an update as follows:

• Standard heat bath dynamics:

snewi := sign(pi − z)

• Glauber dynamics:

snewi :=

+sign(pi − z) if sold

i = +1−sign(1− pi − z) if sold

i = −1

• Metropolis dynamics:

snewi :=

+sign(p+i − z) if sold

i = +1−sign(p−i − z) if sold

i = −1

where p±i = min(1, e∓2βHi ).

All the dynamical procedures listed above relax into the same equilibrium state of the Isingmodel. In the equilibrium stationary state the rates obey detailed balance.

5.4. Continuum limit of the Ising model

So far we have introduced and quantitatively understood the Ising model on a d-dimensional lattice. Is it possible to devise a theory for the Ising model in terms ofcontinuous degrees of freedom?

By “continuous” we mean that any discrete element of the Ising model is replaced bya continuous option. More specifically, we would like to replace the lattice positions iby a continuous vector~r ∈ Rd. Moreover, we would like to replace the discrete spinssi = ±1 by a continuous field φ(~r) ∈ R.

Before introducing continuous coordinates let us consider the discretization of thespin itself. Replacing the binary spin variable si = ±1 by a continuous degree of free-dom φ we have to make sure that the continuous variable φ favors two Z2-symmetricvalues. This is most easily achieved if we introduce a symmetric double-well potential

V(φ) = φ2 + λ(φ2 − 1)2 = λφ4 + (1− 2λ)φ2 + 1 (5.67)

Next, let us write down the “action” for this continuous field on the lattice. To this end,let us denote by~ri the position of the lattice sites. Then, the energy functional reads

Haye Hinrichsen — Complex Systems

Page 113: Complex system

5.5 Spin glasses 107

E[φ] = ∑i

[(−2κ ∑

~uφ(~ri + ~u)φ(~ri)

)+ φ2(~ri) + λ(φ2(~ri)− 1)2

], (5.68)

where the second sum runs over all displacements pointing to the nearest neighborsof~ri.

The continuum limit of this expression is straightforward. We know that on a squarelattice the Laplacian has the approximate discrete representation

∆φ(~r) ≈(∑~u

φ(~r + ~u))− 2dφ(~r) , (5.69)

where d is the dimension of the lattice. This allows us to express the energy functionalas an integral

E[φ] =∫

ddrL[φ](~r) (5.70)

with the Lagrange density

L[φ] = −2κφ∆φ + (1− 4κd)φ2 + λ(φ2 − 1)2 (5.71)

Assuming that the field vanishes asymptotically at infinity, we can partially integratethis expression, obtaining the Lagrange density

L[φ] = +2κ(∇φ)(∇φ) + (1− 4κd)φ2 + λ(φ2 − 1)2 (5.72)

Usually one introduces new constants, rewriting the Lagrangian as

L[φ] =12(∇φ)(∇φ)− m

2φ2 +

g4!

φ4 . (5.73)

The partition function is then given as a functional integral over the Boltzmann factorof the energy functional:

Z =∫Dφ exp

[−β

(12(∇φ)(∇φ)− m

2φ2 +

g4!

φ4)]

(5.74)

Here, the first integral has to be read as the “sum over all possible configurations ofthe continuous field φ.” This partition sum is known as φ4 field theory, the simplestnontrivial field theory with so-called loop corrections.

5.5. Spin glasses

The Ising model on a square lattice is highly homogeneous in various respects. On theone hand, the underlying lattice is regular, without any distortions and defects. Onthe other hand, the nearest neighbors are mutually coupled with the same intensityeverywhere, i.e., the coupling constant J space does not depend on the position. Exper-imentally realized ferromagnets fulfill these conditions approximately.

The extreme opposite of such a highly regular ferromagnet is a so-called spin glass. Aspin glass is a strongly disordered magnet. This disorder could be caused by frustrated

Haye Hinrichsen — Complex Systems

Page 114: Complex system

108 Equilibrium critical phenomena and spin glasses

interactions or by stochastic positions of the spins on the lattice. Another possibilityis that ferromagnetic and anti-ferromagnetic bonds are randomly distributed on thelattice. The term “glass” comes from an analogy with the positional disorder in con-ventional glasses.

A conventional magnet has a high-temperature paramagnetic phase, whether lo-cal magnetization 〈si〉 at site i vanishes, and a low-temperature ordered phase, where〈si〉 > 0. A spin glass has a third phase in between, termed spin glass phase, where thetotal magnetization

M =1N

N

∑i=1〈si〉 = 0 (5.75)

still vanishes while the so-called Edwards-Anderson order parameter

q =1N

N

∑i=1〈si〉2 6= 0 . (5.76)

Illustratively stated the spins a locally magnetized in random orientations in such away that the macroscopic field is still zero.

The hallmark of a spin glass is a very complex landscape of the free energy. For thisreason a spin glass does not easily find its equilibrium state, instead it is frequentlycaptured in intermediate metastable states. Consequently the relaxation is extremelyslow.

Edwards-Anderson model

The simplest spin glass model is the Edwards-Anderson model. In this model, we havespins arranged on a regular d-dimensional lattice with nearest neighbor interactions inthe same way as in the Ising model. However, in the present case, each interacting pairof spins has its individual (time independent) coupling constant Jij:

Es = − ∑<ij>

Jijsisj . (5.77)

The coupling constants may take any value, even negative ones. A negative value of Jijdenotes an anti-ferromagnetic type of interaction between the spins while the positivevalue denotes a ferromagnetic one. It is important to note that the disordering thecoupling constants is quenched, i.e., time-independent.

To simplify the notation, we will denote the set of all coupling constants Jij by thebold letter J:

J := Jij (5.78)

Like the Ising model, the spin glass is assumed to be in equilibrium with a thermalheat bath. As such, it maximizes the potential V = ln Z which now depends on thetemperature as well as on the set of frozen coupling constants

V(

β, J)= ln Z

(β, J)

, (5.79)

Haye Hinrichsen — Complex Systems

Page 115: Complex system

5.5 Spin glasses 109

where

Z(

β, J)= ∑

s∈Ωsysexp

(+β ∑

<ij>Jijsisj

)(5.80)

is the usual partition sum. From that we can compute various physical quantities ofinterest, e.g., the total heat capacity

C(

β, J)=

∂2V(

β, J)

∂β2 . (5.81)

However, it is important to note that this heat capacity is only valid for one particularchoice of the coupling constants. In fact, if these coupling constants are chosen ran-domly, what is needed is a second average over the Jij, denoted as 〈. . .〉J, giving

C(β) =⟨

C(

β, J)⟩

J(5.82)

Of course, when averaging and we have to specify certain distribution over which theaverage is carried out. Here, one usually assumes that the coupling constants are mu-tually uncorrelated and distributed randomly according to a normal distribution

P(Jij) =1√

2π J2exp

(−

J2ij

2J2

), (5.83)

where J controls the variance of the distribution. Another common choice is the bi-modal distribution

P(Jij) =12

δ(Jij − J0) +12

δ(Jij + J) . (5.84)

In both cases, the average 〈Jij〉 = 0 vanishes while the variance is given by 〈J2ij〉 = J2.

Remark: Tthis is a very subtle point for the understanding of spin glasses: there are actuallytwo averages to be carried out. The first one is the average over the fluctuating spins, asdescribed by the Boltzmann weights and the partition sum. This average is carried outbefore the partition sum is evaluated. The second average is carried out over the randomlyquenched coupling constants Jij. This average is performed after evaluation of the partitionsum.

For computing the average 〈. . .〉J we have to integrate over all coupling constants, i.e.we have to carry out as many integrals as there are coupling in the system. For anarbitrary quantity X(J) this average may be written as

⟨X(J)

⟩J

:=

(∏<ij>

∫ +∞

−∞dJij P(Jij)

)X(Jij

)(5.85)

for which we will use the compact notation⟨X(J)

⟩J

:=∫DJ X(J) (5.86)

Haye Hinrichsen — Complex Systems

Page 116: Complex system

110 Equilibrium critical phenomena and spin glasses

Replica trick

The main difficulty in evaluating the second average over the randomly quenched cou-pling constants lies in the circumstance that the physical quantities of interest are usu-ally given in terms of partial derivatives of the logarithm of Z. For example, the totalheat capacity is given by

C(β) =⟨∂2lnZ

(β, J)

∂β2

⟩J

(5.87)

Since the logarithm is highly nonlinear, it is impossible to commute the average with alogarithm, i.e. ⟨∂2lnZ

(β, J)

∂β2

⟩J6=

∂2ln⟨

Z(

β, J)⟩

J

∂β2 (5.88)

Example: Let us, for example, consider the average 〈 f (E)〉, where f is some function. If f islinear it commutes with the process of averaging:⟨

f (E)⟩= ∑

s∈Ωsys

ps f (Es) = f(

∑s∈Ωsys

psEs

)= f

(〈E〉)

(5.89)

This is because the arithmetic average itself is linear. Obviously, this does not work if thefunction is nonlinear.

So the main problem with spin glasses is to get rid of the nonlinear logarithm in frontof the partition sum. If you try to evaluate the right-hand side of Eq. (5.88) you will seethat this is a highly nontrivial problem.

In fact, this problem can be solved with a very elegant mathematical technique, calledreplica trick. This mathematical technique is based on the formula

ln x = limn→0

xn − 1n

. (5.90)

Proof: To prove this formula let us take the usual representation of the exponential function

ey = limn→∞

(1 +

yn

)n.

Setting y = ln x this turns into

x = limn→∞

(1 +

ln xn

)n

or

limn→∞

[x1/n −

(1 +

ln xn

)]= 0.

Hence

ln x = limn→∞

x1/n − 11/n

= limn→0

xn − 1n

.

Alternatively, this relationship can be proven using l’Hospital’s rule.

Applying this formula to the partition sum

ln Z = limn→0

Zn − 1n

(5.91)

we can formally express quantities such as the total heat capacity in Eq. (5.93) by

C(β) =⟨∂2lnZ

(β, J)

∂β2

⟩J= lim

n→0

⟨∂2Zn(β, J)

∂β2

⟩J

(5.92)

Haye Hinrichsen — Complex Systems

Page 117: Complex system

5.5 Spin glasses 111

because the constant contribution drops out upon differentiation. Assuming that thederivative commutes with the average we get

C(β) =∂2

∂β2 limn→0

⟨Zn(β, J

)⟩J

(5.93)

Therefore, we can express the heat capacity in terms of the averages 〈Zn〉. These aver-ages are still nonlinear in the partition sum, but now they attain a new interpretation.The main observation is the following: if we take n independent copies of the samesystem, so-called replicas, then the partition sum of the composed system is obviouslygiven by Zn. This means that we simply have to compute the heat capacity of n uncor-related copies of the same system and then to take the limit n→ 0.

The good news is that this computation is in principle feasible because it boils downto calculating integral moments of a Gaussian distribution. However, the main ob-stacle with this method is that the number of copies n can only take integer valuesn = 1, 2, 3, . . ., while the formula given above requires to perform a continuous limitn → 0. This is a serious problem, but the magic part of the replica method, at least forphysicists, is to ignore this problem.

What we actually do is to compute the quantities of interest for integer values, thento guess the general formula which holds for any value of n, and finally to carry outthe formal limit n → 0 as if the guessed formula was valid for any n ∈ R+. It is, ofcourse, by no means clear whether such an approach is mathematically consistent, butphysicists are already happy if it yields a reasonable result and leave this problem tothe mathematicians.

Computing the spin glass partition sum using the replica method

As we are going to see in the subsection, the replica trick unfolds its power only incombination with the saddle point method and another assumption called replica sym-metry. To explain this tedious calculation, we proceed step-by-step. The reader shouldnot be intimidated by the complexity of the lengthy formulas, instead it is important tosee the essential steps of the derivation.

According to Eq. (5.93) we have to compute the two-fold average 〈Zn〉J. To this end,we first compute the partition sum Zn of n replicas of the system enumerated by a =1, . . . , n with the same given set of coupling constants J in all copies. Since the energy ofn replicas is simply the sum over all individual energies, this partition sum is given by

Zn =

n

∏a=1

∑s(a)∈Ωsys

a

︸ ︷︷ ︸configuration sum

exp

(−β

n

∑a=1

Es(a)

). (5.94)

Denoting the set of all configurations by

s = s(1), . . . , s(1) ∈ Ωsysn . (5.95)

we may shortly writeZn = ∑

se−β ∑a E

s(a) (5.96)

Haye Hinrichsen — Complex Systems

Page 118: Complex system

112 Equilibrium critical phenomena and spin glasses

Having computed Zn, the partition sum of the n copies is averaged over the Gaussiandistribution P(Jij) given in Eq. (5.83). Written explicitly this average is given by

⟨Zn(

β, J)⟩

J=

(∏<ij>

∫ +∞

−∞dJij P(Jij)

)︸ ︷︷ ︸weighted average over all Jij

n

∏a=1

∑s(a)∈Ωsys

a

︸ ︷︷ ︸configuration sum

exp

(−β

n

∑a=1

Es(a)

)(5.97)

or, using the compact notations in Eqs. (5.86) and (5.95), by

〈Zn〉 =∫DJ ∑

se−β ∑a E

s(a) . (5.98)

Since both the average over the coupling constants and the sum of all configurationsconsists only of summations and integrations, we can (disregarding possible mathe-matical subtleties) exchange the configurational sum and the average:

〈Zn〉J = ∑s

∫DJ e−β ∑a E

s(a) . (5.99)

Since the product in front of the integral runs over all pairs of coupled sites, we do notneed a separate sum over coupled sites in the argument of exponential, rather we cansimply rewrite the expression as

⟨Zn(β, J

)⟩J

= ∑s

(∏<ij>

∫ +∞

−∞dJij P(Jij)

)︸ ︷︷ ︸

average over all Jij

exp

n

∑a=1

∑<ij>

Jijs(a)i s(a)

j

)(5.100)

= ∑s

(∏<ij>

∫ +∞

−∞dJij P(Jij)

)︸ ︷︷ ︸

average over all Jij

∏<ij>

exp

n

∑a=1

Jijs(a)i s(a)

j

)(5.101)

= ∑s

∏<ij>

∫ +∞

−∞dJij P(Jij)︸ ︷︷ ︸

average over all Jij

exp

(Jij β

n

∑a=1

s(a)i s(a)

j︸ ︷︷ ︸=λ

)(5.102)

Now the coupling constants Jij stands in front of the sum in the argument of the expo-nential function, allowing us to carry out the integration over the Gaussian distributionin Eq. (5.83). Using the well-known formula∫ +∞

−∞dJij P(Jij) eJijλ = exp

(J2λ2

2

)(5.103)

we arrive at the expression

⟨Zn(β, J

)⟩J= ∑

sexp

(β2 J2

2 ∑<ij>

n

∑a,b=1

s(a)i s(a)

j s(b)i s(b)j

)(5.104)

which is no quartic in the spin variables. Experience tells us that exponentials withquartic arguments are not easy to integrate.

Haye Hinrichsen — Complex Systems

Page 119: Complex system

5.5 Spin glasses 113

Mean field approximation: Sherrington-Kirkpatrick model: In order to proceed, weconsider the mean field limit of the Edwards-Anderson model, which is known as theSherrington-Kirkpatrick model in the literature. The idea is that not only nearest neigh-bors, but all possible pairs of sites are mutually coupled. This means that we can re-place the sum ∑<ij> by a sum ∑i ∑j with independently running indices. However, thismeans that that the number of couplings scales with N2 instead of N, where N is thetotal number of sites. Consequently, the integrated coupling strength in such a meanfield model would be much larger than in the original one. In order to compensate thisdifferent type of scaling, the coupling constants have to be rescaled as

Jij → Jij/√

N . (5.105)

As a result, we obtain⟨Zn(

β, J)⟩

J= ∑

sexp

(β2 J2

2N

n

∑a,b=1

[∑

is(a)

i s(b)i

][∑

js(a)

j s(b)j

]). (5.106)

Let us now reorder the sum over the replica indices by taking out the diagonal partand letting the sum run over a < b, which yields an additional constant contribution infront of the sum:⟨

Zn(

β, J)⟩

J= ∑

sexp

(β2 J2Nn

2+

β2 J2

2N

n

∑a<b

[∑

is(a)

i s(b)i

][∑

js(a)

j s(b)j

])(5.107)

This constant contribution can be pulled out as a prefactor in front of the configurationalsum: ⟨

Zn(

β, J)⟩

J= e

β2 J2 Nn2 ∑

sexp

(β2 J2

2N

n

∑a<b

[∑

is(a)

i s(b)i

][∑

js(a)

j s(b)j

])(5.108)

Obviously, this expression is still quartic in the spin variables, and the goal would beto reduce it to the quadratic expression. To achieve this goal, one uses the so-calledHubbard-Stratanovich identity, which is nothing but the linearly shifted Gaussian inte-gration:

eλr2/2 =

)1/2 ∫ +∞

−∞dq e−λq2/2+λrq . (5.109)

Replacing the scalar variables r and q by d-dimensional vectors r and q, the Hubbard-Stratanovich identity becomes

exp(

λ

2r · r)

=

)d/2 d

∏µ=1

∫ +∞

−∞dqµ exp

(−λ

2q · q + λr · q

). (5.110)

Now let us identify these variables with those of our spin glass problem. Setting

λ = Nβ2 J2 , d =n(n− 1)

2, rµ = rab =

1N

[∑

js(a)

j s(b)j

](5.111)

and introducing the abbreviation

∫Dq :=

(Nβ2 J2

) n(n−1)4 n

∏a<b

∫ +∞

−∞dqab (5.112)

Haye Hinrichsen — Complex Systems

Page 120: Complex system

114 Equilibrium critical phenomena and spin glasses

the exponential term in Eq. (5.108) turns into

exp

(β2 J2

2N

n

∑a<b

[∑

is(a)

i s(b)i

][∑

js(a)

j s(b)j

])

=∫Dq exp

(−Nβ2 J2

2

n

∑a<b

q2ab + β2 J2 ∑

a<bqab

[∑

is(a)

i s(b)i

])(5.113)

which is now quartic in the spin variables. Putting all things together and commutingthe configurational sum to the right, we arrive at

⟨Zn⟩

= eβ2 J2 Nn

2 ∑s

exp

(β2 J2

2N

n

∑a<b

[∑

is(a)

i s(b)i

][∑

js(a)

j s(b)j

])(5.114)

= eβ2 J2 Nn

2

∫Dq exp

(−Nβ2 J2

2

n

∑a<b

q2ab

)∑

sexp

(β2 J2 ∑

a<bqab

[∑

is(a)

i s(b)i

]).

This expression is not only quartic in the spin variables, the spin variables also decou-ple from each other. This allows us to recast the configurational sum as a sum overindividuals spins raised to the power N. the resulting expression reads

⟨Zn⟩=∫Dq exp

(Nβ2 J2

2

(n−

n

∑a<b

q2ab

))[( n

∏a=1

∑u(a)=±1︸ ︷︷ ︸

single spin

)exp

(β2 J2 ∑

a<bqabu(a)u(b)

)]N

Introducing the functionL[q, u] = β2 J2 ∑

a<bqabu(a)u(b) (5.115)

and using the notation

∑u

:=n

∏a=1

∑u(a)=±1

= ∑u(1)=±1

∑u(2)=±1

. . . ∑u(n)=±1

(5.116)

this can be written in a more compact form as

⟨Zn⟩=∫Dq exp

(Nβ2 J2

2

(n−

n

∑a<b

q2ab

)+ N ln ∑

ueL[q,u]

). (5.117)

Saddle point method: At this point it is important to note that the argument of theexponential function scales linearly with a number of sites N. Therefore, in the limitN → ∞ the integral will be dominated by the maximum of the function inside theexponential, allowing us to approximate the value of the integral. This is known as theso-called saddle point method.

Recall: Method of steepest descent:Let f (x) and g(x) are smooth functions. Then the integral

I =∫ +∞

−∞dx e−N f (x) g(x) (5.118)

Haye Hinrichsen — Complex Systems

Page 121: Complex system

5.5 Spin glasses 115

can be estimated for large N → ∞ by

I ≈ e−N f (x0)g(x0)

√2π

N| f ′′(x0)|

(1 +O(N−1/2)

). (5.119)

Here x0 is the point where the function f (x) is maximal in the integration range, i.e. f ′(x0) =0 and f ′′(x0) < 0. More generally, if x ∈ Rd is a vector and f (x), g(x) smooth functions onthe vector space, then we may approximate the d-dimensional integral in the limit N → ∞by

I =∫

ddx e−N f (x) g(x) ≈(

N

)d/2 e−N f (x0)g(x0)

||H(x0)||1/2 , (5.120)

where Hij = (∂ f /∂xi∂xj) is the Hessian matrix of the function f and ||H|| the absolute valueof its determinant.

The saddle point method is in fact the most important step in the simplification of ourformula. Applying it to Eq. (5.117) with

f (q) =β2 J2

2

(n−

n

∑a<b

q2ab

)+ ln ∑

ueL[q,u] (5.121)

we find that this function is maximal at the point q where

∂ f (q)qab

= 0 ⇒ qab =∏n

a=1 ∑u(a)=±1 eL[q,u]u(a)u(b)

∏na=1 ∑u(a)=±1 eL[q,u] =:

⟨u(a)u(b)

⟩L

(5.122)

Moreover, one can show that the Hessian matrix is diagonal and that its determinant istaking the value ||H||1/2 = (βJ)d, where d = n(n− 1)/2. Note that this is an implicitequation since q enters also on the right hand side as an argument of L.

Doing the calculation we obtain the expression

⟨Zn⟩

=

(Nβ2 J2

)d/2

︸ ︷︷ ︸from (5.112)

(2π

Nβ2 J2

)d/2

︸ ︷︷ ︸from (5.120)

exp

(Nβ2 J2

2

(n−

n

∑a<b

q2ab

)+ N ln ∑

ueL[q,u]

), (5.123)

where the prefactors cancel one another. This expression is expected to become exactand the thermodynamic limit N → ∞.

As a last step, we have to calculate the free energy per lattice site

FN

= − 1Nβ

limn→0

1n

[⟨Zn⟩− 1]

(5.124)

giving

FN

= − 1β

limn→0

(β2 J2

2

[1− 1

n

n

∑a<b

q2ab

]+

1n

ln ∑u

eL[q,u]

). (5.125)

Recall that the qab have to be determined from the condition

∂F∂qab

= 0 ∀ab . (5.126)

Haye Hinrichsen — Complex Systems

Page 122: Complex system

116 Equilibrium critical phenomena and spin glasses

Replica symmetry: The determination of the constants qa,b is still a very hard prob-lem. Therefore, an additional assumption is needed, which is suitable to simplify theproblem. This is the postulate of replica symmetry. The idea Is that all replicas are equallyimportant, i.e. they play a symmetric role with respect to another. Therefore, the solu-tion of the maximization problem should be symmetric as well, meaning that all vari-ables in the extremum coincide:

qa,b ≡ q (5.127)

Of course, this is a nontrivial assumption, and in fact there are various cases where thisassumption does not hold (replica symmetry breaking).

Postulating replica symmetry the free energy per lattice site reduces to

FN

= − 1β

limn→0

(−β2 J2

4

[2− (n− 1)q2

]+

1n

ln ∑u

eL[q,u]

)(5.128)

where the function L is now proportional to q:

L[q, u] = β2 J2q ∑a<b

u(a)u(b) = β2 J2q

(∑n

a=1 u(a))2− n

2(5.129)

Inserting this function and taking out the constant part we get

FN

= − 1β

limn→0

(−β2 J2

4

[2 + 2q− (n− 1)q2

]+

1n

ln ∑u

eβ2 J2q(∑n

a=1 u(a))2

2

)(5.130)

To get rid of the square in the exponential, we again apply Eq. (5.109), namely

eλr2/2 =

)1/2 ∫ +∞

−∞dv e−λv2/2+λvr , (5.131)

setting

λ = β2 J2q , r =n

∑a=1

u(a) . (5.132)

asdf

∑u

eβ2 J2q(∑n

a=1 u(a))2

2 = ∑u

√β2 J2 q

∫ +∞

−∞dv e−

12 β2 J2qv2+βJ2qv ∑n

a=1 u(a)

=

√β2 J2 q

∫ +∞

−∞dv e−

12 β2 J2qv2

∑u

eβJ2qv ∑na=1 u(a)

=

√β2 J2 q

∫ +∞

−∞dv e−

12 β2 J2qv2

[∑

u=±1eβJ2qv ∑n

a=1 u

]n

=

√β2 J2 q

∫ +∞

−∞dv e−

12 β2 J2qv2 [

2 cosh(

β2 J2qv)]n

=1√2π

∫ +∞

−∞dz e−

12 z2 [

2 cosh(

βJ√

qz)]n (5.133)

Haye Hinrichsen — Complex Systems

Page 123: Complex system

5.5 Spin glasses 117

Inserting this result back into Eq. (5.130) we obtain

FN

= − 1β

limn→0

(− β2 J2

4

[2 + 2q− (n− 1)q2

]+

1n

ln1√2π

∫ +∞

−∞dz e−

12 z2[2 cosh

(βJ√

qz)]n)

(5.134)The parameter q has to be determined from the condition that the derivative of thebracket with respect to q vanishes. This leads directly to the condition

β2 J2

2((1− n)q− 1) +

1n

1√2π

∫ +∞−∞ dz e−

12 z2 βJzn

2n−1√q tanh(

βJ√

qz) [

cosh(

βJ√

qz)]n

1√2π

∫ +∞−∞ dz e−

12 z2 [2 cosh

(βJ√

qz)]n = 0

(5.135)Now the 1/n-factor in front of the integral cancels with the n in the integrand. Thisallows us to perform the limit n→ 0 of the replica trick, turning the equation into

β2 J2

2(q− 1) +

βJ2√

2πq

∫ +∞−∞ dz e−

12 z2

z tanh(

βJ√

qz)

1√2π

∫ +∞−∞ dz e−

12 z2

= 0 (5.136)

In this expression the denominator is just equal to 1. Because of ddz e−z2/2 = −ze−z2/2

and ddz tanh(βJ

√qz) = βJ

√q cosh−2(z) the integral in the nominator can be solved by

partial integration

β2 J2

2(q− 1) +

β2 J2

2√

∫ +∞

−∞dz e−

12 z2

cosh−2(βJ√

qz)= 0 (5.137)

Since cosh−2(x) = 1− tanh2(x) we arrive at the final equation for self-consistency

q =1√2π

∫ +∞

−∞dz e−z2/2 tanh2(βJ

√qz)

. (5.138)

Remark: Alternatively, starting from Eq. (5.134) we could try to first carry out the replicalimit n → 0 and then to determine the constant q. To this end, let us again consider Eq. (??),reading

FN

= − 1β

limn→0

(− β2 J2

4

[2 + 2q− (n− 1)q2

]+

1n

ln1√2π

∫ +∞

−∞dz e−

12 z2 [

2 cosh(

βJ√

qz)]n)

Since

limn→0

1n

ln1√2π

∫dz e−

12 z2

xn = limn→0

1n

ln1√2π

∫d ze−

12 z2

(1 + n ln x +O(n2))

= limn→0

1n

ln[

1 +1√2π

∫dze−

12 z2

(n ln x +O(n2))

]=

1√2π

∫dz e−

12 z2

ln x

we get

FN

= − 1β

(− β2 J2

4

[2+ 2q+ q2

]+

1√2π

∫ +∞

−∞dz e−

12 z2

ln[2 cosh

(βJ√

qz)]

. (5.139)

Solving the condition ∂F/∂q = 0 we get

− β2 J2

2(q + 1) = (5.140)

Haye Hinrichsen — Complex Systems

Page 124: Complex system

118 Equilibrium critical phenomena and spin glasses

Figure 5.6.: Generic phase diagram of the Sherrington-Kirkpatrick model.

For simplicity, we have carried out this calculation without an external magnetic fieldh. However, including such an external field is straightforward and leads essentiallyto slightly more general equations of the same structure. In fact, if we had includedthis field, we would find that the global magnetization for symmetrically distributedcoupling constant vanishes at any temperature, i.e., a ferromagnetic phase does notexist as long as the Jij are symmetrically distributed around zero.3

As shown before, the free energy is

xxx (5.141)

To investigate the properties of the system near the critical point, where the spin glassorder parameter q is small, we expand the right-hand side of the free energy as

xxxx (5.142)

The spin glass phase transition is expected to takes place at the point where the secondorder term q2 vanishes. Hence, we can conclude that the spin glass phase transition islocated at T = J, corresponding to the horizontal line in Fig. 5.6.

Note that the coefficient in front of q2 is negative for T > J, meaning that the para-magnetic solution q = 0 at high temperatures maximizes the free energy. Similarly, onecan find that this spin class solution with q > 0 maximizes the free energy in the low-temperature phase. This pathological behavior is a consequence of the replica methodin itself which can be explained as follows. As one can see from the expression for thefree energy before the limit n → 0 is carried out, the coefficient in front of q2 changessign at n = 1. Therefore, going from the physical situations with integer n to the un-physical limit n → 0 we change the sign of the coefficient, causing maximization in-stead of minimization of the free energy.

For very low temperatures the solution derived above becomes unstable, the reasonbeing that the assumption of replica symmetry does no longer apply in this case. In-stead, one has to search for a solution where the replica symmetry is spontaneouslybroken. One can show that the instability of the replica-symmetric solution is also re-flected in a negative entropy. However, a detailed analysis of such aspects is beyondthe scope of this lecture. For further reading we refer to the large variety of existingtextbooks in this field.

3An even more elaborate calculation reveals that the ferromagnetic phase exists in the case of non-symmetrically distributed coupling constants.

Haye Hinrichsen — Complex Systems

Page 125: Complex system

6. Neural networks

In this chapter we give a brief introduction to the theory of neural networks. The studyof artificial neural networks is an important interdisciplinary field with the broad rangeof applications. In these lecture notes we are going to summarize some basic elementswhich are relevant in the context of equilibrium spin models discussed above. Thepresentation follows handwritten notes by Michael Biehl [10] and Ref. [11].

6.1. Biological background

The human brain (see Fig. 6.1) is certainly the most advanced information-processingunit existing on earth. It is in many respects superior to a computer. For example, wecan easily enter room, look for an empty seat and sit down, – but none of the existingcomputers can accomplish such an easy task. In fact, the brain differs from a computerin many respects:

• Computers have a central processing unit (CPU) controlled by a serial sequenceof commands. These commands are executed extremely fast (≈ 109 per second).Information processing is purely deterministic. If the hardware or the softwareare partially damaged the computer does not work anymore.

• The brain works decentralized by the co-operation of an enormous number ofelementary units called neurons. There is no program which controls the infor-mation processing in detail. The timescale of information processing is less than amillisecond, i.e., rather slow compared to a computer. The low speed is compen-sated by a high degree of parallelization. The brain is extremely robust, i.e., in thecase of partial damage it still functions properly.

Figure 6.1.: Left: Human brain. Right: Schematic view of the human brain. [Wikimedia commons]

Haye Hinrichsen — Complex Systems

Page 126: Complex system

120 Neural networks

The largest part of the human brain is the cortex.Although the functions of different parts of the cor-tex are highly differentiated, the cortex in itself lookssimilar everywhere, suggesting that the brain is run-ning the same type of “hardware” in all of its parts.This means that there is a common concept of infor-mation processing throughout the whole brain. Thisis the concept of neurons, which play the role of ele-mentary information-processing units.

Neurons: A biological neuron is a cell which receives and transmits short electricpulses, called action potentials. The neurons are mutually coupled by so-called synapses.In the brain there are roughly 10.000 synapses per neuron, i.e. approximately 1014 in to-tal. The interaction strength of the synapses is variable and can be positive (excitatory)or negative (inhibitory).

The figure on the right side shows two com-municating neurons. Each neuron has a treeof dendrites which serve as input channels. Theelectric potential coming from the dendrites iscontinuously integrated inside the cell. Whenthe potential surpasses a certain critical thresh-old, a complex biochemical process is released,leading to a significant increase of the voltageby almost hundred millivolts. This pulse thenpropagates along the so-called axon. The axoncan be enormously long, reaching lengths ofabout 1 m. At the terminal point of the axonis connected to the dendrites of another neu-ron by synaptic contacts. The synaptic contactcan be excitatory or inhibitory, increasing or de-creasing the electric potential in the target cell.

Any kind of mental human activity, let it be thinking, feeling, loving and hating –all that is represented in the brain in the form of electric currents. This applies in par-ticular to the phenomenon of consciousness, which is heavily debated in these days.Likewise, any mental status, for example, our knowledge, memory and our personalityis encoded in terms of synaptic contacts. In particular, the process of learning manifestsitself in an adaption of the synaptic contacts. Therefore, all mental and psychologicalelements of human life have a well-defined physical correlate in which they are repre-sented. Since the number of neurons and synapses is finite, it follows trivially that ourmental capability is finite as well.

Synapses and learning: The synapses are the “coupling constants” of the brain. De-pending on the incoming electrical potential, they determine how much electricity isreleased in the target cell. There are various types of synapses which are well un-derstood. The most important one is the chemical synapse. Chemical synapses are al-

Haye Hinrichsen — Complex Systems

Page 127: Complex system

6.1 Biological background 121

ways directed, i.e. they have a well-defined input (the presynaptic axon) and a well-defined output (the postsynaptic dendrite). Inside the chemical synapse little vesiclesare formed which contain special substances called neurotransmitters. When an actionpotential arrives, the vesicles move to the cell membrane, attach to the membrane andemit the neurotransmitters into the so-called synaptic cleft between the synapse and thedendrite of the target cell. The neurotransmitters in turn triggers certain ion channelsof the dendrite’s cell membrane, increasing or decreasing the electric potential insidethe postsynaptic cell.

The concept of synapses allow the neurons to form a complex network without join-ing the cell bodies, i.e., they are still functioning as separate cells, which is apparentlyimportant to keep them alive. On the other hand, the synapses allow the brain to storeinformation and to control its overall reactivity. As mentioned above, the key propertyis the ability of the synapses to adapt their permeability, a property called plasticity.This adaption depends on the information crossing this synapse. Roughly speaking,we strengthen a synapse by using it frequently. Or, in other words:

“Cells which fire together wire together.“

Moreover, the use of neurotransmitters allows the brain to regulate its average ac-tivity globally by means of various chemical substances, controlling, for example, ex-citement, sleep, and various states of emotion. Consuming artificial drugs distorts thisbalance, modifying the synaptic transmission rate in a more or less specific way andaffecting this delicate regulation mechanism. The synapses would immediately try tocompensate these changes caused by the external drug intake, leading to habituationand drug addiction.

Neural coding:

http://en.wikipedia.org/wiki/File:Haxby2001.jpg

In neurosciences and enor-mous progress was madethanks to modern imagingtechniques, in particular,magnetic resonance imaging,which allows the activity ofneurons to be monitored inreal-time. In other words, wecannot yet see what we thinkbut we can at least see wherewe think.

MRI imaging nicely illustrates that most of the neurons are usually at rest, not firingat all or only at a very low rate. Contrarily, if a neuron is excited, it emits. Not only oneaction potential, but usually a sequence of many pulses, a so-called burst or spike train.

Haye Hinrichsen — Complex Systems

Page 128: Complex system

122 Neural networks

http://www.scholarpedia.org/article/Bursting

How neuronal information is actu-ally coded is still debated. Somepeople argue that only the densityof spikes matters, while others pro-pose that the actual timing of thespike sequence is relevant as well.It seems that the truth is some-where in between, depending onthe specific situation. Clearly, mostsensory cells encode their informa-tion only in the density of spikes.However, the sense of hearing def-initely transmits information en-coded in the spike phases.

Artificial neural networks

Artificial neural networks are simplified models which mimic features of biologicalneural networks to some extent. As usual, such models are based on a number of sim-plifications. For example, one usually ignores the possibility of phase coding, assum-ing that the information is encoded in the density of spikes. Using this assumption,the state of each neuron can be characterized by a single real-valued variable s ∈ R

describing its actual firing rate. For simplicity, we will assume that the range is limitedto the interval s ∈ [−1, 1], where s = 1 stands for the maximal firing rate while s = −1means absolute rest.

In addition one has to set up a suitable network architecture. Among a large varietyof possibilities there are two extreme architectures, namely:

(a) Fully connected networks, where every neuron is coupled to every other neuron.

(b) Feedforward-networks, where the neurons are coupled unidirectionally, forminga hierarchical sequence of layers.

6.2. Magnetically inspired neural networks

Associative memory

Traditional computers store information at certain locations in their memory. To recallinformation an explicit address is provided, sent to the memory, and the content atthis address is retrieved. Thus, conventional computers work with content-addressablememory.

Recalling information in a traditional computer requires precise knowledge of thememory address, and a large part of data-base management is concerned with the han-dling of addresses, for example, in index files.

Haye Hinrichsen — Complex Systems

Page 129: Complex system

6.2 Magnetically inspired neural networks 123

Associative memory, in contrast, does not require an address, instead it requires adata set which looks similar to the one which we would like to find. The memorizeddata is not stored in a specific location, instead it is stored non-locally distributed overthe whole system. The realization of an associative memory is one of the easiest andmost natural tasks to be accomplished by a neural network.

To be more specific, let us assume that the data to be stored is given in the form ofp different patterns ~ξ1 . . .~ξ p, each of them consisting of N binary values (ξ

p1 , . . . , ξ

pN)

with ξpi = ±1. If we present a new pattern ~ν = (ν1, . . . , νN) to the associative memory,

it would be the goal to recall the stored pattern ~ξ which most strongly resembles thepresented pattern. This means that ~ν and ~ξ should differ in as few places as possible,i.e., the Hamming distance, defined as the mean square deviation

h =N

∑i=1

(ξi − νi)2 = ||~ξ −~ν||2 (6.1)

should be minimal. Since

(ξi − νi)2 = ξ2

i − 2ξiνi + ν2i = 2− 2ξiνi (6.2)

the search for the minimum of the Hamming distance is equivalent to the search for themaximum of the Euclidean scalar product

~ν ·~ξk =N

∑i=1

νiξki . (6.3)

The Hopfield model

According to Hopfield (1982), the task of finding the maximum can be accomplishedby a dynamical evolution of the network according to the following rules. We considera network of N neurons (spins) si = ±1. This network evolves by random sequentialupdates, i.e., a neuron i is randomly selected. This neuron collects electric potentialfrom all other neurons via synaptic contacts Jij, that is, the incoming electric potentialis

hi = ∑j

Jijsj . (6.4)

This electric potential is then shifted linearly by a local threshold parameter θi, whichcontrols the firing threshold of the neuron. Depending on this shifted potential weassign a new value

si := sign(∑

jJijsj − θi

)(6.5)

to the neuron i. In many cases, the situation is further simplified by setting the thresholdθi to zero.

Obviously, this update procedure can be used in any network architecture. In thesimplest version of the Hopfield model, the network is recursive and fully connected.By ”recursive“ we mean that the set of neurons is coupled with itself. Moreover, thecoupling constants are often assumed to be symmetric:

Jij = Jji (6.6)

Haye Hinrichsen — Complex Systems

Page 130: Complex system

124 Neural networks

Here it is useful (although not necessary) to set the diagonal couplings to zero.

Hebb’s learning rule: The immediate problem consists in choosing these synapticcoupling constants Jij in dependence of the stored patterns ~ξ1, . . . ,~ξ p in such a way thatthe network evolves from the presented pattern ~s := ~ν into the most similar storedpattern by virtue of its own inherent dynamics.

To start with, let us consider the case of a single stored pattern ~ξ. The configura-tion corresponding to this pattern remains invariant under the network dynamics (withthreshold θ = 0) if the local field hi = ∑j Jijsj has the same sign as ξi. As can be showneasily, this condition is satisfied by the simple choice

Jij =1N

ξiξ j . (6.7)

In fact, for si = ξi the local field is then given by

hi = ∑j

Jijsi =1N ∑

jξiξ jξ j = ξi , (6.8)

hence the pattern~s = ~ξ is stable under the dynamics.

It is, of course, not sufficient to show that the stored pattern is invariant under thedynamics, in addition we have to show that the dynamics is attractive. To see this, weassume that the network starts its evolution not exactly in the memorized pattern, butin a slightly different configuration, where n < N of the bits have the wrong value.Without loss of generality, we can assume that these bits are the first n elements of thepattern, meaning that we start with

si :=

−ξi for i = 1, . . . , nξi for i = n + 1, . . . , N

(6.9)

Then the synaptic potentials are given by

hi = ∑j

Jijsj = −1N

n

∑j=1

ξiξ jξ j +1N

n

∑j=n+1

ξiξ jξ j = si(N − n)− n

N=(

1− 2nN

)si . (6.10)

Therefore, for n < N/2, the network evolves into the correct pattern after a single up-date of all neurons, i.e., the stored pattern acts as an attractor for the network dynamics.

We are now going to consider the general case, where p patterns ~ξ1, . . .~ξ p are to bestored simultaneously. If all patterns are equally important, it is near at hand to gener-alize Eq. (6.7) by a simple superposition of the form

Jij =1N

p

∑k=1

ξki ξk

j . (6.11)

This is what is usually referred to as Hebb’s learning rule. This rule can be understood asfollows: if two spins are aligned for a majority of patterns, the synapse between thembecomes excitatory, otherwise it becomes inhibitory.

Haye Hinrichsen — Complex Systems

Page 131: Complex system

6.2 Magnetically inspired neural networks 125

Let us first verify the invariance of a presented pattern~s := ~ξ`. Here the local field isgiven by

hi = ξ`i +1N ∑

k 6=`

ξki ∑

jξk

j ξ`j . (6.12)

As can be seen, this field is not automatically aligned with the pattern since the sumin the second term could overcompensate the first term. However, if we assume thatthe stored patterns are mutually uncorrelated, then, according to the laws of statistics,1

for large N and p, its value will typically be of size√(p− 1)/N. As long as p N,

meaning that the number of stored patterns is much smaller than the total number ofneurons in the network, the additional terms will most likely not affect the sign of thelocal field so that the pattern is stable under the evolution dynamics.

Next, let us verify the stability of the fixed point. Again, let us assume that n neuronsstart out in the wrong state. Then it is straightforward to show that

hi =

(1− 2n

N

)ξ`i +

1N

p

∑k=1

ξki ξk

j =

(1− 2n

N

)ξ`i +O

(√(p− 1)/N

). (6.13)

Again, under the condition n, p N we still have sign(hi) = ξ`i so that the networkconfiguration will converge to the desired pattern within a single, global update. How-ever, if the number of stored patterns becomes comparable to the in number of neurons,the second random term becomes of order one so that the patterns can no longer be re-called reliably. As one can show by a replica calculations, this undesirable case occurswhen the number of stored patterns exceeds about 14% of the number of neurons. It isimportant to note that this result holds only for random mutually uncorrelated patterns.In the case of correlated pattern, the recognition capability is even worse. Contrarily, ifthe patterns happen to be orthogonal on each other (meaning that their mutual scale aproduct vanishes), the capacity of the network can be higher than 14%.

Magnetic interpretation:

As we have seen so far, the Hopfield model is a paradigmatic minimalistic model ofan associative memory. It stores p prototype patterns as fixed-point attractors, whichare encoded in the coupling matrix J. Loosely speaking, the coupling matrix is chosenin such a way that the network feels comfortable in configurations representing thelearned pattern. The stored patterns can be specified by direct computation, as forexample, by Hebb’s learning rule, or, as will be discussed below, by a dynamic updatescheme.

The neurons are updated asynchronously (=random-sequentially) according to thedynamical rule

si := sign(∑

jJijsj − θi

). (6.14)

After learning p prototype patterns ξ1, ξ2, . . . , ξ p, the network may be used to recognizesimilar patterns associatively. To this end a similar-looking pattern is applied to the

1This is the main consequence of the central limit theorem.

Haye Hinrichsen — Complex Systems

Page 132: Complex system

126 Neural networks

network and the resulting output state is then recursively fed back until the networkstabilizes.

The Hopfield model can be interpreted as some kind of spin glass with the energyfunctional

Es = −∑i 6=j

Jijsisj −∑i

θisi , (6.15)

where the sum runs over all connected neurons. Writing this energy functional againas

Es = −∑ hisi , hi = ∑j 6=i

Jijsi − θi (6.16)

we can apply the heat bath algorithm

snewi = sign(z− pi) , (6.17)

where z ∈ [0, 1] is a random numbers drawn from a flat distribution and

pi =eβhi

eβhi + e−βhi. (6.18)

Thus, the update rule in Eq. (6.14) is just the zero temperature limit of heat bath dy-namics.

Energy in the attractor basins: If the coupling constants Jij are determined accordingto Hebb’s learning rule, and if the configuration of the system corresponds to one of thestored patterns ξ`, then the energy functional takes the value

Eξ` = − 1N

N

∑i,j=1

p

∑k=1

ξki ξk

j ξ`i ξ`j = − 1N

[N2 + ∑

k 6=`

( N

∑i=1

ξki ξ`i

)2]. (6.19)

Assuming that the patterns are uncorrelated and that the values ξki = ±1 are equally

likely, one finds that the term (∑Ni=1 ξk

i ξ`i ) in the bracket is of the order√

N. Therefore,the energy becomes

Eξ` ≈ −1

2N

[N2 +O

((p− 1)N

)]≈ −N

2+O((p− 1)/2) . (6.20)

Thus, in the fixed-point of the stored pattern, the energy is basically equal to−E/2. Theinfluence of all other patterns causes a slight shift of the total energy, since it is typicallycaused by fluctuations around the ground state of the system.

Let us begin investigate what happens if the first n bits of the pattern are faulty. In thiscase, the fluctuating terms replaced by another fluctuating term, which does not resultin an essential modification of its intensity. However, the term with k = ` changessubstantially: if the first n bits are incorrectly oriented (si = −ξi, i = 1, . . . , n), its valuechanges to

N

∑i,j=1

ξ`i ξ`j sisj =( N

∑i=1

ξ`i si

)2= (N − 2n)2 (6.21)

Haye Hinrichsen — Complex Systems

Page 133: Complex system

6.3 Hierarchical networks 127

that is, the energy of the configuration rises in leading order proportional to the extentof the deviation from the stored pattern:

Es ≈ Eξ` + 2n− 2n2

N. (6.22)

This illustrates that the memorized patterns are local minima of the energy landscape.

A more detailed analysis shows that the energy functional has, in addition, infinitelymany other local minima. However, all these additional minima are less pronouncedthan those given by the stored patterns. Therefore, the stored patterns can be inter-preted as the global minima of the energy surface Es, at least for moderate values of thestorage capacity p/N.

Parallel versus sequential dynamics For vanishing temperature T = 0, a thermody-namic system in contact with a heat bath minimizes its energy. The update rule of theneural networks has to be designed in such a way that it performs this minimizationas time proceeds. In fact, it is easy to see that the energy functional Eq. (6.14) decreasescontinually with time if random sequential updates are used. To see this, let us con-sider the energy contribution of neuron i under the assumption of symmetric couplingconstants:

Ei(t) = −si(t)[ N

∑j 6=i

Jijsj(t)]

= −si(t)hi(t) . (6.23)

If this neuron is updated random-sequentially the energy changes to

Ei(t + 1) = −si(t + 1)[ N

∑j 6=i

Jijsj(t)]

= −sign[H − I(t)]hi(t)

= −|hi(t)| ≤ −si(t)hi(t) = EI(t) . (6.24)

In other words, the energy contribution of a single neuron subjected to random sequen-tial updates never increases with time. Since the energy is bounded from below, thisimplies that the network dynamics must reach a stationary state corresponding to aminimum of energy functional.

This does not apply to parallel updates (synchronous dynamics). Since here all neu-rons assumed new states in parallel at the same moment, the contribution of an indi-vidual neuron to the energy functional cannot be considered in isolation. It turns outthat algorithms with parallel updates are also suitable to minimize the energy, but theymight end up in a state with the same configuration of the entire network is repeatedevery second time step.

6.3. Hierarchical networks

The Hopfield networks we have discussed so far may be defined on an arbitrary lattice,e.g., a square lattice with nearest neighbor couplings or a fully connected network. Thehigh connectivity requires a certain update scheme, e.g., random sequential updates.

Haye Hinrichsen — Complex Systems

Page 134: Complex system

128 Neural networks

Otherwise, if all neurons would react instantaneously, the behavior of such a networkwould be uncontrolled.

In hierarchical networks, the situation is different. These networks are built in sucha way that closed feedback loops of information flow are forbidden. This allows one toupdate all neurons instantaneously, defining its action as a map rather than a dynamicalprocedure. The most important class of models with this property is that of so-calledfeedforward networks, which will be dicussed in the following.

Feedforward networks

Feedforward networks are characterized by a layered structure and directed informa-tion processing. Usually a feedforward network consists of an input layer, one or sev-eral intermediate hidden layers, and an output layer. In the following let us enumeratethe layers by an index `.

Since closed feedback loops are forbidden, neurons within the same layer are notmutually coupled. For this reason, the state of neuron i in the layer ` is fully and exclu-sively determined by the configuration of the preceding layer. For example, using theupdate rule introduced above, the state of the neuron i in layer ` could be given by

s(`)i = sign

(N(`−1)

∑j=1

J(`)ij s(`−1)j − θ

(`)i

). (6.25)

Thus, for a given coupling constants J(`)ij and firing thresholds θ(`)i a feedforward net-

work establishes a direct functional relationship between input and output. The in-formation processing within the network is instantaneous since the definition of thenetwork does not involve a specific dynamics. Since the output layer renders a binarystring of neuron states s = ±1, the feedforward network is performing a binary classi-fication of the input data. This classification, however, depends on the specific choiceof the coupling constants.

The perceptron: The perceptron is a minimal model of a neuron with N input chan-nels (dendrites), each of them reading a real number. Thus, the input data can be un-derstood as a real-valued vector ~ξ ∈ RN . Each input channel is weighted with anindividual coupling constant, which can also be thought of as forming a vector~J ∈ RN .

The output of the perceptron (the axon) is a single real number. In the simplest ver-sion of the perceptron, this output is binary, taking the values s = ±1. In this case theoutput value is given by the relation

s = sign(∑

jJjξ j − θ

)= sign

(~J ·~ξ − θ

). (6.26)

To understand how a binary perceptron works, let us consider the following geometricinterpretation. The vector of coupling constants~J defines a hyperplane

ξ ∈ RN |~J ·~ξ = θ (6.27)

Haye Hinrichsen — Complex Systems

Page 135: Complex system

6.3 Hierarchical networks 129

in the N-dimensional space. For θ = 0 this hyperplane runs through the origin, whilefor θ > 0 it is shifted away from it, forming an affine subspace. Since the vector ~J isoriented perpendicular to the hyperplane, a positive value of ~J · ~ξ − θ indicates thatthe point ~ξ lies above the hyperplane, while a negative value indicates a point below.Therefore, the hyperplane separates the vector space linearly into two parts. If thepoint ~ξ is located on one side of the hyperplane, the perceptron responds with a +1,otherwise it responds with −1. In other words, the perceptron can classify input datawith respect to a hyperplane determined by~J and θ.

Classifications: Let us consider p sets of input data, called patterns or examples, enu-merated by ν = 1, . . . , p. These patterns are represented by p vectors ~ξν ∈ RN . The goalis to design a perceptron in such a way that it responds to each of the input patterns ξν

with a well-defined given answer sν = ±1.

A map from p input patterns onto p binary outputs is called a classification. In thefollowing we denote a classification by the symbol Cp

N :

CpN = ~ξν ∈ RN 7→ sν = ±1ν=1,...,p (6.28)

A classification can be thought of as a set of p points in a N-dimensional vector space,which are marked by two different colors, say red and green.

A classification is said to be linearly separable if the red and the green dots can beseparated by a hyperplane. Obviously not all classifications are linearly separable. Onthe other hand, every linearly separable classification can be modeled by a perceptron.Therefore, two important questions arise:

(a) Is a given classification Cpn linearly separable?

(b) If so, how can we find the corresponding coupling constants~J and the shift θ?

The perceptron storage problem: To answer the questions posed above we have tounderstand how generic or special linear separability is. To this end, we would liketo compute the number Ω(p, n) of linearly separable classifications of p patterns in Ndimensions. Equivalently, we could ask for the probability that a randomly selectedclassification is linearly separable.

For a given classification Ω one may find a whole variety of hyperplanes, as illus-trated in Fig. ?? . In order to compute the number Ω(p, n) let us now add anotherpattern in this situation, going from p to p + 1. Doing so we have to distinguish twocases:

(a) If the new pattern is represented by a vector ~ξ(p+1) which does not touch any ofthe hyperplanes, all the realizations of the classification (in terms of hyperplanes)can be associated with the same output. This contribution to Ω(p + 1, n) is de-noted as Ω1(p, n).

(b) If the new pattern touches one of the hyperplanes the multitude of hyperplanescan be split into two parts, either with the new point on the one side or on the

Haye Hinrichsen — Complex Systems

Page 136: Complex system

130 Neural networks

other. In this case, the classification is ambiguous. If Ω2(p, n) denotes the numberof such ambiguous classifications, the contribution to Ω(p+ 1, n) will be given by2Ω2(p, n).

⇒ Ω(p + 1, N) = Ω1(p, N) + 2Ω2(p, N) = Ω(p, N) + Ω2(p, N) . (6.29)

will be continued

Haye Hinrichsen — Complex Systems

Page 137: Complex system

A. Mathematical details

A.1. Perron-Frobenius Theorem

Here we briefly summarize the Perron-Frobenius theorem [9] and discuss its consequenceswith respect to stochastic evolution operators.

Let A = (aij) be an n× n positive matrix: aij > 0 for 1 ≤ i, j ≤ n. Then the followingstatements hold:

• There is a positive real number r, called the “Perron root” or the “Perron–Fro-benius eigenvalue”, such that r is an eigenvalue of A and the absolute value ofany other eigenvalue λ (which may be complex) is strictly smaller than r. Thus,the spectral radius ρ(A) is equal to r.

• The eigenspace associated to r is one-dimensional, i.e. the Perron root is non-degenerate.

• There exists a right eigenvector ~v = (v1, v2, . . . , vN) of A with eigenvalue r suchthat all components of~v are positive. Similarly, there exists a left eigenvector withthe same property.

• All other eigenvectors must have at least one negative or non-real component.

• limk→∞Ak

rk is a projection operator on ~v, i.e., in a high power of A the eigenvector~v corresponding to the eigenvalues r is the only one to survive asymptotically.

• The row sums of the matrix A provide bounds on r, i.e.

mini

∑j

aij ≤ r ≤ maxi

∑j

aij. (A.1)

The eigenvector ~v to the eigenvalue r can be normalized in such a way that the (non-negative) exponents add up to 1. Therefore, the components of such a normalizedvectors can be interpreted as probabilities.

Application to Liouville operators Let us now apply the Perron-Frobenius theorem tothe Liouville operator L. According to Eq. (1.7) its matrix elements are given by

Lcc′ = −wc→c′ + δc,c′ ∑c′′

wc→c′′︸ ︷︷ ︸Ec

. (A.2)

The sums Ec in the last term are the so-called escape rates and they are listed as positiveentries along the diagonal. The individual transition rates, on the other hand, appear as

Haye Hinrichsen — Complex Systems

Page 138: Complex system

132 Mathematical details

negative numbers or zeros in the off-diagonal elements. To establish a connection withthe Perron-Frobenius theorem, we invert the sign of all entries and add a multiple of aunit matrix, i.e. we set

A = s1−L , (A.3)

where the prefactor is large enough so that all matrix elements become non-negative. Inthis way we obtain a real-valued (generally non-symmetric) matrix with non-negativeentries.

Let us for now assume that all entries are positive, meaning that there are no van-ishing rates so that the transition network is fully connected. Then A has an upperPerron-root r, hence L has a lower Perron-root, i.e. there exists a real eigenvalue suchthat the real part of all other eigenvalues is strictly larger. The corresponding eigenvec-tor is the only one to have non-negative entries.

Now we use the fact that L is constructed in such a way that it preserves probability.As shown above, this implies that 〈1|L = 0, where 〈1| = (1, 1, . . .). Clearly, this is aleft eigenvector to the eigenvalue zero, and since we know that all other eigenvectorshave at least one negative component, this will be the eigenvector corresponding to thePerron root. Consequently we know that the root eigenvalue of the Liouvillian is zero,implying s = r.

The corresponding right eigenvector has generally different components, but weknow that they are also non-negative and can be normalized to add up to 1, henceallowing a stochastic interpretation. As argued in the Chapter 1, this is the stationarystate of the system. Moreover, in the case of a fully connected transition network, wecan also conclude that the stationary state is non-degenerate and unique. Since the realpart of all other eigenvalues is positive, they describe relaxational modes which tend tozero in the long-time limit t→ ∞.

Partially connected transition networks may decompose into disconnected clustersof transitions. In this case the above considerations can be applied to each of theseclusters. Consequently, such a system has several stationary states, namely, as many asdisconnected clusters.

A.2. Tensor products

Definition

Let V1 and V2 be two vector spaces. These vector spaces can be used to create a newvector space V1⊗V2, called the tensor product of V1 and V2. This space can be constructedmost easily by defining a suitable basis. To this end let |ei〉 and | f j〉 be a basis of V1and V2, respectively. Then the basis of V1⊗V2 can be thought of as the set of all possibleordered combinations of these basis vectors. For example, if |e1〉, |e2〉, |e3〉 is a basisof V1 = R3 and if | f1〉, | f2〉 is a basis of V2 = R2, then the basis of the tensor productgiven by the six vectors consisting of all possible combinations

|e1〉| f1〉, |e1〉| f2〉, |e2〉| f1〉, |e2〉| f1〉, |e3〉| f1〉, |e3〉| f2〉 , (A.4)

Haye Hinrichsen — Complex Systems

Page 139: Complex system

A.2 Tensor products 133

where we used the compact notation |ei〉| f j〉 := |ei〉 ⊗ | f j〉. Therefore, the dimension ofa tensor space is the product of the dimensions of its tensor factors, i.e. the dimensionis multiplicative with respect to ⊗.

Usually the combinations of the basis vectors are sorted lexicographically, meaningthat the index of the rightmost tensor factor is the first to be incremented.

Having defined a basis in V1 ⊗V2 we can use bilinearity to derive the tensor productof arbitrary vectors. Bilinearity means that scalar prefactors in both tensor componentscan be pulled out in front of the tensor product, i.e.

(λ|ei〉

)⊗(

µ| f j〉)

= λµ(|ei〉 ⊗ | f j〉

). ∀λ, µ ∈ R (A.5)

With this rule it is straight-forward to show that the components of the tensor productof two vectors |a〉 ⊗ |b〉 are given by

|c〉 = |a〉 ⊗ |b〉 =

a1a2a3

⊗(b1b2

)=

a1b1a1b2a2b1a2b2a3b1a3b2

. (A.6)

Note that the tensor product is an ordered product, i.e. the tensor factors do not com-mute with each other.

Likewise it is straight-forward to show that the tensor product of two linear operatorsA and B is given by1

C = A⊗ B =

a11 a12 a13a21 a22 a23a31 a32 a33

⊗(b11 b12b21 b22

)(A.7)

=

a11b11 a11b12 a12b11 a12b12 a13b11 a13b12a11b21 a11b22 a12b21 a12b22 a13b21 a13b22a21b11 a21b12 a22b11 a22b12 a23b11 a23b12a21b21 a21b22 a22b21 a22b22 a23b21 a23b22a31b11 a31b12 a32b11 a32b12 a33b11 a33b12a31b21 a31b22 a32b21 a32b22 a33b21 a33b22

.

Note that a tensor product has no direct geometric interpretation because the smallestvector space which can be decomposed into two tensor factors is R4. In particular, thetensor product must not be confused with the direct sum, where the vector componentsare simply concatenated.

1In the mathematical literature the tensor product of linear operators is referred to as the Kroneckerproduct. The same applies to Mathematica R©: Here you should use KroneckerProduct instead ofTensorProduct. Both commands yield the same matrix entries but the resulting list structure is dif-ferent.

Haye Hinrichsen — Complex Systems

Page 140: Complex system

134 Mathematical details

Formal properties of tensor products

Let us now summarize the most important formal properties of the tensor product. Atfirst we can always pull out scalars µ, ν ∈ R as ordinary products in front of the tensorproduct, i.e.,(

µ|a〉)⊗(

ν|b〉)= µν

(|a〉 ⊗ |b〉

),

(µA)⊗(

νB)= µν

(A⊗ B

). (A.8)

In the case of complex vector spaces it is important to note that tensor products differfrom scalar products in so far as the scalar belonging to the left factor must not becomplex conjugated.

If a tensor product of operators is applied to a tensor product of vectors, each opera-tor acts on each vector separately, i.e.

(A⊗ B)(|a〉 ⊗ |b〉) = A|a〉 ⊗ B|b〉 . (A.9)

Consequently, the successive application of several factorizing operators can be carriedout separately in every tensor component:

(A1 ⊗ B1)(A2 ⊗ B2) = (A1A2) ⊗ (B1B2). (A.10)

Performing the adjoint ‘†’ one has to conjugate all tensor components individually,keeping the order of the tensor factors:

(A⊗ B)† = A† ⊗ B† (A.11)(|a〉 ⊗ |b〉)† = 〈a| ⊗ 〈b| . (A.12)

This differs from ordinary operator products (AB)† = B†A†, where the order is re-versed. The same applies to operator products within the tensor components, whichare reversed as well, since in this case, the usual rules for matrix products apply:(

(A1A2)⊗ (B1B2B3))†

= (A2†A1

†)⊗ (B3†B2

†B1†) . (A.13)

The tensor products of two scalars λ, µ ∈ C is defined in a formally consistent manneras an ordinary multiplication in C:

λ⊗ µ ≡ λµ . (A.14)

For example we have(〈a1| ⊗ 〈b1|

) (A⊗ B

) (|a2〉 ⊗ |b2〉

)= 〈a1|A|a2〉〈b1|B|b2〉 ∈ C . (A.15)

In particular, the norm of a tensor product is simply given by the product of the normsof its factors: ∣∣∣|a〉 ⊗ |b〉∣∣∣ =

∣∣|a〉∣∣ ∣∣|b〉∣∣ . (A.16)

The determinant of a tensor product is equal to the product of the determinants of itsfactors raised to the power of the dimension of the other tensor factor:

det (A⊗ B) = (det A)dim(VB)(det B)dim(VA) . (A.17)

Haye Hinrichsen — Complex Systems

Page 141: Complex system

A.2 Tensor products 135

The trace factorizes:Tr(A⊗ B) = TrA TrB . (A.18)

Both identities can be verified easily by writing the matrices A und B in a diagonalbasis.

Tensor products may be applied successively. For example, a vector space V maybe defined as a tensor product V = V1 ⊗ V2 ⊗ V3 consisting of three individual spacesV1,V2,V3, the so-called tensor components. If the number of tensor components is verylarge, it is common to use the notation

V =n⊗

j=1

Vj . (A.19)

For N identical factors V = V0⊗V0⊗ . . .⊗V0 one also uses the notation of tensor powers

V = V⊗N0 . (A.20)

A.2.1. The physical meaning of tensor products

In quantum mechanics and probability theory one uses tensor products whenever thesystem under consideration consists of several subsystems. For example, if two systemsX and Y with the states spaces (Hilbert space) VX und VY are combined into a singlesystem, the state space V of the entire system will be given by

V = VX ⊗ VY . (A.21)

Such systems composed of two parts are denoted as bipartite sytems. Similarly, one canconstruct multipartite complex systems by repeated application of the tensor product.

A typical application is a chain of sites with a well-defined local configuration space.For example, in case of the ASEP (see chapter 1), where the local configuration spaceis R2, the state space of the total system is V = (R2)⊗N ∼= R2N

and not, as one mightexpect, R2N .

A given state space V can be decomposed in infinitely many ways, provided that itsdimension is factorizable. Which of the decompositions is the correct one depends onthe specific physical situation. Usually a subsystem refers to an entity which is spatiallyseparated from the rest of the system.

Haye Hinrichsen — Complex Systems

Page 142: Complex system

136 Mathematical details

Frequently used Symbols

Ωsys configuration spacec individual microscopic configurationc→ c′ microscopic transition (jump)

Haye Hinrichsen — Complex Systems

Page 143: Complex system

Bibliography

[1] Presentation based on: H. Hinrichsen, Physik in unserer Zeit 43 5, 246 (2012).

[2] P. Grassberger, Phys. Lett. A 128, 369 (1988), see corrections in eprint cond-mat/0307138 (2003).

[3] see e.g. M. Tribus and E.C. McIrvine, Scientific American, 224 (1971).

[4] see e.g. chemistrytable.webs.com/enthalpyentropyandgibbs.htm

[5] G. Miller, Information Theory in Psychology II-B, ed. H. Quastler, Glencoe, Illinois;Free Press, 95 (1955); G. P. Basharin, Theory Prob. App. 4, 333 (1959).

[6] B. Derrida, M.R. Evans, V. Hakim, V. Pasquier, Exact solution of a 1d asymmetricexclusion model using a matrix formulation J. Phys. A 26, 1493 (1993).

[7] B. Derrida, An exactly soluble non-equilibrium system: The asymmetric simple exclusionprocess, Phys. Rep. 301, 65 (1998).

[8] R. A. Blythe and M. R. Evans, Nonequilibrium Steady States of Matrix Prod-uct Form: A Solver’s Guide, J. Phys. A Math. Theor. 40 R333 (2007),http://arxiv.org/abs/0706.1678.

[9] see Meyer [http://www.matrixanalysis.com/Chapter8.pdf chapter 8] page 667and Wikipedia.

[10] The discussion of neural networks follows closely the lecture notes and habilita-tion thesis written by Michael Biehl, TP3, Universitat Wurzburg.

[11] B. Muller and J. Reinhardt, Neural Networks – An Introduction, Springer, Berlin,1991.

[12] J. Schnakenberg, Network theory of microscopic and microscopic behavior of my stayequation systems, Rev. Mod. Phys. 48, 571 (1976).

[13] T. M. Liggett, Interacting particle systems, Springer, Berlin, 1985.

[14] G. Odor, Universality classes in nonequilibrium lattice systems, Rev. Mod. Phys.76, 663–724 (2004).

[15] S. Lubeck, Universal scaling behavior of non-equilibrium phase transitions, Int. J.Mod. Phys. 18, 3977–4118 (2004).

[16] H. Hinrichsen, Non-equilibrium critical phenomena and phase transitions intoabsorbing states, Adv. Phys. 49, 815–958 (2000).

[17] D. Stauffer and A. Aharony, Introduction to Percolation Theory, Taylor & Francis,

Haye Hinrichsen — Complex Systems

Page 144: Complex system

138 Bibliography

London, 1992.

[18] I. Jensen, Low-density series expansions for directed percolation: III. Some two-dimensional lattices, J. Phys. A 37, 6899–6915 (2004).

[19] E. Domany and W. Kinzel, Equivalence of cellular automata to Ising models anddirected percolation, Phys. Rev. Lett. 53, 311–314 (1984).

[20] W. Kinzel, Phase transitions of cellular automata, Z. Phys. B 58, 229–244 (1985).

[21] S. Wolfram, Statistical mechanics of cellular automata, Rev. Mod. Phys. 55, 601–644(1983).

[22] G. F. Zebende and T. J. P. Penna, The Domany-Kinzel cellular automaton phasediagram, J. Stat. Phys. 74, 1273–1279 (1994).

[23] I. Dornic, H. Chate, J. Chave, and H. Hinrichsen, Critical coarsening without sur-face tension: the voter universality class, Phys. Rev. Lett. 87, 5701–5704 (2001).

[24] T. E. Harris, Contact interactions on a lattice, Ann. Prob. 2, 969 (1974).

[25] R. Dickman and J. K. da Silva, Moment ratios for absorbing-state phase transitions,Phys. Rev. E 58, 4266–4270 (1998).

[26] J. F. F. Mendes, R. Dickman, M. Henkel, and M. C. Marques, Generalized scalingfor models with multiple absorbing states, J. Phys. A 27, 3019–3028 (1994).

[27] H. K. Janssen, On the nonequilibrium phase transition in reaction-diffusion sys-tems with an absorbing stationary state, Z. Phys. B 42, 151–154 (1981).

[28] P. Grassberger, On phase transitions in Schlogl’s second model, Z. Phys. B 47,365–374 (1982).

[29] H. K. Janssen and U. C. Tauber, The field theory approach to percolation processes,Ann. Phys. (N.Y.) 315, 147–192 (2005).

[30] U. C. Tauber, M. J. Howard, and B. P. Vollmayr-Lee, Applications of field-theoreticrenormalization group methods to reaction-diffusion problems, J. Phys. A 38, R79–R131 (2005).

[31] R. Rammal, C. Tannous, and P. Brenton, Flicker (1/f) noise in percolation net-works: A new hierarchy of exponents, Phys. Rev. Lett. 54, 1718–1721 (1985).

[32] L. de Arcangelis, S. Redner, and A. Coniglio, Anomalous voltage distribution ofrandom resistor networks and a new model for the backbone at the percolationthreshold, Phys. Rev. B 31, 4725–4727 (1985).

[33] O. Stenull and H. K. Janssen, Transport on directed percolation clusters, Phys.Rev. E 63, 025103 (2001).

[34] H. Hinrichsen, O. Stenull, and H. K. Janssen, Multifractal current distribution inrandom-diode networks, Phys. Rev. E 65, 045104–045107 (2002).

[35] C. N. Yang and T. D. Lee, Statistical theory of equations of state and phase transi-tions: 1. Theory of condensation, Phys. Rev. E 87, 404–409 (1952).

Haye Hinrichsen — Complex Systems

Page 145: Complex system

Bibliography 139

[36] C. N. Yang and T. D. Lee, Statistical theory of equations of state and phase transi-tions: 2. Lattice gas and Ising model, Phys. Rev. E 87, 420–419 (1952).

[37] B. Derrida, L. de Seeze, and C. Itzykson, Fractal structure of zeros in hierarchicalmodels, J. Stat. Phys. 33, 559–569 (1983).

[38] P. F. Arndt, Yang-Lee theory for a nonequilibrium phase transition, Phys. Rev.Lett. 84, 814–817 (2000).

[39] S. M. Dammer, S. R. Dahmen, and H. Hinrichsen, Yang-Lee zeros for a nonequi-librium phase transition, J. Phys. A 35, 4527–4539 (2002).

[40] D. Zhong and D. ben Avraham, Universality class of two-offspring branchingannihilating random walks, Phys. Lett. A 209, 333–337 (1995).

[41] J. Cardy and U. C. Tauber, Theory of branching and annihilating random walks,Phys. Rev. Lett. 77, 4780 (1996).

[42] J. Cardy and U. C. Tauber, Field theory of branching and annihilating randomwalks, J. Stat. Phys. 90, 1–56 (1998).

[43] J. T. Cox and D. Griffeath, Diffusive clustering in the two-dimensional votermodel, Annals of Probability 14, 347–370 (1986).

[44] M. Scheucher and H. Spohn, A soluble kinetic model for spinodal decomposition,J. Stat. Phys. 53, 279–294 (1988).

[45] L. Frachebourg and P. L. Krapivsky, Exact results for kinetics of catalytic reactions,Phys. Rev. E 53, R3009–R3012 (1996).

[46] O. A. Hammal, H. Chate, I. Dornic, and M. A. Munoz, Langevin description of crit-ical phenomena with two symmetric absorbing states, Phys. Rev. Lett. 94, 230601–230604 (2005).

[47] M. Rossi, R. Pastor-Satorras, and A. Vespignani, Universality class of absorbingphase transitions with a conserved field, Phys. Rev. Lett. 85, 1803–1806 (2000).

[48] S. S. Manna, Two-state model of self-organized criticality, J. Phys. A 24, L363(1991).

[49] M. Henkel and H. Hinrichsen, The non-equilibrium phase transition of the pair-contact process with diffusion, J. Phys. A 37, R117–R159 (2004).

[50] P. Grassberger, On phase transitions in Schlog’s second model, 47, 365 (1982).

[51] M. J. Howard and U. C. Tauber, ‘Real’ versus ‘imaginary’ noise in diffusion-limitedreactions, J. Phys. A 30, 7721 (1997).

[52] E. Carlon, M. Henkel, and U. Schollwock, Critical properties of the reaction-diffusion model 2A→ 3A, 2A→ ∅, Phys. Rev. E 63, 036101 (2001).

[53] J. Kockelkoren and H. Chate, Absorbing phase transition of branching-annihilating random walks, Phys. Rev. Lett. 90, 125701 (2003).

[54] G. Odor, Critical behaviour of the one-dimensional annihilation-fission process

Haye Hinrichsen — Complex Systems

Page 146: Complex system

140 Bibliography

2A→ ∅, 2A→ 3A, Phys. Rev. E 62, R3027 (2000).

[55] G. Odor, Critical behaviour of the one-dimensional diffusive pair-contact process,Phys. Rev. E 67, 016111 (2003).

[56] M. Paessens and G. M. Schutz, Phase transitions and correlations in the bosonicpair contact process with diffusion: exact results, J. Phys. A 37, 4709–4722 (2004).

[57] H. Hinrichsen, Cyclically coupled spreading and pair-annihilation, 291, 275 (2001).

[58] J. D. Noh and H. Park, Novel universality class of absorbing transitions with con-tinuously varying exponents, Phys. Rev. E 69, 016122 (2004).

[59] R. Dickman and M. A. F. de Menezes, Nonuniversality in the pair-contact processwith diffusion, Phys. Rev. E 66, 045101 (2002).

[60] H. Hinrichsen, Stochastic cellular automaton for the coagulation-fission process2A→ 3A, 2A→ A, 320, 249 (2003).

[61] G. T. Barkema and E. Carlon, Universality in the pair-contact process with diffu-sion, Phys. Rev. E 68, 036113 (2003).

[62] H. Hinrichsen, The diffusive pair-contact process and non-equilibrium wetting,2003, unpublished notes, cond-mat/0302381.

[63] J. L. Cardy and P. Grassberger, Epidemic models and percolation, J. Phys. A 18,L267–L271 (1985).

[64] A. Jimenez-Dalmaroni and H. Hinrichsen, Epidemic spreading with immuniza-tion, Phys. Rev. E 68, 036103–036114 (2003).

[65] S. M. Dammer and H. Hinrichsen, Epidemic spreading with immunization andmutations, Phys. Rev. E 68, 016114–016121 (2003).

[66] L. Hufnagel, D. Brockmann, and T. Geisel, Forecast and control of epidemics in aglobalized world, Proc. Nat. Acam. Sci. 101, 15124–15129 (2004).

[67] D. Brockmann, L. Hufnagel, and T. Geisel, Human dispersal on geographicalscales, 2005, preprint, submitted.

[68] D. Mollison, Spatial contact models for ecological and epidemic spread, J. Roy.Stat. Soc. B 39, 283 (1977).

[69] J.-P. Bouchaud and A. Georges, Anomalous diffusion in disordered media: statis-tical mechanics, models and physical applications, Phys. Rep. 195, 127–293 (1990).

[70] H. C. Fogedby, Langevin equations for continuous Levy flights, Phys. Rev. E 50,1657–1660 (1994).

[71] P. Grassberger, in Fractals in physics, edited by L. Pietronero and E. Tosatti, Elsevier,Amsterdam, 1986.

[72] M. C. Marques and A. L. Ferreira, Critical behaviour of a long-range non-equilibrium system, J. Phys. A 27, 3389–3395 (19).

Haye Hinrichsen — Complex Systems

Page 147: Complex system

Bibliography 141

[73] E. V. Albano, Branching annihilating Levy flights: Irreversible phase transitionswith long-range interactions, Europhys. Lett. 34, 97–102 (1996).

[74] H. K. Janssen, K. Oerding, F. van Wijland, and H. J. Hilhorst, Levy-flight spreadingof epidemic processes leading to percolating clusters, Euro. Phys. J. B 7, 137–145(1999).

[75] H. Hinrichsen and M. Howard, A model for anomalous directed percolation, Euro.Phys. J. B 7, 635–643 (1999).

[76] A. Jimenez-Dalmaroni and J. Cardy, 2005, in preparation.

[77] J. Adamek, M. Keller, A. Senftleben, and H. Hinrichsen, Epidemic spreading withlong-range infections and incubation times, J. Stat. Mech. , to appear (2005).

[78] S. Dietrich, Wetting phenomena, in Phase transitions and critical phenomena, editedby C. Domb and J. L. Lebowitz, volume 12, Academic Press, London, 1988.

[79] G. P. M. Kardar and Y.-C. Zhang, Dynamic scaling of growing interfaces, Phys.Rev. Lett. 56, 889–892 (1986).

[80] Y. Tu, G. Grinstein, and M. A. Munoz, Systems with multiplicative noise: criticalbehavior from KPZ equation and numerics, Phys. Rev. Lett. 78, 274–277 (1997).

[81] M. A. Munoz and T. Hwa, On nonlinear diffusion with multiplicative noise, Eu-rophys. Lett. 41, 147–152 (1998).

[82] U. Alon, M. R. Evans, H. Hinrichsen, and D. Mukamel, Roughening transition ina one-dimensional growth process, Phys. Rev. Lett. 76, 2746–2749 (1996).

[83] U. Alon, M. Evans, H. Hinrichsen, and D. Mukamel, Smooth phases, rougheningtransitions, and novel exponents in one-dimensional growth models, Phys. Rev. E57, 4997–5012 (1998).

[84] H. Hinrichsen, R. Livi, D. Mukamel, and A. Politi, A model for nonequilibriumwetting transitions in two dimensions, Phys. Rev. Lett. 79, 2710–2713 (1997).

[85] H. Hinrichsen, R. Livi, D. Mukamel, and A. Politi, First order phase transition ina 1+1-dimensional nonequilibrium wetting process, Phys. Rev. E 61, R1032–R1035(2000).

[86] H. Hinrichsen, R. Livi, D. Mukamel, and A. Politi, Wetting under non-equilibriumconditions, Phys. Rev. E 68, 041606 (2003).

[87] H. Hinrichsen, On possible experimental realizations of directed percolation, Braz.J. Phys. 30, 69–82 (2000).

Haye Hinrichsen — Complex Systems

Page 148: Complex system
Page 149: Complex system

Index

attractor , 124heat , 62work , 63

action potential, 120axon, 120

basiscanonical, 6

bit, 30burst, 121

chemical affinity, 56chemical oscillations, 7chemical synapse, 120classification, 129clock model, 52coarse-graining, 49configuration, 2

absorbing, 4classical, 1

conjugate pair, 94coordination number, 98correlation length, 100coupling constant, 92critical phenomena, 91

decoherence, 1dendrites, 120detailed balance, 40, 43, 50, 51, 103dynamics

cluster, 104local spin flip, 104stochastic, 3

Edwards-Anderson model, 108eigenmode decomposition, 7energy

configurational, 45entropic force, 52, 53entropy, 29

configurational, 34entropy production, 52entropy-optimized coding, 33environment, 48equal-a-priori postulate, 37, 48equilibration

in economies, 43equilibrium

thermal, 8equipartition postulate, 37Ergodicity, 3escape rate, 5, 60escape rates, 47, 131examples, 129extensitivity, 31extensive, 30extent of reaction, 56

feedforward networks, 128fluctuation theorem

integral, 59strong, 58weak, 59

fluctuation theorems, 58free energy, 45

configurational, 46Helmholtz, 45

Frobenius theorem, 131

Gibbs postulate, 37, 48Glauber dynamics, 105

Hamming distance, 123Hebb’s learning rule, 124hidden layers, 128Hubbard-Stratanovich identity, 113hyperscaling relation

Ising model, 100

information, 29

Haye Hinrichsen — Complex Systems

Page 150: Complex system

144 Index

initial state, 6input layer, 128intensity matrices, 7internal energy, 46Ising model, 91

Kronecker product, 133

lattice modelsbosonic, 11fermionic, 11

lattice sites, 10Liouville operator, 6Liouvillian, 6

magnetization, 93Markov assumption, 3Markov processes, 3master equation, 5matrix

intensity, 7matrix algebra, 25matrix exponential function, 6mean field limit, 95microcanonical ensemble, 42microstate, 2molecular beam epitaxy, 1

nearest neighbors, 92neurons, 120neurotransmitters, 121non-equilibrium steady-state, 22nonequilibrium steady state, 52nonequilibrium system, 52

o, 98operators

stochastic, 7order parameter, 93output layer, 128

path integral, 62patterns, 123, 129Perron-Frobenius theorem, 8, 131phase transition

first-order, 98phase transitions, 91plasticity, 121Poisson distribution, 9power law, 99

probability current, 38product state, 18

rate, 3escape, 131

replica, 111replica symmetry, 116replica trick, 110

saddle point method, 114scaling ralations

Ising model, 100scaling regime, 99scaling relations, 99Schnakenberg formula, 55second law of thermodynamics, 48sector

dynamical, 4separability

linear, 129Sherrington-Kirkpatrick model, 113shot noise, 9simple exclusion process, 11spike train, 121spin glass, 107spin glass phase, 108spin-spin correlation function, 99spontaneous symmetry breaking, 97state, 5

factorizable, 18stationary, 8, 48

state space, 5state vector, 5states

absorbing, 4stochastic path, 51stochastic trajectory, 5, 51susceptibility, 94synapses, 120synaptic cleft, 121system

isolated, 48bipartite, 135isolated, 37

temperature, 43tensor components, 135tensor powers, 135tensor product, 132

Haye Hinrichsen — Complex Systems

Page 151: Complex system

Index 145

tensor products, 14thermal death hypothesis, 40thermodynamic equilibrium, 44thermodynamic limit, 4transition

microscopic, 3transition network, 3

universality classIsing, 100

upper critical dimension, 98, 102

Haye Hinrichsen — Complex Systems