chapter 3 predicting the uncertain · 2013. 6. 3. · chapter 3. predicting the uncertain 32 c s r...

19
Chapter 3 Predicting the uncertain ©2013 by Alessandro Codello All epistemological value of the theory of probability is based on this: the large scale random phenomena in their collective action create strict, non random regularity. B.V. Gnedenkov and A.N. Kolmogorov [1] 3.1 Random variables Everything around us is random, from the temperature inside our room to the height of the next person who comes in from the door. But the recognition and description of randomness is the first step in the direction of understand- ing, it’s a way to parametrize uncertain. A random variable is anything that can be measured an arbitrary number of what is a random vari- able? times, the outcome of the measure being random (this outcome can be an integer or a real number). A random variable X can be described by the probability density func- definition of ran- dom variable tion (pdf) p X (x) defined by the relation: P (a X b)= b a dx p X (x) . (3.1) 27

Upload: others

Post on 28-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Chapter 3

    Predicting the uncertain

    ©2013 by Alessandro Codello

    All epistemological value of the theory of probability is based on this: thelarge scale random phenomena in their collective action create strict, non

    random regularity.

    B.V. Gnedenkov and A.N. Kolmogorov [1]

    3.1 Random variables

    Everything around us is random, from the temperature inside our room to theheight of the next person who comes in from the door. But the recognitionand description of randomness is the first step in the direction of understand-ing, it’s a way to parametrize uncertain.

    A random variable is anything that can be measured an arbitrary number of what is arandom

    vari-

    able?

    times, the outcome of the measure being random (this outcome can be aninteger or a real number).

    A random variable X can be described by the probability density func- definitionof ran-

    dom

    variable

    tion (pdf) pX(x) defined by the relation:

    P (a ≤ X ≤ b) =∫ b

    a

    dx pX(x) . (3.1)

    27

  • CHAPTER 3. PREDICTING THE UNCERTAIN 28

    The probability that X takes some value is one P (−∞ ≤ X ≤ ∞) = 1, thuswe have the normalization condition on the pdf:

    ∫ ∞

    −∞dx pX(x) = 1 . (3.2)

    A discrete random variable can be described by a pdf of the form p(x) =∑

    i piδ(x − xi). The cumulative distribution function FX(x) of therandom variable X is defined by

    FX(x) = P (X ≥ x) (3.3)

    and we have F ′X(x) = pX(x). Expectations are defined as:

    〈f(X)〉 =∫ ∞

    −∞dx pX(x) f(x)

    and can be evaluated if the pdf is known.

    The moments of the random variable X are given by the expectation values momentsand cu-

    mulants

    of the powers of X:mn = 〈Xn〉 . (3.4)

    The value of the first moment reflects the normalization of the pdf m0 =1 while the second moment is just the mean m1 = m. Higher momentscan be infinite. Moments can be conveniently calculated from the momentgenerating function defined as:

    ZX(t) =〈

    etX〉

    mn = Z(n)X (0) , (3.5)

    which is just the Laplace transform of the pdf1. Much more important thanthe moments are the cumulants which are generated by

    WX(t) = log ZX(t) cn = W(n)X (0) . (3.6)

    The first two cumulants are c0 = 0 and c1 = m, the third is the variance2

    c2 = m2 −m2 =〈

    (x−m)2〉

    ≡ σ2 (3.7)1We will use field theory conventions2The positive square root of the variance σ is called standard deviation.

  • CHAPTER 3. PREDICTING THE UNCERTAIN 29

    and measures the deviations from the mean. The variance sets the scaleof the pdf and quantifies the resolution at which we are “observing” thepdf. The next few moments are

    c3 = m3 − 3m2m+ 2m2

    =〈

    (x−m)3〉

    c4 = m4 − 4m1m3 − 3m22 ++12m21m2 − 6m41

    =〈

    (x−m)4〉

    − 3σ4

    c5 = ... , (3.8)

    and so on. In general the cumulants cn are polynomial in the moments oforder p ≤ n.

    A problem with the moment generating function is that it is not always characteristicfunctionfinite, for this reason it is useful to introduce the characteristic func-

    tion which is the Fourier transform3 of the pdf

    p̂X(t) =〈

    eitX〉

    . (3.9)

    The nice thing about the characteristic function is that it is always finitesince:

    |p̂X(t)| =∣∣〈

    eitX〉∣∣ ≤

    〈∣∣eitX

    ∣∣〉

    = 〈1〉 = 1 ,

    where we used the inequality∣∣〈

    eA〉∣∣ ≤

    〈∣∣eA

    ∣∣〉

    . Once we have calculated thecharacteristic function we can extract the moments from the relation

    mn = (−i)np̂(n)X (0) . (3.10)

    Again, from the normalization of the pdf we have p̂X(0) = 1. Cumulants canalso be extracted from the characteristic function as follows:

    cn = (−i)ndn

    dtnlog p̂X(t)

    ∣∣∣∣t=0

    . (3.11)

    It is useful to define the normalized or “dimensionless” cumulants (remember dimensionlesscumu-

    lants3Our Fourier transform conventions are:

    f̂(t) =

    dx f(x) eitx f(x) =

    ∫dt

    2πf̂(t) e−itx .

  • CHAPTER 3. PREDICTING THE UNCERTAIN 30

    that the standard deviation σ sets the scale of the pdf)

    c̃n =cnσn

    . (3.12)

    The first two dimensionless cumulants are called the skewness

    ς ≡ c̃3 =〈(x−m)3〉

    σ3(3.13)

    and the kurtosis

    κ ≡ c̃4 =〈(x−m)4〉

    σ4− 3 . (3.14)

    The kurtosis is bounded from below, it is possible to prove that κ > −2 forany pdf.

    We will see that the Gaussian pdf has only m and σ non zero, so higher cumulantsparametrize

    devia-

    tions

    from

    gaussian-

    ity

    cumulants, like the skewness (for asymmetrical pdf) and the kurtosis (forsymmetrical pdf), measure the first deviations from gaussianity. The cumu-lants play the role of the connected autocorrelation functions.

    3.2 Sums of iid random variablesiid ran-

    dom

    vari-

    ables

    and

    their

    sums

    We consider now independent identically distributed (iid) randomvariables of mean m and standard deviation σ. The pdf for the sum of two iidrandom variables X1+X2 is given by the sum of the products pX1(x1)pX2(x2)over all values of x1 and x2 such that x1 + x2 = x:

    pX1+X2(x) =

    dx1dx2 pX1(x1)pX2(x2)δ(x− x1 − x2)

    =

    dx1 pX1(x1)pX2(x− x1) . (3.15)

    In other words, the pdf of the sum of iid random variables is given by con-volution. In terms of the characteristic functions we simply have

    p̂X1+X2(t) = p̂X1(t)p̂X2(t) , (3.16)

  • CHAPTER 3. PREDICTING THE UNCERTAIN 31

    since the Fourier transform of a convolution is just the product of the Fouriertransforms. We also have

    paX+b(x) =1

    apX

    (x− ba

    )

    (3.17)

    or in term of the characteristic function

    p̂aX+b(t) = eitbp̂X(at) .

    From (3.16) we can deduce that cumulants sum up cX1+X2 = cX1 + cX2.

    3.2.1 Exact RG transformationsrg

    trans-

    forma-

    tion

    We now develop the RG theory for iid random variables [2, 3]. The randomvariables Xi are independent and thus non–interacting4. The coarse–grainingis done by grouping the Xi in groups of two (the number b of grouped ran-dom variables is a scheme freedom) and summing them. The aim of therescaling is to keep the linear size of the system constant, i.e. to keep σ2

    unchanged. This means that we have to rescale x so to have constant vari-ance x → x2ν , where ν is a scaling exponent to be determined self-consistently.

    The functional space of the pdf is the functional space of positive functions fixingtheory

    space

    on R which satisfy∫

    dx p(x) = 1∫

    dx x p(x) = 0∫

    dx x2 p(x) = σ2 . (3.18)

    We will consider pdf with finite moments. The requirements (3.18) andp(x) ≥ 0 fix theory space.

    Using the relations (3.15) and (3.17) we can write the exact RG transfor- exact rgtrans-

    forma-

    tion

    4What really makes the “structure of space” is the “structure of the interactions”. Wecan imagine the iid random variables on one dimensional lattice, but it is only fictitious,what we are actually doing if field theory in zero dimensions.

  • CHAPTER 3. PREDICTING THE UNCERTAIN 32

    C

    SR

    Figure 3.1: Coarse–graining by convolution of neighbor iid random variables.

    mation as

    Rp(x) = 2ν∫

    dy p(y)p(2νx− y) . (3.19)

    We can determine the value of the exponent ν by the requirement that theRG transformation respects the properties (3.18), we have

    dxRp(x) = 2ν∫

    dxdy p(y)p(2νx− y) ,

    changing variable to x → 2νx− y (dx → 2νdx) gives∫

    dxRp(x) =∫

    dxdy p(y)p(x) = 1 .

    By the same steps we find∫

    dx xRp(x) = 0 .

    Since it is the variance that sets the scale, it is the condition on the variance

  • CHAPTER 3. PREDICTING THE UNCERTAIN 33

    that fixes the value of ν:∫

    dx x2Rp(x) = 2ν∫

    dxdy x2 p(y)p(2νx− y)

    =

    dxdy

    (x+ y

    )2

    p(y)p(x)

    = 21−2νσ2

    ⇒ ν =1

    2. (3.20)

    This last relation tells us that Rσ2 = 2σ2, which could had been derived bythe fact that cumulants sum up. The exact rg transformation is

    Rp(x) =√2

    dy p(y)p(√2x− y) . (3.21)

    If we where grouping more variables than two we would had found√2 →

    √b.

    In terms of the characteristic function the exact RG transformation becomes

    Rp̂(t) =[

    (t√2

    )]2

    . (3.22)

    3.2.2 Fixed point: Gaussian pdf

    The first thing to do is to find the fixed point pdf p∗(x), i.e. the solutions of fixedpoint ≡gaussianRp∗(x) = p∗(x) (3.23)

    To solve (3.23) we use the Fourier representation of (3.22) to obtain:

    p̂∗(t) =

    [

    p̂∗

    (t√2

    )]2

    . (3.24)

    Taking the logarithm of (3.24) and defining f(t) = log p̂∗(t) gives

    f(t) = 2f

    (t√2

    )

    . (3.25)

    Equation (3.25) shows that the function f(t) is homogeneous function withα = 2 and λ = 1√

    2, thus f(t) = Ct2 and we find:

    p̂∗(t) = eCt2 . (3.26)

  • CHAPTER 3. PREDICTING THE UNCERTAIN 34

    The constant C can be fixed by imposing (3.18). Using (3.11) this is equiva-lent to impose p̂∗(0) = 1, p̂′∗(0) = 0 and p̂

    ′′∗(0) = −σ2, from this last relation

    we find:

    C = −1

    2σ2 . (3.27)

    If we reintroduce the mean we finally find:

    p̂G(t) = eiµt− 12 t

    2σ2 . (3.28)

    In (3.28) we called the fixed point solution gaussian since this is the nameof the pdf we have found. Using the Gaussian integration formula

    ∫ ∞

    −∞dx e−

    12ax

    2+bx =

    ae

    12

    b2

    a , (3.29)

    we can Fourier transform back (3.28):

    pG(x) =

    ∫dt

    2πp̂G(t)e

    −itx

    =

    ∫dt

    2πeit(x−µ)−

    12 t

    2σ2

    =1√2πσ2

    e−12

    (x−µ)2

    σ2 ,

    to obtain:

    pG(x) =1√2πσ2

    e−12

    (x−µ)2

    σ2 . (3.30)

    Using (3.29) we can easily prove that∫

    dx pG(x) = 1. The cumulative dis-tribution function of the Gaussian distribution is:

    FG(x) =1√2πσ2

    ∫ ∞

    x

    du e−12

    (u−µ)2

    σ2 =1

    2−

    1

    2Erf

    (

    x− µ√2σ

    )

    , (3.31)

    where Erf(x) is the error function.

  • CHAPTER 3. PREDICTING THE UNCERTAIN 35

    3.2.3 Linearizing the RG transformation: the CLTlinear rg

    trans-

    forma-

    tion and

    rg eigen-

    value

    problem

    To test the stability properties of the fixed point we linearize the RG trans-formation around the Gaussian pdf:

    R(pG + !h)(x) =√2

    dy [pG(y) + !h(y)][

    pG(√2x− y) + !h(

    √2x− y)

    ]

    = RpG(x) + ! 2√2

    dy pG(y)h(√2x− y) +O(!2)

    = pG(x) + !LGh(x) +O(!2) , (3.32)

    where the linear rg operator LG of the Gaussian pdf, defined in the lastline, is the following:

    LGh(x) =2√πσ

    dy e−y2

    2σ2 h(√2x− y) . (3.33)

    We need to study the rg eigenvalue problem:

    LGhn(x) = λnhn(x) , (3.34)

    to do this we switch to Fourier space where the stability operator acts as

    LGĥ(t) = 2e−14σ

    2t2 ĥ

    (t√2

    )

    (3.35)

    and the eigenvalue problem becomes:

    λnĥn(t) = 2e− 14σ

    2t2 ĥn

    (t√2

    )

    . (3.36)

    It is easy to check, by following the same steps used to solve the fixed pointequation, that the functions

    ĥn(t) = e− 12σ

    2t2(it)n , (3.37)

    solve equation (3.36) if we fix the eigenvalues to

    λn = 21−n2 , (3.38)

    for n = 0, 1, 2, 3, ....

  • CHAPTER 3. PREDICTING THE UNCERTAIN 36

    The perturbations ĥ0(t) and ĥ1(t) are amplified by a RG transformation relevant,marginal

    irrele-

    vant

    since the respective eigenvalues λ0 = 2, λ1 =√2 are bigger than one

    and are called relevant; the direction ĥ2(t) is marginal λ2 = 1 whileall the others ĥ3(t), ĥ4(t), ... are suppressed and are termed irrelevantλ3 =

    1√2,λ4 =

    12 , ....

    In coordinate space the eigenfunctions (3.37) are given by the Chebyshev- chebyshev–hermite

    polyno-

    mials

    Hermite polynomials hn(x) = pG(x)σ−2nHn(xσ

    )

    . The first few are:

    H0(x) = 1

    H1(x) = x

    H2(x) = x2 − 1

    H3(x) = x3 − 3x

    H4(x) = x4 − 6x2 + 3 , (3.39)

    in general we have:

    Hn(x) = (−1)nex2

    2dn

    dxne−

    x2

    2 . (3.40)

    Describe the “tangent space” to the fixed point Gaussian pdf, around whichwe can write:

    p(x) = pG(x)[

    1 +#3σ3

    H3(x

    σ

    )

    +#4σ4

    H4(x

    σ

    )

    + ...]

    ,

    which is the first example of perturbative expansion around a fixed–point.Which is the relation between the “couplings” #i and the cumulants ci?

    Not all eigen–perturbations are within our theory space. In fact one has: centrallimit

    theorem

    dx [pG(x) + #0h0(x)] =

    dx pG(x) + #0

    dx pG(x) h0(x)

    = 1 + #0 ⇒ #0 = 0

    and∫

    dx x [pG(x) + #1h1(x)] =

    dx x pG(x) +#1σ2

    dx x2 pG(x)

    = #1 ⇒ #1 = 0 .

    The marginal direction does not contribute since∫

    dx pG(x) h2(x) = 0. Thusthe only two relevant directions are orthogonal to our theory space and the

  • CHAPTER 3. PREDICTING THE UNCERTAIN 37

    Gaussian fixed point pdf attracts all other directions: the long range col-lective behavior of any collection of iid random variable is described by aGaussian! This is the central limit theorem and a manifestation ofuniversality.

    3.2.4 Convergence to the Gaussianrg flow:

    running

    cumu-

    lants

    The CLT is valid only in the limit N → ∞ where N = 2n is the numberof random variable summed and n is the number of RG transformationsperformed. To find out finite N corrections we consider a general pdf in thebasin of attraction of the Gaussian pdf, with zero mean and with all the othercumulants finite. This has a characteristic function of the following general“cumulant expansion” form:

    p̂(t) = exp

    [∞∑

    k=2

    ckk!(it)k

    ]

    ,

    since c0 = c1 = 0. To study the convergence of a general pdf to the Gaussianpdf, we iterate the RG transformation n times:

    Rnp̂(t) =[

    (t√2n

    )]2n

    =

    [

    (t√N

    )]N

    = exp

    [

    N∞∑

    k=2

    ckk!

    (it√N

    )k]

    = exp

    [

    −1

    2t2σ2 +

    ∞∑

    k=3

    ckk!N1−k/2 (it)k

    ]

    = exp

    [

    −1

    2t2σ2 +

    ∞∑

    k=3

    cNkk!

    (it)k]

    , (3.41)

    where cNk = N1−k/2ck, are the scale dependent (running) cumulants (or cou-

    plings). We can write these “beta functions” as cnk = λnkck, where the λk

    are the RG eigenvalues (3.38). We see again from (3.41) that in the limitN → ∞ the pdf p̂N(t) = Rnp̂(t) converges to the Gaussian pdf.

  • CHAPTER 3. PREDICTING THE UNCERTAIN 38

    Generally we are more interested in the behavior for large but finite N , finite Ncorrec-

    tions to

    the clt

    situation that we encounter in practice. In this situation we are already con-verging to the Gaussian pdf and thus we can assume cumulants to be small(we are assuming they are all finite). If we fix σ = 1 and we have c̃3 ! 1,c̃4 ! 1 we can expand the exponential in (3.41). The terms can be arrangedin powers of N−1/2:

    p̂N(t) = e− 12 t

    2

    {

    1 +c̃3

    6√N(it)3 +

    c̃424N

    (it)4 +c̃23

    72N(it)6 +O

    (

    N−3/2)}

    .

    (3.42)In terms of the coordinate space pdf we find:

    pN(x) =e−

    12x

    2

    √2π

    {

    1 +1√Nq1/2(x) +

    1

    Nq1(x) +O

    (

    N−3/2)}

    , (3.43)

    where the qk(x) are polynomials depending on the normalized cumulants.Using (3.43) we can also calculate the cumulative distribution function:

    F (x) = FG(x)−e−

    12x

    2

    √2π

    {1√NQ1/2(x) +

    1

    NQ1(x) +O

    (

    N−3/2)}

    , (3.44)

    where FG(x) is given in (3.31) and the first two Qk(x) polynomials are:

    Q1/2(x) =ς

    6(x2 − 1)

    Q1(x) =1

    72ς2x5 +

    (

    1

    24κ−

    5

    36ς2)

    x3 +

    +

    (5

    24ς2 −

    1

    )

    x , (3.45)

    with ς ≡ c̃3 and κ ≡ c̃4 the skewness and the kurtosis as defined in (3.13)and (3.14).

    Give some examples. Point out that the convergence to the Gaussian isin the central part of the pdf. Tails converge only for N = ∞.

  • CHAPTER 3. PREDICTING THE UNCERTAIN 39

    3.2.5 Law of large numberslaw of

    large

    numbers

    The law of large numbers is obtained instead by imposing:∫

    dx p(x) = 1∫

    dx x p(x) = m,

    and working in the relative functional space. In this case the direction h1(x)belongs to theory space and a pdf is attracted towards pm(x) = δ(x − m).You can work out the details as an exercise.

    3.2.6 Stable distributionsσ2 = ∞implies ν

    is not de-

    termined

    If we drop the requirement that the pdf has finite moments, in particularfinite variance, then the exponent ν in the RG transformation (3.19) is notdetermined. In fact we are considering a different theory space for each dif-ferent value of ν and in each of these spaces we are interested in finding fixedpoints and to study the RG flow around them.

    In Fourier space the RG transformation (3.19) becomes: fixedpoints ≡stable

    distribu-

    tions

    Rp̂(t) =[

    (t

    )]2

    , (3.46)

    and the fixed point equation is now:

    p̂∗(t) =

    [

    p̂∗

    (t

    )]2

    . (3.47)

    Proceeding as we did before we find the general form:

    p̂∗(t) = e−c|t|α α =

    1

    ν. (3.48)

    To have a everywhere positive pdf we must demand 0 < α ≤ 2. This generalclass of fixed point pdf are called stable distributions; they are describedby the following characteristic function:

    p̂Lα(t) = eiµt−c|t|α , (3.49)

  • CHAPTER 3. PREDICTING THE UNCERTAIN 40

    !4 !2 0 2 4

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    5 10 15 20

    0.005

    0.010

    0.015

    0.020

    Figure 3.2: Lévy probability density functions for (from top) α = 12 , 1,32 , 2.

    Note that for smaller values of α the pdf is more picked around zero but hasprogressively thicker tails.

    with also c ≥ 0 and µ real. These distributions are also called lévy distri-butions of which the Gaussian distribution is the particular case α = 2 andc = 1/2σ2. For a generic value of α it is not usually possible to analyticallycalculate the inverse Fourier transform.

    A case where this is possible is when α = 1 and we recover the Cauchy cauchydistribu-

    tion

    (or Lorentzian) distribution:

    pL1(x) =A

    π2A+ (x− µ)2, (3.50)

    where c = πA.

    Note that all the Lévy distributions with 0 < α < 2 have infinite variance: scale–free

    distribu-

    tions

    they are scale–free distribution in the sense that there is no “character-istic scale” like the one set by a finite variance. For the distributions with0 < α < 1 not even the mean is defined.

    One can prove a generalized central limit theorem for Lévy distributions generalizedcltalong the lines of our RG proof of the standard CLT.

  • CHAPTER 3. PREDICTING THE UNCERTAIN 41

    3.3 Entropy and Informationrg

    trasfor-

    mations

    burn

    informa-

    tion

    The information associated to a random variable described by a pdf p(x)is the following:

    I[p] = −∫

    dx p(x) log p(x) . (3.51)

    the coarse–graining procedure burns information and thus theRG flow drives to a pdf which minimize the functional (3.51) within thefunctional space specified by (3.18).

    In particular, fixed point pdf must be an extremum of (3.51) subject to the σ2 = ∞fixed

    points

    ≡ ex-tremum

    of I[p]

    constrains (3.18). We can implement these constrains by employing Lagrangemultipliers:

    δ

    {

    I[p] + α

    dx p(x) + β

    dx x p(x) + γ

    dx x2 p(x)

    }

    = 0 . (3.52)

    Equation (3.52) is solved by:

    p(x) = e1+α+βx+γx2, (3.53)

    where:

    e1+α =1√2πσ

    β = 0 α = −1

    2σ2,

    which gives back the Gaussian pdf (3.30) as expected. Its easy to calculatethe information of the Gaussian:

    I[pG] =1

    2+

    1

    2log 2π + log σ = 1.41894...+ log σ .

    Can we prove that I [Rp] ≤ I[p]?

    3.4 The effective action and Cramér’s theoremeffective

    actionThe moment generating function of a random variable φ is5:

    Z(J) =〈

    eJφ〉

    Z(0) = 1 Z(n)(0) = 〈φn〉 ,5We switch notation X → φ and t → J .

  • CHAPTER 3. PREDICTING THE UNCERTAIN 42

    the cumulant distribution function is the logarithm of the moment generatingfunction:

    W (J) = logZ(J) W (0) = 0

    W ′(0) = Z ′(0) W ′′(0) = Z ′′(0)− Z ′(0)2 .We can prove that W (J) is convex:

    W ′′(J) =Z(J)Z ′′(J)− Z ′(J)2

    Z(J)2=

    eJφ〉 〈

    φ2eJφ〉

    −〈

    φeJφ〉2

    〈eJφ〉2,

    then using the Cauchy-Schwarz inequality 〈φψ〉2 ≤ 〈φ2〉 〈ψ2〉 for〈

    φeJφ〉2 ≤

    eJφ〉 〈

    φ2eJφ〉

    gives W ′′(J) ≥ 0 for all J . Thus one can define the so–calledCramér’s function, or rate function, as the Legendre transform ofW (J):

    Γ(ϕ) = supJ [Jϕ−W (J)] .Γ(ϕ) is also convex and we call Jϕ the solution of W ′(J) = ϕ so thatΓ(ϕ) = Jϕϕ−W (Jϕ). We will call Γ(ϕ) the effective action.

    The fundamental result of the theory of large deviations, due to Cramér, largedevia-

    tions:

    cramér’s

    theorem

    states that:

    The probability of having a deviation from the law of large numbers is givenby:

    P

    (

    1

    N

    N∑

    i=1

    φi > ϕ

    )

    → e−NΓ(ϕ) for N → ∞ ,

    and similarly for 1N∑N

    i=1 φi < ϕ.

    For a proof see [4]. We now look at two examples.

    Its easy to find the rate function for a Gaussian random variable. Start example:gaussian

    random

    vari-

    ables

    from the moment generating function (with m = 0):

    Z(J) =1√2πσ2

    dφ e−1

    2σ2φ2eJφ = e

    12σ2J2 ,

    gives W (J) = 12σ2J2. The average field is ϕJ = W ′(J) = Jσ2 from which we

    obtain the current Jϕ = 1σ2ϕ. The rate function is then

    Γ(ϕ) = Jϕϕ−W (Jϕ) =1

    σ2ϕ2 −

    1

    2

    1

    σ2ϕ2 =

    1

    2σ2(ϕ−m)2 .

  • CHAPTER 3. PREDICTING THE UNCERTAIN 43

    0.6 0.7 0.8 0.9 1.0!

    "0.7

    "0.6

    "0.5

    "0.4

    "0.3

    "0.2

    "0.1

    "#!!"

    Figure 3.3: Cramér’s function, i.e. effective action, for a discrete Bernoullirandom variable compared to a simulation.

    Note that for a Gaussian pdf Γ(ϕ) = log p(ϕ) + 12 log(2πσ2). Which is the

    effective action of Lévy random variables?

    The bernoulli discrete random variable φ assume the values 0, 1 example:bernoulli

    random

    vari-

    ables

    with probability p = (1−p) = 12 . The moment generating function is simply:

    Z(J) =1 + eJ

    2,

    while the cumulant generating function is:

    W (J) = − log 2 + log(1 + eJ) .

    We need to find the minimum:

    W ′(J) =eJ

    1 + eJ= ϕJ ⇒ Jϕ = log

    ϕ

    1− ϕ.

    The effective action is:

    Γ(ϕ) = Jϕϕ−W (Jϕ)

    = ϕ logϕ

    1− ϕ+ log 2− log

    (

    1 +ϕ

    1− ϕ

    )

    = log 2 + ϕ logϕ+ (1− ϕ) log(1− ϕ) .

  • CHAPTER 3. PREDICTING THE UNCERTAIN 44

    Cramér’s theorem predicts:

    1

    NlogP

    (

    1

    N

    N∑

    i=1

    φi > ϕ

    )

    → − log 2− ϕ logϕ− (1− ϕ) log(1− ϕ) .

    It is nice to compare this relation with a simulation. This is shown in Figure3.3. For small ϕ we recover the CLT:

    Γ(ϕ) = 2

    (

    ϕ−1

    2

    )2

    +O

    ((

    ϕ−1

    2

    )4)

    =1

    2σ2(ϕ−m)2 +O

    (

    (ϕ−m)4)

    ,

    where we used m = p = 12 and σ2 = p(1 − p) = 14 valid for a Bernoulli

    variable.

  • Bibliography

    [1] B.V. Gnedenko and A.N. Kolmogorov, Limit Distributions for Sums ofIndependent Random Variables, Addison Wesley, Cambridge, MA, 1954.

    [2] G. Jona-Lasinio, Renormalization group and probability theory, PhysicsReports, (2001) 1–31.

    [3] P. Castiglione, M. Falcione, A. Lesne and A. Vulpiani, Chaos and CoarseGraining in Statistical Mechanics (2008) Cambridge University Press.

    [4] R. Ellis, Entropy Large Deviations and Statistical Mechanics (2000)Springer–Verlag.

    45