summer school connections between probability and ...ledoux/jerusalem.pdf · ous mathematical...

Summer School

Connections betweenProbability and Geometric

Functional Analysis

Jerusalem, 14-19 June, 2005

DEVIATION INEQUALITIES

ON LARGEST EIGENVALUES

M. Ledoux

University of Toulouse, France

In these notes, we survey developments on the asymptotic behaviorof the largest eigenvalues of random matrix and random growth mod-els, and describe the corresponding known non-asymptotic exponentialbounds. We then discuss some elementary and accessible tools frommeasure concentration and functional analysis to reach some of thesequantitative inequalities at the correct small deviation rate of the fluc-tuation theorems. Results in this direction are rather fragmentary. Forsimplicity, we mostly restrict ourselves to Gaussian models.

1

INTRODUCTION

In the recent years, important developments took place in the analysis of the spectrumof large random matrices and of various random growth models. In particular, univer-sality questions at the edge of the spectrum has been conjectured, and settled, for anumber of apparently disconnected examples.

Let XN = (XNij )1≤i,j≤N be a complex Hermitian matrix such that the entries on

and above the diagonal are independent complex (real on the diagonal) centered Gaus-sian random variables with variance σ2. Denote by λN1 , . . . , λ

NN the real eigenvalues of

XN . Under the normalization σ2 = 14N of the variance, the famous Wigner theorem

indicates that the spectral measure 1N

∑Ni=1 δλN

iconverges to the semicircle law, sup-

ported on (−1,+1). Furthermore, the largest eigenvalue λNmax converges almost surelyto 1, the right-end point of the support of the semicircle law. As one main achievementin the recent developments of random matrix theory, it has been proved in the earlynineties by P. Forrester [Fo1] and C. Tracy and H. Widom [T-W1] that the fluctuationsof the largest eigenvalue are given by

N2/3(λNmax − 1) → F

where F is the so-called Tracy-Widom distribution. A similar conclusion holds for realGaussian matrices, and the result has been extended by A. Soshnikov [So1] to classes ofreal or complex matrices with independent entries under suitable moment assumptions.

In the striking contribution [B-D-J], J. Baik, P. Deift and K. Johansson proved in1999 that the Tracy-Widom distribution governs the fluctuation of an apparently com-pletely disconnected model, namely the length of the longest increasing subsequencein a random partition. Denote indeed by Ln the length of the longest increasing sub-sequence in a random permutation chosen uniformly in the symmetric group over nelements. Then, as shown in [B-D-J],

12n1/6

(Ln − 2

√n

)→ F

weakly, with F the Tracy-Widom distribution. (Note that the normalization is givenby the third power of the mean order 2

√n, as it would be the case if we replace λNmax

by NλNmax in the random matrix model.)

2

Since then, universality of the Tracy-Widom distribution is conjectured for a num-ber of models, and has been settled recently for some specific ones, including cor-ner growth models, last-passage times in directed percolation, exclusion processes,Plancherel measure, random Young tableaux... For example, let w(i, j), i, j ∈ N, beindependent exponential or geometric random variables. For M ≥ N ≥ 1, set

W = W (M,N) = maxπ

∑(i,j)∈π

w(i, j)

where the maximum runs over all up/right paths π in N2 from (1, 1) to (M,N). Therandom growth function W may be interpreted as a directed last-passage time in per-colation. K. Johansson [Joha1] showed that, for every c ≥ 1, up to some normalizationfactor,

1N1/3

(W

([cN ], N

)− ωN

)→ F

weakly, where again F is the Tracy-Widom distribution (and ω the mean parameter).

These attractive results, and the numerous recent developments around them (cf.the review papers [Baik2], [Joha4], [T-W4]...) emphasize the unusual rate (mean)1/3

and the central role of the new type of distribution F in the fluctuations of largesteigenvalues and random growth models. The analysis of these models is actually madepossible by a common determinantal point process structure and asymptotics of orthog-onal polynomials for which sophisticated tools from combinatorics, complex analysis,integrable systems and probability theory have been developed. This determinantalstructure is also the key to the study of the spacings between the eigenvalues, a topicof major interest in the recent developments of random matrix theory which led inparticular to striking conjectures in connection with the Riemann zeta function (cf.[De], [Fo2], [Fy], [Ko], [Meh]...).

In these notes, we will be concerned with the simple question of non-asymptoticexponential deviation inequalities at the correct fluctuation rate in some of the preced-ing limit theorems. We only concentrate on the order of growth and do not discuss thelimiting distributions. For example, in the preceding setting of the largest eigenvalueλNmax of random matrices, we would be interested to find, for fixed N ≥ 1 and ε > 0,(upper-) estimates on

P({λNmax ≥ 1 + ε}

)and P

({λNmax ≤ 1− ε}

)which fit the weak convergence rate towards the Tracy-Widom distribution. In a sense,this purpose is similar to the Gaussian tail inequalities for sums of independent ran-dom variables in the context of the classical central limit theorem. Several results,usually concerned with large and moderate deviation asymptotics and convergence ofmoments, deal with this question in the literature. However, not all of them are eas-ily accessible, and usually require a rather heavy analysis, connected with stationary

3

phase asymptotics of contour integrals or non-classical analytical schemes of the the-ory of integrable systems such as Riemann-Hilbert asymptotic methods. In any case,the conclusions so far only deal with rather restricted classes of models. For example,in the random matrix context, only (complex) Gaussian entries allow at this point forsatisfactory deviation inequalities at the appropriate rate. Directed percolation modelshave been answered only for geometric or exponential weights.

The aim of these notes is to provide a few elementary tools, some of them offunctional analytic flavour, to reach some of these deviation inequalities. (We willonly be concerned with upper bounds.) A first attempt in this direction deals withthe modern tools of measure concentration. Measure concentration typically producesGaussian bounds of the type

P({∣∣λNmax − E(λNmax)

∣∣ ≥ r})

≤ C e−Nr2/C , r ≥ 0,

for some C > 0 independent of N . These inequalities are rather robust and holdfor large families of distributions. While they describe the correct large deviations,they however do not reflect the small deviations at the rate (mean)1/3 of the Tracy-Widom theorem. Further functional tools (if any) would thus be necessary, and sucha program was actually advertised by S. Szarek in [Da-S]. We present here a fewarguments of possible usefulness to this task, relying on Markov operator ideas such ashypercontractivity and integration by parts. In particular, we try to avoid saddle pointanalysis on Laplace integrals for orthogonal polynomials which are at the root of theasymptotic results. We however still rely on determinantal and orthogonal polynomialrepresentations of the random matrix models. Certainly, suitable bounds on orthogonalpolynomials might supply for most of what is necessary to our purpose. Our first wishwas actually to try a few abstract and (hopefully) general arguments to tackle someof these questions in the hope of extending some conclusions to more general models.The various conclusions from this particular viewpoint are however far from complete,not always optimal, and do not really extend to new examples of interest. It is thehope of the future research that new tools may answer in a more satisfactory way someof these questions.

The first chapter describes, in the particular example of the Gaussian UnitaryEnsemble, the fundamental determinantal structure of the eigenvalue distribution andthe orthogonal polynomial method which allow for the fluctuation and large deviationasymptotics of the top eigenvalues of random matrix and random growth models. Inthe second chapter, we present the known exponential deviation inequalities which maybe drawn from the asymptotic theory and technology. Chapter 3 addresses the measureconcentration tools in this setting, and discusses both their usefulness and limitations.In Chapter 4, the tool of hypercontractivity of Markov operators is introduced to thetask of deviation and variance inequalities at the Tracy-Widom rate. The last chapterpresents some moment recurrence equations which may be obtained from integrationby parts for Markov operators, and discusses their interest in deviation inequalitiesboth above and below the limiting expected mean.

4

These notes are only educational and do not present any new result. They moreoverfocus on a very particular aspect of random matrix theory, ignoring some main develop-ments and achievements. In particular, references are far from exhaustive. Instead, wetry to refer to some general references where more complete expositions and pointersto the literature may be found. We apologize for all the omissions and inaccuracies inthis respect. In connection with these notes, let us thus mention, among others, thebook [Meh] by M. L. Mehta which is a classical reference on the main random ma-trix ensembles from the mathematical physics point of view. It contains in particularnumerous formulas on the eigenvalue densities, their correlation functions etc. Thevery recent third edition presents in addition some of the latest developments on theasymptotic behaviors of eigenvalues of random matrices. The monograph [De] by P.Deift discusses orthogonal polynomial ensembles and presents an introduction to theRiemann-Hilbert asymptotic method. P. Forrester [Fo] extensively describes the vari-ous mathematical physics models of random matrices and their relations to integrablesystems. The survey paper by Z. D. Bai [Bai] offers a complete account on the spectralanalysis of large dimensional random matrices for general classes of Wigner matrices bythe moment method and the Stieltjes transform. The short reviews [Joha4], [T-W4],[Baik2] provide concise presentations of some main recent achievements. The lectures[Fy] by Y. Fyodorov are an introduction to the statistical properties of eigenvalues oflarge random Hermitian matrices, and treat in particular the paradigmatic example ofthe Gaussian Unitary Ensemble (much in the spirit of these notes). Finally, the re-cent nice and complete survey on orthogonal polynomial ensembles by W. Konig [Ko]achieves an accessible and inspiring account to some of these important developments.More references may be downloaded from the preceding ones.

Thanks are due to G. Schechtman and T. Szankowski for their invitation to thissummer school, and to all the participants for their interest in these lectures.

5

1. ASYMPTOTIC BEHAVIORS

In this first chapter, we briefly present some basic facts about the asymptotic analysisof the largest eigenvalues of random matrix and random growth models. For simplic-ity, we restrict ourselves to some specific models (mostly the Hermite and MeixnerEnsembles) for which complete descriptions are available. We follow the recent litera-ture on the subject. In the particular example of the Gaussian Unitary Ensemble, wefully examine the basic determinantal structure of the correlation functions and theorthogonal polynomial method. We further discuss Coulomb gas and random growthfunctions, as well as large deviation asymptotics.

1.1. The largest eigenvalue of the Gaussian Unitary Ensemble

One main example of interest throughout these notes will be the so-called GaussianUnitary Ensemble (GUE). This example is actually representative of a whole family ofmodels. Consider, for each integerN ≥ 1, X = XN = (XN

ij )1≤i,j≤N a N×N selfadjointcentered Gaussian random matrix with variance σ2. By this, we mean thatX is aN×NHermitian matrix such that the entries above the diagonal are independent complex(real on the diagonal) Gaussian random variables with mean zero and variance σ2 (thereal and imaginary parts are independent centered Gaussian variables with varianceσ2/2). Equivalently, the random matrix X is distributed according to the probabilitydistribution

P(dX) =1Z

exp(− Tr(X2)/2σ2

)dX (1.1)

on the space HN∼= RN

2of N ×N Hermitian matrices where

dX =∏

1≤i≤N

dXii

∏1≤i<j≤N

dRe (Xij)d Im (Xij)

is Lebesgue measure on HN and Z = ZN the normalizing constant. This probabilitymeasure is invariant under the action of the unitary group on HN in the sense thatUXU∗ has the same law as X for each unitary element U of HN . The random matrix

6

X is then said to be element of the Gaussian Unitary Ensemble (GUE) (“ensemble”for probability distribution).

The real case is known as the Gaussian Orthogonal Ensemble (GOE) defined bya real symmetric random matrix X = XN = (XN

ij )1≤i,j≤N such that the entries XNij ,

1 ≤ i ≤ j ≤ N , are independent centered real-valued Gaussian random variables withvariance σ2 (2σ2 on the diagonal). Equivalently, the distribution of X on the space SNof N ×N symmetric matrices is given by

P(dX) =1Z

exp(− Tr(X2)/4σ2

)dX (1.2)

(where now dX is Lebesgue measure on SN ). This distribution is invariant by theorthogonal group.

For such a symmetric or Hermitian random matrix X = XN , denote by λN1 , . . . , λNN

its (real) eigenvalues.

It is a classical result due to E. Wigner [Wig] that, almost surely,

1N

N∑i=1

δλNi→ ν (1.3)

in distribution as σ2 ∼ 14N , N → ∞, where ν is the semicircle law with density

2π (1 − x2)1/2 with respect to Lebesgue measure on (−1,+1). This result has beenextended, on the one hand, to large classes of both real (symmetric) and complex(Hermitian) random matrices with non-Gaussian independent (subject to the symmetrycondition) entries, called Wigner matrices, under the variance normalization σ2 =E(|Xij |2) ∼ 1

4N , i < j. The basic techniques include moment methods, to show theconvergence of

1N

E(Tr

((XN )

p))to the p-moment (p ∈ N) of the semicircle law, or the Stieltjes transform (a kind ofmoment generating function) method. Another point of view on the Stieltjes transformis provided by the free probability calculus ([Vo], [V-D-N], [H-P], [Bi]...). In the partic-ular example of the GUE, simple orthogonal polynomial properties may be used (seebelow). Actually, all these arguments first establish convergence of the mean spectralmeasure

µN = E(

1N

N∑i=1

δλNi

). (1.4)

This convergence has been improved to the almost sure statement (1.3) in [Ar]. Werefer to the paper [Bai] by Z. D. Bai for a complete account on spectral distributions oflarge Wigner matrices not addressed here. On the other hand, Wigner’s theorem hasbeen extended to orthogonal or unitary invariant ensembles of the type (1.1) or (1.2)where X2 is replaced (by the functional calculus) by v(X) for some suitable function

7

v : R → R. The main tool in this case is the Stieltjes transform, and the limitingspectral distribution (or equilibrium measure, cf. [De], [H-P], [S-T]...) then dependson (the potential) v. The quadratic potential is the only one leading to independententries Xij , 1 ≤ i ≤ j ≤ N , in the matrix X with law (1.1) or (1.2).

It is also well-known that in the GUE and GOE models, as well as in the moregeneral setting of Wigner matrices (cf. [Bai]), under suitable moment hypotheses, thelargest eigenvalue λNmax = max1≤i≤N λ

Ni converges almost surely, as σ2 = 1

4N , to theright-end point of the support of the semicircle law, that is 1 in the normalization chosenhere. (By symmetry, the smallest eigenvalue converges to −1. The result extends to thek-th extremal eigenvalues for every fixed k.) For the orthogonal and unitary invariantensembles, the convergence is towards the right-end point of the compact support ofthe limiting spectral distribution (cf. [De]).

As one of the main recent achievements of the theory of random matrices, it hasbeen shown by P. Forrester [Fo1] (in a mathematical physics language) and C. Tracyand H. Widom [T-W1] that the fluctuations of the largest eigenvalue λNmax of a GUErandom matrix X = XN with σ2 = 1

4N around its expected value 1 takes place at therate N2/3. More precisely,

N2/3(λNmax − 1) → FGUE (1.5)

weakly where FGUE is the so-called (GUE) Tracy-Widom distribution. Note that thenormalization N2/3 may be somehow guessed from the Wigner theorem since, for ε > 0small,

Card {1 ≤ i ≤ N ;λNi > 1− ε} ∼ N ν((1− ε, 1]

)∼ Nε3/2

so that for ε of the order of N−2/3 the probability P({λNmax ≤ 1 − ε}) should bestabilized. The new distribution FGUE occurs as a Fredholm determinant

FGUE(s) = det([Id−KAi]L2(s,∞)

), s ∈ R, (1.6)

of the integral operator associated to the Airy kernel KAi, as a limit in this regime ofthe Hermite kernel using Plancherel-Rotach orthogonal polynomial asymptotics (seebelow). C. Tracy and H. Widom [T-W1] were actually able to provide an alternatedescription of this new distribution FGUE in terms of some differential equation as

FGUE(s) = exp(−

∫ ∞

2s

(x− 2s)u(x)2dx), s ∈ R, (1.7)

where u(x) is the solution of the Painleve II equation u′′ = 2u3 + xu with the asymp-totics u(x) ∼ 1

2√πx1/4 e−

23x

3/2as x → ∞. Similar conclusions hold for the Gaussian

Orthogonal Ensemble (GOE) with a related limiting distribution FGOE of the Tracy-Widom type [T-W2]. Random matrix theory is also concerned sometimes with quater-nionic entries leading to the Gaussian Simplectic Ensemble (GSE), cf. [Meh], [T-W2].

8

A few caracteristics of the distribution FGUE are known. It is non-centered, with amean around −.879, and its respective behaviors at ±∞ are given by

C−1 e−Cs3≤ FGUE(−s) ≤ C e−s

3/C (1.8)

andC−1 e−Cs

3/2≤ 1− FGUE(s) ≤ C e−s

3/2/C (1.9)

for s large and C numerical (cf. e.g. [Au], [John], [L-M-R]...)

As already emphasized in the introduction, the Tracy-Widom distributions actuallyappeared recently in a number of apparently disconnected problems, from the length ofthe longest increasing subsequence in a random permutation, to corner growth models,last-passage times in oriented percolation, exclusion processes, Plancherel measure,random Young tableaux etc, cf. [Joha4], [T-W4], [Baik2], [Ko]... The Tracy-Widomdistributions are conjectured to be the universal limiting laws for this type of models,with a common rate (mean)1/3 (in contrast with the (mean)1/2 rate of the classicalcentral limit theorem).

The fluctuation result (1.5) has been extended by A. Soshnikov in the strickingcontribution [So1] to Wigner matrices X = XN with real or complex non-Gaussianindependent entries with variance σ2 = E(|Xij |2) = 1

4N and a Gaussian control of themoments E(|Xij |2p) ≤ (Cp)p, p ∈ N, 1 ≤ i < j ≤ N . In particular, the assumptionscover the case of matrices X = (Xij/2

√N)1≤i,j≤N where the Xij ’s, i ≤ j, are indepen-

dent symmetric Bernoulli variables. This is one extremely rare case so far for whichuniversality of the Tracy-Widom distributions has been fully justified. Interestinglyenough, one important aspect of Soshnikov’s remarkable proof is that it is actuallydeduced from the GUE or GOE cases by a moment approximation argument (andnot directly from the initial matrix distribution). In another direction, asymptoticsof orthogonal polynomials have been deeply investigated to extend the GUE fluctua-tions to large classes of unitary invariant ensembles. Depending on the structure ofthe underlying orthogonal polynomials, the proofs can require rather deep argumentsinvolving the steepest descent/stationary phase method for Riemann-Hilbert problems(cf. [De], [Ku], [Baik2], [B-K-ML-M]...). For a strategy based on 1/n-expansion inunitary invariant random matrix ensembles avoiding the Riemann-Hilbert analysis, see[P-S]. Further developments are still in progress.

1.2 Determinantal representations

The analysis of the GUE, and more general unitary invariant ensembles, is made pos-sible by the determinantal representation of the eigenvalue distribution as a Coulombgas and the use of orthogonal polynomials. This determinantal point process repre-sentation is the key towards the asymptotics results on eigenvalues of large randommatrices, both inside the bulk (spacing between the eigenvalues) and at the edge of the

9

spectrum. We follow below the classical literature on the subject [Meh], [De], [Fo2],[P-L]... to which we refer for further details.

Keeping the GUE example, by unitary invariance of the ensemble (1.1) and theJacobian change of variables formula, the distribution of the eigenvalues λN1 ≤ · · · ≤ λNNof X = XN on the Weyl chamber E = {x ∈ RN ;x1 < · · · < xN} may be shown to begiven by

1Z

∆N (x)2N∏i=1

dµ(xi/σ) (1.10)

where∆N = ∆N (x) =

∏1≤i<j≤N

(xj − xi)

is the Vandermonde determinant, dµ(x) = e−x2/2 dx√

2πthe standard normal distribu-

tion on R and Z = ZN the normalization factor. We actually extend the probabilitydistribution (1.10) to the whole of RN by symmetry under permutation of the coor-dinates, and thus speak, with some abuse, of the joint distribution of the eigenvalues(λN1 , . . . , λ

NN ) as a random vector in RN .

It is on the basis of the representation (1.10) that the so-called orthogonal poly-nomial method may be developed. Denote by P`, ` ∈ N, the normalized Hermitepolynomials with respect to µ, which form an orthonormal basis of L2(µ). Since, foreach `, P` is a polynomial function of degree `, elementary manipulations on rows orcolumns show that the Vandermonde determinant ∆N (x) is equal, up to a constantdepending on N , to

DN = DN (x) = det(P`−1(xk)

)1≤k,`≤N .

The following lemma is then a useful tool in the study of the correlation functions.It is a simple consequence of the definition of the determinant together with Fubini’stheorem.

Lemma 1.1. On some measure space (S,S,m), let ϕi, ψj , i, j = 1, . . . , N , be square

integrable functions. Then

∫SN

det(ϕi(xj)

)1≤i,j≤N det

(ψi(xj)

)1≤i,j≤N

N∏k=1

dm(xk)

= N ! det( ∫

S

ϕiψjdm

)1≤i,j≤N

.

Replacing thus ∆N by DN in (1.10), a first consequence of Lemma 1.1 applied to

10

ϕi = ψj = P`−1 and dm = 1(−∞,t/σ]dµ is that, for every t ∈ R,

P({λNmax ≤ t}

)=

1Z ′

∫RN

DN (x)2N∏k=1

dm(xk)

= det(〈P`−1, Pk−1〉L2(]−∞,t/σ],dµ)

)1≤k,`≤N

= det(Id− 〈P`−1, Pk−1〉L2((t/σ,∞),dµ)

)1≤k,`≤N

(1.11)

where Z ′ =∫

RN DN (x)2∏Nk=1 dµ(xk) and 〈· , ·〉L2(A,dµ) is the scalar product in the

Hilbert space L2(A, dµ), A ⊂ R.

On the basis of Lemma 1.1 and the orthogonality properties of the polynomials P`,the eigenvalue vector (λN1 , . . . , λ

NN ) may be shown to have determinantal correlation

functions in terms of the (Hermite) kernel

KN (x, y) =N−1∑`=0

P`(x)P`(y), x, y ∈ R. (1.12)

The following statement provides such a description.

Proposition 1.2. For any bounded measurable function f : R → R,

E( N∏i=1

[1 + f(λNi )

])

=N∑r=0

1r!

∫Rr

r∏i=1

f(σxi) det(KN (xi, xj)

)1≤i,j≤rdµ(x1) · · · dµ(xr).

Proof. Starting from the eigenvalue distribution (1.10), we have

E( N∏i=1

[1 + f(λNi )

])=

1Z ′

∫RN

N∏i=1

[1 + f(σxi)

]DN (x)2dµ(x1) · · · dµ(xN )

where, as above, Z ′ =∫

RN DN (x)2dµ(x1) · · · dµ(xN ). By Lemma 1.1, Z ′ = N ! whilesimilarly∫

RN

N∏i=1

[1 + f(σxi)

]DN (x)2dµ(x1) · · · dµ(xN )

= N ! det(〈(1 + g)P`−1, Pk−1〉L2(µ)

)1≤k,`≤N

= N ! det(Id + 〈P`−1, Pk−1〉L2(gdµ)

)1≤k,`≤N

11

where we set g(x) = f(σx), x ∈ R. Hence,

E( N∏i=1

[1 + f(λNi )

])= det

(Id + 〈P`−1, Pk−1〉L2(gdµ)

)1≤k,`≤N

.

Now, the latter is equal to

N∑r=0

1r!

N∑`1,...,`r=1

det(〈Pì−1, P`j−1〉L2(gdµ)

)1≤i,j≤r

,

and thus, by Lemma 1.1 again, also to

N∑r=0

1r!

N∑`1,...,`r=1

1r!

∫Rr

det(f(σxj)Pì−1(xj)

)1≤i,j≤r

× det(Pì−1(xj)

)1≤i,j≤rdµ(x1) · · · dµ(xr).

By the Cauchy-Binet formula, this amounts to

N∑r=0

1r!

∫Rr

r∏i=1

f(σxi)det(KN (xi, xj)

)1≤i,j≤rdµ(x1) · · · dµ(xr)

which is the announced claim.

What Proposition 1.2 (more precisely its immediate extension to the computationof E

( ∏Ni=1[1 + fi(λNi )]

)for bounded measurable functions fi : R → R, i = 1, . . . , N)

puts forward is the fact that the distribution of the eigenvalues, and its marginals, arecompletely determined by the kernel KN of (1.12) In particular, replacing f by εf inProposition 1.2 and letting ε→ 0, the mean spectral measure µN of (1.4) is given, forevery bounded measurable function f , by

E(

1N

N∑i=1

f(λNi ))

=∫

Rf(σx)

1N

N−1∑`=0

P 2` dµ. (1.13)

Choosing f = −1(t,∞) in Proposition 1.2 shows at the other end that the distributionof the largest eigenvalue λNmax may be expressed by

P({λNmax ≤ t}

)=

N∑r=0

(−1)r

r!

∫(t/σ,∞)r

det(KN (xi, xj)

)1≤i,j≤rdµ(x1) · · · dµ(xr), t ∈ R.

(1.14)

This identity emphasizes the distribution of the largest eigenvalue λNmax as the Fredholmdeterminant of the (finite rank) operator

ϕ 7→∫ ∞

t/σ

ϕ(y)KN (·, y)dµ(y)

12

with kernel KN . That this expression be called a determinant is justified in particularby (1.11) (cf. e.g. [Du-S] or [G-G] for generalities on Fredholm determinants).

A classical formula due to Christoffel and Darboux (cf. [Sze]) indicates that

KN (x, y) = κNPN (x)PN−1(y)− PN−1(x)PN (y)

x− y, x, y ∈ R. (1.15)

(Note, see Chapter 3, that P ′N =√N PN−1.) In the regime given by (1.5), set then

t = 1 + sN−2/3, while as usual σ2 = 14N . After a change of variables in (1.14),

P({λNmax ≤ 1 + sN−2/3}

)=

N∑r=0

(−1)r

r!

∫(s,∞)r

det(KN (xi, xj)

)1≤i,j≤r dx1 · · · dxr

where

KN (x, y)

= KN

(2√N + 2xN−1/6, 2

√N + 2yN−1/6

)√2π

1N1/6

e−[√N+xN−1/6]2e−[

√N+yN−1/6]2 .

Now, in this regime, the kernel KN (x, y) may be shown to converge to the Airy kernel

KAi(x, y) =Ai (x)Ai′ (y)−Ai′ (x)Ai (y)

x− y, x, y ∈ R,

through the appropriate asymptotics on the Hermite polynomials known as Plancherel-Rotach asymptotics (cf. [Sze], [Fo1]...). Here Ai is the special Airy function solutionof Ai′′ = xAi with the asymptotics Ai(x) ∼ 1

2√πx1/4 e−

23x

3/2as x → ∞. By further

functional arguments, the convergence may be extended at the level of Fredholm de-terminants to show that, for every s ∈ R,

limN→∞

P({λNmax ≤ 1 + sN−2/3}

)=

∞∑r=0

(−1)r

r!

∫(s,∞)r

det(KAi(xi, xj)

)1≤i,j≤r dx1 · · · dxr

= det([Id−KAi]L2(s,∞)

)= FGUE(s),

justifying thus (1.5) (cf. [Fo1], [T-W1], [De]).

13

1.3 Coulomb gas and random growth functions

Probability measures on RN of the type (1.10) may be considered in more generality.Given for example a (continuous or discrete) probability measure ρ on RN , and β > 0,let

dQ(x) =1Z

∣∣∆N (x)∣∣βdρ(x) (1.16)

where Z =∫|∆N |βdρ < ∞ is the normalization constant. As we have seen, such

probability distributions naturally occur as the joint law of the eigenvalues of matrixmodels. For example, in the GUE case (cf. (1.10)), β = 2 and ρ is a product Gaussianmeasure. (In the GOE and GSE cases, β = 1 and 4 respectively, cf. [Meh].) For moregeneral (orthogonal, unitary or simplectic) ensembles induced by the probability lawZ−1 exp(−Tr v(X))dX on matrices, ρ is the product measure of the density e−v(x).

One general idea is that among reasonable families of distributions ρ, for exam-ple product measures of identical factors, the asymptotic behavior of the probabilitylaws (1.16) is governed by the Vandermonde determinant, and thus exhibits commonfeatures. It would be of interest to describe a few general facts about these laws. Dis-tributions of the type (1.16) are called Coulomb gas in mathematical physics. Thelargest eigenvalue of the matrix models thus appears here as the rightmost point orcharge max1≤i≤N xi under (1.16).

When dρ(x) =∏Ni=1 dµ(xi) for some probability measure µ on R or Z, and β = 2,

the preceding Coulomb gas distributions may be analyzed through the orthogonalpolynomials of the underlying probability measure µ (provided they exists) as in theexample of the GUE discussed previously. In particular, the correlation functions admitdeterminantal representations. In this case, the probability measures (1.16) are thussometimes called orthogonal polynomial ensembles. Accordingly, the joint law of theeigenvalues of the GUE is called the Hermite (orthogonal polynomial) Ensemble. Inwhat follows, we only consider Coulomb gas of this sort given as orthogonal polynomialensembles (cf. [De], [Ko]...).

Following the analysis of the GUE, fluctuations of the largest eigenvalue or right-most charge of orthogonal polynomial ensembles toward the Tracy-Widom distributionmay be developed on the basis of the common Airy asymptotics of orthogonal polyno-mials (at this regime). The principle of proof extends to kernels KN properly conver-gent as N → ∞ to the Airy kernel. When the orthogonal polynomials admit suitableintegral representations, the asymptotic behaviors may generally be obtained from asaddle point analysis. For example, the `-th Hermite polynomial may be described as

P`(x) =(−1)`√`!

ex2/2 d

`

dx`(e−x

2/2),

and thus, after a standard Fourier identity,

P`(x) =(−i)`√`!

ex2/2

∫ +∞

−∞s`eisx−s

2/2 ds√2π

.

14

The asymptotic behavior as `→∞ may then be handled by the so-called saddle pointmethod (or steepest descent, or stationary phase) of asymptotic evaluation of integralsof the form ∫

Γ

φ(z)etψ(z)dz

over a contour Γ in the complex plane as the parameter t is large (cf. [Fy] for abrief introduction). While these asymptotics are available for the classical orthogonalpolynomials from suitable representation of their generating series, the study of moregeneral weights can lead to rather delicate investigations. This might require deeparguments involving steepest descent methods of highly non-trivial Riemann-Hilbertanalysis as developed by P. Deift and X. Zhou [D-Z]. We refer to the monograph [De]by P. Deift for an introduction to these methods and complete references up to 1999,including the important contribution [D-K-ML-V-Z]. A further introduction is the setof notes [Ku] including more recent developments and references. See also referencesin [Baik2]. Discrete orthogonal polynomial ensembles are deeply investigated in [B-K-ML-M]. When suitable contour integral representations of the kernels are available, thestandard saddle point method is however enough to determine the expected asymptotics(see e.g. [Joha2] for an example of regularized Wigned matrices).

The orthogonal polynomial method cannot be developed however outside the (com-plex) case β = 2. Specific arguments have to be found. The real case for example usesPfaffians and requires non-trivial modifications (cf. [Meh]). In particular, it is possibleto relate the asymptotic behavior of the largest eigenvalues of the GOE to the one ofthe GUE through a generalized two-dimensional kernel, and thus to conclude to similarfluctuation results (cf. [T-W2], [Wid1]). In particular, the limiting GOE Tracy-Widomlaw takes the form

FGOE(s) = FGUE(s)1/2 exp(− 1

2

∫ ∞

2s

u(x)dx), s ∈ R.

Coulomb gas associated to the classical orthogonal polynomial ensembles are ofparticular interest. Among these ensembles, the Laguerre and (discrete) Meixner en-sembles play a central role and exhibit some remarkable features. The Laguerre Ensem-ble represents the joint law of the eigenvalues of Wishart matrices. Let G be a complexM×N , M ≥ N , random matrix the entries of which are independent complex Gaussianrandom variables with mean zero and variance σ2, and set Y = Y N = G∗G. The lawof Y defines a unitary invariant probability measure on HN , and the distribution of theeigenvalues is given by a Coulomb gas (1.16) with β = 2 and ρ (up to the scaling param-eter σ) the product measure of the Gamma distribution dµ(x) = Γ(γ+1)−1xγe−xdx on(0,∞) with γ = M −N . The Laguerre polynomials being the orthogonal polynomialsfor the Gamma law, the corresponding joint distribution of the eigenvalues is calledthe Laguerre Ensemble. Real Wishart matrices are defined similarly (with β = 1),and Wishart matrices with non-Gaussian entries may also be considered. The limitingspectral measure of Wishart matrices, as σ2 ∼ 1

4N , and M ∼ cN , c ≥ 1, is describedby the so-called Marchenko-Pastur distribution (or free Poisson law) [M-P] (cf. [Bai]).

15

The Meixner Ensemble is associated to a discrete weight. Let µ be the so-callednegative binomial distribution on N with parameters 0 < q < 1 and γ > 0 given by

µ({x}

)=

(γ)xx!

qx(1− q)γ , x ∈ N, (1.17)

where (γ)x = γ(γ + 1) · · · (γ + x− 1), x ≥ 1, (γ)0 = 1. If γ = 1, µ is just the geometricdistribution with parameter q. The orthogonal polynomials for µ are called the Meixnerpolynomials (cf. [Sze], [Ch], [K-S]).

As already mentioned above, asymptotics of the Laguerre and Meixner polynomi-als may then be used as for the GUE, but with increased technical difficulty, to showthat the largest eigenvalue of (properly rescaled) Wishart matrices and the rightmostcharge, that is the function max1≤i≤N xi (‘largest eigenvalue’), of the Meixner orthog-onal polynomial Ensemble, fluctuate around their limiting value at the Tracy-Widomregime. This has been established by K. Johansson [Joha1] for the Meixner Ensemble.The Laguerre Ensemble appears as a limit as q → 1, and has been investigated indepen-dently by I. Johnstone [John] who also carefully analyzes the real case along the linesof [T-W2]. In [So2], A. Soshnikov extends these conclusions to Wishart matrices withnon-Gaussian entries following his previous contribution [So1] for Wigner matrices.

Further classical orthogonal polynomial ensembles may be considered. For example,fluctuations of the Jacobi Ensemble constructed over Jacobi polynomials and associatedto Beta matrices are addressed in [Co].

The Laguerre and Meixner Ensembles actually share some specific Markovian typeproperties which make them play a central role in connection with various probabilisticmodels. In the remarkable contribution [Joha1], K. Johansson indeed showed that theMeixner orthogonal polynomial Ensemble entails an extremely rich mathematical struc-ture connected with many different interpretations. In particular, its rightmost chargemay be interpreted in terms of shape functions and last-passage times. Let w(i, j),i, j ∈ N, be independent geometric random variables with parameter q, 0 < q < 1. ForM ≥ N ≥ 1, set

W = W (M,N) = maxπ

∑(i,j)∈π

w(i, j) (1.18)

where the maximum runs over all up/right paths π in N2 from (1, 1) to (M,N). Anup/right path π from (1, 1) to (M,N) is a collection of sites {(ik, jk)}1≤k≤M+N−1

such that (i1, j1) = (1, 1), (iM+N−1, jM+N−1) = (M,N) and (ik+1, jk+1) − (ik, jk) iseither (1, 0) or (0, 1). The random growth function W may be interpreted as a directedlast-passage time in percolation. Using the Robinson-Schensted-Knuth correspondencebetween permutations and Young tableaux (cf. [Fu]), K. Johansson [Joha1] provedthat, that for every t ≥ 0,

P({W ≤ t}

)= Q

({max

1≤i≤Nxi ≤ t+N − 1

})(1.19)

16

where Q is the Meixner orthogonal polynomial Ensemble with parameters q and γ =M − N + 1. As described in [Joha1], this model is also closely related to the one-dimensional totally asymmetric exclusion process. It may also be interpreted as arandomly growing Young diagram or a zero-temperature directed polymer in a randomenvironment (cf. also [Ko]).

Provided with this correspondence, the fluctuations of the rightmost charge of theMeixner Ensemble may be translated on the growth function W (M,N). As indeedshown in [Joha1], for every c ≥ 1, (some multiple of) the random variable

W([cN ], N

)− ωN

N1/3,

where

ω =(1 +

√qc )2

1− q− 1,

converges weakly to the Tracy-Widom distribution FGUE.

In the limit as q → 1, the model covers the fluctuation of the largest eigenvalue ofWishart matrices, studied independently in [John]. Namely, if w is a geometric randomvariable with parameter 0 < q < 1, as q → 1, (1 − q)w converges in distribution to aexponential random variable with parameter 1. If W = W (M,N) is then understoodas a maximum over up/right paths of independent such exponential random variables,the identity (1.19) then translates into

P({W ≤ t}

)= Q

({max

1≤i≤Nxi ≤ t

}), t ≥ 0, (1.20)

where Q is now the Coulomb gas of the Laguerre Ensemble with parameter σ = 1. (Itshould be mentioned that no direct proof of (1.20) is so far available.) This examplethus admits the double description as a largest eigenvalue of random matrices and alast-passage time.

The central role of the Meixner model covers further instances of interest. Amongthem are the Plancherel measure and the length of the longest increasing subsequencein a random permutation. (See [A-D] for a general presentation on the length of thelongest increasing subsequence in a random permutation.) It was namely observed byK. Johansson [Joha3] (see also [B-O-O]) that, as q = θ

N2 , N →∞, θ > 0, the Meixnerorthogonal polynomial Ensemble converges to the θ-Poissonization of the Plancherelmeasure on partitions. Since the Plancherel measure is the push-forward of the uniformdistribution on the symmetric group Sn by the Robinson-Schensted-Knuth correspon-dence which maps a permutation σ ∈ Sn to a pair of standard Young tableaux of thesame shape, the length of the first row is equal to the length Ln(σ) of the longestincreasing subsequence in σ. As a consequence, in this regime,

limN→∞

P({W (N,N) ≤ t

})= P

({LN ≤ t}

), t ≥ 0,

17

where N is an independent Poisson random variable with parameter θ > 0. Theorthogonal polynomial approach may then be used to produce a new proof of theimportant Baik-Deift-Johansson theorem [B-D-J] on the fluctuations of Ln statingthat

Ln − 2√n

2n1/6→ FGUE (1.21)

in distribution.

The Markovian properties of the specific geometric and exponential distributionsmake it thus possible to fully analyze the shape functions W and their asymptoticbehaviors. (If the definition of up/righ paths is modified, a few more isolated caseshave been studied [Baik1], [Joha1], [Joha3], [Se2], [T-W3]...) It would be a challeng-ing question to establish the same fluctutation results, with the same (mean)1/3 rate,for random growth functions W (M,N) (1.18) constructed on more general families ofdistributions of the w(i, j)’s, such as for example Bernoulli variables. While superad-ditivity arguments show that W ([cN ], N)/N is convergent almost surely as N → ∞under rather mild conditions, fluctuations around the (usually unknown) limit are al-most completely open so far. Even the variance growth (see Chapter 4) has not yetbeen determined.

1.4 Large deviation asymptotics

In addition to the preceding fluctuation results for the largest eigenvalues or rightmostcharges of orthogonal polynomial ensembles, some further large deviation theoremshave been investigated during the past years. The analysis again relies of the determi-nantal structure of Coulomb gas together with a careful examination of the equilibriummeasure from the logarithmic potential point of view [S-T].

For example, translated into the framework of the preceding random growth func-tion W (M,N) defined from geometric random variables, K. Johansson also proved inthe contribution [Joha1] a large deviation theorem in the form of

limN→∞

1N

log P({W

([cN ], N ]

)≥ N(ω + ε)

})= −J(ε) (1.22)

for each ε > 0, where J is an explicit function such that J(x) > 0 if x > 0 (seebelow). The result is actually due to T. Seppalainen [Se3] in the simple exclusionprocess interpretation of the model. The large deviation principle on the left of themean takes place at the speed N2 and expresses

limN→∞

1N2

log P({W

([cN ], N ]

)≤ N(ω − ε)

})= −I(ε) (1.23)

for each ε > 0, where I(x) > 0 for x > 0. As we will see it below, the rate functionsJ and I of (1.22) and (1.23) actually partly reflect the N2/3 rate of the fluctuationresults.

18

In contrast with the fluctuation theorems of the preceding section which rely onspecific orthogonal polynomial asymptotics, such large deviation principles hold forlarge classes of (both continuous or discrete) Coulomb gas (1.16)

dQ(x) =1Z

∣∣∆N (x)∣∣β N∏i=1

dµ(xi),

for arbitrary β > 0 and under mild hypotheses on µ (cf. [Joha1], [BA-D-G], [Fe]).They are closely related to the large deviation principles at the level of the spectralmeasures emphasized by D. Voiculescu [Vo] (as a microstate description) and G. BenArous and A. Guionnet [BA-G] (cf. [H-P]) (as a Sanov type theorem). These resultsexamine the large deviation principles for the empirical measures 1

N

∑Ni=1 δxi at the

speed N2 in the space of probability measures on R. The rate function is minimizedat the equilibrium measure, almost sure limit of the empirical measure (the semicirclelaw for example in case of the Hermite Ensemble).

The corresponding rate function of the large deviation principles for the right-mostcharges (largest eigenvalues) is then usually deduced from the one for the empiricalmeasures. The speed of convergence is however different on the right and on the leftof the mean. In the example of the largest eigenvalue λNmax of the GUE with σ2 = 1

4N ,it is shown in [BA-D-G] that

limN→∞

1N

log P({λNmax ≥ 1 + ε}

)= −JGUE(ε) (1.24)

where, for every ε > 0,

JGUE(ε) = 4∫ ε

0

√x(x+ 2) dx. (1.25)

Note that JGUE(ε) is of the order of ε3/2 for the small values of ε, in accordance withthe Tracy-Widom theorem (1.5). Similarly, on the left of the mean,

limN→∞

1N2

log P({λNmax ≤ 1− ε}

)= −IGUE(ε) (1.26)

for some function IGUE such that IGUE(x) > 0 for every x > 0 (cf. also [Joha1], [Fe]).The speedN2 partly indicates that the largest eigenvalues tend to accumulate below theright-end point of the support of the spectrum. The Laguerre and Meixner exampleswill be discussed in the next chapter. It is expected that for large classes of potentialsv in the driving measure dµ = e−vdx of the Coulomb gas Q, the corresponding ratefunction J on the right of the mean is such that J(ε) ∼ ε3/2 for small ε. Large deviationsfor the length of the longest increasing subsequence have been described in [Se1], [D-Ze].

19

2. KNOWN RESULTS ON

NON-ASYMPTOTIC BOUNDS

The purpose of these notes is to describe some non-asymptotic exponential deviationinequalities on the largest eigenvalues or rightmost charges of random matrix and ran-dom growth models at the order (mean)1/3 of the fluctuation results. It actually turnsout that several results are already available in the literature, motivated by convergenceof moments in Tracy-Widom type theorems or moderate deviation principles interpo-lating between fluctuations and large deviations. We thus survey in this chapter someresults developed to this aim, which however, as we will see it, usually require a ratherheavy analysis and only concern some rather specific models. In particular, Wignermatrices or random growth functions with arbitrary weights do not seem to have beenaccessed by any method so far. We treat upper deviation inequalities both above andbelow the mean, and analyze their consequences to variance inequalities.

2.1. Upper tails on the right of the mean

As presented in the first chapter, the Tracy-Widom theorem on the behavior of thelargest eigenvalue λNmax of the GUE with the scaling σ2 = 1

4N expresses that

limN→∞

P({λNmax ≤ 1 + sN−2/3}

)= FGUE(s), s ∈ R. (2.1)

In addition to this fluctuation result, the largest eigenvalue λNmax also satisfies the largedeviation theorems of Section 1.4. In order to quantify these asymptotic results, onewould be interested in finding (upper-) estimates, for each fixed N ≥ 1 and ε > 0, onP({λNmax ≥ 1 + ε}) and P({λNmax ≤ 1− ε}). Actually, from the discussion on the speedof convergence in the large deviation asymptotics (1.24) and (1.26) and the behaviors(1.8) and (1.9) of the Tracy-Widom distribution FGUE, we typically expect that forsome C > 0 and all N ≥ 1,


)≤ C e−Nε

3/2/C and P({λNmax ≤ 1− ε}

)≤ C e−N

2ε3/C

20

for ε > 0. The range of interest concerns particularly small ε > 0 to cover the valuesε = sN−2/3 in (2.1), justifying the terminology of small deviation inequalities. Boundsof this type may then be used towards convergence of moments and variance bounds,or moderation deviation results. (We do not address the question of lower estimateswhich does not seem to have been investigated in the literature.)

A first approach to such a project would be to carefully follow the proof of theTracy-Widom theorem, and to control the various Fredholm determinants by appro-priate (finite range) bounds on orthogonal polynomials. This is the route taken byG. Aubrun in [Au] which allowed him to state the following small deviation inequalityfor the largest eigenvalue λNmax of the GUE (with σ2 = 1

4N ).

Proposition 2.1. For some numerical constant C > 0, and all N ≥ 1 and ε > 0,


)≤ C e−Nε

3/2/C .

As announced, when ε = sN−2/3, the deviation inequality of Proposition 2.1 fitsthe fluctuation result (2.1). The bound is also in accordance with the tail behavior(1.9) of the Tracy-Widom distribution FGUE at +∞.

This line of reasoning can certainly be pushed similarly for the orthogonal poly-nomial ensembles for which a Tracy-Widom theorem holds, and for which asymptoticsof orthogonal polynomials together with the corresponding bounds are available. Thisissue is seemingly not clearly addressed in the literature. Results of this type seemto be discussed in particular in [G-T-W]. As already emphasized, this might howeverrequire a quite deep analysis, including steepest descent arguments of Riemann-Hilberttype (cf. [De]). An attempt relying on measure concentration and weak convergence tothe equilibrium measure is undertaken in [Bl] to yield asymptotic deviation inequalitiesof the correct order for some families of unitary invariant ensembles.

Another direction to deviation inequalities on the right of the mean may be devel-oped in the context of last-passage times, relying on superaddivity and large deviationasymptotics. Let us consider for example the random growth function W (M,N) ofChapter 1, last-passage time in directed percolation for geometric random variableswith parameter 0 < q < 1. As we have seen it, up to some multiplicative factor,

limN→∞

P({W

([cN ], N

)≤ ωN + sN1/3

})= FGUE(s), s ∈ R. (2.2)

where we recall that

ω =(1 +

√qc )2

1− q− 1.

Fix N ≥ 1 and c ≥ 1, and set W = W ([cN ], N). As for the largest eigenvalue λNmax ofthe GUE, to quantify (2.2), we may ask for exponential bounds on the probabilities

P({W ≥ N(ω + ε)}

)and P

({W ≤ N(ω − ε)}

)21

for ε > 0. One may also ask for example for bounds on the variance of W , which areexpected to be of the order of N2/3 by (2.2).

As observed by K. Johansson in [Joha1], inequalities on P({W ≥ N(ω + ε)}) maybe obtained from the large deviation asymptotics (1.22) together with a superadditivityargument (compare [Se1]). It is indeed immediate to see thatW (M,N) is superadditivein the sense that

W (M,N) +W([M + 1, 2M ], [N + 1, 2N ]

)≤W (2M, 2N)

where W ([M + 1, 2M ], [N + 1, 2N ]) is understood as the supremum over all up/rightpaths from (M + 1, N + 1) to (2M, 2N). Since W ([M + 1, 2M ], [N + 1, 2N ]) is inde-pendent with the same distribution as W (M,N), it follows that for every t ≥ 0,

P(({W (M,N) ≥ t

})2 ≤ P(({W (2M, 2N) ≥ 2t

}).

Iterating, for every integer k ≥ 1,

P(({W (M,N) ≥ t

})2k

≤ P(({W (2kM, 2kN) ≥ 2kt

}).

Together with the large deviation property (1.22), as k → ∞, for every fixed N ≥ 1and ε > 0,

P({W ≥ N(ω + ε)}

)≤ e−NJ(ε). (2.3)

Now the function J(ε) is explicitely known. It has however a rather intricate descrip-tion, based itself on the knowledge of the equilibrium measure of the Meixner Ensemble.Precisely, as shown in [Joha1],

J(ε) = JMEIX(ε) =1

1− q

∫ x

1

(x− y)[ c− q

y +B+

1− qc

y +D

] dy√y2 − 1

where

x = 1 +(1− q)ε2√qc

, B =c+ q

2√qc, D =

1 + qc

2√qc

.

One may nevertheless check that

J(ε) ≥ C−1 min(ε, ε3/2), ε > 0,

where C > 0 only depends on c and q. As a consequence, we may state the followingexponential deviation inequality. Note that this conclusion requires both the delicatelarge deviation theorem (1.22) for the Meixner Coulomb gas together with the deepcombinatorial description (1.19) (in order to make use of superaddivity of the growthfunction W (M,N)).

Proposition 2.2. For some constant C > 0 only depending on the parameter 0 < q < 1of the underlying geometric distribution and c ≥ 1, and all N ≥ 1 and ε > 0,

P({W ≥ N(ω + ε)}

)≤ C e−N min(ε,ε3/2)/C .

22

Note that in addition to the small deviation inequality at the Tracy-Widom rate,Proposition 2.2 also emphasizes the order e−Nε/C for the large values of ε due to theprecise knowledge of the rate function JMEIX. We will come back to this observationin the context of the GUE below.

The explicit knowledge of the rate function JMEIX actually allows one to make useof the non-asymptotic inequality (2.3) for several related models. For example as wealready saw it, if w is geometric with parameter 0 < q < 1, then as q → 1, (1 − q)wconverges in distribution to an exponential random variable with parameter 1. In thislimit, (2.3) turns into

P({W ≥ N(ω + ε)}

)≤ e−NJLAG(ε) (2.4)

where now W is the supremum (1.18) over up/right paths of independent exponentialrandom variables with parameter 1, ω =

(1 +

√c)2 and JLAG is the Laguerre rate

function

JLAG(ε) =∫ x

1

(x− y)(1 + c)y + 2

√c

(y +B)2dy√y2 − 1

withx = 1 +

ε

2√c, B =

1 + c

2√c.

One may check similarly that JLAG(ε) ≥ C−1 min(ε, ε3/2) so that the bound (2.4) thusprovides an analogue of Propositions 2.1 and 2.2 for the Laguerre Ensemble for boththe interpretation in terms of the last-passage time W or the largest eigenvalue ofWishart matrices.

Now one may further go from Wishart matrices to random matrices from the GUE,and actually recover in this way Proposition 2.1. Namely, recalling the Wishart matrixY = Y N = GG∗ with variance σ2, as M →∞,

σ−1√M

( YM

− σ2 Id)→ X

in distribution where X follows the GUE law (with variance σ2). In particular,

σ−1√M

( 1M

λNmax(Y )− σ2 Id)→ λNmax(X). (2.5)

Now, after the scaling σ2 = 14N , (2.4) indicates that for M = [cN ], c ≥ 1,

P({λNmax(Y ) ≥ ω+ε

4 })≤ e−NJLAG(ε)

for every N ≥ 1 and ε > 0. Change then ε into 2√c ε and take the limit (2.5) as

c → ∞. Denoting as usual by λNmax the largest eigenvalue of the GUE with varianceσ2 = 1

4N , it follows that


)≤ e−NJGUE(ε) (2.6)

23

where JGUE(ε), ε > 0, was given in (1.25) as

JGUE(ε) = 4∫ ε

0

√x(x+ 2) dx.

Since JGUE(ε) ≥ C−1 max(ε2, ε3/2), ε > 0, we thus recover in this way Proposition 2.1,and actually more precisely


)≤ C e−N max(ε2,ε3/2)/C (2.7)

for every ε > 0. As in Proposition 2.2, this exponential deviation inequality emphasizesboth the small deviations of order ε3/2 in accordance with the Tracy-Widom theoremand the large deviations of the order ε2.

It may actually be shown directly that (2.6) follows from the large deviation princi-ple (1.24) as a consequence of superadditivity, however on some related representation.It has been proved namely, via various arguments, that the largest eigenvalue λNmax

from the GUE with σ2 = 1 has the same distribution as

supN∑i=1

(Biti −Biti−1) (2.8)

where B1, . . . , BN are independent standard Brownian motions and the supremumruns over all 0 = t0 ≤ t1 ≤ · · · ≤ tN−1 ≤ 1. Proofs in [Bar], [G-T-W] are based on theRobinson-Schensted-Knuth correspondence, while in [OC-Y] advantage is taken fromnon-colliding Brownian motions and queuing theory, and include generalizations relatedto the classical Pitman theorem (see [OC]). The representation (2.8) may be thoughtof as a kind of continuous version of directed last-passage percolation for Brownianpaths. On the basis of this identification, it is not difficult to adapt the superadditivityargument developed for the random growth function (1.18) to deduce (2.6) from thelarge deviation bound (1.24). In any case, the price to pay to reach Propositions 2.1and 2.2 is rather expensive.

It should be pointed out that outside these specific models, non-asymptotic smalldeviation inequalities at the Tracy-Widom rate are so far open. Universality conjectureswould expect similar deviation inequalities for general Wigner matrices or directed lastpassage times W with general independent weights. Soshnikov’s proof [So1] only allowsfor asymptotic inequalities (cf. Section 5.2). Similarly, only the choice of geometric andexponential random variables wij gives rise so far to statements such as Proposition2.2.

The central role of the Meixner model shows, by appropriate scalings and theexplicit expression of the rate function JMEIX, that the tail (2.3) actually covers furtherinstances of interest. As discussed in Section 1.3, one such instance is the length of thelongest increasing subsequence in a random permutation and the Baik-Deift-Johansson

24

theorem (1.21). Namely, in the regime q = θN2 , N →∞, the deviation inequality (2.3)

may indeed be used to show that, for every n ≥ 1 and every ε > 0,

P({Ln ≥ 2

√n (1 + ε)

})≤ C exp

(− 1C

√n min

(ε3/2, ε

))(2.9)

where C > 0 is numerical, in accordance thus with (1.21) and the large deviationtheorem of [D-Z]. The previous bound also matches the upper tail moderate deviationtheorem of [L-M].

2.2 Upper tails on the left of the mean

We next turn to the probability that the largest eigenvalue or rightmost charge is lessthan or equal to the right-end point of the spectral measure. As already mentioned,the intuition, together with the large deviation asymptotics (1.23) and (1.26), suggeststhat it is much smaller than the probability that the largest eigenvalue exceeds theright-end point. Let us consider again the Meixner model in terms of the directedlast-passage time function W of (1.18) with geometric random variables. We thus lookfor the probability that W = W ([cN ], N) is less than or equal to N(ω − ε) for eachε > 0 and fixed N ≥ 1. Things are here much more delicate. Seemingly, only a fewresults are available, relying furthermore on delicate and quite difficult to access meth-ods and arguments. The following result has been put forward in [B-D-ML-M-Z] byrefined Riemann-Hilbert steepest descent methods in order to investigate convergenceof moments and moderate deviations. Some related estimates are developed in [B-D-R]in the context of random Young tableaux, and in [L-M-R] for the length of the longestincreasing subsequence.

Proposition 2.3. For some constant C > 0 only depending on the parameter 0 < q < 1of the underlying geometric distribution and c ≥ 1, and all N ≥ 1 and 0 < ε ≤ ω,

P({W ≤ N(ω − ε)}

)≤ C e−N

2ε3/C .

(Actually, the statement in [B-D-ML-M-Z] seems to concern only large values ofN .)

As for Proposition 2.1, the preceding inequality matches the behavior at −∞ ofthe Tracy-Widom distribution FGUE given by (1.8).

After [B-D-R] and [B-D-ML-M-Z], H. Widom [Wid2] noticed a somewhat less pre-cise estimate, replacing N2ε3 by its square root, using a more simple trace bound,however still requiring steepest descent. We will come back to this observation inChapter 5.

It is plausible that the behavior of the constant C in Proposition 2.3 allows for limitsto the Laguerre and Hermite Ensembles as in the preceding section. This is however

25

not completely obvious from the analysis in [B-D-ML-M-Z]. On the other hand, thereis no doubt that a similar Riemann-Hilbert analysis may be performed anagously forthese examples, and that the statements corresponding to Proposition 2.3 hold true.We may for example guess the following for the largest eigenvalue λNmax of the GUEwith σ2 = 1

4N .

Proposition 2.4. For some numerical constant C > 0, and all N ≥ 1 and 0 < ε ≤ 1,

P({λNmax ≤ 1− ε}

)≤ C e−N

2ε3/C .

As already mentioned, similar estimates have been obtained in [L-M-R] in the proofof the lower tail moderate deviations for longest increasing subsequences, where, basedon the investigation [B-D-J], the following speed of convergence is established: thereexists a numerical constant C > 0 such that for every n ≥ 1 and every 0 < ε ≤ 1,

P({Ln ≤ 2

√n (1− ε)

})≤ C e−n ε

3/C . (2.10)

2.3 Variance inequalities

The non-asymptotic deviation inequalities of Sections 2.1 and 2.2 allow for convergenceof moments towards the Tracy-Widom distribution [B-D-ML-M-Z], [Wid2]. In partic-ular, they may easily be combined to reach variance bounds. For example, the nextstatement on the growth function W = W ([cN ], N) follows from Propositions 2.2 and2.3.

Corollary 2.5. For some constant C > 0 (only depending on q and c), and every

N ≥ 1,

var (W ) = E([W − E(W )

]2) ≤ CN2/3.

Proof. Fix N ≥ 1. We may write

N−2 var (W ) ≤ N−2 E([W − ω]2

)≤

∫ ∞

0

P({W ≥ N(ω + t)

})dt2

+∫ ω

0

P({W ≤ N(ω − t)

})dt2.

By Proposition 2.2,∫ ∞

0

P({W ≥ N(ω + t)

})dt2 ≤ C

∫ ∞

0

e−N min(t,t3/2)/Cdt2 ≤ CN−4/3

26

where C, here and below, may vary from line to line. On the other hand, by Proposition2.3, ∫ ω

0

P({W ≤ N(ω − t)

})d(t2) ≤ C

∫ ω

0

e−N2t3/Cdt2 ≤ CN−4/3.

The proposition is established.

It is worthwhile mentioning that a weaker bound in Proposition 2.3, with N2ε3

replaced by Nε3/2 as proved in [Wid2], is sufficient for the proof of Corollary 2.5.

Taking Proposition 2.4 for granted, we get similarly for the largest eigenvalue λNmax

of the GUE with σ2 = 14N the following variance bound.

Corollary 2.6. For some numerical constant C > 0, and all N ≥ 1,

var (λNmax) ≤ CN−4/3.

As the proof of Corollary 2.5 shows, we actually have that for some C > 0 and allN ≥ 1,

E(|λNmax − 1|2

)≤ CN−4/3.

The same arguments furthermore leads to

supN

E(∣∣N2/3(λNmax − 1)

∣∣p) <∞

for any p > 0 which allow for convergence of moments in the Tracy-Widom theorem.In particular, ∣∣E(λNmax)− 1

∣∣ ≤ CN−2/3. (2.11)

We will observe in Chapter 5 below that moment recurrence equations may be used tosee that actually

E(λNmax) ≤ 1− 1CN2/3

(2.12)

for some C > 0 and all N ≥ 1. In particular thus, (2.11) means that

1− 1C ′N2/3

≤ E(λNmax) ≤ 1− 1CN2/3

for some C,C ′ > 0 and all N ≥ 1.

27

3. CONCENTRATION INEQUALITIES

In this chapter, we present a few classical measure concentration tools that may beused in the investigation of exponential deviation inequalities on the largest eigenval-ues and growth functions. These tools are of general interest, apply in rather largesettings and provide useful informations at the level of large deviation bounds for bothextremal eigenvalues and spectral distributions. While measure concentration yieldsthe appropriate (Gaussian) large deviation bounds, it however does not produce thecorrect small deviation rate (mean)1/3 of the asymptotic theorems presented in Chapter1. For simplicity, we mostly detail below the relevant inequalities in the Gaussian case.We successively present concentration inequalities for largest eigenvalues and randomgrowth functions, spectral measures, as well as Coulomb gas.

3.1 Concentration inequalities for largest eigenvalues

Let µ denote the standard Gaussian measure on Rn with density (2π)−n/2e−|x|2/2 with

respect to Lebesgue measure. One basic concentration property (cf. [Le1]) indicatesthat for every Lipschitz function F : Rn → R with ‖F‖Lip ≤ 1, and every r ≥ 0,

µ({F ≥

∫Fdµ+ r}

)≤ e−r

2/2. (3.1)

Together with the same inequality for −F , for every r ≥ 0,

µ({|F −

∫Fdµ| ≥ r}

)≤ 2 e−r

2/2. (3.2)

The same inequalities hold for a median of F instead of the mean. Independence uponthe dimension of the underlying state space is one crucial aspect of these properties.

We may apply for example these inequalities to the largest eigenvalue λNmax of theGUE. Namely, by the variational characterization,

λNmax = sup|u|=1

uXNu∗, (3.3)

28

so that λNmax is easily seen to be a 1-Lipschitz map of the N2 independent real andimaginary entries Xii, 1 ≤ i ≤ N , Re (Xij)/

√2, Im (Xij)/

√2, 1 ≤ i < j ≤ N , of

XN . Together with the scaling of the variance σ2 = 14N we thus get the following

concentration inequality on λNmax.

Proposition 3.1 For all N ≥ 1 and r ≥ 0,


∣∣ ≥ r})

≤ 2 e−2Nr2 , r ≥ 0.

As a consequence, note that var (λNmax) ≤ C N−1 that should be compared withCorollary 2.6. Actually, while Proposition 3.1 describes the Gaussian decay of λNmax forthe large values of r, it does not catch the r3/2 rate of the small deviation inequality(2.7). It actually seems that viewing the largest eigenvalue as one particular example ofLipschitz function of the entries of the matrix does not reflect enough the structure ofthe model. This comment more or less applies to all the results of this chapter deducedfrom the concentration principle.

A similar inequality holds for the GOE, and actually for more general families ofGaussian matrices. Before however going on with further applications of the generalprinciple of measure concentration, a few words are necessary at the level of the center-ings. The inequalities emphasized in the preceding chapter indeed discuss exponentialdeviation inequalities from the limiting expected value (for example 1 for the scaledlargest eigenvalue λNmax of the GUE) while the concentration principle typically pro-duces tail inequalities around some mean (or median) value of the given functional(such as E(λNmax)). A comparison thus requires proper control over E(λNmax) or similaraverage values. In the example of the GUE, (2.11) is of course enough to this task, butto make the concentration inequalities relevant by themselves, one needs independentestimates. A few remarks in this regard may be developed.

Keep again the GUE example. We may ask whether E(λNmax), or a median of λNmax,are smaller than 1, or at least suitably controlled. As emphasized in [Da-S], Gaussiancomparison principles are of some help to this task. Consider the real-valued Gaussianprocess

Gu = uXNu∗ =N∑

i,j=1

Xijuiuj , |u| = 1,

where u = (u1, . . . , uN ) ∈ CN . It is immediate to check that for every u, v ∈ CN ,

E(|Gu −Gv|2

)= σ2

N∑i,j=1

|uiuj − vivj |2.

Hence, if we define the Gaussian process indexed by u ∈ CN , |u| = 1,

Hu =N∑i=1

gi Re (ui) +N∑j=1

hj Im (uj)

29

where g1, . . . , gN , h1, . . . , hN are independent standard Gaussian variables, then, forevery u, v such that |u| = |v| = 1,

E(|Gu −Gv|2

)≤ 2σ2E

(|Hu −Hv|2

).

By the Slepian-Fernique lemma (cf. [L-T]),

E(

sup|u|=1

Gu)≤√

2σ E(

sup|u|=1

Hu

)≤ 2

√2σ E

([ N∑i=1

g2i

]1/2).

When σ2 = 14N , we thus get that

E(λNmax

)≤√

2. (3.4)

Together with the one-sided version of the inequality of Proposition 3.1, for every r ≥ 0,

P({λNmax ≥

√2 + r}

)≤ e−2Nr2

that thus agrees with (2.7) for r large.

It is worthwhile mentioning that in the real GOE case, the comparison theoremmay be sharpened into

E(λNmax

)≤ 4σ2E

([ N∑i=1

g2i

]1/2)< 1 (3.5)

(cf. [Da-S]). In particular therefore

P({λNmax ≥ 1 + r}

)≤ e−Nr

2

for every r ≥ 0, which is more directly comparable to the Tracy-Widom theorem.However (3.5) is not sharp enough to reach (2.12).

Bounds such as (3.4) or (3.5) extend to the class of sub-Gaussian distributions (cf.[L-T], [Ta3]) including thus random matrices with symmetric Bernoulli entries. Theymay then be combined as above with Proposition 3.3 below.

On the basis of the supremum representation (3.3) of the largest eigenlue or the verydefinition (1.18) of last passage time in oriented percolation, one may actually wonderwhether bounds on the supremum of Gaussian or more general processes (Zt)t∈T maybe useful in this type of investigation. Numerous developements took place in thelast decades (cf. [L-T] and the recent monograph [Ta3]) in the analysis of bounds onE(supt∈T Zt) and P({supt∈T Zt ≥ r}), r ≥ 0, with rather sophisticated chaining argu-ments involving metric entropy or majorizing measures. For real symmetric matricesX, the task would be for example to investigate processes given by

Zu = uXNu∗ =N∑

i,j=1

Xijuiuj , |u| = 1,

30

where u = (u1, . . . , uN ) ∈ RN and Xij , 1 ≤ i ≤ j ≤ N , are independent centered eitherGaussian or Bernoulli variables, and to study the size of the unit sphere |u| = 1 underthe L2-metric

E(|Zu − Zv|2

)=

N∑i,j=1

|uiuj − vivj |2, |u| = |v| = 1.

While these tools provide general, typically Gaussian, bounds, the unusual and morerefined rates from random matrix theory do not seem to have been accessed so farfrom this point of view. It might be a worthwhile project to investigate this questionin more detail.

We now come back to the application of measure concentration to general familiesof random matrices and random growth functions. This is actually the main interestin the theory. The concentration inequality of Proposition 3.1 indeed applies to largefamilies of both real and complex random matrices, the entries of which form a randomvector with a dimension free concentration property. For notational simplicity, we onlydeal below with real matrices but up to numerical factors, all the results hold similarlyin the complex case. That is, we are looking for measures µ on Rn, representing thejoint law of the entries of a given matrix, which satisfy, as (3.1) or (3.2) for Gaussianmeasures, the dimension free concentration inequality

µ({|F −

∫Fdµ| ≥ r

})≤ C e−r

2/C , r ≥ 0 (3.6)

for some C > 0 independent of n and every 1-Lipschitz function F : Rn → R. Themean may be replaced by a median of F . Actually, other tails than Gaussian maybe considered, and we refer to [Le1] for a general account on the concentration ofmeasure phenomenon and examples satisfying it. Now, it is immediate (cf. e.g. (3.3))that the singular values (resp. eigenvalues) of a N × N matrix X (resp. symmetricmatrix) are Lipschitz functions of the vector of the N2 (resp. N(N + 1)/2) entriesof X. One thus immediately concludes to concentration inequalities of the type ofProposition 3.1 for singular values or eigenvalues of matrices the joint law of the entriessatisfying a concentration inequality (3.6). This observation already yields variousconcentration inequalities for singular values and eigenvalues of families of Gaussianmatrices. Another simple example of interest consists of matrices with independentuniform entries (which may be realized as a contraction of Gaussian variables). Thefollowing proposition summarizes this conclusion. Note that if X = (Xij)1≤i,j≤N is areal symmetric N × N random matrix, then its eigenvalues are 1-Lipschitz functionsof the entries Xii, 1 ≤ i ≤ N ,

√2Xij , 1 ≤ i < j ≤ N (justifying in particular the

normalization of the variances in the GOE). For simplicity, we do not distinguish belowbetween the diagonal and non-diagonal entries, and simply use that the eigenvalues areLipschitz with a Lipschitz coefficient less than or equal to

√2 with respect to the vector

Xij , 1 ≤ i ≤ j ≤ N .

31

Proposition 3.2. Let X = (Xij)1≤i,j≤N be a real symmetric N ×N random matrix

and Y = (Yij)1≤i,j≤N be a real N ×N random matrix. Assume that the distributions

of the random vectors Xij , 1 ≤ i ≤ j ≤ N , and Yij , 1 ≤ i, j ≤ N , in respectively

RN(N+1)/2 and RN2

satisfy the dimension free concentration property (3.6). Then, if

τ is any eigenvalue of X, respectively singular value of Y , for every r ≥ 0,

P({|τ − E(τ)| ≥ r

})≤ C e−r

2/2C , resp. C e−r2/C .

We next discuss two examples of distributions satisfying concentration inequalitiesof the type (3.6) and illustrate there application to matrix models.

A first class of interest consists of measures satisfying a logarithmic Sobolev inequal-ity which form a natural extension of the Gaussian example. A probability measureµ on R or Rn is said to satisfy a logarithmic Sobolev inequality if for some constantC > 0 ∫

Rn

f2 log f2dµ ≤ 2C∫

Rn

|∇f |2dµ (3.7)

for every smooth enough function f : Rn → R such that∫f2dµ = 1. The prototype

example is the standard Gaussian measure on Rn which satifies (3.7) with C = 1.Another example consists of probability measures on Rn of the type dµ(x) = e−V (x)dx

where V − c |x|2

2 is convex for some c > 0 which satisfy (3.7) for C = 1c . An important

aspect of the logarithmic Sobolev inequality is its stability by product that yieldsdimension free constants. That is, if µ1, . . . , µn are probability measures on R satisfyingthe logarithmic Sobolev inequality (3.7) with the same constant C, then the productmeasure µ1⊗· · ·⊗µn also satisfies it (on Rn) with the same constant. The applicationof logarithmic Sobolev inequalities to measure concentration is developed by the so-called Herbst argument that indicates that if µ satisfies (3.7), then for any 1-Lipschitzfunction F : Rn → R and any λ ∈ R,∫

eλF dµ ≤ eλ∫Fdµ+Cλ2/2.

In particular, by a simple use of Markov’s exponential inequality (for both F and −F ),for any r ≥ 0,

µ({|F −

∫Fdµ| ≥ r}

)≤ 2 e−r

2/2C ,

so that the dimension free concentration property (3.6) holds. We refer to [Le1] fora complete discussion on logarithmic Sobolev inequalities and measure concentration.Related Poincare inequalities, in connection with variance bounds and exponentialconcentration, may be considered similarly in this context and in the applicationsbelow.

As a consequence of this discussion, if X = (Xij)1≤i,j≤N is a real symmetricN × N random matrix and Y = (Yij)1≤i,j≤N a real N × N random matrix such

32

that the entries Xij , 1 ≤ i ≤ j ≤ N and Yij , 1 ≤ i, j ≤ N define random vectorsin respectively RN(N+1)/2 and RN

2the law of which satisfy the logarithmic Sobolev

inequality (3.7), then the conclusion of Proposition 3.2 holds. By the product propertyof logarithmic Sobolev inequalities, this is in particular the case if the variables Xij

and Yij are independent and satisfy (3.7) with a common constant C (for example,they have a common distribution e−vdx where v′′ ≥ c = 1

C > 0). In particular thus, ifλNmax denotes the largest eigenvalue of X,


∣∣ ≥ r})

≤ 2 e−r2/4C , r ≥ 0

for every r ≥ 0.

Another family of interest are product measures. The application of measure con-centration to this class however requires an additional convexity assumption on thefunctionals. Indeed, if µ is a product measure on Rn with compactly supported fac-tors, a fundamental result of M. Talagrand [Ta2] shows that (3.2) holds for everyLipschitz convex function. More precisely, assume that µ = µ1 ⊗ · · · ⊗ µn where eachµi is supported on [a, b]. Then, for every 1-Lipschitz convex function F : Rn → R,

µ({|F −m| ≥ r}

)≤ 4 e−r

2/4(b−a)2 (3.8)

where m is a median of F for µ. (Classical arguments, cf. [Le1], allow for the replace-ment of m by the mean of F up to numerical constants.) Since, by the variationalcharacterization, the largest eigenvalue λNmax of symmetric (or Hermitian) matrices isclearly a convex function of the entries, such a statement may immediately be appliedto yield concentration inequalities similar to Propositions 3.1 and 3.2.

Proposition 3.3. Let X be a real symmetric N ×N matrix such that the entries Xij ,

1 ≤ i ≤ j ≤ N , are independent random variables with |Xij | ≤ 1. Denote by λNmax the

largest eigenvalue of X. Then, for any r ≥ 0,

P({|λNmax −M | ≥ r

})≤ 4 e−r

2/32

where M is a median of λNmax.

Up to some numerical constants, the median may be replaced by the mean. Asimilar result is expected for all the eigenvalues. A partial result in [A-K-V] yields abound of the order of 4 e−r

2/32 min(k,N−k+1)2 on the k-th largest eigenvalue which, fork far from 1 or N is much bigger than the corresponding one in the Gaussian case forexample. The analogous question for singular values (in particular the smallest one)in this context seems also to be open. Further inequalities on eigenvalues and normsfollowing this principle, together with additional material, are discussed in [Mec] and[G-P].

33

We refer to [Ta2], [Le1] for further examples of distributions with the concentrationproperty.

Similar measure concentration tools may be developed at the level of randomgrowth functions. Consider for example an array (wij)1≤i≤M,1≤j≤N of real-valuedrandom variables and let, as in the preceding sections,

W = maxπ

∑(i,j)∈π

wij

where the sup runs over all up/right paths from (1, 1) to (M,N). It is clear thatF (x) = sup

∑(i,j)∈π xij is a Lipschitz map of the MN coordinates (xij)1≤i≤M,1≤j≤N

with Lipschitz constant√M +N − 1. The following statement is thus an immediate

consequence of the basic concentration principle. It applies thus in particular to in-dependent Gaussian variables, or more general distributions satisfying a logarithmicSobolev inequality. Since F is clearly a convex function of the coordinates, the re-sult also applies to independent random variables with compact supports (such as forexample Bernoulli variables).

Proposition 3.4. Let (wij)1≤i≤M,1≤j≤N be a set of real-valued random variables such

that the distribution on RMN satisfies the concentration property (3.6) for all Lipschitz

convex functions. Then, for any r ≥ 0,

P({∣∣W − E(W )

∣∣ ≥ r})

≤ C e−r2/C(M+N−1).

While again of interest, and of rather wide applicability, this exponential boundhowever does not describe the expected rate drawn from the Meixner model as exam-ined in the previous sections. In particular, the variance growth drawn from Proposition3.4 with M = N only yields var (W ) ≤ C ′N (where C ′ > 0 only depends on C) whileit is expected to be of the order of N2/3 (cf. Corollary 2.5). Similar comments applyto the concentration inequalities for the length of the longest increasing subsequenceinvestigated in [Ta2] which do not match the Baik-Deift-Johansson theorem (1.21). In-deed, building on the general principle underlying (3.8), M. Talagrand got for examplethat

P({|Ln −m| ≥ r

})≤ 4 e−r

2/8m

for 0 ≤ r ≤ m.

3.2 Concentration inequalities for spectral distributions

The general concentration principles do not yield the correct small deviation rate at thelevel of the largest eigenvalues. They however apply to large classes of Lipschitz func-tions. In particular, as investigated by A. Guionnet and O. Zeitouni [G-Z], applications

34

to functionals of the spectral measure yield sharp exponential bounds in accordancewith the large deviation asymptotics for empirical measures (cf. Section 1.4). For ex-ample, if f : R → R is Lipschitz, it is not difficult to check that F = 1√

N

∑Ni=1 f(λNi )

is a Lipschitz function of the (real and imaginary) entries of XN . Moreover, if f isconvex on the real line, then F is convex on the space of matrices (Klein’s lemma).Therefore, the general concentration principle may be applied to functions of the spec-tral measure. For example, if X is a GUE random matrix with variance σ2 = 1

4N , andif f : R → R is 1-Lipschitz, as a consequence of (3.2), for any r ≥ 0,

P({∣∣∣∣ 1

N

N∑i=1

f(λNi )−∫fdµN

∣∣∣∣ ≥ r

})≤ 2 e−2N2r2 (3.9)

(where we recall that µN is the mean spectral measure (1.4)). Inequality (3.9) is inaccordance with the N2 speed of the large deviation principles for spectral measures.With the additional assumption of convexity on f , similar inequalities hold for real orcomplex matrices the entries of which are independent with bounded support. Thevarious examples of distributions with the measure concentration property discussedfor example in the previous sections may thus be developed similarly at the level of thespectral measures, and Propositions 3.2 and 3.3 have immediate counterparts for theLipschitz functions F as above. We may for example state the following.

Proposition 3.5. Let X = (Xij)1≤i,j≤N be a real symmetric N ×N random matrix.

Assume that the distribution of the random vector Xij , 1 ≤ i ≤ j ≤ N , in RN(N+1)/2

satisfy the dimension free concentration property (3.6) for all Lipschitz (resp. Lipschitz

and convex) functions Then, for any 1-Lipschitz (resp. 1-Lipschitz and convex) function

f : R → R,

P({∣∣∣∣ 1

N

N∑i=1

f(λNi )−∫fdµN

∣∣∣∣ ≥ r

})≤ C e−Nr

2/2C

for all r ≥ 0.

Extended inequalities have been investigated along these lines in [G-Z] to which werefer for further applications to various families of random matrices.

Interestingly enough, these concentration inequalities may be used to improve theWigner theorem from the statement on the mean spectral measure to the almost sureconclusion. For example, in the context of the GUE, as a consequence of (3.9),

1N

N∑i=1

f(λNi )−∫fdµN → 0

almost surely for every Lipschitz function f : R → R. Assuming that µN → ν, thesemicircle law, it easily follows after a density argument that, almost surely,

1N

N∑i=1

δλNi→ ν

35

weakly as probability measures on R.

3.3 Concentration inequalities for Coulomb gas

The GUE model shares both the structure of a Wigner matrix with independententries and the one of a unitary invariant ensemble. As a unitary ensemble, we have seenin Chapter 1 how the joint eigenvalue distribution may be represented as a Coulombgas (1.16). Under suitable convexity assumption on the underlying potential, Coulombgas actually also share concentration properties which follow from general convexityprinciples.

Let indeed, as in (1.16),

dQ(x) =1Z

∣∣∆N (x)∣∣βdρ(x)

where ρ is a probability measure on RN and Z = ZN =∫|∆N |βdρ < ∞ the normal-

ization constant. For particular values of β > 0 and suitable distributions ρ, Q thusrepresents the eigenvalue distribution of some random matrix model. We consider prob-ability measures ρ given by dρ = e−V dx for some symmetric (invariant by permutationof the coordinates) potential V : RN → R. Typically, in the context of eigenvalues ofrandom matrix models, V (x) =

∑Ni=1 v(xi), x = (x1, . . . , xN ) ∈ RN , where v : R → R

is the underlying potential of the matrix distribution exp(−Tr v(X))dX. Assume nowthat V (x)− c |x|

2

2 is convex for some c > 0. For example, if

V (x) =|x|2

2=

12

N∑i=1

x2i ,

we would deal with the joint eigenvalue distribution of the GUE. By exchangeability,we may describe equivalently the measure Q by

dQ(x) =N !Z

∆N (x)β 1E dρ(x) (3.10)

where E = {x ∈ RN ;x1 < · · · < xN}. Now, log ∆βN is concave on the convex set E,

so that the probability measure Q of (3.10) enters the general setting of probabilitymeasures with density e−U , U strictly convex, on a convex set in RN . The generaltheory of the Prekopa-Leindler and transportation cost inequalities as presented in[Le1] then shows that Q satisfies a Gaussian like concentration inequality for Lipschitzfunctions. The next statement describes the result.

Proposition 3.6. Let Q be defined by (3.10) with dρ = e−V dx where V is symmetric

and such that V (x)−c |x|2

2 is convex for some c > 0. Then, for any 1-Lipschitz function

F : RN → R and any r ≥ 0,

Q({|F −

∫FdQ| ≥ r

})≤ 2 e−r

2/2c.

36

Applied to the particular Lipschitz function given by max1≤i≤N xi, we recoverProposition 3.1 for the GUE, which thus applies to more general orthogonal of unitaryensembles with a strictly convex potential. Proposition 3.6 may also be used to coverthe concentration inequalities for Lipschitz functions of the spectral measure of thepreceding section. However, again, the Tracy-Widom rate does not seem to follow fromthis description. It actually appears that in distributions dQ(x) = 1

Z

∣∣∆N (x)∣∣βdρ(x),

the important factor is the Vandermonde determinant ∆N and not the underlyingprobability measure ρ, while in the concentration approach, we rather focus on ρ.

37

4. HYPERCONTRACTIVE METHODS

We presented in Chapter 2 the known asymptotic exponential deviation inequalitieson largest eigenvalues and last-passage times. As described there, these actually followfrom quite refined methods and results. The aim of this chapter and the next oneis to suggest some more accessible tools to reach some of these bounds (or parts ofthem) at the correct small deviation order. The tools developed in this chapter are offunctional analytic flavour, with a particular emphasis on hypercontractive methods.They however still rely on the orthogonal polynomial representation.

In the first part, we present an elementary approach, relying on the hypercontrac-tivity property of the Hermite semigroup, to the small deviation inequality of Proposi-tion 2.1 for the largest eigenvalue of the GUE. We then investigate, following the recentcontribution [B-K-S] by I. Benjamini, G. Kalai and O. Schramm, variance bounds fordirected last-passage percolation with the same tool of hypercontractivity.

4.1 Upper tails on the right of the mean

We first briefly describe the semigroup tools we will be using. Consider the Hermite ofOrnstein-Uhlenbeck operator

Lf = ∆f − x · ∇f

acting on smooth functions f : Rn → R. It satisfies the integration by parts formula∫f(−Lg)dµ =

∫∇f · ∇gdµ (4.1)

for smooth functions f, g on Rn with respect to the standard Gaussian measure µ onRn. The associated semigroup Pt = etL, t ≥ 0, solution of the heat equation ∂

∂t = L,is, in this case, explicitely described by the integral representation

Ptf(x) =∫

Rn

f(e−tx+ (1− e−2t)1/2y

)dµ(y), t ≥ 0, x ∈ Rn. (4.2)

Note that P0f = f and Ptf →∫fdµ (for suitable f ’s).

38

To illustrate L and Pt in the one-dimensional case, recall the generating functionof the (normalized) Hermite polynomials P`, ` ∈ N, on the real line is given by

eλx−λ2/2 =

∞∑`=0

λ`√`!P`(x), λ, x ∈ R.

Since for every t ≥ 0 and λ ∈ R,

Pt(eλx−λ2/2) = e(λe−t)x−(λe−t)2/2,

it follows that Pt(P`) = e−`tP`, ` ∈ N. Hence the Hermite polynomials are the eigen-functions of L, with eigenvalues −`, ` ∈ N (L is sometimes called the number operator).

The central tool in this section is the celebrated hypercontractivity property of theHermite semigroup first put forward by E. Nelson [Ne] in quantum field theory. Itexpresses that, for any function f (in Lp),

‖Ptf‖q ≤ ‖f‖p (4.3)

for every 1 < p < q < ∞ and t > 0 such that e2t ≥ q−1p−1 (cf. [Ba]). Lp-norms are

understood here with respect to the Gaussian measure µ.

For comparison, it might be worthwhile mentioning that hypercontractivity hasbeen shown by L. Gross [Gr] to be equivalent to the logarithmic Sobolev inequality(3.7) (with C = 1) for the standard normal distribution µ, in actually the generalsetting of Markov operators (cf. [Bak]).

We now make use of hypercontractivity to reach small deviation inequalities for thelargest eigenvalues of the GUE. We follow the note [Le2]. Recall thus X from the GUEwith σ2 = 1

4N , with eigenvalues λN1 , . . . , λNN . The starting point is the representation

(1.13) of the spectral measure µN in terms of the Hermite polynomials and the simpleunion bound

P({λNmax ≥ t}

)≤ NµN

([ t,∞)

), t ∈ R, N ≥ 1. (4.4)

Let N ≥ 1. As a consequence, for every ε > 0 (recall σ2 = 14N ),


)≤

∫ ∞

2√N(1+ε)

N−1∑`=0

P 2` dµ (4.5)

(where µ is here the standard Gaussian measure on R). Now, by Holder’s inequality,for every r > 1, and every ` = 0, . . . , N − 1,∫ ∞

2√N(1+ε)

P 2` dµ ≤ µ

([2√N(1 + ε),∞)

)1−(1/r) ‖P`‖22r

≤ e−2N(1+ε)2(1− 1r ) ‖P`‖22r

39

where we used the standard bound on the tail of the Gaussian measure µ. Since as wehave seen Pt(P`) = e−`tP`, it follows from the hypercontractivity property (4.3) thatfor every r > 1 and ` ≥ 0,

‖P`‖2r ≤ (2r − 1)`/2.

Hence, ∫ ∞

2√N(1+ε)

N−1∑`=0

P 2` dµ ≤ e−2N(1+ε)2(1− 1

r )N−1∑`=0

(2r − 1)`

≤ 12(r − 1)

e−2N(1+ε)2(1− 1r )+N log(2r−1).

Optimizing in r → 1 then shows, after a Taylor expansion of log(2r − 1) at the thirdorder, that for some numerical constant C > 0 and all 0 < ε ≤ 1,


)≤ C ε−1/2 e−Nε

3/2/C (4.6)

for C > 0 numerical. Up to some polynomial factor, we thus recover the content ofProposition 2.1. (The argument is easily extended to also include the large deviationbehavior of the order of ε2 (cf. (2.7) and Section 2.3).

The same strategy may be developed similarly for orthogonal polynomial ensembleswhich may be diagonalized by an hypercontractive operator [Le2]. This is the case forexample of the Laguerre operator, so that this approach yields exponential deviationinequalities for the largest eigenvalue of Wishart matrices or last-passage times forexponential random variables. The class of interest seems however to be restricted tothe classical examples of Hermite, Laguerre and Jacobi polynomials [Ma]. Even theapplication of the method to discrete orthogonal polynomial ensembles does not seemto be clear.

4.2 Variance bounds

This section is devoted to the question of the variance growth of last-passage time func-tions for more general distributions than geometric and exponential. We follow herea recent contribution by I. Benjamini, G. Kalai and O. Schramm [B-K-S] who provedsub-linear growth by means of hypercontractive tools. In connection with the growthmodels discussed in the preceding sections, we only investigate here the directed perco-lation model. Furthermore, for simplicity again, we restrict ourselves in the expositionof this result to a Gaussian setting. It actually holds for a variety of examples discussedat the end of the section.

Consider thus, as in Section 3.1, an array (wij)1≤i≤M,1≤j≤N of independent stan-dard Gaussian random variables and let

W = maxπ

∑(i,j)∈π

wij

40

where the maximum runs over all up/right paths from (1, 1) to (M,N). Assume fur-thermore for simplicity that M = N . We saw from the general concentration boundsin Chapter 3 that var (W ) ≤ CN (while it is expected to be of the order of N2/3 byCorollary 2.5). We provide here, following [B-K-S], a slight, but significant improve-ment.

For a suitably integrable function f : RN2→ R, denote by

varµ(f) =∫f2dµ−

( ∫fdµ

)2

its variance with respect to the standard Gaussian measure µ on RN2. When f is

smooth enough, the heat equation for the Ornstein-Uhlenbeck semigroup (P)t≥0 with

generator L on RN2

allows one to write

varµ(f) = −∫ ∞

0

dtd

dt

∫(Ptf)2dµ

= 2∫ ∞

0

dt

∫Ptf(−LPtf)dµ

= 2N∑

i,j=1

∫ ∞

0

dt

∫(∂ijPtf)2dµ.

From the integral representation (4.2) of the Ornstein-Uhlenbeck semigroup,

|∂ijPtf | ≤ e−t Pt(|∂ijf |

), i, j = 1, . . . , N, t ≥ 0. (4.7)

Hence, together with hypercontractivity (4.3), for every t ≥ 0 and every i, j = 1, . . . , N ,∫(∂ijPtf)2dµ ≤ e−2t

∫ [Pt

(|∂ijf |

)]2dµ ≤ e−2t‖∂ijf‖21+e−2t .

Setting u = e−2t, and v = u+ 1,

varµ(f) ≤N∑

i,j=1

∫ 1

0

‖∂ijf‖21+udu =N∑

i,j=1

∫ 2

1

‖∂ijf‖2v dv. (4.8)

By a simple upper-bound on the right-hand side, this inequality may also be writtenas

Varµ(f) ≤ 4N∑

i,j=1

‖∂ijf‖221 + log

(‖∂ijf‖22/‖∂ijf‖21

) . (4.9)

Inequality (4.9) is actually due to M. Talagrand [Ta1] on the discrete cube, and wasinvestigated in [B-H] in the Gaussian case as a dual version of the logarithmic Sobolev(3.7).

41

Recall now the (Lipschitz) function

F (x) = maxπ

∑(i,j)∈π

xij , x = (xij)0≤i,j≤N ∈ RN2.

The idea is to apply (4.8) or (4.9) to this function F to gain a factor logN in the varianceby showing that ‖∂ijF‖1 is small enough, at least on sufficiently many coordinates(i, j). It is not immediate how this may be achieved*. To overcome this difficulty, theauthors of [B-K-S] had to resort to an averaging lemma in the context of non-orientedpercolation for Bernoulli variables (see also [B-R] for more general distributions). Itis a natural hope that the same argument may be adapted to the present context toyield a proof of the following conjecture.

Conjecture 4.1. Let W be the directed last-passage time of an array of independent

standard Gaussian random variables on the square from (1, 1) to (N,N), N ≥ 2. Then

var (W ) ≤ CN

logN

where C > 0 is numerical.

Provided the argument develops suitably, the proof should then apply more gener-ally to examples where both the hypercontractive bound and the commutation prop-erty (4.7) may be applied. One instance would be the example of uniform randomvariables. A further example is the case of exponential variables, for which howeverthe much stronger Corollary 2.5 is available.

* The published proof of this result in the GAFA Seminar Lecture Notes is erroneoussince it assumes that all the sets Aπ are equally distributed. It is thus necessary tomore carefully follow the argument in [B-K-S]. We are grateful to R. Rossignol forpointing out this mistake to us.

42

5. MOMENT METHODS

In this chapter, we take a somewhat different route from the one of Chapter 4 and con-centrate on moments of the spectral distribution. Moment methods and combinatorialarguments are at the roots of the study of random matrix models, and for exampleare typically used in proofs of Wigner’s theorem (cf. Section 1.1). The combinatorialpart has been significantly improved by A. Soshnikov in [So1] to reach fluctuation re-sults. Here, we again make advantage of the orthogonal polynomial structure to deriverecurrence equations for moments, which may be shown of interest in non-asymptoticdeviation inequalities. The strategy is based on integration by parts for the underlyingMarkov operator of the orthogonal polynomial ensemble.

In the first paragraph, we derive in this way the moment equations of the GUEmodel using simple integration by parts arguments for the Hermite operator. Wethen emphasize their usefulness in non-asymptotic deviation inequalities on the largesteigenvalues. Below the mean, we follow an argument by H. Widom relying on a simpletrace inequality.

5.1 Moment recurrence equations

Let µ denote again the standard Gaussian measure on R. By integration by parts, forevery smooth function f on R, ∫

xfdµ =∫f ′dµ. (5.1)

(This formula is actually a particular case of the integration by parts formula (4.1)applied to g = P1 = x the first eigenvector of the one-dimensional Ornstein-Uhlenbeckoperator Lf = f ′′ − xf ′ with eigenvalue 1.)

By (1.13), moments of the mean spectral measure (1.4) amounts to moments oforthogonal polynomial measures. In this direction, we examine first a reduced case.Let

ap = aNp =∫x2pP 2

Ndµ, p ∈ N,

43

where we recall that the Hermite polynomials P`, ` ∈ N, are normalized in L2(µ). (Theodd moments are zero by symmetry.) By (5.1),

ap =∫xx2p−1P 2

Ndµ = (2p− 1)ap−1 + 2∫x2p−1PNP

′Ndµ. (5.2)

Repeating the same step,

ap − (4p− 3)ap−1 + (2p− 1)(2p− 3)ap−2

= 2∫x2p−2P ′N

2dµ+ 2

∫x2p−2PNP

′′Ndµ.

(5.3)

Now, PN is an eigenfunction of −L with eigenvalue N . Thus, by the integration byparts formula (4.1) for L,

Nap =∫x2pPN (−LPN )dµ = 2p

∫x2p−1PNP

′Ndµ+

∫x2pP ′N

2dµ.

Together with (5.2), ∫x2pP ′N

2dµ = (N − p)ap + p(2p− 1)ap−1. (5.4)

In the same way, on the basis of (5.2),

N[ap − (2p− 1)ap−1

]= 2

∫x2p−1(−LPN )P ′Ndµ

= 2(2p− 1)∫x2p−2P ′N

2dµ+ 2

∫x2p−1P ′NP

′′Ndµ.

Now, since P ′N =√N PN−1 (which may be checked from the generating function of

the Hermite polynomials) is eigenfunction of −L with eigenvalue N − 1, we also havethat

(N − 1)[ap − (2p− 1)ap−1

]= 2

∫x2p−1PN (−LP ′N )dµ

= 2(2p− 1)∫x2p−2PNP

′′Ndµ+ 2

∫x2p−1P ′NP

′′Ndµ.

Substracting to the latter,

ap − (2p− 1)ap−1 = 2(2p− 1)∫x2p−2P ′N

2dµ− 2(2p− 1)

∫x2p−2PNP

′′Ndµ,

so that, by (5.4),

2(2p− 1)∫x2p−2PNP

′′Ndµ = −ap + (2p− 1)(2N − 2p+ 1)ap−1

+ (2p− 1)(2p− 2)(2p− 3)ap−2.

(5.5)

44

Plugging (5.5) and (5.4) into (5.3) finally shows the recurrence equations

ap = (4N + 2)2p− 1

2pap−1 +

(2p− 1)(2p− 2)(2p− 3)2p

ap−2 (5.6)

(a0 = 1, a1 = 2N + 1).

We now make use of the following elementary lemma that appears as a version inthis context of the classical Christoffel-Darboux formula (1.15).

Lemma 5.1. For every integer k ≥ 1, and every N ≥ 1,

k

∫xk−1

N−1∑`=0

P 2` dµ =

√N

∫xkPNPN−1dµ.

Proof. Let A be the first order operator Af = f ′−xf acting on smooth functions f onthe real line R. The integration by parts formula for A (analogous to (4.1)) indicatesthat for smooth functions f and g,∫

g(−Af)dµ =∫g′fdµ.

Since −LPN = NPN and P ′N =√NPN−1 for every N ≥ 1, the recurrence relation for

the (normalized) Hermite polynomials PN , takes the form

xPN =√N + 1PN+1 +

√N PN−1.

Hence,A(P 2

N ) = PN[2P ′N − xPN

]=√N PNPN−1 −

√N + 1PN+1PN .

Therefore,

(−A)(N−1∑

`=0

P 2`

)=√N PNPN−1

from which the conclusion follows from the integration by parts formula for A.

Recall now the GUE random matrix X = XN with σ2 = 14N and N ≥ 1 fixed. Set,

for every integer p,

bp = bNp =1N

E(Tr (X2p)

)= E

(1N

N∑i=1

(λNi

)2p)

=∫

R

(x

2√N

)2p 1N

N−1∑`=0

P 2` dµ

45

where we used (1.13). (The odd moments are zero by symmetry.) By Lemma 5.1,

(2p− 1)∫x2p−2

N−1∑`=0

P 2` dµ =

∫x2p−1PNP

′Ndµ

so that, by (5.2),

22p−1Np(2p− 1)bp−1 = ap − (2p− 1)ap−1

for every p ≥ 1. As a consequence of (5.6), we may then deduce the following recurrenceequations on the moments of X.

Proposition 5.2. For every integer p ≥ 2,

bp =2p− 12p+ 2

bp−1 +2p− 12p+ 2

· 2p− 32p

· p(p− 1)4N2

bp−2

(b0 = 1, b1 = 14 .)

This recurrence equation, reminiscent of the three-step recurrence equation for or-thogonal polynomials, was first put forward in an algebraic context by J. Harer and D.Zagier [H-Z] (to determine the Euler characteristics of moduli spaces of curves). It isalso discussed in the book by M. L. Mehta [Meh]. The proof above is essentially due toU. Haagerup and S. Thorbjørnsen [H-T]. Similar recurrence identities may be estab-lished, with the same strategy, for the Laguerre and Jacobi orthogonal polynomials,and thus the corresponding moments of Wishart and Beta matrices [H-T], [Le3].

It should be pointed out that the equation

χp =2p− 12p+ 2

χp−1 =(2p)!

22pp!(p+ 1)!(5.7)

is the recurrence relation of the (even) moments of the semicircle law (the so-calledCatalan numbers, the number of non-crossing pair partitions of {1, 2, . . . , 2p}). Inparticular, Proposition 5.2 may then be used to produce a quick proof of the Wignertheorem, showing namely that bNp → χp for every p. Moreover, for every fixed p andevery N ≥ 1,

χp ≤ bNp ≤ χp +CpN2

where Cp > 0 only depends on p.

46

5.2 Upper tails on the right of the mean

We next make use of the recurrence equations of Proposition 5.2 to recover the sharpexponential bounds on the probability that the largest eigenvalues of the GUE matrixexceeds its limiting value discussed by other means in the preceding chapters.

We start again from (4.5). Together with Markov’s inequality, for every N ≥ 1,ε > 0 and p ≥ 0,


)≤ (1 + ε)−2pNbp

where we recall that bp = bNp are the 2p-moments of µN (or X = XN ). Now, byinduction on the recurrence formula of Proposition 5.2 for bp, it follows that, for everyp ≥ 2,

bp ≤(

1 +p(p− 1)

4N2

)pχp. (5.8)

By Stirling’s formula,

χp ≤C

p3/2, p ≥ 1. (5.9)

Hence, for 0 < ε ≤ 1 and some numerical constant C > 0 possibly changing from lineto line below,


)≤ CNp−3/2 e−εp+p

3/4N2.

Therefore, optimizing in p ∼√εN , 0 < ε ≤ 1, we recover the sharp small deviation

inequality


)≤ C e−Nε

3/2/C ,

N ≥ 1, 0 < ε ≤ 1, C > 0 numerical, of Proposition 2.1. When ε ≥ 1, the optimizationis modified to recover the large deviation rate of the order of Nε2. With respect tothe hypercontractive approach of Chapter 4, no further polynomial factors have to beadded.

As observed by S. Szarek in [Sza], the moment recurrence equation of Proposition5.2 and the preceding argument may be used to reach the sharp upper bound (2.12)on E(λNmax), and even

E(

max1≤i≤N

|λNi |)≤ 1− 1

CN2/3(5.10)

for some numerical C > 0 and all N ≥ 1. Indeed, for every p ≥ 1,

E(

max1≤i≤N

|λNi |)≤

(Nbp

)1/2p,

and thus, by (5.8),

E(

max1≤i≤N

|λNi |)≤

(Nχp ep

3/4N2)1/2p. (5.11)

47

If t > 0 and N ≥ 1 are such that p = [tN2/3] ≥ 3, then, together with (5.9),

E(

max1≤i≤N

|λNi |)≤

[(C2/3

t

)3/2t

et2/4

]1/2N2/3

.

The constant C > 0 in (5.9) may be taken to be π−1/2 so that taking for examplet = C

√e shows that the bracket in the preceding inequality is strictly less than 1.

Therefore (5.10) holds except for some few values of N which may be checked directlyon the basis of (5.11).

As a consequence of (5.8), for every t > 0,

supN≥1

NbN[tN2/3] ≤ Ct−3/2eCt3

(5.12)

for the moments of the GUE. One important step in Soshnikov’s extension [So1] of theTracy-Widom theorem to more general (real or complex) Wigner matrices amounts toestablish that lim supN→∞NbN

[N2/3]<∞, actually

lim supN→∞

NbN[tN2/3] ≤ Ct−3/2eCt3

for every t > 0. This is accomplished through delicate combinatorial arguments onmoments. It is however an open question so far whether its non-asymptotic version(5.12) also holds for these families of random matrices, which would then yield theexpected deviation inequalities for every fixed N ≥ 1. It would be of particular interestto study the case of Bernoulli entries.

Very recently, O. Khorunzhiy [Kh] developed Gaussian integration by parts meth-ods at the level of traces, together with triangle recurrence schemes, to reach momentbounds on both the GUE and the GOE. The estimates provide exact expressions forthe 1/N -corrections of the moments, but do not allow yet for the sharp deviation in-equalities on largest eigenvalues. This first step outside the orthogonal polynomialmethod might however be promising.

This strategy relying on integration by parts for Markov generators and momentequations may be used similarly for some other classical orthogonal polynomial ensem-bles of the continuous variables. For example, the Laguerre and Jacobi ensembles arestudied along these lines in [Le3] to yield deviation inequalities at the Tracy-Widomrate of the largest eigenvalue of Wishart and Beta matrices. Discrete examples mayalso be considered, although not necessarily through recurrence equations, but ratherthe explicit expression for moments (this is actually also possible in the continuousvariable). For example, integration by parts with respect to the negative binomialdistribution (1.17) with parameters q and γ reads∫

xfdµ =γq

1− q

∫f(x+ 1)dµ

48

for any, say, polynomial function f on N. It allows one to express the factorial moment(of order p) of the mean spectral measure of the Meixner Ensemble as∫

x(x− 1) · · · (x−p+ 1)1N

N−1∑`=0

P 2` dµ

=( q

1− q

)p p∑i=0

q−i(p

i

)2 1N

N−1∑`=i

(γ + `)p−i ` !(`− i)!

(5.13)

where P` are the associated normalized Meixner polynomials. Therefore, if Q is theCoulomb gas of the Meixner Ensemble, by the union bound inequality (4.4) and therepresentation (1.13) of the spectral measure, for every t ≥ 0,

Q({

max1≤i≤N

xi ≥ t})

≤∫ ∞

t

N−1∑`=0

P 2` dµ

≤ (t− p)!t!

( q

1− q

)p p∑i=0

q−i(p

i

)2 N−1∑`=i

(γ + `)p−i ` !(`− i)!

.

Stirling’s formula may then be used to control the right-hand side of the latter, andto derive exponential deviation inequalities on the rightmost charge. Together withJohansson’s combinatorial formula (1.19), the conclusion of Proposition 2.2 on therandom growth functions W may be recovered in this way, avoiding superadditivityand large deviation arguments. In the limit from the Meixner Ensemble to the length ofthe longest increasing subsequence Ln, it also covers the tail inequality (2.9) (cf. [Le4]).However, the key of the analysis still relies on the orthogonal polynomial representation.

5.3 Upper tails on the left of the mean

We next turn, with the tool of moment identities, to the probability that the largesteigenvalue of the GUE is less than or equal to 1. As discussed in Chapter 2, bounds onthis probability turn out to be much more delicate. We present here a simple inequality,in the context of the GUE, taken from the note [Wid2] by H. Widom.

We start from the determinantal description (1.11)

P({λNmax ≤ t}

)= det

(Id−K

)where, for each t ∈ R, K = Kt is the symmetric N ×N matrix

(〈P`−1, Pk−1〉L2((t/σ,∞),dµ))1≤k,`≤N .

Since for any unit vector u = (u1, . . . , uN ) ∈ RN ,

0 ≤N∑

k,`=1

uku`〈P`−1, Pk−1〉L2((t/σ,∞),dµ) ≤ 1,

49

the eigenvalues ρ1, . . . , ρN of K are all non-negative and less than or equal to 1. Hence,

P({λNmax ≤ t}

)=

N∏i=1

(1− ρi) ≤ e−ΣNi=1ρi .

Now, by (1.13),

N∑i=1

ρi =N∑`=1

〈P 2`−1〉L2((t/σ,∞),dµ)

= NµN((t,∞)

).

Therefore, for every t ∈ R,

P({λNmax ≤ t}

)≤ exp

(−NµN

((t,∞)

)). (5.14)

Note that (5.14) would be the inequality that one would deduce if the λNi ’s wereindependent. The latter are however strongly correlated so that (5.14) already missesa big deal of the interactions between the eigenvalues.

Let us now apply (5.14) to t = 1 − ε, 0 < ε ≤ 1. By Wigner’s theorem,µN ((1− ε,∞) → ν((1− ε, 1)) where we recall that ν is the semicircle law (on (−1,+1)).It is easy to evaluate

ν((1− ε, 1)

)≥ C−1ε3/2, 0 < ε ≤ 1,

where C > 0 is numerical. We expect that

µN((1− ε,∞)

)≥ C−1ε3/2, 0 < ε ≤ 1, (5.15)

at least for every N ≥ 1 such that CN−2/3 ≤ ε ≤ 1. To this task, we could invoke arecent result of F. Gotze and A. Tikhomirov [G-T] on the rate of convergence of thespectral measure of the GUE to the semicircle which implies that∣∣∣µN(

(1− ε,∞))− ν

((1− ε, 1)

)∣∣∣ ≤ C

N

for some C > 0 and all N ≥ 1, and thus (5.15). While the proof of [G-T] requires quitea bit of analysis, we provide here an independent elementary argument to reach (5.15)using the moment equations.

Fix N ≥ 1 and 0 < ε ≤ 1. For every p ≥ 1,

b2p =∫

Rx4pdµN (x) ≤ (1− ε)2p bp + 2

∫ ∞

1−εx4pdµN (x).

From the recurrence equations put forward in Proposition 5.2, for every p,

b2p ≥ χ2p

50

while (cf. (5.8))

bp ≤(

1 +p(p− 1)

4N2

)pχp ≤ ep

3/4N2χp.

By the Cauchy-Schwarz inequality,∫ ∞

1−εx4pdµN (x) ≤ µN

((1− ε,∞)

)1/2b1/24p .

Hence,

µN((1− ε,∞)

)≥ 4−1 e−16p3/N2

χ−14p

[χ2p − ep

3/4N2χp

]2

.

Choose then p = [ε−1] and assume that N−2/3 ≤ ε ≤ 1. Then

µN((1− ε,∞)

)≥ e−18 χ−1

4p

[χ2p − e1/4 χp

]2

.

Since by Stirling’s formula χp ∼ π−1/2p−3/2 as p→∞, uniform bounds show that forsome constant C > 0,

µN((1− ε,∞)

)≥ C−1ε3/2.

Together with (5.14), we thus conclude that for some C > 0, every N ≥ 1 andevery ε such that N−2/3 ≤ ε ≤ 1,

P({λNmax ≤ 1− ε}

)≤ C e−Nε

3/2/C . (5.16)

Increasing if necessary C, the inequality easily extends to all 0 < ε ≤ 1. The deviationinequality (5.16) is weaker than the one of Proposition 2.4 and does not reflect the N2

rate of the large deviation asymptotics. Its proof is however quite accessible, and givesa firm basis to N−4/3 growth rate of Corollary 2.6.

The preceding argument may be extended to more general orthogonal polynomialensembles provided the corresponding version of (5.15) can be established. In casefor example of the Meixner Ensemble, the explicit expression (5.13) for the factorialmoments of the mean spectral measure might be useful to this task.

51

REFERENCES

[A-D] D. Aldous, P. Diaconis. Longest increasing subsequences: from patience sorting to the

Baik-Deift-Johansson theorem. Bull. Amer. Math. Soc. 36, 413–432 (1999).

[A-K-V] N. Alon, M. Krivelevich, V. Vu. On the concentration of eigenvalues of random sym-

metric matrices. Israel J. Math. 131, 259–267 (2002).

[Ar] L. Arnold. On the asymptotic distribution of the eigenvalues of random matrices. J. Math.

Anal. Appl. 20, 262–268 (1967).

[Au] G. Aubrun. An inequality about the largest eigenvalue of a random matrix (2003). Seminaire

de Probabilites XXXVIII. Lecture Notes in Math. 1857, 320–337 (2005). Springer.

[Bai] Z. D. Bai. Methodologies in the spectral analysis of large dimensional random matrices: a

review. Statist. Sinica 9, 611–661 (1999).

[Baik1] J. Baik. Random vicious walks and random matrices. Comm. Pure Appl. Math. 53,

1385–1410 (2000).

[Baik2] J. Baik. Limit distribution of last passage percolation models (2003).

[B-D-K] J. Baik, P. A. Deift, K. Johansson. On the distribution of the length of the longest

increasing subsequence in a random permutation. J. Amer. Math. Soc. 12, 1119–1178

(1999).

[B-D-ML-M-Z] J. Baik, P. A. Deift, K. McLaughlin, P. Miller, X. Zhou. Optimal tail estimates for

directed last passage site percolation with geometric random variables. Adv. Theor. Math.

Phys. 5, 1207–1250 (2001).

[B-D-R] J. Baik, P. A. Deift, E. Rains. A Fredholm determinant identity and the convergence of

moments for random Young tableaux. Comm. Math. Phys. 223, 627–672 (2001).

[B-K-ML-M] J. Baik, T. Kriecherbauer, K. McLaughlin, P. Miller. Uniform Asymptotics for

Polynomials Orthogonal With Respect to a General Class of Discrete Weights and Universality

Results for Associated Ensembles. Announcements of results: Int. Math. Res. Not. 15, 821–

858 (2003). Full version: Annals of Math. Studies, to appear.

[Bak] D. Bakry. L’hypercontractivite et son utilisation en theorie des semigroupes. Ecole d’Ete

de Probabilites de St-Flour. Lecture Notes in Math. 1581, 1–114 (1994). Springer.

52

[Bar] Y. Baryshnikov. GUES and queues. Probab. Theor. Ral. Fields 119, 256–274 (2001).

[BA-G] G. Ben Arous, A. Guionnet. Large deviations for Wigner’s law and Voiculescu’s noncom-

mutative entropy. Probab. Theor. Rel. Fields 108, 517–542 (1997).

[BA-D-G] G. Ben Arous, A. Dembo, A. Guionnet. Aging of spherical spin glasses. Probab. Theor.

Relat. Fields 120, 1–67 (2001).

[B-R] M. Benaım, R. Rossignol. In preparation (2006).

[B-K-S] I. Benjamini, G. Kalai, O. Schramm. First passage percolation has sublinear distance

variance. Ann. Probab. 31, 1970–1978 (2003).

[Bi] P. Biane. Free probability for probabilists. Quantum Probability Communications, vol. XI,

55–71 (2003). World Sci. Publishing.

[B-I] P. Bleher, A. Its. Semiclassical asymptotics of orthogonal polynomials, Riemann-Hilbert

problem, and universality in the matrix model. Ann. Math. 150, 185–266 (1999).

[Bl] G. Blower. Concentration of measure for large eigenvalues of unitary ensembles, and Airy

operators (2004).

[B-H] S. Bobkov, C. Houdre. A converse Gaussian Poincare-type inequality for convex functions.

Statist. Probab. Letters 44, 281–290 (1999).

[B-O-O] A. Borodin, A. Okounkov, G. Olshanski. Asymptotics of Plancherel measures for sym-

metric groups. J. Amer. Math. Soc. 13, 481–515 (2000).

[Ch] T. S. Chihara. An introduction to orthogonal polynomials. Gordon & Breach (1978).

[Co] B. Collins. Product of random projections, Jacobi ensembles and universality problems

arising from free probability. Probab. Theory Relat. Fields 133, 315–344 (2005).

[Da-S] K. Davidson, S. Szarek. Local operator theory, random matrices and Banach spaces.

Handbook of the Geometry of Banach Spaces, vol. I, 317–366 (2001). North Holland.

[De] P. A. Deift. Orthogonal polynomials and random matrices: a Riemann-Hilbert approach.

CIMS Lecture Notes 3. Courant Institute of Mathematical Sciences (1999).

[D-Z] P. A. Deift, X. Zhou. A steepest descent method for oscillatory Riemann-Hilbert problems;

asymptotics for the MKdV equation. Ann. Math. 137, 295–368 (1993).

[D-K-ML-V-Z] P. A. Deift, T. Kriecherbauer, K. McLaughlin, S. Venakides, X. Zhou. Uniform

asymptotics for polynomials orthogonal with respect to varying exponential weights and ap-

plications to universality questions in random matrix theory. Comm. Pure Appl. Math. 52,

1335–1425 (1999).

[D-Ze] J.-D. Deuschel, O. Zeitouni. On increasing subsequences of I.I.D. samples. Combin.

Probab. Comput. 8, 247–263 (1999).

[Du-S] N. Dunford, J. T. Schwartz. Linear Operators. Part II: Spectral Theory. Wiley (1988).

[Fe] D. Feral. On large deviations for the spectral measure of discrete Coulomb gas (2004).

53

[Fo1] P. J. Forrester. The spectrum edge of random matrix ensembles. Nuclear Phys. B 402,

709–728 (1993).

[Fo2] P. J. Forrester. Log-gases and random matrices (2002).

[Fu] W. Fulton. Young tableaux. With applications to representation theory and geometry.

London Math. Soc. Student Texts, 35. Cambridge University Press (1997).

[Fy] Y. V. Fyodorov. Introduction to the random matrix theory: Gaussian Unitary Ensemble

and beyond. Recent perspectives in random matrix theory and number theory, London Math.

Soc. Lecture Note Ser., 322, 31–78 (2005). Cambridge Univ. Press.

[G-G] I. Gohberg, S. Goldberg. Basic operator theory. Birkhauser (1981).

[G-T] F. Gotze, A. Tikhomirov. The rate of convergence for spectra of GUE and LUE matrix

ensembles. Cent. Eur. J. Math. 3, 666–704 (2005).

[G-T-W] J. Gravner, G. Tracy, H. Widom. Limit theorems for height fluctuations in a class of

discrete space and time growth models. J. Statist. Phys. 102, 1085–1132 (2001).

[Gr] L. Gross. Logarithmic Sobolev inequalities. Amer. J. Math. 97, 1061–1083 (1975).

[G-P] O. Guedon, G. Paouris. Concentration of mass on the Schatten classes (2005).

[G-Z] A. Guionnet, O. Zeitouni. Concentration of the spectral measure for large matrices. Elec-

tron. Comm. Probab. 5, 119-136 (2000).

[H-T] U. Haagerup, S. Thorbjørnsen.Random matrices with complex Gaussian entries. Expo.

Math. 21, 293–337 (2003).

[H-Z] J. Harer, D. Zagier. The Euler characteristic of the moduli space of curves. Invent. math.

85, 457–485 (1986).

[H-P] F. Hiai, D. Petz. The semicircle law, free random variables and entropy. Mathematical

Surveys and Monographs 77. AMS (2000).

[Joha1] K. Johansson. Shape fluctuations and random matrices. Comm. Math. Phys. 209, 437–476

(2000).

[Joha2] K. Johansson. Universality of the local spacing distribution in certain ensembles of Hermi-

tian Wigner matrices. Comm. Math. Phys. 215, 683–705 (2001).

[Joha3] K. Johansson. Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann.

Math. 259–296 (2001).

[Joha4] K. Johansson. Toeplitz determinants, random growth and determinantal processes. Pro-

ceedings of the International Congress of Mathematicians, Vol. III (Beijing, 2002), 53–62.

Higher Ed. Press, Beijing (2002).

[John] I. M. Johnstone. On the distribution of the largest principal component. Ann. Statist. 29,

295–327 (2001).

[Kh] O. Khorunzhyi. Estimates for moments of random matrices with Gaussian elements (2005).

54

[K-S] R. Koekoek, R. F. Swarttouw. The Askey scheme of hypergeometric orthogonal polyno-

mials and its q-analogue (1998). http://aw.twi.tudelft.nl/˜koekoek/askey/index.html.

[Ko] W. Konig. Orthogonal polynomial ensembles in probability theory. Probability Surveys 2,

385–447 (2005).

[Ku] A. Kuijlaars. Riemann-Hilbert analysis for orthogonal polynomials. Orthogonal polyno-

mials and special functions (Leuven, 2002). Lecture Notes in Math. 1817, 167–210 (2003).

Springer.

[Le1] M. Ledoux. The concentration of measure phenomenon. Mathematical Surveys and Mono-

graphs 89. AMS (2001).

[Le2] M. Ledoux. A remark on hypercontractivity and tail inequalities for the largest eigenvalues of

random matrices. Seminaire de Probabilites XXXVII. Lecture Notes in Math. 1832, 360–369

(2003). Springer.

[Le3] M. Ledoux. Differential operators and spectral distributions of invariant ensembles from

the classical orthogonal polynomials. The continuous case. Electron. J. Probab. 9, 177–208

(2004).

[Le4] M. Ledoux. Differential operators and spectral distributions of invariant ensembles from

the classical orthogonal polynomials. The discrete case. Electron. J. Probab. 10, 1116-1146

(2005).

[L-T] M. Ledoux, M. Talagrand. Probability in Banach spaces (Isoperimetry and processes).

Ergebnisse der Mathematik und ihrer Grenzgebiete. Springer (1991).

[L-M] M. Lowe, F. Merkl. Moderate deviations for longest increasing subsequences: the upper

tail. Comm. Pure Appl. Math. 54, 1488–1520 (2001).

[L-M-R] M. Lowe, F. Merkl, S. Rolles. Moderate deviations for longest increasing subsequences:

the lower tail. J. Theoret. Probab. 15, 1031–1047 (2002).

[M-P] V. Marchenko, L. Pastur. The distribution of eigenvalues in certain sets of random

matrices. Math. Sb. 72, 507–536 (1967).

[Ma] O. Mazet. Classification des semi-groupes de diffusion sur R associes a une famille de

polynomes orthogonaux. Seminaire de Probabilites XXXI. Lecture Notes in Math. 1655,

40–53 (1997). Springer.

[Mec] M. Meckes. Concentration of norms and eigenvalues of random matrices. J. Funct. Anal.

211, 508–524 (2004).

[Meh] M. L. Mehta. Random matrices. Academic Press (1991). Third edition Elsevier (2004).

[Ne] E. Nelson. The free Markov field. J. Funct. Anal. 12, 211–227 (1973).

[OC] N. O’Connell. Random matrices, non-colliding processes and queues. Seminaire de Proba-

bilites XXXVI. Lecture Notes in Math. 1801, 165-182 (2003). Springer.

55

[OC-Y] N. O’Connell, M. Yor. A representation for non-colliding random walks. Electron. Com-

mun. Probab. 7, 1–12 (2000).

[P-L] L. Pastur, A. Lejay. Matrices aleatoires: statistique asymptotique des valeurs propres.

Seminaire de Probabilites XXXVI. Lecture Notes in Math. 1801, 135–164 (2003). Springer.

[P-S] L. Pastur, M. Sherbina. On the edge unviversality of the local eigenvalue statistics of

matrix models (2003).

[S-T] E. B. Saff, V. Totik. Logarithmic potentials with external fields. Springer (1997).

[Se1] T. Seppalainen. Large deviations for increasing sequences on the plane. Probab. Theory

Relat. Fields 112, 221–244 (1998).

[Se2] T. Seppalainen. Exact limiting shape for a simplified model of first-passage percolation on

the plane. Ann. Probab. 26, 1232-1250 (1998).

[Se3] T. Seppalainen. Coupling the totally asymmetric simple exclusion process with a moving

interface. Markov Proc. Relat. Fields 4, 593–628 (1998).

[So1] A. Soshnikov. Universality at the edge of the spectrum in Wigner random matrices. Comm.

Math. Phys. 207, 697–733 (1999).

[So2] A. Soshnikov. Determinantal random point fields. Uspekhi Mat. Nauk 55, 107–160 (2000).

[Sza] S. Szarek. Volume of separable states is super-doubly-exponentially small in the number of

qubits. Phys. Rev. A, 72, (2005).

[Sze] G. Szego. Orthogonal polynomials. Colloquium Publications XXIII. Amer. Math Soc.

(1975).

[Ta1] M. Talagrand. On Russo’s approximate zero-one law. Ann. Probab. 22, 1576–1587 (1994).

[Ta2] M. Talagrand. Concentration of measure and isoperimetric inequalities in product spaces.

Publications Mathematiques de l’I.H.E.S. 81, 73–205 (1995).

[Ta3] M. Talagrand. The generic chaining. Springer (2005).

[T-W1] C. Tracy, H. Widom. Level-spacing distribution and the Airy kernel. Comm. Math. Phys.

159, 151–174 (1994).

[T-W2] C. Tracy, H. Widom. On Orthogonal and Symplectic Matrix Ensembles. Comm. Math.

Phys. 177, 727–754 (1996).

[T-W3] C. Tracy, H. Widom. On the distributions of the lengths of the longest monotone subse-

quences in random words. Probab. Theor. Rel. Fields 119, 350–380 (2001).

[T-W4] C. Tracy, H. Widom.Distribution functions for largest eigenvalues and their applications.

Proceedings of the International Congress of Mathematicians, Vol. I (Beijing, 2002), 587–596.

Higher Ed. Press, Beijing (2002).

[Vo1] D. Voiculescu. The analogues of entropy and of Fisher’s information measure in free prob-

ability theory, I. Comm. Math. Phys. 155, 71–92 (1993).

56

[Vo2] D. Voiculescu. Lecture on free probability theory. Ecole d’Ete de Probabilites de St-Flour.

Lecture Notes in Math. 1738, 279–349 (2000). Springer.

[V-D-N] D. Voiculescu, K. Dykema, A. Nica. Free random variables. CRM Monograph Ser. 1.

Amer. Math. Soc. (1992).

[Wid1] H. Widom. On the relation between orthogonal, symplectic and unitary matrix ensembles.

J. Statist. Phys. 94, 347–363 (1999).

[Wid2] H. Widom. On convergence of moments for random Young tableaux and a random growth

model. Int. Math. Res. Not. 9, 455–464 (2002).

[Wig] E. Wigner. Characteristic vectors of bordered matrices with infinite dimensions. Ann.

Math. 62, 548–564 (1955).

Institut de Mathematiques, Universite Paul-Sabatier, 31062 Toulouse, France

E-mail address: [email protected]

57

summer school connections between probability and ...ledoux/jerusalem.pdf · ous mathematical...

Documents