the distribution of height and diameter in random non … · 2018-01-18 · the distribution of...

THE DISTRIBUTION OF HEIGHT AND DIAMETER IN RANDOMNON-PLANE BINARY TREES

NICOLAS BROUTIN AND PHILIPPE FLAJOLET

ABSTRACT. This study is dedicated to precise distributional analyses of the height ofnon-plane unlabelled binary trees (“Otter trees”), when trees of a given size are taken withequal likelihood. The height of a rooted tree of size n is proved to admit a limiting thetadistribution, both in a central and local sense, and obey moderate as well as large deviationsestimates. The approximations obtained for height also yield the limiting distribution ofthe diameter of unrooted trees. The proofs rely on a precise analysis, in the complex planeand near singularities, of generating functions associated with trees of bounded height.

Introduction

We consider trees that are binary, non-plane, unlabelled, and rooted; that is, a treeis taken in the graph-theoretic sense and it has nodes of (out)degree two or zero only; aspecial node is distinguished, the root, which has degree two. In this model, the nodesare indistinguishable, and no order is assumed between the neighbours of a node. Let Ydenote the class of such trees, and let Yn be the subset consisting of trees with n externalnodes (i.e., nodes of degree zero). In this article, we study the (random) height Hn of atree sampled uniformly from Yn.

Most of the results concerning random trees of fixed size are relative to the situationwhere one can distinguish the neighbours of a node, either by their labels (labelled trees),or by the order induced on the progeny through an embedding in the plane (plane trees);see the reference books [14, 19] and the discussion by Aldous [3] who globally refers tothese as “ordered trees”. In this range of models, Meir and Moon [33] determined that thedepth of nodes is typically O(

√n) for all “simple varieties” of trees, which are determined

by restricting in an arbitrary way the collection of allowed node degrees. Regarding height,a few special cases were studied early: Renyi and Szekeres [38] proved in particular thatthe average height of labelled non-plane trees of size n is asymptotic to 2

√πn; De Bruijn,

Knuth, and Rice [12] dealt with plane trees and showed that the average height is equivalentto√πn as n → ∞. Eventually, Flajolet and Odlyzko [18] developed an approach for

height that encompasses all simple varieties of trees; see also [20] for additional results.Under such models with distinguishable neighbourhoods, trees of a fixed size n may be

seen as Galton–Watson processes (branching processes) conditioned on the size being n,see [1, 26, 28], and there are natural random walks associated to various tree traversals. Ac-cordingly, probabilistic techniques have been successfully applied to quantify tree heightand width [8, 9], based on Brownian excursion. An important probabilistic approach con-sists in establishing the existence of a continuous limit of suitably rescaled random treesof increasing sizes—one can then read off, to first asymptotic order at least, some of thelimit parameters directly on the limiting object. The latter point of view has been adopted

Date: May 18, 2011.1

2 NICOLAS BROUTIN AND PHILIPPE FLAJOLET

by Aldous [2, 3, 4] in his definition of the continuum random tree (CRT): see the surveyby Le Gall [30] for a recent account of probabilistic developments along these lines.

The case of trees (as are considered here) that have indistinguishable neighbourhoodsis essentially different. Such trees cannot be generated by a branching process conditionedby size and no direct random walk approach appears to be possible, due to the inherentpresence of symmetries. (An analysis of such symmetries otherwise occurs in the recentarticle [6].) The analysis of unlabelled non-plane trees finds its origins in the works ofPolya [36] and Otter [35]. However, these authors mostly focused on enumeration—theproblem of characterizing typical parameters of these random trees remained largely un-touched. Recently, in an independent study, Drmota and Gittenberger [15] have examinedthe profile of “general” trees (where all degrees are allowed) and shown that the jointdistribution of the number of nodes at a finite number of levels converges weakly to thefinite dimensional distribution of Brownian excursion local times. They further extendedthe result to a convergence of the entire profile to the Brownian excursion local time.

The foregoing discussion suggests that, although there is no clear exact reduction of un-labelled non-plane trees to random walks, such trees largely behave like simply generatedfamilies of ordered trees. In particular, it suggests that the rescaled heightHn/

√n is likely

to admit a limit distribution of the theta-function type [16, 18, 27, 38]. We shall prove thatsuch is indeed the case for non-plane binary trees in Theorems 1 and 2 below. We alsoprovide moderate and large deviations estimates (Theorems 4 and 5), as well as asymp-totic estimates for moments (Theorem 3), see §5. Equipped with solid analytic estimatesregarding height, we can then proceed to characterize the diameter of unrooted trees in §6,this both in a local and central form (Theorems 6 and 7). Some a posteriori observationsthat complete the picture are offered in our Conclusion section, §7.

A preliminary investigation of the distribution of height in rooted trees is reported in theextended abstract [7]. Our interest in this range of problems initially arose from questionsof Jean-Francois Marckert and Gregory Miermont [32], in their endeavour to extend theprobabilistic methods of Aldous to non-plane trees and develop corresponding continuousmodels—we are indebted to them for being at the origin of the present study.

1. Trees and generating functions

Tree enumeration. Our approach is entirely based on generating functions. The class Yof (non-plane, unlabelled, rooted) binary trees is defined to include the tree with a singleexternal node. A tree has size n if it has n external nodes, hence n − 1 internal nodes.The cardinality of the subclass Yn of trees of size n is denoted by yn and the generatingfunction (GF) of Y is

y(z) :=∑n≥1

ynzn = z + z2 + z3 + 2z4 + 3z5 + 6z6 + 11z7 + 23z8 + · · · ,

the coefficients corresponding to the entry A001190 of Sloane’s On-line Encyclopedia ofInteger Sequences. The trees of Y with size at most 6 are shown in Figure 1.

A binary tree is either an external node or a root appended to an unordered pair of two(not necessarily distinct) binary trees. In the language of analytic combinatorics [19], thiscorresponds to the (recursive) specification

Y = Z + MSet2(Y),

HEIGHT AND DIAMETER IN RANDOM BINARY TREES 3

Figure 1. The binary unlabelled trees of size less than six.

where Z represents a generic atom (of size 1) and MSet2 forms multisets of two elements.The basic functional equation

(1) y(z) = z +1

2y(z)2 +

1

2y(z2),

closely related to the early works of Polya (1937; see [36, 37], and first studied by Ot-ter (1948; see [35]), follows from fundamental principles of combinatorial enumeration [19,25]. The term 1

2y(z2) accounts for potential symmetries—hereafter, we refer to such termsinvolving functions of z2, z3, . . . , as Polya terms. According to the general theory of ana-lytic combinatorics, we shall operate in an essential manner with properties of generatingfunctions in the complex plane. The following lemma is classical but we sketch a proof, asits ingredients are needed throughout our work.

Lemma 1 (Otter [35]). Let ρ be the radius of convergence of y(z). Then, one has 1/4 ≤ρ < 1/2, and ρ is determined implicitly by ρ + 1

2y(ρ2) = 12 . As z → ρ−, the generating

function y(z) satisfies

(2) y(z) = 1− λ√

1− z/ρ+O (1− z/ρ) , λ =√

2ρ+ 2ρ2y′(ρ2).

Furthermore, the number yn of trees of size n satisfies asymptotically

(3) yn =λ

2√π· n−3/2ρ−n

(1 +O

(1

n

)),

Proof. The number of plane binary trees with n external nodes is given by the Catalannumber Cn−1 = 1

n

(2n−2n−1

). The number of symmetries in a tree of size n being a priori

between 1 and 2n−1, one has the bounds

Cn−121−n ≤ yn ≤ Cn−1.

As it is well known, the Catalan numbers satisfy Cn ∼ π−1/24nn−3/2, so that the radiusof convergence ρ satisfies the bounds 1/4 ≤ ρ < 1/2. It follows that y(z2) is analyticin a disc of radius

√ρ, which properly contains |z| ≤ ρ. Then, from (1), upon solving

for y(z), we obtain

(4) y(z) = 1−√

1− 2z − y(z2),

which can only become singular when the argument of the square root vanishes. ByPringsheim’s Theorem [19, p. 240], the value ρ is then the smallest positive solution of2z+ y(z2) = 1, corresponding to a simple root, and, at this point, we must have y(ρ) = 1,


given (4). This reasoning also justifies the singular expansion (2), which is seen to be validin a ∆-domain [19, §VI.3], i.e., a domain of the form

(5) z : |z| < ρ+ ε, z 6= ρ, | arg(z − ρ)| > θ ε, θ > 0

that extends beyond the disc of convergence |z| ≤ ρ.Equation (3) constitutes Otter’s celebrated estimate: it results from translating the square

root singularity of y(z) by means of either Darboux’s method [25, 35, 36] or singularityanalysis [19].

Numerically, one finds [17, 19, 35]:

ρ.= 0.40269 750367, λ

.= 1.13003 37163,

λ

2√π

.= 0.31877 66259.

Height. In a tree, height is defined as the maximum number of edges along branchesconnecting the root to an external node. Let yh,n be the number of trees of size n andheight at most h and let yh(z) =

∑n≥1 yh,nz

n be the corresponding generating function.The arguments leading to (1) yield the fundamental recurrence

(6) yh+1(z) = z +1

2yh(z)2 +

1

2yh(z2), h ≥ 0,

with initial value y0(z) = z, and

(7)y1(z) = z + z2, y2(z) = z + z2 + z3 + z4,y3(z) = z + z2 + z3 + 2z4 + 2z5 + 2z6 + z7.

A central role in what follows is played by the generating function of trees with heightexceeding h:

eh(z) ≡∑n≥1

eh,nzn := y(z)− yh(z),

Then, a trite calculation shows that the eh(z) satisfy the main recurrence

(8) eh+1(z) = y(z)eh(z)

(1− eh(z)

2y(z)

)+eh(z2)

2, e0(z) = y(z)− z,

on which our subsequent treatment of height is entirely based.Analysis. The distribution of height is accessible by

(9) P Hn > h =yn − yn,h

yn=eh,nyn

,

where eh,n = [zn]eh(z). Lemma 1 provides an estimate for yn, and we shall get a handleon the asymptotic properties of eh,n by means of Cauchy’s coefficient formula,

(10) en,h =1

2iπ

∫γ

eh(z)dz

zn+1,

upon choosing a suitable integration contour γ in (10), of the form commonly used in sin-gularity analysis theory [19]; see Figure 2 below. This task necessitates first developpingsuitable estimates of eh(z), for values of z both inside and outside of the disc of conver-gence |z| < ρ. Precisely, we shall need estimates valid in a “tube” around an arc of thecircle |z| = ρ, as well as inside a “sandclock” anchored at ρ (see Figure 2).

Definition 1. The “tube” T (µ, η) of width µ and angle η is defined as

(11) T (µ, η) := z : −µ < |z| − ρ < µ, | arg(z)| > η.The “sandclock” of radius r0 and angle θ0 anchored at ρ is defined as

(12) S(r0, θ0) := z : |z − ρ| < r0, π/2− θ0 < | arg(z − ρ)| < π/2 + θ0.


ρρ

γ1

γ2

γ3

γ4

γ5

sandclock

tube@@@R

Figure 2. Left: the “tube” and “sandclock” regions. Right: the Hankel contour used to estimate eh,n(details are given in Figure 3).

Strategy and overview of the results. Estimates of the sequence of generating func-tions (eh(z)) within the disc of convergence and a tube, where z stays away from thesingularity ρ, are comparatively easy: they form the subject of Section 2. In particular,Proposition 1 states that we can always find thinner and thinner tubes that come arbitrarilyclose to the singularity ρ and where the convergence yh → y, eh → 0 is ensured. The bulkof the technical work is relative to the sandclock, in Section 3, where Proposition 2 grantsus the existence of a suitable sandclock for convergence. We can then develop in Section 4our main approximation:

(13) eh(z) ≡ y(z)− yh(z) ≈ 21− y1− yh

yh.

Here, the symbol “≈” is to be loosely interpreted in the sense of “approximately equal” ; aformal statement is postponed and summarized in Proposition 3.

The form of the approximation in (13) is similar to that in the original paper by Flajoletand Odlyzko [18] where trees are ordered. Its justification ranges in Sections 2–4, whichclosely follow the general strategy in [18]; however, nontrivial adaptations are needed, dueto the presence of Polya terms, so that the problem is no longer of a “pure” iteration type.

We then reap the crop in Section 5. There, we use (9), the approximation in (13) togetherwith the square root singularity of y at ρ to prove the following theorem relative to thedistribution of height Hn:

Theorem 1 (Limit law of height). The height Hn of a random tree taken uniformly fromYn admits a limiting theta distribution: for any fixed x > 0, there holds

limn→∞

P(Hn ≥ λ−1x√n) = Θ(x), λ :=

√2ρ+ 2ρ2y′(ρ2),

where Θ(x) :=∑k≥1

(k2x2 − 2)e−k2x2/4.

Our formal version of approximation in (13) (Proposition 3) is also strong enough togrant us access to a limit law for the height Hn:


Theorem 2 (Local limit law of height). The distribution of the height Hn of a randomtree taken uniformly from Yn admits a local limit: for x in a compact set of R>0 andh = λ−1x

√n an integer, there holds uniformly

P(Hn = h) ∼ λ√nϑ(x),

where ϑ(x) = −Θ′(x) = (2x)−1∑k≥1

(k4x4 − 6k2x2)e−k2x2/4.

Note that the results above appear to parallel the weak limit theorem and and local limitlaws known in the planar case [20]. Further theorems about the asymptotics of (integer)moments of Hn, together with moderate and large deviations may also be extracted from(13) ; we only state the one for the moments, the others may be found in Section 5.

Theorem 3 (Moments of height). Let r ≥ 1. The rth moment of height Hn satisfies

(14) E [Hn] ∼ 2

λ

√πn and E[Hr

n] ∼ r(r− 1)ζ(r)Γ(r/2)

(2

λ

)rnr/2, r ≥ 2.

Finally, in Section 6, we analyse the diameter of unrooted trees using a reduction to therooted tree case. There, we provide theorems similar to Theorem 1, 2 and 3, i.e., a weaklimit theorem, a local limit law, and asymptotics for the moments. The precise definition ofthe model of unrooted trees, and the statement of the results are postponed until Section 6.

2. Convergence away from the singularity in tubes

Our aim in this section1 is to extend the domain where eh is analytic beyond the disc ofconvergence |z| ≤ ρ, when z stays in a “tube” T (µ, η) as defined in (11) and is thus awayfrom ρ. The main result is summarized by Proposition 1, at the end of this section. Itsproof relies on the combination of two ingredients: first, the fact, expressed by Lemma 2,that the eh converge to 0, equivalently, yh → y, in the closed disc of radius ρ (this propertyis the consequence of the n−3/2 subexponential factor in the asymptotic form of yn, whichimplies convergence of y(ρ)); second, a general criterion for convergence of the eh to 0,which is expressed by Lemma 3. The criterion implies in essence that the convergence do-main is an open set, and this fact provides the basic analytic continuation of the generatingfunctions of interest.

Lemma 2. For all z such that |z| ≤ ρ, and h ≥ 1, one has

|eh(z)| ≤ 1√h

(|z|ρ

)h.

Proof. To have height at least h, a tree needs at least h + 1 nodes, so that |eh(z)| ≤∑n>h yn|z|n.We first note an easy numerical refinement of (3), namely, yn ≤ 1

2ρ−nn−3/2,

obtained by combining the first few exact values of yn with the asymptotic estimate (3).(See [22] for a detailed proof strategy in the case of a similar but harder problem.) Thisimplies

|eh(z)| ≤ 1

2

(|z|ρ

)h∑n>h

1

n3/2≤ 1

2

(|z|ρ

)h ∫ ∞h

dt

t3/2=

(|z|ρ

)h1√h,

and the statement results.

1In what follows, we freely omit the arguments of y(z), eh(z), yh(z) . . . , whenever they are taken at z. (Wereserve h for height and n for size, so that no ambiguity should arise: yh means yh(z), whereas yn invariablyrepresents [zn]y(z).)


We now devise a criterion for the convergence of eh(z) to zero. This criterion, adaptedfrom [18, Lemma 1], is crucial in obtaining extended convergence regions, both near thecircle |z| = ρ (in this section) and near the singularity ρ (in Section 3).

Lemma 3 (Convergence criterion). Define the domain2

(15) D := z : |y(z)| < 1.Assume that z satisfies the conditions z ∈ D and |z| < √ρ. The sequence |eh(z)|, h ≥ 0converges to 0 if and only if there exist an integer m ≥ 1 and real numbers α, β ∈ (0, 1),such that the following three conditions are simultaneously met:

(16) |em| < α, |y|+ α

2< β, αβ +

(|z|2

ρ

)m< α.

Furthermore, if (16) holds then, for some constant C and β0 ∈ (0, 1), one has the geomet-ric convergence

(17) |eh| ≤ Chβh0 ,for all h ≥ m.

Proof. (i) Convergence implies that (16) is satisfied, for some m. Assume that z ∈ D,|z| < √ρ, and eh(z) → 0 as h → ∞. Then choose β such that |y| < β < 1. This givesa possible value for α, say, α = (β − |y|). Choose m0 such that, for all µ > m0, one has|eµ| < α; then choose m1 large enough, so that the third condition of (16) is satisfied. Thethree conditions of (16) are now satisfied by taking m = max(m0,m1).

(ii) Condition (16) implies convergence and the bound (17). Conversely, assume thethree conditions in (16), for some value m. Then, they also hold for m + 1. Indeed,recalling (8), we see that, for any h ≥ 1,

(18) |eh+1| ≤ |eh|(|y|+ |eh|

2

)+|eh(z2)|

2≤ |eh|

(|y|+ |eh|

2

)+

(|z|2

ρ

)h,

where the Polya term involving |eh(z2)| has been bounded using Lemma 2. The hypothe-ses of (16) together with (18) above taken at h = m, yield the inequality |em+1| < α. So,once the conditions (16) hold for some m, they hold for m+ 1; hence, for all h ≥ m.

The fact that, under these conditions, there is convergence, eh → 0, now results fromunfolding the recurrence (8): we find, for all h ≥ m,

|eh+1| ≤ βh−m+1|em|+h−m∑i=0

βi(|z|2

ρ

)h−i≤ βh−m+1|em|+ hmax

β,

(|z|2

ρ

)h,

where Lemma 2 has been used again to bound the Polya term. The additional assertionthat |eh| ≤ Chβh0 in (17) finally follows from choosing β0 := max(β, |z|2/ρ).

We can now state the main convergence result of this section:Proposition 1 (Convergence in “tubes”). For any angle η > 0, there exists a tube T (µ, η)with width µ > 0, such that |eh(z)| → 0, as h→∞, uniformly for z in T (µ, η).

Proof. We thus start from a fixed η, assumed to be suitably small. If we exclude a smallsector of opening angle 2η around the positive real axis, then the quantity,

λ0 := sup |y(z)|; |z| = ρ, | arg(z)| ≥ η ,satisfies λ0 < 1: this results from the strong triangle inequality (see also the “DaffodilLemma” of [19]) and the fact that y(ρeiθ) is a continuous function of θ. (By the argument

2This domain will sometimes be referred to as the “cardioid-like” domain, as it contains the |z| ≤ ρpunctured at ρ (Proposition 1) and has a cusp at z = ρ, associated to the square root singularity of y(z) at ρ.


introduced in the proof of Lemma 1, the function y(z) is analytic at all points of |z| = ρ,z 6= ρ, hence continuous.) Fix then ε by λ0 = 1 − 2ε. By continuity of y again, foreach z on the circle of radius ρ satisfying | arg(z)| ≥ η, there exists a small open disc δ(z),centred at z and such that |y(ζ)| < 1 − ε for all ζ ∈ δ(z). From now on, we assume thatthe discs δ(z) are taken small enough, so that they are entirely contained in the larger discw ∈ C : |w| < √ρ.

We can then make use of the convergence criterion of Lemma 3, supplemented oncemore by a continuity argument. In the notations of (16), choose first α = ε, then β =1 − ε/2. For all sufficiently large m, say m ≥ ν, the last two conditions of (16) aresatisfied. Then, since the eh(z) are analytic (hence continuous) at every point of the unitcircle punctured at ρ, there exists, around each z on |z| = ρ with | arg(z)| ≥ η, a smallopen disc δ1(z) ⊆ δ(z) and an integer M(z) such that |em| < α for all m ≥ M(z). Wemay also freely assume that M(z) > ν.

Finally, by compactness of the arc ρeiθ defined by |θ| ≥ η, there exists a coveringof the arc by a finite collection of small discs, say δ1(zj)rj=1. The union of these smalldiscs must then contain a tube of angle η and width µ > 0. By design, in this tube, all threeconditions of the convergence criterion of Lemma 3 (Equation (16)) are now satisfied, withm = maxrj=1M(zj).

3. Convergence near the singularity in a sandclock

We now focus on the behaviour of eh(z) in a “sandclock” around the singularity. Whenz approaches ρ, the quantity |y| is no longer bounded away from 1, so that the criterion forconvergence obtained earlier (Lemma 3) cannot be used directly. We then need to proceedin two stages: first, we prove in Subsection 3.1 that, in a suitable sandclock, the initial termsdecay “enough”; next, in Subsection 3.2, we establish the existence of a sandclock whereconvergence of the eh to 0 is ensured—this is expressed by the main Proposition 2 below.We shall then be able to build upon these results in the next section and derive suitablesingular approximations of the eh outside of the original disc of convergence |z| ≤ ρof y(z), when z is near ρ.

Alternative recurrence. So far, we have operated with the main recurrence (8) relatingthe eh, then applied some partial unfolding supplemented by simple continuity arguments.To proceed with our programme, we need to adapt a classical technique in the study ofslowly convergent iterations near an indifferent fixed point [11, p. 153], which simplyamounts to “taking inverses” and leads to a useful alternative form of the original recur-rence.

Lemma 4 (Alternative recurrence). Assume, for a value z, the conditions

ei(z) 6= 0 and ei(z)[1− ei(z2)/ei(z)

2]6= 2y(z), for i = 0, . . . , h− 1.

Then, the following two recurrence relations hold

yh

eh=

1

e0+

1

2y

h−1∑i=0

yi[1− ei(z

2)

e2i

](1− ei

2y

[1− ei(z

2)

e2i

])−1(19)

yh

eh=

1

e0+

1

2y

1− yh

1− y−h−1∑i=0

yi−1ei(z2)

2e2i+

1

4y2

h−1∑i=1

yiei

[1− ei(z

2)e2i

]21− ei

2y

[1− ei(z2)

e2i

] .(20)

The form (19) is referred to as the simplified alternative recurrence; the form (20) is theextended alternative recurrence.


Proof. Starting with the recurrence relation (8), rewritten as

ei+1

yi+1=eiyi

(1− ei

2y

[1− ei(z

2)

e2i

]),

the trick is to take inverses (cf also [18]). The identity (1−u)−1 = 1+u(1−u)−1 implies

yi+1

ei+1− yi

ei=yi−1

2

[1− ei(z

2)

e2i

](1− ei

2y

[1− ei(z

2)

e2i

])−1.

Summing the terms of this equality for i = 0, . . . , h − 1 then yields the first version. Theextended version follows from the expansion (1− u)−1 = 1 + u+ u2(1− u)−1.

Lemma 4 is used to complete the proof of Lemma 5 below (see Equation (31)) and itserves as the starting point of the proof of Proposition 2 (see Equation (33)). It then provescentral in establishing the main approximation of Proposition 3 in the next section. Theinterest of these alternative recurrences is that they relate the inverse 1/eh to essentiallypolynomial forms in the previous ei. In particular they serve to convert lower bounds intoupper bounds, and vice versa.

3.1. Initial behaviour of eh. We establish in this subsection (cf Lemma 6) that the quan-tities |eh(z)| first exhibit a decreasing behaviour for h ≤ N , with some appropriateN = N(z). At that point, |eN (z)| appears to be small enough to guarantee that the crite-rion of Lemma 3 becomes applicable, whence eventually the convergence |eh(z)| → 0 ash→∞ in a sandclock.

The following preparatory lemma serves to control the effect of Polya terms, when z isclose to ρ, so that z2 is close to ρ2, well inside of the disc of radius ρ. It is evocative of thetheory of iteration near an attractive fixed point (see, e.g., [34, Ch. 8]).

Lemma 5 (Smooth iteration for Polya terms). Fix z0 ∈ (0, ρ). There exists a constantR0 > 0, dependent upon z0, such that, for all h ≥ 0, and for all z satisfying |z−z0| < R0,one has

eh(z) = Ch(z) · y(z)h,

where, uniformly with respect to z, Ch(z) = C(z)+o(1), as h→∞, and C(z) is analyticat z0. Furthermore, for some K1,K2, c0 all positive, one has3, in the disc |z − z0| < R0,

K1 < |Ch(z)| < K2 and | arg(eh(z))| ≤ c0(h+ 1)|z − z0|.

Proof. Starting from the main relation (8) and unfolding only the eh that is factored, weobtain by induction

(21)eh+1

yh+1= e0

h∏i=0

(1− ei

2y

)+eh(z2)

2yh+1+

1

2y

h−1∑i=0

ei(z2)

yi

h∏j=i+1

(1− ej

2y

) .We let Ch(z) := eh(z)/y(z)h and proceed to prove properties of these quantities.

(i) Upper bound on Ch and existence of C(z). When z lies in a small enough neigh-bourhood of z0 ∈ (0, ρ), the convergence of ei to zero is geometric by Lemma 2, and itremains so, uniformly with respect to z restricted to a small neighbourhood of z0. Further-more, the inequality |y(z)| > |z|, which holds at z = z0, persists, by continuity, for z ina suitably small neighbourhood of z0. It follows that both the product and the sum in theright-hand side of (21) converge geometrically and uniformly, so that Ch(z) → C(z) ash → ∞, where C(z) is analytic at z0. These arguments also imply that |Ch(z)| remainsbounded from above by an absolute constant: |Ch(z)| < K2.

3 The argument of a complex number w 6= 0 taken to be the number θ ∈ (−π,+π] such that w = |w|eiθ .


(ii) Lower bound on Ch. We next observe that, in a small enough neighbourhoodof z0, the quantity |C(z)| must be bounded from below. Indeed, a contrario, if this wasnot the case, then we would need to have C(z0) = 0. Now, because of the convergenceof Ch(z) to C(z), we would have Ch(z0) = o(1), implying eh(z0) = o(y(z0)h). This lastfact is finally seen to contradict Equation (21), since the left hand side taken at z0 wouldtend to 0, while the right hand side remains bounded from below by the positive quantitye0∏∞i=0(1− ei/(2y)) taken at z0. A contradiction has been reached. Thus, we must have

|C(z)| > K?1 for some K?

1 > 0; hence the claimed inequality |Ch(z)| > K1 for all h largeenough, say h > h0. (For h ≤ h0, we can complete the argument by referring again toEquation (21), which precludes the possibility that eh(ζ) = 0 for ζ ∈ (0, ρ). A continuityargument then provides a small domain around z0 where Cj(z) is bounded from below, forall j ∈ 1, . . . , h0.)

(iii) Bound on the argument. Finally, the argument of eh can be expressed as follows:

(22) arg eh = =(log eh) = =(logCh) + h=(log y) (mod 2π).

We now consider a disc |z − z0| < R and momentarily examine the effect of lettingR→ 0. By analyticity of y(z) at z0 and since y(z0) is positive real, we have=(log y(z)) =O(R). Next, since |Ch(z)| is bounded from above and below in a small enough fixedneighbourhood of z0, Ch(z0) is positive real, and Ch(z) → C(z), we have, similarly,=(logCh(z)) = O(R), where the implied constant in O(·) can be taken independent of h.This means that, there exist constants d0, d1 > 0 such that, provided R is chosen smallenough, one has | arg(eh(z))| < d0R + d1Rh. This last form implies the stated bound onthe argument of eh.

With Lemma 5 in hand, we can obtain a first set of properties of eh(z), which hold forz in a sandclock S(r0, θ0) and for h “not too large”. These will be used in Proposition 2 toderive an upper bound on |eN | (for some suitably chosen N depending on z), to the effectthat eN eventually satisfies the criterion of Lemma 3. In the following, we only need toconsider z ∈ S(r0, θ0), with =(z) ≥ 0, since we clearly have eh(z) = eh(z), where zdenotes the complex conjugate of z.

Lemma 6 (Initial behaviour of eh). Suppose =(z) > 0. Define the integer

(23) N(z) :=

⌊arccos(1/4)

arg y(z)

⌋− 2.

Fix θ0 ≤ π8 , with θ0 > 0. There exists a constant r0 > 0 such that, if z lies in the sandclock

S(r0, θ0), then, for all h such that 1 ≤ h ≤ N(z), the following inequalities hold:

(24)|y|h+1

2(h+ 1)< |eh(z)| < 1/2 and 0 ≤ arg(eh) ≤ (h+ 2) arg(y).

Furthermore, one has |eh(z)| < 1/5, for 6 ≤ h ≤ N(z).

Observe that we can also assume, in a small enough sandclock,

(25)1

2< |e0(z)| < 2

3,

since e0(ρ) = 1− ρ has numerical value .= 0.59730 and e0(z) is continuous at z = ρ.

Proof. As a preamble, we note thatN(z) tends to infinity as z → ρ, since y(ρ) = 1 is real,hence has argument 0. Consider next the basic recurrence relation (8) rewritten as

(26)eh+1

y= y · eh

y·(

1− eh/y

2

)+eh(z2)

2y.


The behaviour of the first term in the right-hand side of (26) is dictated by properties of themapping

(27) g : w 7→ w(1− w/2).

(A very similar function appeared in the analysis of Flajolet and Odlyzko [18, Lemma 3].)By a simple modification of the proof in [18], we can check elementarily the implication

(28)

|w| ≤ 10 ≤ argw ≤ arccos(1/4)

⇒

|g(w)| ≤ |w|0 ≤ arg g(w) ≤ argw.

(i) Weak upper bounds on modulus and bounds on argument. We are first going to use(28) and induction on h (with 1 ≤ h ≤ N(z)), in order to establish a suitably weakenedform of (24); namely,

(29) |eh/y| < 1 and 0 ≤ arg(eh/y) ≤ (h+ 1) arg y.

We start with the basis of the induction relative to (29), the case h = 1, where e1 =y − z − z2. Observe that e1(ρ) = 1 − ρ − ρ2

.= 0.43, so that |e1(z)| < 1/2 (and, a

fortiori, |e1/y| < 1) is granted for z close enough to ρ. Write next z/ρ = 1 + reiθ, with θclose to π/2 and r a small positive number. Then, by virtue of the singular expansion (2)of Lemma 1, we have,

y(z) = 1 + iλ√reiθ/2 +O(r),

as r → 0, hencee1y

= 1− ρ− ρ2 + iλ(ρ+ ρ2)eiθ/2√r +O(r).

Since θ/2 now lies in (π/4 − π/16, π/4 + π/16), there results from the last expansionthat the argument of e1/y is essentially a small positive multiple of

√r. A precise com-

parison of the arguments of y and e1/y, as provided by the last two displayed equations,confirms (routine details omitted) that we can choose a small enough r0 such that, in thesandclock S(r0, π/8), we have both |e1/y| < 1 and 0 < arg(e1/y) ≤ 2 arg(y).

Suppose now that (29) holds for all integers up to h ≤ N(z). In order to determinewhether it also holds for h + 1, we have to take into account the Polya term, that is, thesecond term in the right-hand side of (26). By possibly further restricting r0, we canguarantee that, for all z ∈ S(r0, π/8), this second term does not contribute any increase inthe argument of eh/y. Indeed, observe that for z ∈ S(r0, π/8), we have arg(y) ≥ δr1/2,with some δ > 0. In addition, by Lemma 5, Equation (22), we have arg eh(z2) ≤ c0(h+1)|z2 − ρ2|, so that |z2 − ρ2| = O(r) is of a smaller order than O(

√r). Thus, in (26),

the second (Polya) term on the right hand side of the equality has an argument which is oforder hr, and, for r small enough, may be taken to satisfy

0 ≤ arg(eh(z2)/(2y)) ≤ h/2 · arg(y).

Now, the simple geometry of parallelograms implies that two complex numbers ζ and ζ ′,whose arguments lie in [0, π2 ], satisfy arg(ζ + ζ ′) ≤ max(arg(ζ), arg(ζ ′)). There results,from the induction hypothesis, the chain of inequalities

0 ≤ arg(eh+1/y) ≤ maxarg(eh/y) + arg(y), arg(eh(z2)/y)≤ max(h+ 1) arg y + arg(y)≤ (h+ 2) arg(y).

Note that the first inequality follows from the use of (28). In particular, this step requiresthat arg(eh/y) be lower than arccos(1/4), which we can only garantee as long as our upperbound (h+ 1) arg(y) is itself at most arccos(1/4). This is why we only proceed with theinduction only as long as h ≤ N(z). At this stage, the induction is complete and (29) isestablished for h ≤ N(z).


(ii) Improved upper bound on modulus. The upper bound on the modulus providedby (29), being (slightly) weaker than the upper bound on |eh| asserted in (24), needs tobe strengthened. The y(z) and eh(z) are analytic, hence continuous, in the domain Dof (15) and all the sandclooks it contains. Also, we have seen that e1(ρ) < 1/2, while,by definition, e1(ρ) > e2(ρ) > · · · . So, after possibly restricting r0 to a smaller valueonce more, for all z ∈ S(r0, π/8), the inequality |eh(z)| ≤ 1/2 is guaranteed to hold withh = 1, . . . , 6, this by virtue of continuity. Next, if h ≥ 6, the alternative recurrence and thefact that |eh/y| < 1 (asserted in (29) and proved in Part (i) above) imply, via the triangleinequality∣∣∣∣ehy

∣∣∣∣ ≤ |y| |g(eh/y)|+∣∣∣∣eh(z2)

2y

∣∣∣∣ , where g(eh/y) =ehy·(

1− eh2

)(g(w) is as defined in (27)). Now, for h < N(z), the quantity g(w) is taken at w = eh/y,which is such that |w| < 1 and arg(w) < arccos(1/4), so that, by (28),

(30)|eh+1/y| ≤ |e6(z)/y|+ 1

2·h+1∑i=6

|ei(z2)/y|

≤ |e6(ρ)|+K√r +

1

2·∞∑i=6

(ρ+ 3r)i,

for some constant K ; here the last line makes use of the inequality |ei(z2)| ≤ (|z|2/ρ)i

granted by Lemma 2. It follows easily that, for h ≥ 6,

|eh+1/y| ≤ |e6(ρ)|+ 1

2· ρ6

1− ρ+ 2K

√r,

for all r ≤ r0, with r0 chosen small enough. In particular, for h ∈ [6, N(z)] and r ≤ r0small enough, we have |eh/y| < 1/5. Combined with previous observatiosn regarding theinitial values of ej(z), this implies the inequality |eh/y| < 1

2 for all h ≤ N(z), as asserted.

(iii) Lower bound on modulus. It finally remains to establish the lower bound on |eh| in(24). We start with the recurrence relation (26). For h ≤ N(z), the additional Polya termeh(z2) only contributes to making |eh+1| larger. Indeed, for z ∈ S(r0, θ0), by Lemmas 5and the upper bound on arguments proved in Part (i), both terms are such that, for h <N(z),

|eh+1/y| ≥ |y| · |eh/y| ·(

1− |eh/y|2

).

Since x 7→ x(1− x/2) is increasing in [0, 1], we have |eh/y| ≥ fh, , for all h ≥ 0, wherethe sequence (fh)h≥0 is defined by

f0 =|e0||y|

=|y − z||y|

and fh+1 = |y| · fh ·(

1− fh2

).

(The latter recurrence relation is precisely the one analysed by Flajolet and Odlyzko [18]in the case of simply generated trees.) For r0 small enough, a process analogous to thederivation of the alternative recurrence in Lemma 6 yields

(31)|y|h

fh=

1

f0+

1

2·h−1∑i=0

|y|i +1

2·h−1∑i=0

fi1− fi/2

· |y|i ≤ 1

f0+

3

2·h−1∑i=0

|y|i ≤ 2 +3h

2.

This last last bound directly implies the lower bound on |eh| asserted in (24).


3.2. Existence of a convergence sandclock. We can now turn to a proof of the main resultof this section, Proposition 2 stated below, which establishes the existence of a sandclockin which the eh converge to 0. This proof follows the lines of the analogous statement [18,Proposition 4], where the iteration is “pure”. In the present context, we need once more tocontrol the effect of the Polya terms, which can be done thanks to an easy auxiliary result,Lemma 7.

Lemma 7. There exist r0, θ0 > 0 small enough, so that, for z ∈ S(r0, θ0) and for all hsatisfying 0 ≤ h ≤ N , with N ≡ N(z) as specified in (23), one has

|eh(z2)||eh(z)2|

≤ 1

2·(

4

5

)h.

Proof. Set z = ρ + reiθ, with z ∈ S(r0, θ0), for r0, θ0 > 0 taken small enough, whichwill be successively constrained, as the need arises. The inequality

(32)|eh(z2)||eh(z)2|

≤ 4(h+ 1)2

y2

∣∣∣∣ z2ρy2∣∣∣∣h , 0 ≤ h ≤ N,

combines the upper bound on eh(z2) provided by Lemma 2 (with z in the statement to bereplaced by z2) and the lower bound on eh(z) guaranteed by Lemma 6.

Now, at z = ρ, the upper bound (32) takes the form 4(h+1)2ρh, where ρ .= 0.40269, so

that its decay is about 0.4h. By continuity of the exponential rate |z2/(ρy2)|, there existsa small sandclock such that the decrease is less than 4(h + 1)20.45h (say). Furthermore,we verify easily that this last quantity is less than 1

2 · 0.8h for all h ≥ 13. Thus, the

statement is established for h large enough (h ≥ 13). On the other hand, examination ofinitial values shows that the ratio ej(ρ2)/ej(ρ) decreases rapidly from a value of about0.0543, at j = 0, to about 7.8 10−10, at j = 12; furthermore, we observe numerically that2 ·0.8−jej(ρ2)/ej(ρ) is always less than 1/9, for j = 0 . . . , 12. Thus, by continuity again,in a small enough sandclock, we must have |ej(z2)/ej(z)| < 1

2 ·0.8j for j = 0, . . . , 12.

Proposition 2 (Convergence in a “sandclock” around ρ). There exist constants r0, θ0 > 0such that the sequence eh(z), h ≥ 0 converges to zero for all z in the sandclock S(r0, θ0).

Proof. It suffices to verify that, for h = N ≡ N(z) as specified in Equation (23), the quan-tity eN satisfies the convergence criterion of Lemma 3, which then grants us convergenceof the ej to 0 for j > N . For this purpose, we appeal to the alternative recurrence stated inLemma 4

(33)yh

eh=

1

2y

1− yh

1− y︸︷︷︸M

+1

e0−h−1∑i=0

yi−1ei(z2)

2e2i︸︷︷︸A

+1

4y2

h−1∑i=1

yiei

[1− ei(z

2)e2i

]21− ei

2y

[1− ei(z2)

e2i

]︸︷︷︸

B

and devise an asymptotic lower bound for the right-hand side. (Observe that we can indeeduse the relation since, by Lemmas 6 and 7, for all i = 0, . . . , N , the denominators do notvanish.)

Write 1 − y(z) = εeit. As in Lemma 6, we assume without loss of generality that=(z) > 0. We need to establish properties of the various quantities, which intervenein (33); this, in a small sandclock, that is, for small ε > 0 and t close to −π/4. Thefollowing expansions are valid uniformly for t ∈ [−π/4− δ,−π/4 + δ] with 0 < δ < π/4when ε→ 0:

(34) 1− |y| = ε cos t+O(ε2),arg(y) = −ε sin t+O(ε2),

andN(z) = −ϕ/(ε sin t) +O(1)

1− |y|N = 1− eϕ·cot t +O(ε),


where ϕ := arccos(1/4).The first term M of the right-hand side of (33) will be seen to bring the main contribu-

tion. It satisfies, as ε→ 0:

|M | =∣∣∣∣ 1

2y

1− yN

1− y

∣∣∣∣ =1

2|y|· |1− y

N ||1− y|

≥ 1

2

1− |y|N

|1− y|=

1− eϕ·cot t

2ε+O(1),

Next, regarding the term A, we have

|A| ≤ 1

|e0|+

∣∣∣∣∣ 1

2y

N−1∑i=0

yiei(z2)

ei(z)2

∣∣∣∣∣ ≤ 1

|e0|+

1

2|y|

N−1∑i=0

∣∣∣∣ei(z2)

ei(z)2

∣∣∣∣ = O(1),

since, by Lemma 7, the summands decrease geometrically.It now remains to analyseB. We split it further: by Lemma 7, for all i such 18 ≤ i ≤ N ,

we have |ei(z2)|/|ei(z)2| ≤ 1/100. Then, by Lemma 6, and for ε small enough, we obtain

|B| ≤ 9

4(1− ε)·

(32

)21− 1/2

2−2ε32

+1/5

4(1− ε)·

(101100

)21− 1/5

2−2ε101100

1− |y|N

1− |y|< 22 +

3

50· 1− |y|N

1− |y|.

It follows that

(35)|yN ||eN |

≥ 1− eϕ·cot t

ε

(1

2− 3

50 cos t

)+O(1) >

2

5· 1− eϕ·cot t

ε,

where the last inequality holds for all z ∈ D such that ε < ε0 and |t+ π/4| < δ0, as soonas both ε0 and δ0 are small enough.

We can now return to the criterion for convergence (Lemma 3) and verify that in a smallenough sandclock the conditions in (16) are satisfied for m = N and some well chosenparameters α and β. Equation (35) provides the required upper bound on eN , which fixesour choice for α:

|eN | ≤ α :=5

2· ε · e

ϕ cot t

1− eϕ·cot t.

We now focus on the second condition in Lemma 3. From (34) and (35) we have, for ε > 0small enough,

|y|+ α

2≤ 1− ε ·

(cos t− 5

4· eϕ·cot t

1− eϕ·cot t

)+O(ε2).

Next, one can verify that there exists δ0 > 0 such that for all t ∈ [−π/4− δ0,−π/4 + δ0],we have

cos t− 5

4· eϕ·cot t

1− eϕ·cot t>

1

4.

Thus, for all ε > 0 small enough, we can choose β ∈ (0, 1), so that the first two conditionsin Lemma 3 are satisfied; namely,

(36) |eN | ≤ α =5

2· ε · e

ϕ cot t

1− eϕ·cot tand |y|+ α

2< β := 1− ε

4.

One then easily verifies that the third condition also holds for small enough ε > 0: here,α(1 − β) = Ω(ε2), and, by (34), we have (|z|2/ρ)N = o(ε2). So for ε small enough,(|z|2/ρ)N < α(1 − β). This shows that the criterion for convergence of Lemma 3 issatisfied with values of α and β specified in (36). As a consequence, eh(z) → 0 for all zin a small enough sandclock.


4. Main approximation

In this section, we develop precise quantitative estimates of eh(z) near the singularity ρand in a sandclock; these estimates serve as the main ingredient required for developinglimit laws for height in the next section.

Proposition 3 (Main estimate for eh in a sandclock). There exist r1, θ1 > 0 andK,K ′ > 0such that for all z ∈ S(r1, θ1) and all h ≥ 1,

(37)yh

eh=

1

2

1− yh

1− y+Rh(z), where |Rh(z)| ≤ K min

log

1

1− |y|, log(1 + h)

.

Furthermore, |Rh −Rh+1| < K ′/h.

In order to prove this proposition, we need a better control on error terms, which canbe achieved by extending the bounds obtained in Section 3 for h > N , knowing now thatthe eh converge (Proposition 2). The proof requires the bounds to be uniform both in thedistance to the singularity |z − ρ| and in the height h, as expressed by Lemmas 8 and 9below. The bound (38) below serves as a useful complement to the lower bound in (24),only holds for h ≤ N .

Lemma 8 (Uniform lower bound for |eh|). For any δ ∈ (0, 1), there exist constantsr1, θ1 > 0 such that if z ∈ S(r1, θ1) then one has

(38) |eh(z)| ≥ (1− δ)h+2

2(h+ 1), for all h ≥ 0.

Proof. Let δ ∈ (0, 1). We have |y| > 1 − δ/4 provided r := |z − ρ| < r0 small enough.Then, by Lemma 6, the estimate (38) holds for h ≤ N .

We thus only need to consider now the case h > N . Assume further that z ∈ S(r0, θ0),as in Proposition 2. Then, |eh| ≤ α, for α as in (36). The recurrence relation (8) implies

(39) |eh+1| ≥ |y| |eh|(

1− |eh|2|y|

)− |eh(z2)|

2≥ |y|

(1− α

2|y|

)· |eh| −

|eh(z2)|2

.

However, by (36), we have |y|+ α/2 < 1 so that |y| − α/2 > 1− δ/2. Lemma 2, whichserves to bound the Polya term |eh(z2)|, then yields |eh+1| ≥ (1 − δ/2)|eh| − (ρ + r)h.Therefore dividing both side of the recurrence relation by (ρ+r)h+1, we obtain for h ≥ N ,

(40)|eh+1|

(ρ+ r)h+1≥(

1− δ/2ρ+ r

)|eh|

(ρ+ r)h− 1

ρ+ r.

The remainder of the proof then consists in extracting the desired bound (38) from thelatter relation by unfolding the recurrence from h down to N . To this effect, recall that, byLemma 6, |eN | > yN+1/(2(N + 1)) and N < K/

√r, for some constant K. Hence, we

can set r to a value small enough that,

(41)|eN |

(ρ+ r)N>

2

δand

1− δρ+ r

> 1.

Then, for such r, using (40) and (41), it is easly verified by induction on h (with h ≥ N )that |eh|/(ρ+ r)h > 2/δ. Using this last bound in (40) gives, for h ≥ N :

|eh+1|(ρ+ r)h+1

≥(

1− δρ+ r

)|eh|

(ρ+ r)h≥(

1− δρ+ r

)h+1−N |eN |(ρ+ r)N

.


We can finally recover the information on |eh| by means of the lower bound for |eN | inLemma 6. For all h > N , we then have

|eh| ≥ |eN | · (1− δ)h−N+1 ≥ (1− δ)h+2

2(N + 1)≥ (1− δ)h+2

2(h+ 1),

and the proof is complete.

We can now develop a uniform upper bound for |eh| when z ∈ S(r1, θ1).

Lemma 9 (Uniform upper bound for |eh(z)|). There exist constants r1, θ1 > 0 and c1 > 0such that, for any h ≥ 1, and z ∈ S(r1, θ1), we have

|eh(z)| ≤ c1h.

Proof. Write 1 − y = εeit for some ε > 0 and t. It suffices to prove that the result holdsfor all such z provided ε is small enough and t close enough to −π/4. Observe that ε is ofthe order of

√z − ρ.

Our starting point is again (33), which we now use to obtain an upper bound on |eh|.The first term M is such that

(42) |M | =∣∣∣∣ 1

2y

1− yh

1− y

∣∣∣∣ =1

2|y|· |1− y

h||1− y|

≥ 1− |y|h

2|1− y|=

1− |y|h

2ε.

On the other hand, for all h ≥ 0 and ε > 0 small enough, by Lemmas 2 and 8, the firsterror term A in (33) satisfies

|A| ≤ 1

|e0|+

1

2|y|

h−1∑i=0

∣∣∣∣ei(z2)

ei(z)2

∣∣∣∣ ≤ 1

|e0|+

1

2(1− ε)

∞∑i=0

4(i+ 1)2(

ρ+ ε

(1− ε)2

)i.

There exists ε1 > 0 such that for all ε < ε1 the geometric term in the series above is atmost 2ρ < 1; together with the fact that e0 = y − z = 1 − ρ + O(ε), this implies that|A| ≤ 11/(1− 2ρ)3.

We now bound the second error term B in (33). Note first that, for all ε small enough,we have |ei| ≤ 1/2 for all i ≥ 0: for h ≤ N , this is implied by Lemma 6, while for h ≥ Nwe have |eh| < α < 2(1 − |y|) = O(ε). Furthermore, by Lemmas 2 and 8, for all ε < ε1small enough |ei(z2)|/|ei(z)2| < 1/100, for i ≥ h0 depending only on ε1. It follows that

|B| ≤ 3

2h0 +

1

8· (101/100)2

1− 14 ·

101100

·h∑

i=h0+1

|y|i−2 ≤ 3

2h0 +

1

5· 1− |y|h

1− |y|.

As a consequence, using Lemma 4, and combining the bounds just obtained on |A| and|B| with (42), one sees that, for all h ≥ 0,

(43)|yh||eh|≥ 1− |y|h

ε

(1

2− 1

5 cos t

)− h0 −

11

(1− 2ρ)3>

1− |y|h

5ε− h0 −

11

(1− 2ρ)3,

for |π/4 + t| small enough.The relation above provides a decent upper bound on |eh| provided that |yh| is small

enough. With this in mind, we now prove an upper bound on |yh| for all h ≥ 0. First,when h is not too large, |y|h should decrease at least linearly in h: we show that for somesmall enough δ > 0, |y|h ≤ 1 − δhε for all h ≤ N . For some fixed z, the sequence(|y|h, h ≥ 0) is convex; thus if |y|N ′ ≤ 1− δN ′ε for some N ′ ≥ N , then |y|h ≤ 1− δhε


for all 0 ≤ h ≤ N . Recall that ϕ = arccos(1/4); we now prove that we might takeN ′ := −2ϕ/(ε sin t). By (34), for ε small enough, |y| ≤ 1− ε

2 cos t and N < N ′ and

|y|N′≤(

1 +ε cos t

2

)N ′≤ exp

(−N

′ε cos t

2

)= eϕ cot t.

However, for |t+ π/4| < 1/100, then eϕ cot t < 1/2, so that we can pick δ > 0 such that

eϕ cot t < 1 +2δϕ

sin t= 1− δN ′ε.

It follows by (43) that there exists δ > 0 small enough such that |eh| ≤ 10/(δh), for0 ≤ h ≤ N ′, for all |π/4 + t| < 1/100 and ε > 0 small enough.

On the other hand, if h ≥ N and ε > 0 is small enough and |π/4 + t| < 1/100, then1− |y|h ≥ 1− 2eϕ cot t > 1/4 by (34). As a consequence,

|eh| ≤ 40ε|y|h ≤ 40ε(1− ε cos t+O(ε2))h ≤ 40ε(1− ε/2)h,

for all ε small enough and t close enough to −π/4. Now, seen as a function of ε, themaximum of the right-hand side above is obtained for ε = 2/(h + 1), which implies that|eh| ≤ 80/(h + 1), for h ≥ N. Finally, by Lemma 6, and the bounds above, the resultfollows by choosing c1 = maxh0, 10/δ, 80.

Proof of Proposition 3. The proof consists in using Lemma 9 above to bound the errorterms in (33) for z ∈ S(r2, θ2), with r2 = minr0, r1 and θ2 = minθ0, θ1. For someconstants c2 and c3, we have

|A|+ |B| ≤ 11

(1− 2ρ)3+ c2

(1 +

h∑i=1

|yi|i

)≤ c3 min

log

(1

1− |y|

), 1 + log h

,

which proves the main statement of Proposition 3. Finally, since A and B are partial sums,we obtain

|Rh −Rh+1| =

∣∣∣∣∣yh−12

eh(z2)

eh(z)2+yh−2

4eh

[1− eh(z2)

eh(z)2

]2(1− eh

2y

[1− eh(z2)

eh(z)2

])−1∣∣∣∣∣ ,a quantity which is easily seen to be uniformly O(1/h), thanks to Lemmas 8 and 9.

5. Asymptotic analysis and distribution estimates

The basis of our estimates relative to the distribution of height is the main approximationof eh in Proposition 3, which is valid in a fixed sandclock at ρ. Given its importance, werepeat it under the simplified form:

(44) eh(z) ≡ y(z)− yh(z) ≈ 21− y1− yh

yh.

(Here, the symbol “≈” is to be loosely interpreted in the sense of “approximately equal”.)This approximation acquires a precise meaning, when z remains fixed and h tends to in-finity, in which case it expresses the geometric convergence of eh(z) to 0 (since |y| < 1);also, when h remains fixed and z tends to ρ, it reduces to the numerical approximationeh(ρ) ≈ 2/h, whose accuracy increases with increasing values of h. In other words, theprecise version of (44) provided by Proposition 3 consistently covers, in a uniform manner,the case when both z → ρ and h→∞. (Analogues of the formula (44) surface in the caseof general plane trees in [12], plane binary trees [18], and labelled Cayley trees in [38].)

The exploitation of the enhanced versions of (44) relies on Cauchy’s coefficient for-mula (10). The contour γ in Cauchy’s integral (10) will be comprised of several arcs and


0 ρ

6

ρn

<(z) = ρ

ρ+ r2eiπ/2−iθ1

ρ+ r2eiπ/2+iθ2 = ρne

iη2

γ1

γ4γ3

uz0 uz1

Figure 3. Fine details of the Cauchy integration contour γ in the vicinity of ρ.

line segments4 that lie outside of the disc |z| ≤ ρ and taken in the union of a suitablesandclock (as granted by Proposition 3) and of a tube, overlapping with the sandclock(where properties of Proposition 1 are in effect). The strategy just described belongs tothe general orbit of singularity analysis methods expounded in [19, Ch. VI–VII]. We pro-pose to apply it to the height-related generating functions eh(z) (weak limit, Theorem 1)and eh−1(z)− eh(z) (local limit law, Theorem 2).

Before proceeding with the proof of Theorem 1, recall that we aim at showing that forany fixed x > 0, we have

limn→∞

P(Hn ≥ λ−1x√n) = Θ(x), λ :=

√2ρ+ 2ρ2y′(ρ2),

where Θ(x) :=∑k≥1

(k2x2 − 2)e−k2x2/4.

Proof of Theorem 1. We aim at using Cauchy’s formula (10) with a well-chosen5 integra-tion contour γ. The reader should consult Figures 2 and 3. First, we choose a priori asandclock S, whose existence is granted by Proposition 2 and such that the approximationproperties of Proposition 3 hold. By design, this sandclock contains in its interior a smallarc of the circle |z| = ρ. Choose arbitrarily a point z0 on this small arc, with z0 6= ρ,=(z0) > 0, and set z0 = r2e

iπ/2+iθ0 . Proposition 1 guarantees the existence of a tubeT that has z0 in its interior and for which the convergence eh → 0 is ensured. We havenow determined a sandclock and a partially overlapping tube, whose union will be seen tocontain the contour γ (where eh → 0) and whose intersection contains z0 = r2e

iθ0 .

4 In order to have well-defined determinations of square roots, one may think of the two segments as in factjoined by an infinitesimal arc of a circle that passes to the left of the singularity ρ.

5It might be that none of the tubes corresponding to Proposition 1 includes points to the right of the verticalline<(z) = ρ, hence the need to insert “joins” γ4 and γ5. (The discussion of this case was inadvertently omittedfrom the earlier version [7].) An alternative would be to make use of a contour that is squeezed in between thecircle |z| = ρ and the vertical line<(z) = ρ (this is done in [38], where the circle itself is used); but then the nearstationarity of the modulus of the Cauchy kernel, |z|−n, makes it technically harder, or at least less transparent,to translate approximations of generating functions into coefficient estimates.


The contour γ is essentially a Hankel contour escaping from ρ along rectilinear portionsγ1 and γ2 such that

γ1 = γ2 =ρ+ ξeiπ/2−iθ1 : ξ ∈ [0, r2]

,

where θ1 is chosen positive and strictly less than the half-angle of the sandclock S. Bydesign, the segments γ1 and γ2 lie entirely inside the sandclock S.

The component γ3 of the contour is a subarc of the circle

Cn := z : |z| = ρn, where ρn := ρ

(1 +

log2 n

n

).

Precisely, let z1 ≡ z1(n) be the intersection point in the upper half-plane of the circle Cnand the circle |z| = r2. When n gets large, this point z1 comes closer and closer to z0,so that, for all n large enough, it must belong to the intersection S ∩T . In other words, wecan write

z1 = ρneiη2 = ρ+ r2e

iπ/2+iθ2 ,

where η2 = η2(n) and θ2 ≡ θ2(n) both depend on n and tend to finite limits as n→ +∞(in particular, θ2 → θ0). Then we take

γ3 =ρne

iθ : θ ∈ [η2, 2π − η2],

and for n large enough, the arc γ3 entirely lies in the tube T .We can finally complete the contour to make it connected, with joining arcs γ4 and γ5,

which are arcs of |z − ρ| = r2 defined by

γ4 = γ5 = r2eiπ/2+iθ : θ ∈ [−θ1, θ2],so that both arcs lie inside the sandclock S.

Outer circular arc γ3. By Proposition 1, we have eh(z) → 0 uniformly on γ3 ash→∞. In particular, all moduli |eh(z)| are bounded by an absolute6 constant K. On theother hand the Cauchy kernel z−n is small on the contour, so that

(45)∣∣∣∣∫γ3

eh(z)dz

zn+1

∣∣∣∣ < K1ρ−n exp

(− log2 n

).

Join portions γ4, γ5. By Proposition 2, one has eh → 0 uniformly on γ4 ∪ γ5 ash → ∞. In particular |eh(z)| ≤ K2 for some absolute constant K2. By definition, for allz ∈ γ4 ∪ γ5, |z| ≥ ρn so that, for the same reasons as in (45),

(46)∣∣∣∣∫γ4∪γ5

eh(z)dz

zn+1

∣∣∣∣ ≤ K3ρ−n exp(− log2 n).

Outer rectilinar parts of γ1 and γ2. Let Dn := |z − ρ| ≥ δn, with

δn =log2 n

n.

Note that for z ∈ γ1 ∩ Dn, we have |z| ≥ ρ+ δn sin θ1. For the same reason as before,

(47)

∣∣∣∣∣∫(γ1∪γ2)∩Dn

eh(z)dz

zn+1

∣∣∣∣∣ ≤ K4ρ−n exp(−K ′4 log2 n).

The total contribution of the outer circular arc γ3, of both join portions γ4 and γ5, andof the outer rectilinear parts γ1∩Dn, γ2∩Dn are thus exponentially small compared to yn,hence totally negligible in the present context.

6In what follows, we use generically K,K1, . . . to denote absolute positive constants, not necessarily of thesame value at different occurrences.


Inner rectilinear parts of γ1 and γ2. This is where action takes place. From now on, weoperate with the normalization

h = λ−1x√n,

where x is taken to range over a fixed compact interval of R>0. We now focus on theportions of γ1 and γ2 lying outside Dn. We denote them by γ1 and γ2, respectively, andnote that all their points are at a distance from ρ that tends to 0, as n→ +∞. Our objectiveis to replace eh by the simpler quantity

(48) eh(z) ≡ eh := 21− y1− yh

yh,

as suggested by Proposition 3. Along γ1, γ2, the singular expansion of y(z) applies, so that1− y = O((log n)/n1/2) and the error term Rh(z) from Equation (37) is O(log n). Thereresults that (1− yh)/(1− y) is always at least as large in modulus as K5

√n/ log n (this,

by a study of the variation of |1− e−hτ |/|1− e−τ |), and we have

(49)yh

eh=yh

eh

(1 +O

(log2 n√

n

)).

It proves convenient to define

(50) E(h, n) :=1

2iπ

∫γ1∪γ2

ehdz

zn+1,

and to make the change of variables

(51) z = ρ

(1− t

n

), dz = − ρ

ndt .

With this rescaling, the point t then starts from −iρ−1nδne−iθ1 , loops to the right of theorigin, then steers away to iρ−1nδneiθ1 . Given the singular expansion of y(z) in (2), wehave on the small arcs γ1, γ2,

(52) z−n = ρ−net(

1 +O

(log4 n

n

)), y(z) = 1− λ

√t

n+O

(t

n

),

and, with h = λ−1x√n and |t| ≤ K6 log2 n, since δn = log2 n/n:

(53) yh = exp(−x√t)(

1 +O

(t√n

))= exp

(−x√t)(

1 +O

(log2 n√

n

)).

We also find7, for the range of values of t corresponding to γ1, γ2:

1− yh

1− y=

1− exp(−x√t)(1 + t/

√n)

λ√t/n

(1 +O

(log? n√

n

))=

[√n · 1− exp(−x

√t)

λ√t

+O(√t)

](1 +O

(log? n√

n

))(54)

=

[√n · 1− exp(−x

√t)

λ√t

](1 +O

(log? n√

n

)).

The approximations (52), (53), and (54) motivate considering, as an approximationof E(h, n) in (50), the contour integral

(55) J(X) :=1

2iπ

∫L

exp(−X√t)

1− exp(−X√t)

√tet dt =

1

2iπ

∑k≥1

∫L

exp(−kX√t)√tet dt,

7The expression log? n represents an unspecified positive power of logn.


where L goes from −∞+ i∞ to −∞− i∞ and winds to the right of the origin. We nowmake J(X) explicit. Each integral on the right side of (55) can be evaluated by the changeof variables w = i

√t, equivalently, t = −w2. By completing the square and flattening the

image contour L′ onto the real line, we obtain:

(56) J(X) =1

4√π

∑k≥1

e−k2X2/4(k2X2 − 2).

From the chain of approximations in Equations (48) to (55), we are then led to expectthe approximation

E(h, n) ∼ 2λρ−nn−3/2J(x),

which is justified next.Error management. In order to justify the replacement of eh by eh, following (49)

and (50), we observe the estimate

(57)∣∣∣∣∫γ1∪γ2

|y|h |1− y||1− yh|

|dz||z|n+1

∣∣∣∣ = O

(ρ−n

log4 n

n3/2

).

This results from the discussion of the lower bound on (1−yh)/(1−y) that follows (48), theinequality |yh| ≤ 1, and the fact that the length of the integration interval is O(log2 n/n).The error in our approximation has three sources: the two successive replacements

(58) eh 7→ eh,1− yh

1− y7→ 1− exp(−λx

√t)

λ√t/n

and the integration on a finite contour. We have, for z ∈ γ1 ∪ γ2:

eh = eh

(1 +O

(log2 n√

n

))= 2λ

√t

n· exp(−x

√t)

1− exp(−x√t)·(

1 +O

(log? n√

n

)).

Finally, the infinite extension of the contour only entails an additive error term of the formO(exp(−K log4 n)), since ∫ ∞

log2 n

e−w2

dw = O(e− log4 n).

This implies, for h = λ−1x√n:

eh,n ≡ [zn]eh(z) = 2λρ−nn−3/2J(x) +O

(ρ−n

log? n

n2

).

The explicit form of J(X) in (56) and the asymptotic form of yn (Lemma 1) jointly yieldthe statement.

The main message of the proof of Theorem 1 is twofold: (i) for any “reasonable”expression involving eh, the estimation of the Cauchy coefficient formula can be limited toa small neighbourhood of ρ (parts γ1 and γ2), since the other parts of the contour γ haveexponentially negligible contributions; (ii) the approximation provided by Proposition 3and Equation (37) is normally sufficient to derive first-order asymptotic estimates.

The convergence in law expressed by Theorem 1 is illustrated by Figure 4. The proofof the theorem points to an error term, in the convergence to the limit, that is of the formO((loga n)/

√n), with an unspecified exponent a. As a matter of fact, the value a =

1 is suggested by the logarithmic character of the error term in (37) of Proposition 3.Convergence is, at any rate, somewhat slow, a fact that is perceptible from Figure 4.


Figure 4. The normalized distribution functions P(Hn ≤ λ−1x√n), for n =

10, 20, 50, 100, 200, 500, as a function of x, and the limit distribution function 1 − Θ(x), whereΘ(x) is specified in Theorem 1.

On an other register, the distribution function 1−Θ(x) belongs to the category of elliptictheta functions [42, Ch. XXI], which are of the rough form8 ∑ qk

2

e2ikz and are well-known to satisfy transformation formulae [42, p. 475]. Regarding Θ(x), such formulaeprovide an alternative form, which we state for the density function, ϑ(x) := −Θ′(x):

(59) ϑ(x) =8√π3

x3ϑ

(4π

x

).

Theorem 2 states that the Hn indeed satisfies a local limit law with density functionϑ( · ): for x in a compact set of R>0 and h = λ−1x

√n an integer, there holds uniformly

P(Hn = h) ∼ λ√nϑ(x),

where ϑ(x) = −Θ′(x) = (2x)−1∑k≥1

(k4x4 − 6k2x2)e−k2x2/4.

Proof of Theorem 2. We abbreviate the discussion, since it is technically very similar tothe proof of Theorem 1: only the approximations near z = ρ differ. Proceeding in thisway, based on Proposition 3, we can justify approximating the number of trees of heightexactly h and size n by the integral

1

2iπ

∫γ1∪γ2

(eh−1 − eh)dz

zn+1,

with eh as defined in (48). We have

(60) eh−1 − eh = 2yh−1(1− y)2

(1− yh)(1− yh−1).

The approximations (53) and (54) then motivate considering the quantity

J1(X) :=1

2iπ

∫L

exp(−X√t)

(1− exp(−X√t))2

tet dt .

8 Do q = e−x2/4 and z = 0 to recover θ(x).


Figure 5. Left: the normalized histograms of the distribution of height P(Hn = h) (as a functionof x, with h = bλ−1x

√nc), for n = 100, 200, . . . , 500. Right: the limit density θ(x) = −Θ′(x).

One then finds (with the auxiliary estimate |Rh − Rh+1| = O((log? n)/√n) provided by

Proposition 3):

yn,h − yn,h+1 = 2λ2ρ−nn−2J1(x) +O

(ρ−n

log? n

n5/2

).

On the other hand, differentiation under the integral sign yields J1(X) = −J ′(X), whichproves the statement.

Figure 5 displays the normalized histograms of the distribution of height and a plot ofthe corresponding limit density.

Revisiting the proof of Theorems 1 and 2 shows that one can allow x to become ei-ther small or large, albeit to a limited extent. Indeed, it can be checked, for instance, thatallowing x to get as large as O(

√log n) only introduces extra powers of log n in error esti-

mates. However, such extensions are limited by the fact that the main theta term eventuallybecomes smaller than the error term. We state (compare with [20, Th. 1.1]):

Theorem 4 (Moderate deviations). There exist constants A,B,C > 0 such that for h =(x/λ)

√n with A/

√log n ≤ x ≤ A

√log n and n large enough, there holds

(61)∣∣P(Hn ≥ λ−1x

√n)−Θ(x)

∣∣ ≤ C

nB.

In particular, if x→∞ in such a way that x ≤ A√

log n, then, uniformly,

P(Hn ≥ λ−1x√n) ∼ x2e−x

2/4.

Similar estimates hold for the local law. These estimates can furthermore be supple-mented by (very) large deviation estimates in the style of [20, Th.1.4]:

Theorem 5 (Very large deviations). There exists a continuous increasing function I(u)satisfying I(u) > 0 for 0 < u ≤ 1 and such that, given any fixed δ > 0, one has forall x ∈ [δ, 1− δ] and all n

P(Hn ≥ xn) ≤ Kn3/2e−nI(x),

where K only depends on δ.


Proof. We propose to use saddle point bounds [19, p. 246]: for any r ∈ (0, ρ), one has

(62) P Hn ≥ h =eh,nyn≤ 1

yn

(eh(r)

rn

).

The first step is to obtain an upper bound on eh(r), for r ∈ (0, ρ). For such r, all termsin the recurrence relation (8) are non-negative and expanding the relation with the help ofLemma 2 yields, for all h ≥ 0, the inequality

eh+1(r) ≤ y(r)eh(r) +

(r2

ρ

)h≤ y(r)h

(h∑i=1

(r2

ρy(r)

)i+ e1(r)

).

However, it is easily verified that for all r ∈ (0, ρ), we have y(r) > r + r2 + r3 ≥ r2/ρ.As a consequence, the series above converges and there exists a universal constant K suchthat

eh(r) ≤ Ky(r)h, for h ≥ 0 and r ∈ (0, ρ).

The last estimate, the saddle point bound (62), and Lemma 1 yield, in the region h = xn,

P Hn ≥ h ≤ K ′n3/2(y(r)x

ρ

r

)n,

for some other universal constant K ′ and for any r ∈ (0, ρ).The goal is now to make an optimal choice of the value of r. For x kept fixed and

regarded as a parameter, we consider

J(r, x) :=y(r)x

r

as a function of r, and henceforth abbreviated as J(r). We have J(0) = +∞ and J(ρ) =ρ−1. The point, to be justified shortly, is that J(r) decreases from +∞ to some minimalvalue J(ξ), when r = ξ; then it increases again to ρ−1 for r ∈ (ξ, ρ). In particular, wemust have J(ξ) < ρ−1, which suffices to imply a non-trivial exponential bound on theprobabilities.

The unimodality of J(r) ≡ J(r, x) results from the usual convexity properties of gen-erating functions (see [13] or [19, pp. 250 and 580]). Indeed it suffices to observe that thelogarithmic derivative (all derivatives being taken with respect to r), namely,

J ′(r)

J(r)= x

(ry′(r)

y(r)− 1

x

),

varies monotonically from x − 1 ≤ 0 to +∞, as r varies from 0 to ρ. This last fact is aconsequence of the positivity of

v :=∂

∂r

(ry′(r)

y(r)− 1

x

),

itself granted, since V = rv is the variance of a random variable X with probabilitygenerating function E

[uX]

= J(ru)/J(r).In summary, from the preceding considerations, the system

I(x) = x log y(ξ)− log ξ + log ρ with ξ = ξ(x) such that xξy′(ξ)− y(ξ) = 0

uniquely determines a function I(x), which precisely satisfies the properties asserted inTheorem 5.

Finally, the approximation of eh by eh in (48) is good enough to grant us access tomoments (cf also [18]) stated in Theorem 3: as n→∞, we have

E [Hn] ∼ 2

λ

√πn and E[Hr

n] ∼ r(r − 1)ζ(r)Γ(r/2)

(2

λ

)rnr/2, r ≥ 2.


Proof of Theorem 3. The problem reduces to estimating generating functions of the form

Mr(z) = 2(1− y)2∑h≥1

hryh

(1− yh)2,

which are accessible to the Mellin transform technology [21], upon setting y = e−t. If welet Fr(t) =

∑h≥1 h

r e−ht

1−e−ht , then the Mellin transform F ?r (s) is given by

F ?r (s) = ζ(s− r)ζ(s− 1)Γ(s),

and is valid in the fundamental strip s > r+1. The information relative to the distribution isconcentrated around the singularity, hence for values of y such that y → 1, or equivalentlyt → 0. The asymptotics of Fr(t) as t → 0 correspond to the singular expansion of itsMellin transform F ?r (s) to the left of the strip.

For r ≥ 2, the main contribution is due to the simple pole at s = r + 1, which hasresidue ζ(r)Γ(r + 1). It follows that

Fr(t) ∼ ζ(r)Γ(r + 1)t−r−1 r ≥ 2.

Since 1− y ∼ λ√

1− z/ρ, and y = e−t, we have t ∼ λ√

1− z/ρ and

Mr(z) ∼ 2ζ(r)Γ(r + 1)λ−r+1(1− z/ρ)−(r−1)/2.

Singularity analysis theorems imply

[zn]Mr(z) ∼ 2ζ(r)λ−r+1Γ(r + 1)ρ−nn−(r+1)/2

Γ((r − 1)/2).

The duplication formula for the Gamma function, combined with the estimate for yn, thenyields:

E [Hrn] ∼ [zn]Mr(z)

yn∼(

2

λ

)rζ(r)r(r − 1)Γ(r/2)nr/2 r ≥ 2.

When r = 1, the Mellin transform F ?r (s) has a double pole at s = 2 and the asymptoticform of Fr(t) at zero involves logarithmic terms. We then obtain, as n→∞,

E [Hn] ∼ 2λ−1√πn

using similar arguments

6. The diameter of unrooted trees

In this section, we put to use the approximations of Section 4 in order to quantify ex-treme distances in random unrooted trees. Developments parallel those of Riordan [39], asregards formal generating functions, and especially Szekeres [41], as regards asymptoticdevelopments.

In the class Y of rooted binary trees, every node has total degree three or one, exceptfor the root, which has degree two. Consider now the class U of unrooted ternary treeswhere each node has degree either three or one, without exception (no special root node isnow distinguished). Let Un be comprised of the elements of U with n nodes of degree one,the leaves, which determine size, hence (n − 2) nodes of degree three. Denote by un thenumber of such trees. The trees of U of size at most 8 are displayed in Figure 6. We writethe generating function of U as u(z) :=

∑n≥0 unz

n, so that

u(z) = z2 + z3 + z4 + z5 + 2z6 + 2z7 + 4z8 + 6z9 + 11z10 + 18z11 + · · · ,and the coefficients constitute sequence A000672 of Sloane’s On-line Encyclopedia ofInteger Sequences [40].


Figure 6. The unlabelled trees of sizes from 2 to 8, with external nodes (leaves) represented bysquares.

Using considerations about the dissimilarity characteristic of trees found in Otter’swork [35] and developed in [5, 25], we obtain

(63) u(z) = z2 + u•(z)− 1

2y(z)2 +

1

2y(z2),

where u•(z) is the generating function of unrooted trees with a distinguished node. (Notethat because of the special degree condition in rooted trees u•(z) 6= y(z).) The distin-guished node might be a leaf or a node of degree three, which leads to

(64) u•(z) = zy(z) +1

6y(z)3 +

1

2y(z2)y(z) +

1

3y(z3).

The equations (63) and (64) fully characterize u(z) and, together with Lemma 1, theydetermine the singular expansion of u(z). The following classical lemma reduces to simplemanipulations based on Lemma 1, supplemented by routine singularity analysis of thegenerating function.

Lemma 10. The generating function u(z) of unrooted ternary trees expands in a neigh-bourhood of ρ as follows

u(z) = µ0 + µ1(1− z/ρ) +1

3λ3(1− z/ρ)3/2 +O

((1− z/ρ)2

),

for some constants µ0, µ1 ∈ R and λ =√

2ρ+ 2ρ2y′(ρ2). Furthermore, the number unof unrooted trees of size n satisfies the asymptotic estimate

un =λ3

4√π· n−5/2ρ−n

(1 +O

(1

n

)).

We now turn to the analysis of the diameter of unrooted trees. A diameter in a graph ora tree is any simple path of maximal length and we also refer to the common length of alldiameters as the diameter of the tree. Let ud,n be the number of unrooted trees on n leaveswith diameter exactly equal to d, and let ud(z) =

∑n≥0 ud,nz

n denote the associated


generating function9. To simplify notations, we set

gh(z) := eh−1(z)− eh(z),

which is the generating function of rooted unlabelled binary trees having height exactly h.We have u1(z) = z2 and u2(z) = z3. Unrooted trees of size at least 4 may be recur-

sively decomposed into sets of rooted trees; the decomposition depends on the parity of thediameter d. If d = 2h+ 1 is odd, with d ≥ 3, all diameters share a unique edge (bicentre)that splits the tree into a pair of two rooted trees of height exactly h each, so that

(65) u2h+1(z) =1

2gh(z)2 +

1

2gh(z2).

On the other hand, trees with even diameter d = 2h, with d ≥ 4, decompose into threerooted trees around a central vertex (center), with two of the trees of height exactly h anda third subtree of height at most h:

u2h(z) =1

6gh−1(z)3 +

1

2gh−1(z2)gh−1(z) +

1

3gh−1(z3)

+1

2gh−1(z)2yh−2(z) +

1

2gh−1(z2)yh−2(z).(66)

In this way, one can enumerate the trees of odd and even diameter (the “bicentred” and“centred” trees), whose generating functions start, respectively, as

uodd(z) = z2 + z4 + z6 + +z7 + 2z8 + 2z9 + 6z10 + 8z11 + · · ·ueven(z) = z3 + z5 + z6 + z7 + 2z8 + 4z9 + 5z10 + 10z11 + · · · ,

with coefficients forming sequences A000673 and A000675 of Sloane’s Encyclopedia.We now turn to singular asymptotics in a ∆–domain10 (see [19, §VI.3] and Equation (5),

and Figure 3). As usual, the Polya terms in (65), (66), which are the ones containing func-tional terms involving z2 or z3, will turn out to be of negligible effect. Indeed, Lemma 2guarantees, for |z| < √ρ:

(67)∣∣gh(z2)

∣∣ ≤ ∣∣eh−1(z2)∣∣ ≤ 1√

h− 1

(|z|2

ρ

)h.

Thus, fixing some R with ρ < R <√ρ, we have for some C > 0 and A ∈ (0, 1):

(68)∣∣gh(z2)

∣∣ < C ·Ah,

whenever z lies in a suitable ∆–domain anchored at ρ, and the same bound on the rightof (68) obviously holds for gh(z3). In other words, the Polya terms involving z2 and z3

are exponentially small. This gives us, relative to (65) and (66) and for z in a ∆–domain,the estimate

(69) u2h+1(z) =1

2gh(z)2 +O(Ah)

and, similarly,

(70) u2h(z) =1

2gh−1(z)2yh−2(z) +

1

6gh−1(z)3 +O(Ah).

9 We reserve n for the size of trees, so that un ≡ [zn]u(z) is the number of trees of size n; we make useof indices d, 2h, 2h + 1 for diameter and occasionally abbreviate ud(z), . . ., as ud, . . ., so that no ambiguityshould occur.

10 To be precise, we only need to consider the part of a ∆-domain that is interior to a γ-contour of the typeintroduced in the previous section.


Figure 7. Left: the raw histograms of the distribution of diameter in unrooted trees, for n =

50, 100, 150, . . . , 500. Right: a plot of the limit density function θ(x) of Theorem 6.

The latter asymptotic form may be further simplified: by Lemmas 1 and 9, for z → ρ in asandclock, we have

y − eh = 1−O(√

1− z/ρ)− eh, |gh| ≤ |eh−1| = O(1/h),

and it follows that, in this sandclock,

(71) u2h(z) =1

2gh−1(z)2

(1 +O(1/h) +O(

√1− z/ρ)

).

(The cubic term 16gh−1(z)3 essentially corresponds to trees having a centre from which

there spring three trees of equal height; such configurations are still negligible, but nowpolynomially, rather than exponentially.) Additionally, in a tube, all terms in (69) and (70)are exponentially small, by virtue of Equation (17) of Lemma 3 and Proposition 1; theinduced contributions for coefficients are thus going to be exponentially small, and we donot need to discuss these any further.

In a way similar to the asymptotic simplification (60) of eh−eh+1 ≡ gh+1, the estimatesof (69) and (71) now suggest to introduce the following approximation of ud,

(72) ud := 2(1− y)4yd

(1− yd/2)4,

regardless of the the parity of d: we have (in a sandclock)

(73) ud = ud

(1 +O(1/d) +O(

√1− z/ρ)

).

Following the line of proof of Theorems 1, 2, and 3, it is now a routine matter to workout the consequences, at the level of coefficients, of the main approximations (72) and (73).Note that, since we have access to generating functions of diameter exactly h, we start witha local limit law, then proceed to estimate the distribution function by summation. Figure 7presents supporting numerical data for the local limit law of diameter.

Theorem 6 (Local limit law for diameter). The diameter Dn of an unrooted tree sampledfrom Un uniformly at random satisfies a local limit law: for x in any compact set of R>0,uniformly, with (x/λ)

√n an integer, one has:

limn→∞

PDn = (x/λ)

√n

=λ√nϑ(x)


where ϑ(x) =1

768

∑k≥1

k(k2 − 1)(k5x5 − 80k3x3 + 960kx

)e−k

2x2/16.

Proof. We start from the approximations (72) and (73), then make use of Cauchy’s coef-ficient formula together with the contour γ specified in the proof of Theorem 1. As notedalready, the contributions of the outer circle γ3, the joins γ4 and γ5 and the further portionsof the rectilinear pieces γ1 and γ2 are exponentially small, so that we can restrict attentionto what happens in a small sandclock, along γ1 and γ2.

The change of variable z = ρ(1 − t/n) and approximations that are justified in Equa-tions (50) to (54) of the proof of Theorem 1 lead to

[zn]ud(z) = −2ρ−nn−3λ4J2(λx/2) +O(ρ−nn−7/2 log? n),

where we have set

J2(X) :=1

2iπ

∫L

e−2X√t

(1− e−X√t)4

t2etdt,

with L that goes from −∞+ i∞ to −∞− i∞ and winds to the right of the origin. As inEquations (55) and (56), we can make J2(X) explicit:

J2(X) =1

2iπ

∑k≥3

k(k − 1)(k − 2)

6

∫Le−X(k−1)

√tt2etdt(74)

= − 1

192√π

∑k≥2

k(k2 − 1)(k5X5 − 20k3X3 + 60

)e−k

2X2/4.

A normalization by un, as provided by Lemma 10, then yields the claim.

Theorem 7 (Limit distribution of diameter). The diameter Dn of a unrooted tree sampledfrom Un uniformly at random admits a limit distribution: for x in a compact set of R>0,we have

limn→∞

PDn ≥ (x/λ)

√n

= Θ(x),

where Θ(x) ≡∫ ∞x

ϑ(w) dw =1

96

∑k≥1

(k2 − 1)(k4x4 − 48k2x2 + 192)e−k2x2/16.

Proof. The convergence of distribution functions results from earlier approximations throughintegration. Indeed, approximating a Riemann sum by the corresponding integral, we find,for d = x

√n,

[zn]∑`≥d

u` ∼ [zn]∑`≥d

u` ∼ −2λ4ρnn−3/2∫ ∞x

J2(λs/2)ds,

as n→∞. The integral is easily computed from (74): write X = λx/2 to obtain∫ ∞x

J2(λs/2)ds =1

3λ

∑k≥1

(k2 − 1)1

2iπ

∫LekX

√tt3/2etdt

= − 1

3 · 24√π

∑k≥1

(k2 − 1)(k4X4 − 12k2X2 + 12)e−k2X2/4.

A final normalization based on Lemma 10 yields the result.

Theorem 8 (Moments of diameter). The moments of the diameter Dn of a random un-rooted tree with n leaves satisfy

E [Dn] ∼ 8

3λ

√πn E

[D2n

]∼ 16

3λ2

(1 +

π2

3

)n E

[D3n

]∼ 64

λ3

√πn3,


and, for all r > 3,

E [Drn] ∼ 22r

3r(r − 1)(r − 3)Γ(r/2)(ζ(r − 2)− ζ(r))λ−rnr/2.

Proof. By definition, the moments of Dn are given by

(75) E [Drn] =

1

un[zn]

∑d≥1

drud(z),

and, from (72) and (73) once more, we are led to the approximation E [Drn] ∼ ([zn]Mr)/un,

where

Mr(z) := 2(1− y)4∑d≥1

dryd

(1− yd/2)4

results from replacing ud by ud in the generating function of (75).As for the moments of height, the singular asymptotic form of Mr(z) is conveniently

determined by means of the Mellin transform technology. Set y = e−τ , so that z → ρ

corresponds to τ → 0, with τ ∼ λ√

1− z/ρ. We then need the asymptotic estimation ofMr(e

−τ ) when τ → 0. Define

Fr(τ) :=∑d≥1

dre−dτ

(1− e−dτ/2)4,

which is such that Mr(z) ∼ 2λ4τ4Fr(τ). By the “harmonic sum rule” [21], the Mellintransform F ?r (s) of Fr(τ) satisfies, for <(s) > max1 + r, 4,

F ?r (s) =2s

6ζ(s− r)(ζ(s− 3)− ζ(s− 1))Γ(s).

The singularities in a right half-plane are known to dictate the asymptotic expansion ofFr(τ), as τ → 0. For r > 3, the main contribution comes from a simple pole at s = r + 1(due to the factor ζ(s− r)), and we find

Fr(τ) ∼ 2r

6(ζ(r − 2)− ζ(r))Γ(r + 1)τ−r−1, τ → 0,

which provides in turn the main term in the expansion of Mr(z) as z → ρ:

Mr(z) ∼2r

3λ−r (ζ(r − 2)− ζ(r))F (r + 1) (1− z/ρ)

−r+1, z → ρ.

Singularity analysis combined with the estimate of un in Lemma 10 and the duplicationformula for the Gamma–function then automatically yields the asymptotic form of E [Dr

n],in the case r > 3.

For r ≤ 3, the approach is similar, but a little more care is required. For r = 1, 2one needs to consider the second terms of the singular expansion of F ?r (s), at s = 2 ands = 3, respectively. Also, the cases r = 1 and r = 3 involve logarithmic terms due todouble poles of F ?1 (s) and F ?3 (s) at s = 2 and s = 4. The claim follows by routine Mellintechnology and singularity analysis.

7. Conclusion

We finally conclude with two corollaries and a general comment. First, as a byproductof (72) and (73), via summation and singularity analysis, we can estimate the proportionof centred and bicentred trees.


Cayley trees Otter trees

mean depth√πn

2

1

λ

√πn

mean height√

2πn2

λ

√πn

mean diameter4

3

√2πn

8

3λ

√πn

Figure 8. A table comparing the asymptotic forms of the expectations of several parameters of trees,for the two models of Cayley trees (non-plane labelled trees) and Otter trees (non-plane unlabelledbinary trees), based on [33, 38, 41] and the present paper. Depth refers to the depth of a randomlychosen node in the tree; height is the maximum distance of any node from the root; diameter isrelative to the unrooted version of the trees under consideration.

Corollary 1. There are asymptotically as many centred trees (trees of even diameter) asbicentred trees (trees of odd diameter):

[zn]uodd(z) ∼ [zn]ueven(z) ∼ 1

2un.

This perhaps unsurprising observation parallels one made by Szekeres [41, p. 394] in thecase of labelled trees, where all degrees are allowed.

Next, a comparison of expectations of height and diameter in random nonplane treesshows the following.

Corollary 2. The ratio of the expected diameter of an unrooted tree and the expectedheight of a rooted tree of the same size satisfies asymptotically

limn→∞

E [Dn]

E [Hn]=

4

3.

Again, a similar observation was made by Szekeres [41, p. 396] regarding labelled treesand the same property, with a “universal” 4

3 factor is expected to hold for any “ordered”tree family (i.e., trees whose nodes have neighbours that are distinguishable; cf our Intro-duction), as argued heuristically by Aldous in [3].

The fact, established rigorously in the present paper (Theorems 1 to 8 and Corollaries 1,2), is that, up to scaling, height and diameter behave for some non-plane unlabelled treessimilarly to what is known for ordered trees: see Figure 8 for some striking data. Thisbrings further evidence for the hypothesis that probabilistic models, such as the Contin-uum Random Tree, may be applicable to unordered trees—this has indeed been recentlyconfirmed, in the binary case at least, by Marckert and Miermont [32]. It is piquant to notethat the probabilistic approach of [32] relies in part on large deviation estimates for height,which were developed analytically by us in the earlier conference version [7] of the presentpaper. (Recently, Haas and Miermont [24] have developped an alternative approach thatfurther allows them to prove the convergence of a large class of trees towards continuumlimits. This encompasses a self-contained proof of the result in [32] and other more exam-ples with stable tree limits.) An analytic treatment of the height of unordered trees withall degrees allowed has been given recently by Drmota and Gittenberger (see [15] and theaccount in [14]). Together with the present study, it confirms, among unordered trees, the


existence of universal phenomena regarding height and profile, which parallel what hasbeen known for a long time regarding their ordered counterparts. As usual, the analyticapproach advocated in the present paper has the advantage of providing precise estimates,with speed of convergence estimates, local limit laws, and convergence of moments.

Finally, the fact that, up to a possible linear change of scale, some of the main char-acteristics of trees, such as height and diameter, are not sensitive to whether trees areplanar (ordered) or not, is also of some relevance to the emerging field of “probabilis-tic logic” [29, 31]. For instance, there is interest there in determining the probabilityof satisfiability of random boolean formulae obeying various randomness models (see,e.g., [10, 23]). In this context, our results suggest that the commutativity of logical conjunc-tion and disjunction (reflected by the non-planarity of associated expression trees) shouldnot, in many cases, have a major effect on complexity properties of random Boolean ex-pressions.

Acknowledgements. Thanks to Jean-Francois Marckert and Gregory Miermont for in-citing us to investigate in detail the distribution of height. We also express our gratitudeto Alexis Darrasse and Carine Pivoteau for designing and programming for us efficientBoltzmann samplers of binary trees and providing detailed statistical data that guided ourfirst analyses of this problem. This work was supported in part by the French ANR ProjectBOOLE dedicated to Boolean frameworks.

REFERENCES

[1] D. J. Aldous. The random walk construction of uniform spanning trees and uniformlabelled trees. SIAM Journal on Discrete Mathematics, 3(4):450–465, 1990.

[2] D. J. Aldous. The continuum random tree I. The Annals of Probability, 19:1–28,1991.

[3] D. J. Aldous. The continuum random tree II: an overview. In M. T. Barlow and N. H.Bingham, editors, Proccedings of the 1990 Durham Symposium on Stochastic Anal-ysis, volume 21 of London Math. Soc. Lecture Note Ser., pages 23–70. CambridgeUniversity Press, 1991.

[4] D. J. Aldous. The continuum random tree III. The Annals of Probability, 21:248–289,1993.

[5] F. Bergeron, G. Labelle, and P. Leroux. Combinatorial species and tree-like struc-tures. Cambridge University Press, 1998. ISBN 0-521-57323-8.

[6] M. Bona and P. Flajolet. Isomorphism and symmetries in random phylogenetic trees.Journal of Applied Probability, 46:1005–1019, 2009.

[7] N. Broutin and P. Flajolet. The height of random binary unlabelled trees. In U. Rosler,editor, Proceedings of Fifth Colloquium on Mathematics and Computer Science: Al-gorithms, Trees, Combinatorics and Probabilities, volume AI of Discrete Mathe-matics and Theoretical Computer Science Proceedings, pages 121–134, Blaubeuren,2008.

[8] P. Chassaing and J.-F. Marckert. Parking functions, empirical processes, and thewidth of rooted labeled trees. Electronic Journal of Combinatorics, 8(1):ResearchPaper 14, 19 pp. (electronic), 2001. ISSN 1077-8926.

[9] P. Chassaing, J.-F. Marckert, and M. Yor. The height and width of simple trees. InMathematics and computer science (Versailles, 2000), Trends Math., pages 17–30.Birkhauser Verlag, 2000.


[10] B. Chauvin, P. Flajolet, D. Gardy, and B. Gittenberger. And/Or Trees Revisited.Combinatorics, Probability and Computing, 13(4–5):501–513, 2004. Special issueon Analysis of Algorithms.

[11] N. G. de Bruijn. Asymptotic Methods in Analysis. Dover, 1981. A reprint of the thirdNorth Holland edition, 1970 (first edition, 1958).

[12] N. G. de Bruijn, D. E. Knuth, and S. O. Rice. The average height of planted planetrees. In R. C. Read, editor, Graph Theory and Computing, pages 15–22. AcademicPress, 1972.

[13] A. Dembo and O. Zeitouni. Large Deviations Techniques and Applications. Jonesand Bartlett Publishers, Boston and London, 1993.

[14] M. Drmota. Random Trees. Springer Verlag, 2009.[15] M. Drmota and B. Gittenberger. The shape of unlabeled rooted random trees. Techni-

cal report, Technical University of Vienna, 2008. Revised version available as Arxivpreprint arXiv:1003.1322, 2010.

[16] R. T. Durrett and D. L. Iglehart. Functionals of Brownian meander and Brownianexcursion. The Annals of Probability, 5(1):130–135, 1977.

[17] S. Finch. Mathematical Constants. Cambridge University Press, 2003.[18] P. Flajolet and A. M. Odlyzko. The average height of binary trees and other simple

trees. Journal of Computer and System Sciences, 25:171–213, 1982.[19] P. Flajolet and R. Sedgewick. Analytic Combinatorics. Cambridge University Press,

2009. URL http://algo.inria.fr/flajolet. 824 pages. Also availableelectronically from the authors’ home pages.

[20] P. Flajolet, Z. Gao, A. Odlyzko, and B. Richmond. The distribution of heights ofbinary trees and other simple trees. Combinatorics, Probability and Computing, 2:145–156, 1993.

[21] P. Flajolet, X. Gourdon, and P. Dumas. Mellin transforms and asymptotics: Harmonicsums. Theoretical Computer Science, 144(1–2):3–58, June 1995.

[22] P. Flajolet, P. Grabner, P. Kirschenhofer, and H. Prodinger. On Ramanujan’s Q–function. Journal of Computational and Applied Mathematics, 58(1):103–116, Mar.1995.

[23] D. Gardy. Random Boolean expressions. In Computational Logic and Applications(CLA’05), volume AF of Discrete Mathematics and Theoretical Computer ScienceProceedings, pages 1–36, 2005.

[24] B. Haas and G. Miermont. Scaling limits of Markov branching trees, with applica-tions to Galton–Watson and random unordered trees. Technical Report 1003.3632,arXiv, 2010.

[25] F. Harary and E. M. Palmer. Graphical Enumeration. Academic Press, 1973.[26] D. P. Kennedy. The Galton–Watson process conditioned on the total progeny. Journal

of Applied Probability, 12(4):800–806, dec 1975.[27] D. P. Kennedy. The distribution of the maximum Brownian excursion. Journal of

Applied Probability, 13(2):371–376, jun 1976.[28] V. F. Kolchin. Random Mappings. Optimization Software Inc., 1986. Translated

from Slucajnye Otobrazenija, Nauka, Moscow, 1984.

http://algo.inria.fr/flajolet


[29] Z. Kostrzycka and M. Zaionc. Asymptotic densities in logic and type theory. StudiaLogica, 88(3):385–403, 2008.

[30] J. F. Le Gall. Random trees and applications. Probability Surveys, 2:245–311, 2005.[31] H. Lefmann and P. Savicky. Some typical properties of large AND/OR Boolean

formulas. Random Structures & Algorithms, 10:337–351, 1997.[32] J.-F. Marckert and G. Miermont. The CRT is the scaling limit of unordered binary

trees. Random Structures & Algorithms, 2010. In press. 35 pages.[33] A. Meir and J. W. Moon. On the altitude of nodes in random trees. Canadian Journal

of Mathematics, 30:997–1015, 1978.[34] J. Milnor. Dynamics in one complex variable. Friedr. Vieweg & Sohn, 1999. ISBN

3-528-03130-1.[35] R. Otter. The number of trees. Annals of Mathematics, 49(3):583–599, 1948.[36] G. Polya. Kombinatorische Anzahlbestimmungen fur Gruppen, Graphen und chemis-

che Verbindungen. Acta Mathematica, 68:145–254, 1937.[37] G. Polya and R. C. Read. Combinatorial Enumeration of Groups, Graphs and Chem-

ical Compounds. Springer Verlag, 1987.[38] A. Renyi and G. Szekeres. On the height of trees. Australian Journal of Mathematics,

7:497–507, 1967.[39] J. Riordan. Enumeration of trees by height and diameter. IBM Journal of Research

and Development, 4:473–478, 1960.[40] N. J. A. Sloane. The On-Line Encyclopedia of Integer Sequences. 2008. Published

electronically at www.research.att.com/˜njas/sequences/.[41] G. Szekeres. Distribution of labelled trees by diameter. In Combinatorial Mathemat-

ics X, Lecture Notes in Mathematics, pages 392–397. Springer, 1982.[42] E. T. Whittaker and G. N. Watson. A Course of Modern Analysis. Cambridge Uni-

versity Press, fourth edition, 1927. Reprinted 1973.

P.F., N.B.: ALGORITHMS PROJECT, INRIA-ROCQUENCOURT, F-78153 LE CHESNAY (FRANCE)

the distribution of height and diameter in random non … · 2018-01-18 · the distribution of...

Documents