from infinite ergodic theory to number theory (and possibly back)

Chaos, Solitons & Fractals 44 (2011) 467–479

Contents lists available at ScienceDirect

Chaos, Solitons & FractalsNonlinear Science, and Nonequilibrium and Complex Phenomena

journal homepage: www.elsevier .com/locate /chaos

Frontiers

From infinite ergodic theory to number theory (and possibly back)

Stefano IsolaDipartimento di Matematica e Informatica, Università di Camerino, via Madonna delle Carceri, I-62032 Camerino, Italy

a r t i c l e i n f o a b s t r a c t

Article history:Received 12 July 2010Accepted 31 January 2011

0960-0779/$ - see front matter � 2011 Elsevier Ltddoi:10.1016/j.chaos.2011.01.015

E-mail address: [email protected]

Some basic facts of infinite ergodic theory are reviewed in a form suitable to be applied tointerval maps with number theoretic significance such as the Farey map. This is anenlarged version of the lecture notes accompanying a short course on Infinite ErgodicTheory at the First meeting of the (mostly) young italian hyperbolicians (Corinaldo, Italy, June8–12, 2009).

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

In rough terms, ergodic theory is the study of the longterm average behaviour of systems evolving in time. Inparticular, one considers deterministic dynamical systems,which (restricting to discrete time) are mathematical ob-jects which arise as soon as one is given with a specificway to associate one point of a phase space X to another,that is a transformation T : X ? X. Such a transformationcan be iterated to get sequences of points in X called orbits.If, for example, a point x 2 X represents the actual state ofsome physical system, the orbit of x under T, namely theset of images (Tkx)kP0, yields the set of states of the systemat later times. In particular, the fact that some event occursfor the system at time n is expressed by Tnx 2 E for somespecified subset E # X. On the other hand, since the workof Poincaré it has become more and more evident that evena transformation T which appears very simple may pro-duce very complicated orbits so that, for example, to pre-dict the precise occurrence of some event we areinterested in can be an extremely hard task, if not impossi-ble. Hence, one has to reformulate what are the ‘‘goodquestions’’ for such a system. For example, instead of ask-ing when a given event will take place along the orbit of apoint x, one may investigate which is the frequency of itsoccurrences along ‘typical’ orbits of the system. The canon-ical mathematical framework where such questions areformulated is that of measure theory, so that the familyof possible events will be a r-algebra B of measurable sub-

. All rights reserved.

sets of the phase space X, and precise quantitative resultscan be obtained whenever the system possesses a measurel : B ! ½0;1� which is T-invariant, namely such thatl(E) = l(T�1E) for each E 2 B. In particular, if we regard Tas modelling the time evolution of some concrete physicalsystem, we shall be interested in invariant measures whichare meaningful w.r.t such modelisation. For example, if X isa portion of a Euclidean space then l should have a densityh, such that lðEÞ ¼

RE hðxÞdx. If moreover, the measure of

the whole space is finite, so that it can be normalized togive l(X) = 1, then a very rich theory has been developedwith many connections to other domains of mathematics,notably probability theory (for a good modern account of(finite) ergodic theory see, e.g., [19]). The first mathemati-cal result in ergodic theory was obtained by Poincaré in1890 and says that almost every point in a set E 2 B withpositive measure will return in E infinitely many times[21]. As a next step, one may ask whether a given x 2 E,besides coming back to E infinitely often, does it with adefinite frequency. Differently said, one questions aboutthe existence of the limit

limn!1

SnðE; xÞn

ð1Þ

where SnðE; xÞ :¼Pn�1

k¼0 1EðTkxÞ is the occupation time of theset E at finite n. This is answered by Birkhoff ergodic theo-rem (1931), according to which, under mild hypotheses onthe quadruple ðX; T;B;lÞ, the asymptotic frequency existsalmost everywhere w.r.t. l (abbreviated l-a.e.). On theother hand, the analogue of the strong law of large num-bers of probability theory would be satisfied whenever

http://dx.doi.org/10.1016/j.chaos.2011.01.015

mailto:[email protected]

http://dx.doi.org/10.1016/j.chaos.2011.01.015

http://www.sciencedirect.com/science/journal/09600779

http://www.elsevier.com/locate/chaos

468 S. Isola / Chaos, Solitons & Fractals 44 (2011) 467–479

the value of the above limit is l-a.e. equal to the constantl(E), for each E 2 B. In this case one says that ðX; T;B;lÞ isergodic. Thus, for an ergodic dynamical system ðX; T;B;lÞwith l(X) = 1, we have seen that if l(E) > 0 then Sn(E,x) isof order n with probability one. A simple consequence ofthis property is the following theorem, proved in [15]: let{nk} be the set of occurence times such that Tnk ðxÞ 2 E,sorted in increasing order. The differences betweenconsecutive occurrence times rk = nk � nk�1 are called thereturn times in E. Then, assuming n0 = 0 (that is x 2 E),the average return time in E is inversely proportional tothe measure of E,

limk!1

r1 þ � � � þ rk

k¼ 1

lðEÞ ; l-a:e: ð2Þ

That is, the smaller E is, the longer it takes to return to it.A further ergodic property that can be tested on

ðX; T;B;lÞ when l(X) = 1 is that of mixing, which amountsto the fact that the probability to enter a set F (for the firsttime or not) conditioned to have started in E, n iterates be-fore, tends to the (simple) probability to be in F, as n goesto infinity, i.e.

limn!1

lðE \ T�nFÞ ¼ lðEÞlðFÞ; 8 E; F 2 B ð3Þ

One readily sees that ergodicity can rephrased as the factthat the above holds on the average, that is

limn!1

1n

Xn�1

k¼0

lðE \ T�kFÞ ¼ lðEÞlðFÞ; 8 E; F 2 B ð4Þ

so that mixing is a somewhat stronger property.So far, we have been reasoning under the assumption

that l(X) = 1, which is the standing assumption of the larg-est part of textbooks in ergodic theory. On the other hand,there exist several interesting systems which happen tohave an infinite invariant measure, l(X) =1. They are pre-cisely the objects that infinite ergodic theory deals with. Aswe shall see, these objects are somehow wild creatures forwhich new classification tools have to be introduced andnew mathematical problems naturally arise. There are alsosome assumptions, which in the finite case are automati-cally satisfied, that have to be explicitly stated in the infi-nite case. First, to exclude pathological situations, allinvariant measures l will have to be r-finite, that is s.t. Xcan be decomposed into a countable disjoint union of sub-sets of finite measure. Another standing assumption weshall made is that of conservativity. This means that ifW 2 B is a wandering set, i.e.

PnP01W � Tn

6 1, thenl(W) = 0. An example of a non conservative system is givenby the map T : R! R; Tx :¼ xþ 1, which preserves theLebesgue measure on R, but no points of W = (0,1] willever return to this set. Moreover, as far as ergodicity is con-cerned, a definition which covers both cases, finite and infi-nite, is the following: we say that ðX; T;B;lÞ is ergodic if forany set E 2 B which is invariant, i.e. T�1E = E, we havel(E) � l(XnE) = 0.

Examples of systems preserving an infinite invariantmeasure which satisfies these assumptions are not tooweird. For instance, interval maps with neutral fixed points(Section 6), which form a rich family of nontrivial transfor-mations which are often used as models for the physical

phenomenon of intermittency (see, e.g. [6,14,22,25]) butalso in problems related to geometry and number theory(Section 8).

For such systems, the asymptotic behaviour of theoccupation time Sn(E,x) of a set E of finite positive measuresatisfies Sn(E,x) = o(n) for all x 2 X outside a set of zerol-measure (Section 4). In particular, this means that in-stead of (4) we haveXn�1

k¼0

lðE \ T�kFÞ ¼ oðnÞ; 8 E; F 2 B s:t: lðEÞlðFÞ <1

ð5Þ

and we shall see below (Sections 2 and 3) how one canreformulate the recurrence property (2) so as to be ableto rescale the sum in (5) to get more informative results(Sections 4 and 5). As far as mixing is concerned, in the infi-nite case, instead of (3), we have

lðE \ T�nFÞ ¼ oð1Þ; 8E; F 2 B s:t: lðEÞlðFÞ <1 ð6Þ

and we shall see in Section 5 how one can introduce a no-tion of local mixing property, through the scaling rate. Thelatter is then explicitly computed for the Farey map in Sec-tion 8.5. More general notions of mixing in the context ofinfinite ergodic theory have been recently introduced in[13].

To end this introduction, let us point out that in thesenotes we shall discuss only some features of infinite mea-sure preserving systems, notably those which can be di-rectly translated into corresponding properties of intervalmaps having a number theoretical significance, such asthe Farey map (Section 8). For more comprehensive treat-ments of the subject we refer to the works [1,24,26], andreferences therein.

2. Preliminaries

Let ðX; T;B;lÞ be a conservative ergodic measure pre-serving dynamical system where l is an infinite r-finitemeasure. First, we say that a given set E 2 B is a good setif a.e. orbit visits it, i.e. if [nP0T�nE = X (modl). Conserv-ativity and ergodicity imply that any measurable set s.t.0 < l(E) <1 has this property. We now show that themean return time to such a set is infinite. More specifically,define the return time R : E! N as

RðxÞ :¼ inffn P 1 : TnðxÞ 2 Eg ð7Þ

and let En :¼ {x 2 E : R(x) = n} be its nth levelset. This de-fines a countable partition {An} of X (modl) into the sets

An ¼ T�ðn�1ÞE n [n�2k¼0 T�kE

� �¼ [kPnTk�nþ1Ek; n P 1 ð8Þ

and, l being T-invariant,

lðAnÞ ¼XkPn

lðEkÞ ð9Þ

Therefore

lðRÞ ¼X

n

nlðEnÞ ¼X

n

lðAnÞ ¼ lðXÞ ¼ 1 ð10Þ

S. Isola / Chaos, Solitons & Fractals 44 (2011) 467–479 469

To quantify the degree of ‘infiniteness’ of the invariantmeasure l one may use the notion of wandering rate of(X,T,l,E): this is the sequence (wn(E))nP1 with

wnðEÞ :¼ l [n�1k¼0T�kE

� �¼Xn�1

k¼0

lðfR > kgÞ ð11Þ

Note that l({R > k})/l(E) is the probability of seeing anexcursion outside E of length larger than k and nonintegra-bility of R means nothing butX1k¼0

lðfR > kgÞ ¼ 1

Thus, information about how fast this series diverges or,equivalently, how slowly l({R > k}) decreases to zero quan-tifies in a way how large is X relative to E.

EXAMPLE. For the standard symmetric random walk on Z

with E = {0} it is well known that the conditional probabil-ity for an excursion away from the origin to last longerthan n (when starting at the origin) decreases as n�1/2.

3. Inducing

The classical idea of inducing makes use of a fixed refer-ence set E as above to accelerate the dynamics in such away that the long (infinite mean) excursions outside Eare squeezed to one step (keeping track of their lengths).We shall give here a version of this idea which is well sui-ted for the case in which E is s.t. TE = X, a property whichshall be assumed throughout the paper and turns out tobe satisfied in all examples discussed below. To be precise,we let p : X ! N be the first passage time in E defined as

pðxÞ :¼ 1þ inffn P 0 : TnðxÞ 2 Eg ð12Þ

so that R = p � T, and define the induced map TE : X ? X of Tw.r.t. E as

TEðxÞ :¼ TpðxÞðxÞ ð13Þ

Note that the partition sets An introduced above are thelevelsets of the first passage time: An = {x 2 X : p(x) = n},so that TE = Tn on An.

A basic way of using this device is as follows: given amap T we are interested in, find a good subset E w.r.t.which it induces a map TE which we can understand moreeasily (i.e. it belongs to a class of maps which have beenstudied earlier). Then go back to T using the following.

Lemma 3.1. Assume that TE preserves a finite measure q.Then the measure l defined for any Borel set B � X by

lðBÞ ¼XnP0

qðT�nB \ fp > ngÞ

is T-invariant.

Remark. Setting B = E in the formula and noting thatT�nE \ {p > n} = {p = n + 1} we get

lðEÞ ¼XnP1

qðfp ¼ ngÞ ¼ qðXÞ

so in order that q be a probability measure we have to ‘nor-malize’ l so that l(E) = 1.

Proof of the lemma. We have

lðT�1BÞ ¼XnP0

qðT�nðT�1BÞ \ fp > nþ 1gÞ

þXnP0

qðT�nðT�1BÞ \ fp ¼ nþ 1gÞ

¼XnP1

qðT�nB \ fp > ngÞ þXnP1

qðT�1E B \ fp ¼ ngÞ

¼XnP0

qðT�nB \ fp > ngÞ ¼ lðBÞ �

Easy corollary. For any function f : X ! R define its in-duced version f E : X ! R as

f EðxÞ :¼XpðxÞ�1

k¼0

f ðTkxÞ ð14Þ

Then, if f 2 L1(l) then fE 2 L1(q) and

lðf Þ ¼ qðf EÞ ð15Þ

Examples

� Setting f ¼ 1An we get fE = 1{pPn} and thus l({p = n}) =q({p P n}).� Setting f = 1X so that fE = p we reobtain Kac’s formula

l(X) = l(R) = q(p).

4. Ergodic averages

Assuming that the probability measure preserving sys-tem (X,TE,q) is ergodic we have

limn!1

1n

Xn�1

k¼0

f ðTkExÞ ¼

ZX

f dq q� a:e:; 8f 2 L1ðqÞ ð16Þ

On the other hand, for the infinite measure preserving sys-tem (X,T,l) it holdsXn�1

k¼0

f ðTkxÞ ¼ oðnÞ l� a:e:; 8f 2 L1ðlÞ ð17Þ

A natural question is then the following: is it possible toidentify the proper rate, i.e. a sequence an%1 s.t.Pn�1

k¼0 f ðTkxÞ � anR

X fdl a.e.? Well, it is not worth tryingtoo hard, because according to a theorem of Aaronson [1]this is just not possible: given any positive sequence an

either

lim infn!1

1an

Xn�1

k¼0

f ðTkxÞ ¼ 0 l� a:e:; 8f 2 L1ðlÞ; f > 0;


or

lim supn!1

1an

Xn�1

k¼0

f ðTkxÞ ¼ 1 l� a:e:; 8f 2 L1ðlÞ; f > 0:

Thus, the pointwise behaviour of ergodic averages for aninfinite measure preserving system is so complicated thatany normalizing sequence an either over- or underesti-mates their actual size infinitely often.

However, under some universality and regularity condi-tions (existence of very good sets (see below) and regularvariation1 of their wandering rate wn) the same authorproved the existence of a suitable sequence an% 1 s.t. theergodic averages rescaled with an have a definite asymptoticdistribution (see [1]).

Theorem 4.1. Assume there is some very good set E such thatwn(E) is regularly varying with exponent 1 � a, a 2 [0,1].Then there is a sequence an regularly varying with exponent aand satisfying

an �1

Cð2� aÞCð1þ aÞ �n

wnðEÞ

s.t. for every probability measure P on X a.c. w.r.t. l, for allf 2 L1(l) and for all t > 0

P1an

Xn�1

k¼0

f ðTkxÞ 6 t

!! Pr

ZX

f dl � na 6 t� �

ðn!1Þ

Here na denotes a non-negative real random variabledistributed according to the (normalized) Mittag–Lefflerdistribution of order a, which can be characterized by itsmoments

E½n‘a� ¼ ‘!ðCð1þ aÞÞ‘

Cð1þ ‘aÞ ; ‘ ¼ 0;1;2; . . .

For specific a-values it has a more explicit description:n1 = 1 (a constant r.v.), n1=2 ¼ jN j (the absolute value of astandard Gaussian r.v.) and n0 ¼ E (an exponentially dis-tributed r.v.).

5. The asymptotic renewal equation

In this section we give a proof of the first part ofTheorem 4.1. The starting point is to observe that if one at-tempts to define the (wild) asymptotic size of occupationtimes of a good set E by just averaging, i.e. setting

anðEÞ :¼Z

E

Xn�1

k¼0

1E � Tk dlE ¼Xn�1

k¼0

lEðT�kEÞ ð18Þ

where lE is the conditional probability measure dl jE/l(E),then one may still have non-universality, in that theremight be another good set F so that an(E) = o(an(F))!

1 We say that c(n) is a regularly varying sequence if c([kn])/c(n) ? ka forsome a 2 R (if a = 0 we say that c(n) is slowly varying). In this case it admitsan asymptotic inverse d(n) s.t. c(d(n)) � d(c(n)) � n which is regularlyvarying with exponent 1/a and unique up to asymptotic equivalences ifa > 0 (see [5], p. 28).

One is then led to ask for the existence of a very good set(also called Darling-Kac set), which is a set E such that forsome universal an%1 it holds2

1an

Xn�1

k¼0

bT k1E ! lðEÞ uniformly ðmod lÞ on E ð19Þ

where bT : L1ðlÞ ! L1ðlÞ is the transfer operator of T w.r.t.the invariant measure l, which describes the evolution ofprobability densities under the action of T: if u is the den-sity of some probability measure m w.r.t. l, then bTu is thedensity of the image measure m� T�1, which is reflectedin the duality relationZ

Xf � bTudl ¼

ZXðf � TÞ � udl; f 2 L1ðlÞ; u 2 L1ðlÞ ð20Þ

Setting u = 1E/l(E) and f = 1E this gives, upon multiplying(19) by f and integrating over X

an �1

lðEÞ

ZX

1E �Xn�1

k¼0

bT ku

!dl

¼ 1lðEÞ

Xn�1

k¼0

ZXð1E � TkÞ � udl ¼ anðEÞ

lðEÞ ð21Þ

The point here is that if E and F are both very good setsthen

an �anðEÞlðEÞ �

anðFÞlðFÞ

In the same way, the sequence (sn(E))nP0 given by

snðEÞ :¼ lEðT�nEÞ

lðEÞ ¼ lðT�nE \ EÞðlðEÞÞ2

ð22Þ

is asymptotically universal whenever E is a very good set,and we can thus define the scaling rate (sn)nP0 of (X,T,l)(see [11,12]) as the sequence

sn � snðEÞ; E a very good set ð23Þ

It satisfies

an �Xn�1

k¼0

sk ð24Þ

Now, let us consider the set Cn :¼ [nk¼0T�kE of points which

enter E not later than time n, and decompose it as follows

Cn ¼ [nk¼0T�kfR > n� kg

Since the sets T�k{R > n � k}, 0 6 k 6 n, are disjoint we have

lEðCnÞ ¼Z

E

Xn

k¼0

bT ku � 1fR>n�kg dl; n P 0

with u = 1E/l(E). Therefore, taking the Laplace–Stieltjestransform we getZ

E

X1n¼0

bT nue�ns

! X1n¼0

1fR>n�kge�ns

!dl¼

X1n¼0

lEðCnÞe�ns; s> 0

ð25Þ

2 Uniform convergence (modl) on E means uniform convergence on aset E0 such that the symmetric difference E0DE has l-measure zero.


On the other hand, putting together (19) and (24) we get

Xn

k¼0

bT ku �Xn

k¼0

sk n!1

so that by a classical Abelian theorem we have

X1n¼0

bT nue�ns � PðsÞ :¼X1n¼0

sne�ns ðs! 0Þ

Moreover, since limn?1lE(Cn) = 1 we have

X1n¼0

lEðCnÞe�ns � 11� e�s

� 1s; ðs! 0Þ

Thus, upon setting

QðsÞ :¼X1n¼0

lðfR > ngÞe�ns; s > 0 ð26Þ

the above yields the following asymptotic renewal equation

PðsÞ � QðsÞ � 1s; ðs! 0Þ ð27Þ

We now recall the

Lemma 5.1. Karamata’s Tauberian Theorem for powerseries [5, p. 37] Let un P 0 (n P 0) and suppose thatUðsÞ ¼

PnP0une�ns converges for s > 0. If L is slowly varying

and 0 6 q <1, the following are equivalent:

Xn�1

k¼0

uk � nq � LðnÞ=Cð1þ qÞ ðn!1Þ

and

UðsÞ � ð1=sÞq � Lð1=sÞ ðs! 0Þ

Finally, the hypothesis of Theorem 4.1 is thatwnðEÞ ¼

Pn�1k¼0lðfR > kgÞ ¼ n1�aLðnÞ, so that putting to-

gether (11), (24), (27) and the above lemma we get theclaim.

Remark 5.2. In order to justify the name given to Eq. (27)let us recall some basic notions of renewal theory. Thesequence of return times in E, i.e. ðR � Tk

EÞkP0, is a stationaryand ergodic process on the probability space (E,lE).Moreover the quantity

en :¼ lEðT�nEÞ ¼ lðEÞ � snðEÞ ð28Þ

is the probability to observe a return in E after n iterationsof T (for the first time or not), and can be interpreted as theprobability to observe a renewal at time n [23]. Settingpn lE(En) we can write en in the form

en ¼Xn

k¼1

pklEðT�nEjEkÞ ð29Þ

Suppose for a moment that the iteration process xn = Tn(x0),x0 2 X, ‘starts afresh’ at each passage (renewal) in E, namelythat

lEðT�nEjEkÞ lEðT

nx 2 EjRðxÞ ¼ kÞ ¼ lEðTnx 2 EjTkx 2 EÞ

¼ lEðT�nþkEÞ ¼ en�k

so that the sequence e0,e1, . . . satisfies the recurrence:

e0 ¼ 1; en ¼ pn þ e1pn�1 þ � � � þ en�1p1; ðn P 1Þ; ð30Þ

and we say that (en) is the renewal sequence associated tothe probability distribution (pn).

Now set qn ¼P

k>npk. Denote by P(s), Q(s), F(s) theLaplace–Stieltjes transforms of the sequences (en), (pn),(qn), (s > 0), respectively, i.e., PðsÞ ¼

P1n¼0ene�ns; QðsÞ

¼P1

n¼0qne�ns; FðsÞ ¼P1

n¼0pne�ns. The recursion (30) isequivalent to P(s)(1 � F(s)) = 1 and since 1 � F(s) =(1 � e�s)Q(s) this is the same as P(s) � Q(s) = (1 � e�s)�1,which is the ‘exact’ version of (27).

Remark 5.3. Let NnðxÞ ¼Pn

k¼11EðTkðxÞÞ be the number ofreturns in E up to time n (with N0 = 0). Using the definition(28) one can show that in general (even without (30))

en ¼ lEðNnÞ � lEðNn�1Þ

This gives to en the interpretation of mean number of re-turns in E per iteration of T, or else as mean density of re-turns in E.

6. The main example

Take X = [0,1] and T:[0,1] ? [0,1] a Markov map of thefollowing type: there exists a finite family of pairwise dis-joint subintervals {Zk : k 2 I} s.t. k([k2IZk) = 1 and

1. For each k 2 I, T extends to a monotone C2 function Tk

on the closure of Zk which is onto [0,1].2. There exists a non-empty finite set J # I s.t. each Zj, j 2 J,

contains a indifferent fixed point xj where T0(xj) = 1 (reg-ular source).

3. For each � > 0 we have jT0jP q(�) > 1 on[k2IZkn [j2J(xj � �,xj + �).

4. jT00(x)/(T0(x))2j uniformly bounded on [k2IZk (Adler’scondition).

5. In a neighbourhood of each indifferent fixed point xj wehave

TðxÞ¼xcjjx�xjjbjþ1þoðjx�xjjbjþ1Þ; cj>0; bj P1

An map satisfying these assumptions is depicted in thepicture below, borrowed from [24].

For this example the assumptions of Theorem 4.1 arefulfilled and setting b = max{bj : j 2 J} we have a = 1/b and(see [24])

an � const:n= log n; b ¼ 1n1=b; b > 1

�ð31Þ


7. The barely infinite invariant measure situation

In what follows, we shall discuss in some detail only thesituation in which the measure l is barely infinite (a = 1).Resting on the last example this amounts to restrict tothe case b = 1. In this case the Aaronson distributional re-sult reduces to a ‘weak law of large numbers’ (see also [7]).

Theorem 7.1. Let T : [0,1] ? [0,1] be a map satisfying theassumptions listed in the above example with b = 1. For everyprobability measure P on X a.c. w.r.t. l, for all f 2 L1(l) andfor all � > 0,

P1an

Xn�1

k¼0

f ðTkxÞ �Z

Xfdl

��P �

!! 0 as n!1

where an � c n/logn for some constant c > 0.To see how things work, let us consider an orbit

fTkxgn�1k¼0 for some x 2 [0,1], fix a very good set E � [0,1]

and denote by Nn = Nn(E,x) the number of its passages inE. Namely

NnðE; xÞ :¼Xn�1

k¼0

1EðTkxÞ ð32Þ

We can writeXn�1

k¼0

f ðTkxÞ ¼XNn�1

k¼0

f EðTkxÞ þ Rðn; x; f Þ ð33Þ

with remainder

Rðn; x; f Þ ¼Xn�1

k¼SNðE;xÞf ðTkxÞ ð34Þ

where

SNðE; xÞ :¼XNn�1

k¼0

pðTkExÞ ð35Þ

is the total number of iterates of T needed to observe Nn

passages in E. Now Nn(E,x) ?1 as n ?1 for all x 2 [0,1].Moreover if f 2 L1(l) then fE 2 L1(q) by (15) and hence by(16)

limn!1

1NnðE; xÞ

XNnðE;xÞ�1

k¼0

f EðTkExÞ ¼

ZX

fdq a:e:

So, provided jR(n,x, f)j/Nn(E,x) ? 0 (in a suitable sense: notethat as a function of the number of iterates n we haveSN � n in probability) from the above we get

limn!1

1NnðE; xÞ

Xn�1

k¼0

f ðTkxÞ ¼Z

Xfdl a:e: ð36Þ

To finish the sketch of the proof we need two more steps:

� As a function of the number of passages N the totalexcursion time SN obeys a definite asymptotic law inprobability: there exists a sequence bN � c�1N logN forsome constant c > 0 s.t.

limN!1

SNðE; xÞbN

¼ 1 in probability; ð37Þ

� The total excursion time SN and the number of passagesNn satisfy the duality rule:

NnðE; xÞ 6 m() SmðE; xÞP n ð38Þ

That is, the number of passages to E before time n does notexceed m iff the m-th passage does not take place before timen. Therefore, knowing the asymptotic behaviour bm ofSm(E,x) we can obtain that of Nn(E,x) (that is an). In partic-ular, since bm is regularly varying an is its asymptotic in-verse, namely an � cn/logn, and viceversa.

It thus remains to prove (37). The first ingredient is an esti-mate of the decay of the tail distribution of the first pas-sage time. Under the hypotheses of Theorem 7.1, a resultof Thaler says that the measure l is s.t. l(dx) = e(x)dx withdensity eðxÞ ¼ gðxÞ

Qj2Jjx� xjj�1 with g continuous and

positive on [0,1]. Then, considering the random variablesri :¼ p � Ti�1

E ; ði P 1Þ, on the probability space ([0,1],q),one gets for n large enough q(ri = n) � C1n�2 and thus theestimate q(ri > n) P C2n�1. The second ingredient is theuniform mixing property of the random variables ri. Tostate it precisely, for r and k1,k2, . . .,kr positive integerswe let Qr = {x : r1 = k1, . . .,rr = kr} be an r-dimensional cyl-inder. Then, if r and s are positive integers, B is any Borelset and Qr is as above, it holds

qðQ r \ T�r�sE BÞ ¼ qðQrÞqðBÞð1þ OðqsÞÞ ð39Þ

uniformly in r, s, B and Qr. Here q is some number in (0,1).We now prove the followingLemma 7.2. For all � > 0 and fixed N 2 N we can find aconstant C > 0 so that

qSN

bN� 1

�� P ��

<C

� log N

where, as above, bN � c�1N logN for some constant c > 0.


Proof. Set SNðE; xÞ0 ¼P

i6Nr0i where r0i ¼ ri if ri < N0: = �N-logN and r0i ¼ 0 otherwise, and moreover

M0N :¼ qðS0NÞ; V 0N :¼ qððS0N �M0

NÞ2Þ

We have

M0N ¼ Nqðr01Þ ¼ N

X‘6N0

‘qðr1 ¼ ‘Þ

¼: bN � c�1N log N0 � c�1N log N

for some c > 0. Moreover qððS0NÞ2Þ ¼

PNn;m¼1qmn where

qmn :¼ qðr0mr0nÞ ¼X‘;k6N0

‘ � kqðfr0m ¼ ‘;r0n ¼ kgÞ

¼X‘;k6N0

‘ � kqðfr01 ¼ ‘gÞqðfr01 ¼ kgÞð1þ Oðqn�mÞÞ

¼ ðM0NÞ

2N�2ð1þ Oðqn�mÞÞ

In particular qnn ¼P

‘6N0‘2qðfr01 ¼ ‘gÞ � N0, and therefore

V 0N ¼XN

n;m¼1

qmn � ðM0NÞ

2 ¼ ðM0NÞ

2N�2X

m<n6N

Oðqn�mÞ þ NN0

� ðM0NÞ

2N�1 þ NN0 � NN0

Thus, applying Chebyshev inequality we get

qðjS0N �M0NjP �M0

NÞ <C3

� log N

and moreover, for each i 6 N,

qðri P �N log NÞ < C2

�N log N

and the estimate follows. h

We end this section by illustrating a simple argumentwhich shows that Theorem 7.1 cannot be sharpened(yielding the Aaronson negative result for this case).

Considering (37) one may wonder if it holds in a strongsense, i.e. if

q limN!1

SN

bN¼ 1

� �¼ 1

We now show that it does not and in fact

q limN!1

SN

bN¼ 1

� �¼ 0 ð40Þ

Indeed, since q(rN > n) P C3/n, for any number ‘ > 1 and forN large enough we have

qðrN > ‘bNÞPC3

‘bNP

C4

‘cN log N

since bN � c�1N logN. ThereforeXNP1

qðrN > ‘bNÞ ¼ 1

From the extension of the Borel–Cantelli lemma to depen-dent events it follows that

qrN

bN> ‘ infinitely often

� �¼ 1

hence

qSN

bN> ‘ infinitely often

� �¼ 1

and finally

q lim supN!1

SN

bN¼ 1

� �¼ 1

which implies the claim (40). But we can actually saymore: since (37) implies the convergence a.e. on a subse-quence, (40) is valid for every sequence of constants bN.

8. The Farey and Gauss maps

We now choose the map T : [0,1] ? [0,1] to be theFarey map F, given by

FðxÞ :¼F0ðxÞ; 0 6 x 6 1=2F1ðxÞ; 1=2 < x 6 1

�ð41Þ

where

F0ðxÞ ¼x

1� xand F1ðxÞ ¼ F0ð1� xÞ ¼ 1� x

xð42Þ

Their iterates are explicitly given as

Fn0ðxÞ ¼

x1� nx

and Fn1ðxÞ ¼

fnþ1x� fn

fn�1 � fnx; n P 1 ð43Þ

where f0 = 0, f1 = 1 and fn = fn�1 + fn�2, n P 2, are theFibonacci numbers. The inverse branches are

W0ðxÞ ¼ F�10 ðxÞ ¼

x1þ x

¼ 12

1� 1� x1þ x

� �;

W1ðxÞ ¼ F�11 ðxÞ ¼

11þ x

¼ 12

1þ 1� x1þ x

� �Moreover F preserves the a.c. (barely) infinite measure

lðdxÞ ¼ eðxÞdx; eðxÞ ¼ 1log 2

� �1x

ð44Þ

where the multiplying factor ensures that l([1/2,1)) = 1,the set E = [1/2,1) being a very good set for this map. Tosee this it suffice to verify that Pe = e where P is the transferoperator of F w.r.t. to the Lebesgue measure, which acts onf : ½0;1� ! C as

ðPf ÞðxÞ ¼X

y:FðyÞ¼x

f ðyÞjF 0ðyÞj

¼ 1

ð1þ xÞ2f

x1þ x

� �þ f

11þ x

� �� ð45Þ

Note that P is related to the Markov operator bT introducedpreviously by bTf ¼ e�1Pðe � f Þ, so that

ðbT f ÞðxÞ ¼ 11þ x

� �f

x1þ x

� �þ x

1þ x

� �f

11þ x

� �ð46Þ

Referring to the notation introduced in the previous sec-tion we have the following identifications according tothe above choice for E:

fp ¼ ng An ¼1

nþ 1;1n

� �;

fR ¼ ng En ¼n

nþ 1;nþ 1nþ 2

� �; n P 1


and setting A0 = [0,1] we have

FðEnÞ ¼ An; FðAnÞ ¼ An�1; 8n P 1

Therefore

lðfR > kgÞ ¼Xl>k

lðElÞ ¼Xl>k

log 1þ 1lðlþ 2Þ

� �log 2

¼ log 1þ 1kþ 1

� �log 2

and the (slowly varying) wandering rate is

wn ¼Xn

k¼0

lðfR > kgÞ ¼ log2ð2þ nÞ � log2n ð47Þ

According to Theorem 4.1 we thus have

an �n

wn� n

log2nð48Þ

as expected. Theorem 7.1 applied to f = 1E with c = log 2yields (cf (32))

limn!1

log2nn� NnðE; xÞ ¼ 1 in probability ð49Þ

Dually to this we get (cf. (37) and (38))

limN!1

SNðE; xÞN log2N

¼ 1 in probability ð50Þ

Now note that pðxÞ ¼ 1x

� �where [ � ] denotes the integer

part and the induced map TE G: [0,1] ? [0,1] acts as

GðxÞ ¼ F1 � Fn�10 ðxÞ ¼ 1

x� n; x 2 An

Namely G is the celebrated Gauss map

GðxÞ :¼ 1xðmod1Þ; x–0; Gð0Þ ¼ 0 ð51Þ

which is ergodic w.r.t. the invariant a.c. probability mea-sure q (dx) = h(x)dx obtained by pushing forward l withthe right branch of F, whose density h satisfies

h ¼ jW01je �W1 () hðxÞ ¼ 1log 2

� �1

1þ xð52Þ

Note that the converse relation is (cf Proposition 3.1)

e ¼X1k¼0

jðWk0Þ0jh �Wk

0 ð53Þ

Remark 8.1. It is well known that the system ([0,1],q,G) isergodic and, in fact, exact3 (see, e.g., [4]). It is not difficult tosee (for example using (15)) that ([0,1],l,F) is also exact(and thus ergodic). On the other hand, a result in [20] saysthat this is equivalent to the fact that

limn!1kPnfk1 ¼ 0; 8f 2 L1ðlÞ s:t: lðf Þ ¼ 0 ð54Þ

3 A system ðX; T;B;lÞ is exact if and only if for each element E of the tailr-algebra B1 :¼ \nP0T�nðBÞ we have l (E) � l(XnE) = 0. Note that any T-invariant set E is an element of B1 , since E ¼ T�nE 2 T�nðBÞ for all n P 0,but B1 may contain also sets which are not T-invariant.

8.1. Relation with the continued fractions

An interesting way to look at the action of the maps Fand G makes use of the continued fraction expansion[18]. We start recalling that every real number x 2 [0,1]has a unique expansion of the type

x ¼ 1r1þ 1

r2þ 1

. .. ½r1;r2 . . .�; rk 2 N ð55Þ

The integers rk are called partial quotients or CF-digits. Thefollowing result can be readily established by noting thatp(x) = r1 and more generally p(Gk�1(x)) = rk for all k P 1.

Proposition 8.2. In terms of CF-digits we have

F : ½r1;r2; . . .�# ½r1 � 1;r2; . . .� ð56Þ

and

G : ½r1;r2; . . .�# ½r2;r3; . . .� ð57ÞTherefore, the function SN(E,x) introduced in the previoussection is given by

x ¼ ½r1;r2; . . .� () SNðE; xÞ ¼XN

k¼1

rkðxÞ ð58Þ

Hence we have q(r1) =1 and (50) stated in terms of con-tinued fraction CF-digits yields the following classicalresult.

Proposition 8.3 (Khinchin’s weak law). The CF-digits (rk)satisfy

limN!1

PNk¼1rk

Nlog2N¼ 1 in probability

As we have seen, this result cannot be sharpened. Onthe other hand the following result by Diamond and Vaaler[8] shows that the obstacle to a.e. convergence is theoccurrence of a single large value of ri.

Proposition 8.4. For almost all x 2 [0,1] there existsN0 = N0(x) s.t. for all N P N0XN

k¼1

rk ¼ ð1þ oð1ÞÞNlog2N þ # max16k6N

rkðxÞ

with # = 0(N,x) 2 [0,1].The proof of this result relies on a simple but interesting

lemma.

Lemma 8.5. Let d > 1/2. Under the above assumptions foralmost all x 2 [0,1] we can find a number N0 = N0(x) s.t."N P N0 there is at most one integer k among {1, . . .,N} sothat rk > N00 N(logN)d

Proof. Fix m < n. A weak form of the mixing propertyyields

qðrm > N00;rn > N00Þ � qðrm > N00Þ � qðrn > N00Þ¼ ðqðr1 > N00ÞÞ2 � ðN00Þ�2


From this, it follows that the measure of the set in whichrn > N00 and rm > N00 for some distinct indices m,n 6 2N isof order at most (logN)�2d. For K = 1,2, . . . let

UK ¼ [kPKfrm > ð2kÞ00;rn > ð2kÞ00 for some distinct m;n 6 2kþ1g

Then

qðUKÞ �XkPK

k�2d ! 0 as K !1

Finally, for x R UK and N P 2K there is at most one indexk 6 N s.t. rk > N00. h

Proof of Proposition 8.4. Set S00NðE; xÞ ¼P

i6Nr00i wherer00i ¼ ri if ri 6 N00 and r00i ¼ 0 otherwise, and moreover

M00N :¼ qðS00NÞ; V 00N :¼ qððS00N �MNÞ2Þ

Reasoning as in the proof of Lemma 7.2 we get that

M00N � N log N and V 00N � N2ðlog NÞ2d

Let 0 < a < 1 and b > 1 two numbers to be chosen later anddefine the sequence Nk :¼ exp(ka) and k�b. From the abovewe have

qXkP1

ðS00Nk�M00

NkÞ2

NkN00kk�b

!�XkP1

k�b<1

and hence

S00Nk�M00

Nk¼ o

NkN00kk�b

� �1=2

a:e:

On the other hand we have NkN00k=k�b ¼ oðM2NkÞ since

NkN00k=k�b ¼ ðNk log NkÞ2rN with rN = (logNk)d�2kb = o(1)provided a(2 � d) > b. Therefore S00Nk

¼ ð1þ oð1ÞÞM00Nk

for al-most all x 2 [0,1]. It is moreover easy to see thatM00

Nk�1=M00

Nk� 1 as k ? 1 and for large N

S00N ¼ ð1þ oð1ÞÞM00N a:e:

The assertion now follows putting together the above andLemma 8.5, which says that 0 6 SN � S00N 6 max16i6Nri. h

Remark 8.6. It would be of some interest to investigate towhat extent the above result can be extended to moregeneral infinite measure preserving dynamical systems(not necessarily with a number theoretical significance) byergodic theoretical methods.

8.2. Further properties of the CF-digits

A simple consequence of the ergodicity of ([0,1],q,G) isthat unlike the arithmetic mean of the partial quotients,their geometric as well as harmonic means are well definedalmost q-a.e.

Proposition 8.7. Both functions logr1(x) and 1/r1(x) belongto L1(q) and we have

limn!1

Yn

k¼1

rk

!1n

¼ eqðlogr1Þ ¼ eK1 q� a:e: ð59Þ

and

limn!1

nPnk¼1

1rk

¼ 1qð1=r1Þ

¼ 1K2

q� a:e: ð60Þ

where the constants K1 and K2 are given by

K1 ¼X1k¼1

log k � log2 1þ 1kðkþ 2Þ

� �’ 0:987882 ð61Þ

and

K2 ¼X1k¼1

1k� log2 1þ 1

kðkþ 2Þ

� �’ 0:572935 ð62Þ

respectively.

Proof. We have

qðlogr1Þ ¼X1k¼1

log k � qðr1 ¼ kÞ

¼X1k¼1

log klog 2

� log 1þ 1k

� �1þ 1

kþ 1

� ��1 !

¼X1k¼1

log k � log2 1þ 1kðkþ 2Þ

� �¼ K1 <1

This computation shows at the same time that logr1 2L1(q) and the last identity of (59). The first identity of(59) follows from rk(x) = r1(Gk�1(x)) along with the ergodictheorem and the ergodicity of ([0,1],q,G). In a similar wayone proves the second property, noting that

qð1=r1Þ ¼X1k¼1

qðr1 ¼ kÞk

¼X1k¼1

1k� log2 1þ 1

kðkþ 2Þ

� �¼ K2 <1 �

8.3. Fast and slow convergents

Let us briefly recall some well known facts about con-tinued fractions (see [9] or [18] for more information).

For x = [r1,r2, . . .] irrational one can construct recur-sively a sequence pn/qn of rational approximants of x as

p0

q0¼ 0

1;

p1

q1¼ 1

r1and

pn

qn¼ rnpn�1 þ pn�2

rnqn�1 þ qn�2; n P 2

ð63Þ

One can write this recursion in matrix form as follows: set

A :¼1 01 1

� �and B :¼

1 11 0

� �ð64Þ

and note that BAk�1 ¼ k 11 0

� �. Then

p1 p0

q1 q0

� �¼ Ar1 and

pn pn�1

qn qn�1

� �¼ Ar1 BAr2�1 � � �BArn�1

; n P 2ð65Þ


Moreover, a short manipulation of (63) gives qn+1pn �qnpn+1 = �(qnpn�1 � qn�1pn). Since q1p0 � q0p1 = �1 one ob-tains inductively the Lagrange formula

qnpn�1 � qn�1pn ¼ ð�1Þn; n P 1: ð66Þ

Another useful formula which can be easily obtained from(63) is the following:

r1;r2; . . . ;rn�1 þ1r

� ¼ rpn�1 þ pn�2

rqn�1 þ qn�2; n P 2; r P 1

ð67Þ

In particular, for r = rn one gets

½r1;r2; . . . ;rn� ¼pn

qn; n P 1 ð68Þ

The numbers pnqn

are called fast convergents (FC) of x and itturns out that the nth FC pn

qnis the best rational approxima-

tion to x whose denominator does not exceed qn (see, e.g.,[9], Ch. X). One also sees that

p2n

q2n< x <

p2n�1

q2n�1; 8n > 0 ð69Þ

On the other hand, letting r range as 1 6 r 6 rn we get thenumbers

t1;r

s1;r:¼ 1

r;

tn;r

sn;r:¼ rpn�1 þ pn�2

rqn�1 þ qn�2; n P 2 ð70Þ

which are called the slow convergents (SC) for the real num-ber x 2 [0,1).

In matrix notation, the SC’s can be expressed in terms ofintermediate products in (65) for n P 1 as

tn;r pn�1

sn;r qn�1

� �¼ Ar1 BAr2�1 � � �BArn�1�1BAr�1

; 1 6 r 6 rn:

ð71Þ

The algorithm which produces the sequence of SC’s of a gi-ven real number is called slow continued fraction algorithm(see, e.g., [3]).

Example. Let x = e � 2 = [1,2,1,1,4,1,1,6, . . .]. The firstfive FC’s are

n ¼ 1p1

q1¼ 1

1

n ¼ 2p2

q2¼ 1

1þ 12¼ 2

3

n ¼ 3p3

q3¼ 1

1þ 12þ 11¼ 3

4

n ¼ 4p4

q4¼ 1

1þ 12þ 11þ 11¼ 5

7

n ¼ 5p5

q5¼ 1

1þ 12þ 11þ 11þ 14¼ 23

32

On the other hand, within the same accuracy, there are1 + 2 + 1 + 1 + 4 = 9 SC’s. They are

n ¼ 1; r ¼ 1;t1;1

s1;1¼ 1

1

n ¼ 2; r ¼ 1;t1;1

s1;1¼ p1 þ p0

q1 þ q0¼ 1

2

n ¼ 2; r ¼ 2;t1;2

s1;2¼ 2p1 þ p0

2q1 þ q0¼ 2

3

n ¼ 3; r ¼ 1;t2;1

s2;1¼ p2 þ p1

q2 þ q1¼ 3

4

n ¼ 4; r ¼ 1;t3;1

s3;1¼ p3 þ p2

q3 þ q2¼ 5

7

n ¼ 5; r ¼ 1;t4;1

s4;1¼ p4 þ p3

q4 þ q3¼ 8

11

n ¼ 5; r ¼ 2;t4;2

s4;2¼ 2p4 þ p3

2q4 þ q3¼ 13

18

n ¼ 5; r ¼ 3;t4;3

s4;3¼ 3p4 þ p3

3q4 þ q3¼ 18

25

n ¼ 5; r ¼ 4;t4;4

s4;4¼ 4p4 þ p3

4q4 þ q3¼ 23

32

We now need some notions.

Definition 8.8. The Farey sum over two rationals ab and a0

b0is

the mediant operation given by

ab� a0

b0:¼ aþ a0

bþ b0¼ a00

b00� ð72Þ

It is easy to see that a00

b00falls in the interval ðab ; a0

b0Þ. We say that

ab and a0

b0are Farey neighbours if ab0 � a0b = ± 1. Two Farey

neighbours define a Farey interval and each Farey intervalcan be labelled uniquely according to the mediant (child)a00

b00¼ aþa0

bþb0of the neighbours.

Observe that given a pair of consecutive SC’s, say

tn;r

sn;r¼ rpn�1 þ pn�2

rqn�1 þ qn�2and

tn;rþ1

sn;rþ1¼ ðr þ 1Þpn�1 þ pn�2

ðr þ 1Þqn�1 þ qn�2

for some n P 2 and 1 6 r < rn, we have

tn;rþ1

sn;rþ1¼ tn;r

sn;r� pn�1

qn�1ð73Þ

Moreover

qn�1tn;r � pn�1sn;r ¼ qn�1pn�2 � pn�1qn�2 ¼ ð�1Þn�1 ð74Þ

by Lagrange’s formula. Therefore, for every n P 1, each SCtn;rsn;r

for r = 1, . . .,rn is a Farey neighbour of pn�1qn�1

, the corre-sponding Farey interval getting smaller and smaller as r in-creases. More precisely, using again Lagrange’s formula,one easily obtains

pn�1

qn�1� rpn�1 þ pn�2rqn�1 þ qn�2

�� ¼ 1

qn�1ðrqn�1 þ qn�2Þð75Þ

We therefore see that the SC tn;rsn;r

is the best one-sided rationalapproximation to x whose denominator does not exceed sn,r

(although, if r < rn, there might be a FC with denominatorless than sn,r and closer to x on the other side of x). Increas-ing r, once we arrive at r = rn we hit a new FC on the cur-rent side of x, closer than the previous FC.


Remark 8.9. The set F ‘ of Farey fractions of order ‘ is theset of irreducible fractions in [0,1] with denominator 6‘,listed in order of magnitude (see [2]). Thus,

F 1 ¼01;11

� �; F 2 ¼

01;12;11

� �; F 3 ¼

01;13;12;23;11

� �;

F 4 ¼01;14;13;12;23;34;11

� �and so on. In particular jF ‘j � 2 ¼

P‘k¼1uðkÞ � 3‘2

p2 with Eu-ler totient function u(k) = j{0 < i 6 k : gcd (i,k) = 1}j. Thenwe see that each tn;r

sn;rfor r = 1, . . .,rn is consecutive to pn�1

qn�1in

F ‘ for sn,r < ‘ 6 sn,r+1.

8.4. Growth of denominators

From the recursion (63) one readily realizes that theFC’s denominators grow at least exponentially:

qn P 2ðn�1Þ=2 ð76Þ

On the other hand we may expect the growth of SC’sdenominator to be subexponential. To understand this bet-ter we can reason as follows.

First, using Proposition 8.2 we can writex = [r1,r2, . . .,rn + Gn(x)] or else

x ¼ ðGnðxÞÞ�1pn þ pn�1

ðGnðxÞÞ�1qn þ qn�1

ð77Þ

From this we obtain at once

GnðxÞ ¼ � qnx� pn

qn�1x� pn�1¼ � fn

fn�1ð78Þ

so that the numbers fn :¼ (�1)n(qnx � pn) > 0 satisfy

fn ¼Yn

k¼0

GkðxÞ ð79Þ

Thus, by the ergodic theorem we have that for q-almost allx 2 [0,1], and then almost everywhere,

limn!1

1n

log fn ¼Z 1

0log xqðdxÞ ¼ � p2

12 log 2� ð80Þ

Since [(Gn(x))�1] = rn+1 so that rn+1 < (Gn(x))�1 < rn+1 + 1another consequence of (77) is that

1rnþ1 þ 2

<qn

qn þ qnþ1< qnfn <

qn

qnþ1<

1rnþ1

ð81Þ

and therefore using (78)

12< qnfn�1 < 1: ð82Þ

Putting together (80) and (82) we get a classical theorem ofLévy (see [18])

log qn

n! p2

12 log 2a:e: ð83Þ

Let moreover

tm

sm tn;r

sn;rwith m ¼

Xn�1

i¼1

ri þ r

be the mth SC. Its denominator satisfies qn�1 < sm 6 qn.Combining the above with Proposition 8.3 one gets thefollowing

Proposition 8.10.

log sm

m� p2

12 log min probability ð84Þ

Of course there are special behaviours: take x ¼ðffiffiffi5p� 1Þ=2 ¼ ½1;1;1; . . .�, then sn = qn and both are equal

to the nth Fibonacci number. Hence n�1 logqn convergesto x�1.

Remark 8.11. Recall that the number p2/(6log 2) ¼: hq(G)is but the entropy of ([0,1],q,G) which satisfies

hqðGÞ ¼Z 1

0log jG0ðxÞjqðdxÞ ¼

Z 1

0log jG0ðxÞjhðxÞdx

¼X1n¼1

ZAn

log jG0nðxÞjhðxÞdx

¼X1n¼1

ZAn

Yn�1

j¼0

log jF 0ðFj0ðxÞÞjhðxÞdx

¼X1k¼0

Z 1=ðkþ1Þ

0log jF 0ðFk

0ðxÞÞjhðxÞdx

¼Z 1

0log jF 0ðxÞj

X1k¼0

hðWk0ðxÞÞ � ðW

k0Þ0ðxÞdx

¼Z 1

0log jF 0ðxÞjeðxÞdx ¼

Z 1

0log jF 0ðxÞjlðdxÞ

8.5. The scaling rate of the Farey map

We now consider the scaling rate sn(E) as defined in (22)for ([0,1],l,F) and E 2 B belonging to the family of verygood sets

Bþ :¼ [�>0fE 2 B : lðEÞ > 0; E # ½0;1� n ð0; �Þg ð85Þ

From and (24) and (48) we have that for all E 2 B+

Xn�1

k¼0

skðEÞ �n

log2nð86Þ

To obtain more information one can proceed as in [12] byconstructing a Markov approximation of ([0,1],l,F), towhich renewal theory can be applied, to get ([12], Theorem10.9)

Theorem 8.12. There is a constant C > 0 s.t. for all E 2 B+ wehave

snðEÞ :¼ lðF�nE \ EÞðlðEÞÞ2

� Clog n

as n!1 ð87Þ

In what follows we shall obtain this result for the inducingset E = [1/2,1), following a more direct argument inspiredby [11] (see also [16,17] for related results). Sincel(E) = 1 we can write


snðEÞ ¼Z

E1F�nEðxÞlðdxÞ ¼

ZEðbT n1EÞðxÞlðdxÞ ð88Þ

where bT is the operator defined in (46). Setting

/nðxÞ :¼ ðbT n1EÞðxÞ; n P 0 ð89Þ

we have

Lemma 8.13. For all n P 1 the function /n : [0,1] ? [0,1/2]is positive, strictly increasing and concave. Moreover/n+1(x) < /n(x) for all x 2 E and n P 0.

0.30

0.35

0.40

0.45

0.50

/n(x) for x 2 E and n = 1, . . .,5.

0.6 0.7 0.8 0.9 1.0

Proof. We have the recursion

/nþ1ðxÞ ¼1

1þ x/n

x1þ x

� �þ x

1þ x/n

11þ x

� �ð90Þ

from which we see that /n+1(1) = /n(1/2). Differentiatingtwice we get

/0nþ1ðxÞ ¼/0n

x1þx

� �� x/0n

11þx

� �ðxþ 1Þ3

þ/n

11þx

� �� /n

x1þx

� �ðxþ 1Þ2

and

/00nþ1ðxÞ¼/00n

x1þx

� �þx/00n

11þx

� �ðxþ1Þ5

�2ð1�xÞ/0n 1

1þx

� �þ2/0n

x1þx

� �ðxþ1Þ4

þ2/n

x1þx

� ��/n

11þx

� �ðxþ1Þ3

The first assertion now follows easily by induction, sincefor n = 1 we have

/1ðxÞ ¼x

xþ 1> 0; /01ðxÞ ¼

1

ðxþ 1Þ2> 0;

/001ðxÞ ¼ �2

ðxþ 1Þ3< 0

Also note that for n > 1 we have /0nð1Þ ¼ 0. The strict mono-tonicity of /n for n = 0,1, . . . follows by observing that by

(90) /n+1(x) is a convex combination of /nx

1þx

� �and

/n1

1þx

� �;/nðxÞ being strictly increasing and concave.

Therefore /n+1(x) 6 /n(2x /(1 + x)2) < /n(x) provided x >ffiffiffi2p� 1. h

The above result yields that for E = [1/2,1) the sequencesn(E) is strictly decreasing. Now one can either use directly(86) or apply a Tauberian theorem for power series (see,e.g., [5], p. 40) to the function P(s) introduced in Section4 (with z = e�s), to get

snðEÞ �1

log2nas n!1 ð91Þ

Finally, let us briefly dwell on the number theoretical sig-nificance of this result. A short reflection using Proposition8.2 shows that

Bn :¼ F�nE \ E ¼ f½1;r2; . . .� 2 ½0;1�;Xk

i¼2

ri ¼ n for some k 2 Ng ð92Þ

Now note that

F�rBn \ fp > rg ¼ f½r þ 1;r2; . . .� 2 ½0;1�;Xk

i¼2

ri ¼ n for some k 2 Ng

Hence, using Lemma 3.1 and the G-invariance of theprobability measure q, we have

lðBnÞ ¼X1r¼1

qðf½r;r2; . . .� 2 ½0;1�;Xk

i¼2

ri ¼ n for some k 2 NgÞ

¼ qðf½r1;r2; . . .� 2 ½0;1�;Xk

i¼2


¼ qðf½r1;r2; . . .� 2 ½0;1�;Xk

i¼1


In other words, the scaling rate sn(E) is but the q-probabil-ity of the sum-level sets:

Cn :¼ f½r1;r2; . . .� 2 ½0;1�;Xk

i¼1

ri ¼ n for some k 2 Ng

ð93Þ

Direct inspection shows that liminfn Cn is equal to the set ofall noble numbers, i.e. whose infinite continued fractionexpansion terminates with an infinite block of 1’s. On theother hand, limsupn Cn is the set of all irrational numbersin [0,1] (see [17]). For further results on the statistics ofthe continued fraction digit sum see [10,16]. Finally, fromRemark 5.3, it follows that sn(E) can also be intepreted asthe mean density of returns in E with the map F.

References

[1] Aaronson J. An introduction to infinite Ergodic theory. Mathematicssurveys and monographs, vol. 50. Providence, RI: AMS; 1997.

[2] Apostol T. Modular functions and Dirichlet series in number theory.Graduate texts in mathematics, vol. 41. New York: Springer; 1976.

[3] Appelgate H, Onishi H. The slow continued fraction algorithm via2 2 integer matrices. Amer Math Monthly 1983;90:443–55.

[4] Billingsley P. Ergodic theory and information. New York: Wiley;1965.

[5] Bingham NH, Goldie CM, Teugels JL. Regular variation. Encyclopediaof mathematics and its applications, vol. 27. Cambridge UniversityPress; 1987.

[6] Campanino M, Isola S. Statistical properties of long return times intype I intermittency. Forum Math 1995;7:331–48.

[7] Campanino M, Isola S. Infinite invariant measures for non-uniformlyexpanding transformations of [0,1]: weak law of large numbers withanomalous scaling. Forum Math 1996;8:71–92.


[8] G Diamond H, Vaaler JD. Estimates for partial sums of continuedfraction partial quotients. Pacific J Math 1986;122:73–82.

[9] Hardy GH, Wright EM. An introduction to the theory ofnumbers. Oxford: Oxford University Press; 1980.

[10] Hensley D. The statistic of the continued fraction digit sum. Pacific JMath 2000;192:103–20.

[11] Isola S. Renewal sequences and intermittency. J Stat Phys1999;97:263–80.

[12] Isola S. On systems with finite ergodic degree. Far East J Dyn Syst2003;5:1–62.

[13] Lenci M. On infinite-volume mixing. Commun Math Phys2010;298(2):485–514.

[14] Manneville P. Intermittency, self similarity and 1/f spectrum indissipative dynamical systems. J Phys 1980;41:1235–43.

[15] Kac M. On the notion of recurrence in discrete stochastic processes.Bull AMS 1947;53:1002–10.

[16] Kesseböhmer M, Slassi M. A distributional limit law for thecontinued fraction digit sum. Math Nachr 2008;281(9):1294–306.

[17] Kesseböhmer M, Stratmann BO. On the Lebesgue measure of sum-level sets for continued fractions. arXiv:org/pdf.

[18] Ya Khinchin A. Continued Fractions. The University of Chicago Press;1964.

[19] Petersen K. Ergodic theory. Cambridge University Press; 1991.[20] Lin M. Mixing for Markov Operators. Z Wahrsc Verw Geb

1971;19:231–42.[21] Poincaré H. Sur le probléme des trois corps et les équations de la

Dynamique. Acta Math 1890;13:1–270.[22] Prellberg T, Slawny J. Maps of intervals with indifferent fixed points:

thermodynamic formalism and phase transitions. J Stat Phys1992;66:503–14.

[23] Sevast’yanov B A. Renewal theory. J Soviet Math 1975;4(3).[24] Thaler M. Infinite ergodic theory. Lecture notes, CIRM residential

session. The dynamic odyssey; 2001. Available at <http://www.sbg.ac.at/mat/staff/thaler>.

[25] Thaler M. Transformations on [0,1] with infinite invariant measures.Isr J Math 1983;46:67–96.

[26] Zweimueller R. Surrey notes on infinite ergodic theory. Notesaccompanying a course on Infinite Ergodic Theory at the LMSGraduate School on Ergodic Theory at the University of Surrey, UK;16th–19th March 2009. Available at <http://homepage.univie.ac.at/roland.zweimueller>.

http://www.sbg.ac.at/mat/staff/thaler

http://www.sbg.ac.at/mat/staff/thaler

http://homepage.univie.ac.at/roland.zweimueller

http://homepage.univie.ac.at/roland.zweimueller

from infinite ergodic theory to number theory (and possibly back)

Documents