steven n. evans david steinsaltz kenneth w. wachter...chapter 4. mutation-selection-recombination in...

A mutation-selection model with recombination

for general genotypes

Steven N. Evans

David Steinsaltz

Kenneth W. Wachter

Author address:

Steven N. Evans, Department of Statistics, 367 Evans Hall, Uni-versity of California, Berkeley, CA 94720-3860, U.S.A.

E-mail address: [email protected]

David Steinsaltz, Department of Statistics, 1 South Parks Road,Oxford, OX1 3TG, UNITED KINGDOM


Kenneth W. Wachter, Department of Demography, 2232 PiedmontAvenue, University of California, Berkeley, CA 94720-2120, U.S.A.


Contents

Chapter 1. Introduction 11.1. Informal description of the limit model 31.2. Example I: Mutation counting 71.3. Example II: Polynomial selective costs 91.4. Example III: Demographic selective costs 101.5. Comments on the literature 111.6. Overview of the remainder of the work 12

Chapter 2. Definition, existence, and uniqueness of the dynamical system 132.1. Spaces of measures 132.2. Definition of the dynamical system 142.3. Existence and uniqueness of solutions 152.4. Lemmas used in the proof of existence and uniqueness 16

Chapter 3. Equilibria 213.1. Introductory example: One-dimensional systems 223.2. Introductory example: Multiplicative selective cost 233.3. Frechet derivatives 263.4. Existence of equilibria via perturbation 273.5. Concave selective costs 283.6. Iterative computation of the minimal equilibrium 333.7. Stable equilibria in the concave setting via perturbation 333.8. Equilibria for demographic selective costs 35

Chapter 4. Mutation-selection-recombination in discrete time 394.1. Mutation and selection in discrete-time 394.2. Recombination in discrete-time 414.3. The discrete-time model 42

Chapter 5. Hypotheses and statement of the convergence result 435.1. Shattering of point processes 435.2. Statement of the main convergence result 44

Chapter 6. Convergence with complete Poissonization 47

Chapter 7. Preparatory lemmas for the main convergence result 517.1. Consequences of shattering 517.2. Estimates for Radon-Nikodym derivatives 537.3. Palm measures 587.4. Comparisons with complete Poissonization 59

v

vi CONTENTS

Chapter 8. Proof of convergence of the discrete-generation systems 69

Chapter 9. Convergence to Poisson of iterated recombination 75

Appendix A. An expectation approximation 79

Bibliography 81

Abstract

We investigate a continuous time, probability measure valued dynamical sys-tem that describes the process of mutation-selection balance in a context wherethe population is infinite, there may be infinitely many loci, and there are weakassumptions on selective costs. Our model arises when we incorporate very generalrecombination mechanisms into a previous model of mutation and selection fromSteinsaltz, Evans and Wachter (2005) and take the relative strength of mutationand selection to be sufficiently small. The resulting dynamical system is a flowof measures on the space of loci. Each such measure is the intensity measure ofa Poisson random measure on the space of loci: the points of a realization of therandom measure record the set of loci at which the genotype of a uniformly chosenindividual differs from a reference wild type due to an accumulation of ancestralmutations. Our main motivation for working in such a general setting is to provide abasis for understanding mutation-driven changes in age-specific demographic sched-ules that arise from the complex interaction of many genes, and hence to developa framework for understanding the evolution of aging.

Received by the editor October 16, 2009.2000 Mathematics Subject Classification. Primary 60G57, 92D15; Secondary 37N25, 60G55,

92D10.Key words and phrases. measure-valued, dynamical system, population genetics, quasi-

linkage equilibrium, Poisson random measure, Wasserstein metric, Palm measure, shadowing,

stability, attractivity.SNE supported in part by grants DMS-04-05778 and DMS-09-07630 from the National Science

Foundation (U.S.A.).

DR supported by grant K12-AG00981 from the National Institute on Aging (U.S.A.) and aDiscovery Grant from the National Science and Engineering Research Council (Canada).

KWW supported by grant P01-008454 from the National Institute on Aging (U.S.A.) and by

the Miller Institute for Basic Research in Science at U.C. Berkeley.

vii

CHAPTER 1

Introduction

A principal goal of population genetics is to understand how the mechanisms ofmutation, selection and recombination determine the manner in which the distribu-tion of genotypes in a population changes over time. The distribution of genotypesdictates, to a large degree, the distribution of phenotypes, an object that is bothmore readily observable and of more immediate practical interest. Conversely, thedifferential survival of genotypes is mediated almost exclusively by their phenotypicexpression.

More specifically, mutation accumulation models that incorporate the counter-vailing forces of recurring, slightly deleterious mutations and persistent selectionhave been staples of evolutionary theory [Bur00]. In order to use these models toexplain phenomena such as aging that are presumed to result from the combinationof many small mutational impacts, it is necessary to adopt a multilocus perspec-tive. However, as Pletcher and Curtsinger [PC98] point out, early progress in thisarea has relied on simplifying assumptions that are severely limiting and possi-bly unfounded: equal impacts from all mutations, additive effects of mutations onage-specific survival, and the existence of mutations that impact a specific, narrowrange of ages.

Several general multilocus formalisms have been proposed. Notable amongthese is that of [KJB02], which is designed to allow almost any conceivable regimeof selection, mating, linkage, mutation, and phenotypic effects. Such frameworkshave not been exploited for studies of aging, in part because so much detail iscounterproductive when what concerns us is not the fate of any individual allelebut rather the mass of overlapping, age-varying phenotypic effects that are centralto standard theories of aging.

K. Dawson [Daw99] applied a variant of the Kimura-Maruyama rare-allele ap-proximation [KM66, Bur00] (see also [Kon82]) to aging. While this less detailedview of the genome is more amenable to theoretical analysis, it is also not suitedto describing the interacting phenotypic contribution of multiple loci.

In a previous paper [SEW05], we proposed a model, which we also describebriefly here in the development leading to (1.4), that overcame many of these limita-tions. That model leads to computable solutions for mutation-selection equilibria,the hazard functions which such equilibria imply, and the time evolution of thepopulation distribution of genotypes.

In this work, we define and analyze a parallel model that, in essence, incorpo-rates recombination into the model of [SEW05]. The key assumptions behind themodel in this work are:

• the population is infinite,• the genome may consist of infinitely many or even a continuum of loci,• reproduction is sexual, in that each individual has two parents,

1

2 1. INTRODUCTION

• the mechanism of genetic recombination randomly shuffles together thegenomes of the parents in order to obtain the genome of the offspring,

• mating is random and individuals are identical except for their genotypes,so the population at any time can be completely described by a prob-ability measure on the space of possible genotypes — heuristically, theempirical distribution of genotypes in the population or, equivalently, theprobability distribution of the genotype of a randomly sampled individual,

• individuals are haploid — an individual has only one copy of each generather than copies from each of two parents,

• down a lineage, mutant alleles only accumulate — there is no back-mutation to cancel out the alleles introduced by earlier mutation events,

• fitness is calculated for individuals rather than for mating pairs,• a genotype becomes less fit when it accumulates additional mutant alleles,

but otherwise selective costs are arbitrary,• recombination acts on a faster time scale than mutation or selection —

the common quasi-linkage equilibrium (QLE) assumption.

Our general picture of the genome and the processes of mutation, selection andrecombination is similar to that of [BT91, KJB02]. Whereas [BT91, KJB02]invoke the QLE assumption to justify treating the effects of alleles at different locias nearly independent, we present a detailed asymptotic treatment in a standard,explicit scaling regime.

More specifically, we establish conditions under which a discrete-time processwith the above features converges to a continuous-time, deterministic, dynamicalsystem that has its state space the probability measures on the set of possiblegenotypes. At any point in time, the genotype distribution is in the complete linkageequilibrium represented by the distribution of a Poisson point process. The pointsof a realization of the Poisson process represent loci at which a randomly sampledindividual’s genotype has accumulated ancestral mutations away from an original“reference” genotype which we will call the wild type. Because a Poisson processis defined by its intensity measure, which in our case is a finite measure on theset of loci, the asymptotic model can be more simply described by a deterministicdynamical system that moves about in the space of such measures.

In order to establish such a convergence result it is, of course, not enough simplyto show that the distribution of the genotype of a randomly chosen individual ata fixed time converges to the distribution of a Poisson random measure. Rather,we must keep track of the accumulated perturbations that arise over time from theeffect of mutation, selection and recombination, and demonstrate the convergenceof the entire time evolution of the genotype distribution to a dynamical system ofPoisson distributions.

Moreover, we establish conditions under which the continuous-time dynamicalsystem has equilibria, investigate when the system converges to an equilibrium fromthe pure wild type genotype, and obtain results about the stability and attractivityof that particular equilibrium when it is present.

Our models provide a basis for the rigorous study of a number of questions inthe biodemography of longevity, as surveyed in [Wac03]:

• the adequacy of Charlesworth’s [Cha01] proposed explanation of Gom-pertz hazards and mortality plateaus as a consequence of mutation accu-mulation,

1.1. INFORMAL DESCRIPTION OF THE LIMIT MODEL 3

• the causes of the hyperexponential hazards described by Horiuchi [Hor03],• extensions to age-structured settings of Haldane’s principle that, in its

original form, equates the mutation rate to population decline of fitness,• contrasts between proportional and additive mutation effects on hazards

as discussed in [Bau05, Bau08],• the possibility of mortality rates diverging to infinity at ages before the

end of reproduction, which have been termed “Walls of Death”.We pursue these matters in the companion papers [WES08, WSE08].

1.1. Informal description of the limit model

In this section we describe the asymptotic model and motivate why it is rea-sonable that such a dynamical system should arise as the limit of a sequence ofdiscrete-time systems with the features listed above.

Denote by M the collection of loci in the portion of the genome that is ofinterest to us. There is a distinguished reference wild type genotype, and eachlocus represents a “position” at which the genotype of an individual may differfrom that of the wild genotype. We allow the set M to be quite general and donot necessarily think of it as a finite collection of physical DNA base positions or afinite collection of genes. For example, the proposed explanation for the Gompertzmortality curve and mortality plateaus at extreme ages in Charlesworth [Cha01]suggest taking M to be a class of functions from R+ to R+: the value of such afunction at time t ≥ 0 represents an additional increment to the mortality hazardrate at age t conferred by a mutation away from the wild type at this locus. Somestructure on M is necessary to accommodate rigorous probability theory, so wetake M to be a complete, separable metric space.

The genotype of an individual is specified by the set of loci at which there hasbeen a mutation somewhere along the ancestral lineage leading to that individual.More precisely, a genotype is an element of the space G of integer–valued finiteBorel measures onM. An element of G is a finite sum

∑i δmi , where δm is the unit

point mass at the locus m ∈ M. The measure∑i δmi corresponds to a genotype

that has ancestral mutations at loci m1,m2, . . .. The wild type genotype is thusthe null measure. We do not require that the loci mi ∈ M be distinct. We thusallow several copies of a mutation. This is reasonable, since we are not identifyingmutations with changes in nucleotide sequences in a one-to-one manner.

For example, ifM is finite, so we might as well takeM = {1, 2, . . . , N} for somepositive integer N , then G is essentially the Cartesian product NN of N copies ofthe nonnegative integers: A genotype is of the form

∑Nj=1 njδj , indicating that an

ancestral mutation is present nj times at locus j, and we identify such a genotypewith the nonnegative integer vector (n1, n2, . . . , nN ).

Recall that the population is infinite and all that matters about an individualis the individual’s genotype, so that the dynamics of the population are describedby the proportions of individuals with genotypes that belong to the various subsetsof G. We are thus led to consider a family of probability measures Pt, t ≥ 0, onG, where Pt(G) for some subset G ⊆ G represents the proportion of individualsin the population at time t that have genotypes belonging to G. Note that Pt(G)also may be thought of as the probability that an individual chosen uniformly atrandom from the population will have genotype belonging to the set G. In otherwords, Pt is the distribution of a random finite integer-valued measure on M. For

4 1. INTRODUCTION

example, if M = {1, 2, . . . , N} and we identify G with the Cartesian product NNas above, then Pt({(n1, n2, . . . , nN )}) represents the probability that an individualchosen uniformly at random from the population will have nj ancestral mutationsat locus j for j = 1, 2, . . . , N .

We next indicate how we model mutation, selection and recombination to obtainthe evolution dynamics for Pt, t ≥ 0.

Mutation alone. Suppose that there is only mutation and no selection or recom-bination. In this case all individuals present in the population at a give time dieand reproduce at the same rate because differing genotypes do not confer differingselective costs and mutations accumulate down lineages because they cannot bereplaced by recombination.

We describe the mutation process using a finite measure ν on the space of lociM, where ν(B) for B ⊆ M gives the rate at which mutations from the ancestralwild type belonging to the set B accumulate along a given lineage.

Write PtΦ =∫G Φ(g) dPt(g) for some test function Φ : G → R. That is, PtΦ

is the expected value of the real-valued random variable obtained by applying thefunction Φ to the genotype of an individual chosen uniformly at random from thepopulation. The content of our assumptions is that when there is only mutationPt, t ≥ 0, evolves according to

(1.1)d

dtPtΦ = Pt

(∫M

[Φ(·+ δm)− Φ(·)] ν(dm)).

For example, when M = {1, 2, . . . , N} we have the system of ordinary differ-ential equations

d

dtPt({n}) =

N∑j=1

ν({j}) [Pt({n− ej})− Pt({n})] ,

where ej is the jth coordinate vector. This equation is, of course, just a specialcase of the usual equation (see, for example, Section III.1.2 of [Bur00]) describingevolution due to mutation of type frequencies in a population where the set of typesis NN and mutation from type n to type n + ej occurs at rate ν({j}).

The evolution equation (1.1) is of the formd

dtPt = APt

for a certain linear operator A. We recognize that A is the the infinitesimal gener-ator of a G-valued Levy process, and hence (1.1) has the following explicit proba-bilistic solution. Let Π denote a Poisson random measure onM×R+ with intensitymeasure ν ⊗ λ, where λ is Lebesgue measure; that is, Π is a random integer-valuedBorel measure such that:

(1) The random variable Π(A) is Poisson with expectation (ν⊗λ)(A) for anyBorel subset A of M× R+.

(2) If A1, A2, . . . , An are disjoint Borel subsets of M× R+, then the randomvariables Π(Ak) are independent.

Define a G-valued random variable Zt (that is, Zt is a random finite integer-valuedmeasure on M) by

Zt :=∫M×[0,t]

δm dΠ((m,u)).

1.1. INFORMAL DESCRIPTION OF THE LIMIT MODEL 5

Then,PtΦ = E [Φ(W + Zt)] ,

where W is a random measure onM that has distribution P0 and is independent ofΠ. In particular, if P0 is itself the distribution of a Poisson random measure, thenPt will also be the distribution of a Poisson random measure. If we write ρt forthe intensity measure associated with Pt (that is, ρt is the measure on M definedby ρt(M) :=

∫G g(M) dPt(g) for M ⊆M), then ρt evolves according to the simple

dynamicsρt(M) = ρ0(M) + t ν(M).

Selection alone. Now suppose there is only selection and no mutation or recom-bination. We specify the fitnesses of different genotypes by a selective cost functionS : G → R+. The difference S(g′) − S(g′′) for g′, g′′ ∈ G is the difference inthe rate of sub-population growth between the sub-population of individuals withgenotype g′′ and the sub-population of individuals with genotype g′. We make thenormalizing assumption S(0) = 0 and suppose that

(1.2) S(g + h) ≥ S(h), g, h ∈ G,

in line with our assumption above that genotypes with more accumulated mutationsare less fit.

It follows that at time t ≥ 0 the per individual rate of increase of the proportionof the population of individuals with genotype g′ is PtS − S(g′). More formally,

d

dtPtΦ = −Pt(Φ · [S − PtS])

= −∫G

Φ(g′)[S(g′)−

∫GS(g′′) dPt(g′′)

]dPt(g′).

(1.3)

For example, when M = {1, 2, . . . , N} we have

d

dtPt({n′}) = −

[S(n′)−

∑n′′

Pt({n′′})S(n′′)

]Pt({n′}).

If S is non-epistatic, that is, S has the additive property

S(∑i

δmi) =∑i

S(δmi),

then the selective effects of different mutations don’t interact. In particular, if P0 isthe distribution of a Poisson random measure, then Pt will also be the distributionof a Poisson random measure and, writing ρt for the intensity measure associatedwith Pt as before, we have

ρt(dm′) = ρ0(dm′)−∫ t

0

[S(δm′)−

∫MS(δm′′)ρs(dm′′)

]ρs(dm′) ds.

However, if S is epistatic (that is, non-additive), then Pt will, in general, not bethe distribution of a Poisson random measure — even when P0 is.

Combining mutation and selection. If there is mutation and selection, but norecombination, then the appropriate evolution equation for Pt comes from simply

6 1. INTRODUCTION

combining equations (1.1) and (1.3):

d

dtPtΦ = Pt

(∫M

[Φ(·+ δm)− Φ(·)] ν(dm))

− Pt(Φ · [S − PtS]).(1.4)

This is the model introduced and analyzed at length in [SEW05] using the Feynman-Kac formula. WhenM = {1, 2, . . . , N}, (1.4) is a special case of the classical systemof ordinary differential equations for mutation and selection in continuous-time –see Section III.1.2 of [Bur00] for the derivation of an analytic solution that agreeswith the one that arises from a Feynman-Kac analysis.

Recombination alone. The effect of recombination is to choose an individualuniformly at random from the population at some rate and replace the individual’sgenotype g′ with a genotype of the form g′(· ∩M) + g′′(· ∩M c), where g′′ is thegenotype of another randomly chosen individual, M is a subset of M chosen ac-cording to a suitable random mechanism, and M c is the complement of M . Thus,recombination randomly shuffles together two different genotypes drawn from thepopulation. In order to specify the recombination mechanism fully, we would needto specify the recombination rate and the distribution of the segregating set M .Suppose, however, that we are in the following regime:

• recombination acts alone (that is, there is no mutation or selection),• the initial population P0 has the property that there does not exist anm ∈ M with P0({g ∈ G : g({m}) > 0}) > 0 (that is, no single mutationfrom the ancestral wild type is possessed by a positive proportion of theinitial population),

• the mechanism for choosing the segregating set M is such that, looselyspeaking, if m′ and m′′ are two loci then there is positive probability thatm′ ∈ M and m′′ ∈ M c (that is, no region of the genome M is immunefrom the shuffling effect of recombination).

Then, under these condition the probability measure Pt will converge as t→∞ tothe distribution of a Poisson random measure onM with the same intensity measureas P0. Moreover, the speed of this convergence increases with the recombinationrate and so if we take the the recombination rate to be effectively infinite, thenthe distribution Pt will be essentially Poisson for all t > 0 with the same intensitymeasure as P0, irrespective of the details of the recombination mechanism.

Combining mutation, selection and recombination. We have seen that ifP0 is the distribution of a Poisson random measure, then mutation preserves thisproperty. On the other hand epistatic selection drives the population distributionaway from Poisson, while increasing rates of recombination push it towards Poisson.Thus, when all three processes operate and we consider a limiting regime whererecombination acts on a much faster time scale than selection and recombination,we expect asymptotically that if the initial condition P0 is the distribution of aPoisson random measure on M, then Pt will also be the distribution of a Poissonrandom measure for all t > 0. Let ρt denote the intensity measure of Pt (so thatρt is a finite measure on the space M of loci). If we write Xπ for a Poissonrandom measure on M with intensity measure π, then we expect from combining

1.2. EXAMPLE I: MUTATION COUNTING 7

the observations above that ρt should satisfy the evolution equation

(1.5) ρt(dm) = ρ0(dm) + t ν(dm)−∫ t

0

E [S(Xρs + δm)− S(Xρs)] ρs(dm) ds.

We define the rigorous counterpart of (1.5) in Chapter 2 and establish theexistence and uniqueness of solutions. Furthermore, we show in Chapter 4 that ourdynamical equation is indeed a limit of a sequence of standard discrete generation,mutation-selection-recombination models.

Given that it involves computing an expected value for a quite general Poissonprocess, equation (1.5) may look rather forbidding. However, for certain reasonablechoices of selective costs the infinite sum can be evaluated explicitly, leading to asimpler and more intuitive looking system. We will consider three such cases in thefollowing three sections. First, though, we make a useful observation about howloci may be “clumped together” in (1.5).

Remark 1.1. Consider two instances (ρ′t)t≥0 and (ρ′′t )t≥0 of the dynamicalsystem (1.5) with respective locus spaces M′ and M′′, associated genotype spacesG′ and G′′, mutation intensity measures ν′ and ν′′, and selective costs S′ and S′′.Suppose that there is a Borel measurable map T from M′ onto M′′ such that:

• the initial measure ρ′′0 on M′′ is the push-forward of the initial measureρ′ on M′ by the map T ,

• the mutation intensity measure ν′′ on M′′ is the push-forward of themutation intensity measure ν′ on M′ by the map T ,

• the selective cost S′ on G′ has the property that S′(g′) = S′(h′) wheneverthe push-forwards of g′, h′ ∈ G′ by T are the same,

• the selective cost S′′(g′′) on G′′ is given by the common value of S′(g′) forall g′ ∈ G′ that have push-forward by T equal to g′′.

Then, for each t > 0, the measure ρ′′t on M′′ is the push-forward of the measureρ′t on M′ by the map T . Intuitively, we have a situation in which for a given forg′′ ∈ G′′ any two genotypes in the set T−1(g′′) ⊆ G′ are indistinguishable in termsof their associated selective cost, and so we may identify any two such genotypesas being the same. Of course, we cannot recover the finer description ρ′t from thecoarser one ρ′′t in general, but if we let P ′t and P ′′t be the population distributionsof genotypes corresponding to ρ′t and ρ′′t (that is, P ′t and P ′′t are the distributionsof Poisson random measures on M′ and M′′ with intensity measures ρ′t and ρ′′t ),then the push-forward of P ′t by S′ is the same as the push-forward of P ′′t by S′′:the population distributions of selective costs agree whether we use the fine or thecoarse description of genotypes.

1.2. Example I: Mutation counting

The simplest special case of our framework occurs when there are many locibut the selective cost of a genotype g only depends on the total number of locig(M) at which there have been ancestral mutations away from the wild type.

For example, suppose that the spaceM of loci is the unit interval [0, 1] and theselective cost S is of the form S(g) = s(g(M)) for some non-decreasing functions : N0 → R+ with s(0) = 0.

We may apply Remark 1.1 and “replace” the locus space M by a single point.Let q := ν(M) be the total rate at which mutations occur and write rt := ρt(M)for the expected number of ancestral mutations in the genotype of an individual

8 1. INTRODUCTION

chosen at random from the population at time t ≥ 0. It follows from Remark 1.1that the function r evolves autonomously according to the (non-linear) ordinarydifferential equation

r = q − ψ(r),

where

ψ(x) := x

∞∑k=0

(s(k + 1)− s(k))e−xxk

k!

= e−x∞∑k=1

s(k)xk

k!(k − x).

One example of this simplified model, which lines up with models familiarfrom earlier literature, assumes the cost per mutation to be a constant s, so thats(k) = sk and ψ(x) = sx, and the starting point to be the null genotype. In thiscase, we readily compute that the intensity at time t is rt = (q/s)

(1 − exp(−st)

).

The intensity converges monotonically to the equilibrium value q/s, the elementaryexpression for mutation-selection equilibrium going back to Haldane. Thus, thelimiting distribution for the number of mutations from the ancestral wild type isPoisson with mean q/s. Our assumption about recombination leads to simpleranswers than Kimura and Maruyama’s treatment of mutation counting withoutrecombination (see [KM66]).

When costs are multiplicative, in the sense that s(k) := 1 − exp(−kσ) forsome constant σ > 0, we have ψ(x) = ax exp(−ax), where a := 1 − exp(−σ). Anequilibrium exists only if the mutation rate q is below the maximum of r exp(−r),namely e−1, and in that case rt converges monotonically to the smallest positivesolution of the equation q − ψ(x) = 0. The solution is x = W (−q)/a, whereW (z) :=

∑∞n=1(−n)n−1 zn

n! is Lambert’s W function – the power series expansion isa simple application of the technique known variously as reversion of series or theLagrange inversion formula. These properties for multiplicative costs are general-ized in Section 3.2. Note also that when σ is small the equilibrium is approximatelyq/σ, as one would expect from the observation that in this case s(k) is approxi-mately σk for small k and hence this model is approximately the additive one ofthe previous paragraph with s = σ.

If we are interested only in how the population distribution of selective costsevolves, then we need consider only (rt)t≥0, rather than (ρt)t≥0. However, weshould be somewhat careful about how we interpret the biological import of thissimplification. We justified the dynamical system (1.5) as describing the evolvingpopulation distribution of genotypes defined in terms of the locus space M in apopulation undergoing mutation, selection and recombination. Mathematically, wesee that in this example we can replace the locus space [0, 1] by a single point forthe purposes of studying the dynamics of the distribution of selective costs, but thisdoes not mean that biologically the multilocus model is identical with a single locusmodel. As we show later, our instance of (1.5) with locus space [0, 1] arises as a limitof discrete generation models in which Poisson random measures appear becauseof the manner in which recombination breaks up and shuffles together genotypesfrom different individuals. Even though our instance of (1.5) with locus space asingle point is mathematically well-defined, it can’t arise as a limit of such discrete

1.3. EXAMPLE II: POLYNOMIAL SELECTIVE COSTS 9

generation models because with a single locus there is no way that recombinationcan drive genotype distributions towards Poisson.

1.3. Example II: Polynomial selective costs

Recall that when M = {1, 2, . . . , N} we can encode genotypes as orderedN−tuples of nonnegative integers, where the entry in the kth coordinate is thenumber of ancestral mutations present at locus k. Then, (1.5) becomes the systemof ordinary differential equations

(1.6)d

dtρt({j}) = ν({j})− ρt({j})

∑n

[S(n + ej)− S(n)]N∏k=1

e−ρt({k})ρt({k})nk

nk!.

As in the previous example, we should not think of {1, 2, . . . , N} as beingthe “real” locus space. Rather, we should imagine that there is something likea continuum of loci which may be partitioned into N sub-classes in such a waythat the loci in each sub-class have indistinguishable selective effects, and (1.6) isa reduced description that comes from applying the observation in Remark 1.1.

A natural family of selective costs is given by those of the polynomial form

S(g) =∑I

αIgI ,

where the sum is taken over all nonempty subsets I ⊆ {1, . . . , N} and we adoptthe usual multi-index convention that for a vector v the notation vI denotes theproduct

∏i∈I vi. The constant α{i} for 1 ≤ i ≤ N measures the selective cost of

mutation i alone, whereas the constant αI for a subset I ⊆ {1, . . . , N} of cardinalitygreater than one measures the selective cost attributable to interactions betweenall of the mutations in I over and above that attributable to interactions betweenmutations in proper subsets of I.

The system of ordinary differential equations (1.6) becomes

(1.7) ρk = νk −∑I∈Ik

αIρI , 1 ≤ k ≤ N,

where we write ρk := ρ({k}) and νk := ν({k}), and where Ik denotes the collectionof subsets of {1, . . . , N} that contain k, see [CE09].

It is shown in [CE09] that if νk > 0 for all k (that is, mutations may occur at allloci), α{i} > 0 for all i (that is, the individual effect of any mutation is deleterious,in keeping with our general assumption on the selective cost), and αI ≥ 0 forall subsets I (that is, the synergistic effects of individually deleterious mutationsare never beneficial), then the system of equations (1.7) has a unique equilibriumpoint in the positive orthant RN+ . Moreover, this equilibrium is globally attractive;that is, the system converges to the equilibrium from any initial conditions inRN+ . The condition αI ≥ 0 for all I certainly implies our standing assumption thatS(g+h) ≥ S(g) for all g, h ∈ G, but it is strictly stronger. It is also shown in [CE09]that the analogue of this result for general M holds with a suitable definition ofpolynomial selective costs in terms of sums of integrals against products of themeasure g with itself.

10 1. INTRODUCTION

1.4. Example III: Demographic selective costs

The following model is discussed in [WES08, WSE08], where there is a moreextensive discussion of the demographic assumptions.

For the moment, suppose that the space of loci M is general. Write `x(g) forthe probability that an individual with genotype g ∈ G lives beyond age x ∈ R+. Atage x, the corresponding cumulative hazard and hazard function are thus − log `x(g)and d/dx(− log `x(g)), respectively. Suppose that the infinitesimal rate that anindividual at age x ∈ R+ has offspring is fx, independently of the individual’sgenotype. For individuals with genotype g, the size of the next generation relativeto the current one is

∫∞0fx`x(g) dx.

Suppose further that there is a background hazard λ and that an ancestralmutation at locus m ∈M contributes an increment θ(m,x) to the hazard functionat age x. Thus, the probability that an individual with genotype g ∈ G lives beyondage x ∈ R+ is

`x(g) = exp(−λx−

∫Mθ(m,x) dg(m)

).

The corresponding selective cost is

S(g) =∫ ∞

0

fx `x(0) dx−∫ ∞

0

fx `x(g) dx

(recall that selective costs represent relative rates of increase, and we have adoptedthe normalizing convention that S(0) = 0).

Note that

S(g + δm)− S(g) =∫ ∞

0

(1− e−θ(m,x)

)fx `x(g) dx

and

E [`x(Xπ)] = exp(−λx−

∫M

(1− e−θ(m,x)

)π(dm)

)for any finite measure π onM (recall that Xπ is a Poisson random measure onMwith intensity measure π).

An important issue for demographic applications is whether the solution ρt,t ≥ 0, of (1.5) converges to an equilibrium ρ∞ as t → ∞ and, if so, what are thefeatures of that equilibrium – in particular, what we can say about E[`x(Xρ∞)],the probability that a randomly chosen individual from the equilibrium populationlives beyond age x. It is not hard to show that if the limit ρ∞ exists, then it mustbe absolutely continuous with respect to the mutation rate measure ν and have aRadon-Nikodym derivative r∞ that satisfies

1 = r∞(m)∫ ∞

0

(1− e−θ(m,x)) fx E[`x(Xρ∞)] dx.

We study such equilibria further in Section 3.8 in and in [WES08, WSE08].In particular, we show in [WES08] that if M = R+, the mutation rate measureν is absolutely continuous with respect to Lebesgue measure, and θ is of the formθ(m,x) = η(m)1x≥m, so that we consider mutations which each provide a pointmass increment to the hazard at a specific age, then the equilibrium equation turnsout to be equivalent to a second-order, non-linear, ordinary differential equation inone variable that can be solved explicitly.

1.5. COMMENTS ON THE LITERATURE 11

1.5. Comments on the literature

We make some brief remarks about the substantial literature on multilocusdeterministic models in population genetics and its relation to our work.

A very comprehensive recent reference is Reinhard Burger’s book [Bur00] (seealso Burger’s review paper [Bur98]). As well as giving an overview of the classi-cal models for finitely many alleles at each of a finite number of loci, these worksconsider at length deterministic haploid continuum-of-alleles models in which in-dividuals have a type that is thought of as the contribution of a gene to a givenquantitative trait. The type belongs to a general state space that represents some-thing like the trait value (in which case the state space is a subset of R) and isoften thought of as the combined effect of a multilocus genotype. Each type hasan associated fitness, which is some fairly arbitrary function from the type spaceto (0, 1]. However, the models do not explicitly incorporate a family of loci, theconfiguration of alleles present at those loci, or a function describing the fitness ofa configuration. Rather, everything is cast in terms of how fit each type is and howlikely one type is to mutate into another.

Certain classes of mutation-selection models without recombination are solvedexplicitly in [WBG98, BW01] using ideas from statistical mechanics. Such modelsmay be thought of as either multilocus systems with complete linkage or structuredsingle locus systems. We also mention the constellation of papers [BB03, Baa05,Baa07, Baa01] presenting a deterministic model of population change due torecombination alone.

Finally, we make two comments about our model in order to distinguish it fromothers in the literature that superficially might seem to have similar features.

• We do not present a Markovian stochastic model such as the one in[SH92], where the population is described in terms of an evolving Pois-son random measure that keeps track of the proportion of alleles at eachsite that are mutant (with a typical site being purely wild type and onlyexceptional sites having a positive proportion of mutants present). Forus, the evolution of the proportions of different genotypes in the popula-tion is described by a deterministic dynamical system living on a space ofprobability measures. If we sample from the evolving probability measureat some fixed time, then the resulting individual’s genotype is a Poissonrandom measure.

• Linkage arises in our discrete-time approximation models as the depen-dence between loci, a natural consequence of non-additive selection costs,is only partially broken up in any finite number of rounds of recombina-tion. However, linkage does not appear in the limit model. That is, ifwe sample from the limit population at some time, then the fact that theresulting individual’s genotype is described by a Poisson random mea-sure means that the presence of ancestral mutations from wild type inone part of the genotype is independent of the presence of mutations inanother part. Our limit theorem may therefore be thought of as delineat-ing the relative strengths of mutation, selection and recombination thatlead asymptotically to a situation in which the Poissonizing effect of re-combination wins out over the interactions introduced by non-additiveselection. Understanding how these two forces counteract each other isfar from trivial and is the most demanding technical task of the present

12 1. INTRODUCTION

work. We cannot stress too strongly that we have not a priori assumedthat linkage is not present.

1.6. Overview of the remainder of the work

We define the measure-valued dynamical system (1.5) rigorously in Chapter 2and establish the existence and uniqueness of solutions.

We investigate in Chapter 3 whether the dynamical system has equilibria andwhether these equilibria are stable and attractive.

We devote Chapters 4 to 8 to showing that the dynamical system is a limit ofdiscrete-generation, infinite-population models. We define the discrete-generationmodels in Chapter 4 and we state the convergence theorem and its hypotheses inChapter 5. As a first step towards the proof of the convergence result, we showin Chapter 6 that an analogous convergence result holds when the recombinationmechanism is replaced by a complete Poissonization operation that destroys alllinkage between loci in a single step. The proof of the actual convergence result isquite involved and requires a number of technical estimates that, loosely speaking,bound the extent to which selection reintroduces linkage that has been partiallyremoved by recombination. We present these preliminary results in Chapter 7. Wecomplete the proof of the convergence theorem in Chapter 8.

The passage from the discrete-generation mutation-selection-recombination modelto its continuous-time limit involves the “linearization” of operators that describemutation and selection. This linearization is essentially a consequence of the factthat log E

[e−T

]is approximately −E[T ] when the nonnegative random variable T

is close to the being constant. A simple quantitative version of this observation isused several times in this work. We state the result and prove it in Appendix A.

CHAPTER 2

Definition, existence, and uniqueness of thedynamical system

2.1. Spaces of measures

Recall from the Introduction that among the fundamental ingredients in ourmodel are:

• the complete, separable metric space M of loci;• a finite Borel measure ν onM called the “mutation measure”, because it

describes the rate at which mutations occur in regions of the genome;• the space G of integer–valued finite Borel measures onM – an element ofG represents a genotype thought of as the set of loci at which there havebeen ancestral mutations away from the reference wild type, so that, inparticular, the null measure represents the wild genotype.

Recall also that the state of the population at time t ≥ 0 in our model is aprobability measure Pt on G that is the distribution of a Poisson random finitemeasure on M. The distribution of such a Poisson random measure is determinedby its intensity measure, which is a finite Borel measure on M. The followingnotation will be useful:

Notation 2.1. Denote by H the space of finite signed Borel measures on M.Let H+ be the subset of H consisting of nonnegative measures.

For π ∈ H, write π+, π− ∈ H+ for the positive and negative parts of π appearingin the Hahn-Jordan decomposition. Thus, π = π+ − π−.

We next recall the definition of the Wasserstein metric (also transliterated asVasershtein, or otherwise). This metric provides a unified way of topologizing thevarious spaces of measures that we use.

Notation 2.2. Given a metric space (E, d), let Lip be the space of functionsf : E → R such that

(2.1) ‖f‖Lip := supx|f(x)|+ sup

x 6=y

|f(x)− f(y)|d(x, y)

<∞.

Define a norm ‖ · ‖Was on the space of finite signed Borel measures on (E, d) by

(2.2) ‖π‖Was := sup {|π[f ]| : ‖f‖Lip ≤ 1} ,where

(2.3) π[f ] :=∫fdπ.

An extensive account may be found in [Rac91, RR98], while the propertiesused here are described in Problem 3.11.2 of [EK86].

13

14 2. DEFINITION, EXISTENCE, AND UNIQUENESS OF THE DYNAMICAL SYSTEM

We may build Wasserstein metrics on top of Wasserstein metrics. We start bytaking M for the metric space X. Then G, as a space of measures on M, has aWasserstein metric. Next we take G with its Wasserstein metric for the metric spaceX, and obtain a Wasserstein metric on the finite signed measures on G, includingthe measures Pt. Also, withM again playing the role of X, we obtain a Wassersteinmetric on H.

We note that the designation “Wasserstein metric” is often applied to the anal-ogous definition where the constraining Lipschitz norm ‖ · ‖Lip does not includethe supremum norm term. This latter distance is always greater than or equal tothe Wasserstein metric as we are defining it. It is equivalent to the Kantorovich-Rubinstein distance, which is a member of the class of Monge-Kantorovich dis-tances, a class defined by a single parameter p; the Kantorovich-Rubinstein dis-tance is the element of this class corresponding to p = 1. Details may be found in[Vil03, Section 7.1], [Vil09, Chapter 6] or [AGS05, Section 7.1].

2.2. Definition of the dynamical system

Recall that the state of the population at time t ≥ 0 is given by a probabilitymeasure Pt on G that is the distribution of a Poisson random measure with intensitymeasure ρt. Our informal description of the evolution of ρt, and hence of Pt, is (seeequation (1.5))

(2.4) ρt(dm) = ρ0(dm) + t ν(dm)−∫ t

0

E [S(Xρs + δm)− S(Xρs)] ρs(dm) ds,

where we write Xπ for a Poisson random measure on M with intensity measureπ ∈ H+.

In order to formalize this definition, it will be convenient to introduce thefollowing two objects.

Definition 2.3.

• Define F :M×H+ → R+ by

(2.5) Fπ(x) := E[S(Xπ + δx)− S(Xπ)

]for x ∈M and π ∈ H+.

• Define the operator D : H+ → H+ by

(2.6)d(Dπ)dπ

(x) := Fπ(x).

That is, for any bounded f :M→ R,

(2.7)∫Mf(x) (Dπ)(dx) =

∫Mf(x)Fπ(x)π(dx).

Formally, a solution to (1.5) is anH+-valued function ρ that is continuous (withrespect to the topology of weak converge of measures) and satisfies

(2.8) ρt = ρ0 + tν −∫ t

0

Dρs ds

for all t ≥ 0.Equation (2.8) involves the integration of a measure-valued function, and such

an integral can have a number of different meanings. We require only the following

2.3. EXISTENCE AND UNIQUENESS OF SOLUTIONS 15

notion: If η : R+ → H is a a Borel function, then for t ≥ 0 the integral It =∫ t

0ηs ds

is the element of H satisfying

(2.9) It(A) =∫ t

0

ηs(A) ds for every Borel A ⊆M.

This integral certainly exists (and is unique) if the function η is continuous withrespect to the topology of weak convergence of measures. For more informationabout integration on infinite dimensional spaces, see chapter 2 of [DU77].

2.3. Existence and uniqueness of solutions

Theorem 2.4. Fix a mutation measure ν ∈ H+ and a selective cost S : G →R+, that satisfies the conditions

• S(0) = 0,• S(g) ≤ S(g + h) for all g, h ∈ G,• for some constant K,

∣∣S(g)− S(h)∣∣ ≤ K∥∥g − h∥∥

Was, for all g, h ∈ G.

Then, equation (2.8) has a unique solution for any ρ0 ∈ H+.

Proof. Fix a time horizon T > 0 and let c > 0 be a constant that willbe chosen later. Write C([0, T ],H) for the Banach space of continuous H-valuedfunctions on [0, T ], equipped with the norm

‖α‖c = sup0≤t≤T

e−ct‖αt‖Was.

Denote by Γ the closed subset of C([0, T ],H) consisting of H-valued functionsα with α0 = ρ0 and

α+t (M) ≤ ρ0(M) + tν(M)

for 0 ≤ t ≤ T (recall from Notation 2.1 that the measure α+t is the positive part in

the Hahn-Jordan decomposition of the signed measure αt).Define a map ∆ : H → H by

(∆α)t = ρ0 −∫ t

0

Dα+s ds+ tν.

Note that ∆ maps Γ into itself. Moreover, for α, β ∈ Γ,

‖∆α−∆β‖c ≤ sup0≤t≤T

e−ct∫ t

0

∥∥∆α+s −∆β+

s

∥∥Was

ds

≤ sup0≤t≤T

e−ct∫ t

0

K(2 + 6

{α+s (M) ∧ β+

s (M)})∥∥αs − βs∥∥Was

ds

by Lemma 2.9 below

≤ sup0≤t≤T

e−ct∫ t

0

K(2 + 6

{ρ0(M) + sν(M)

})ecs∥∥α− β∥∥

cds

≤ sup0≤t≤T

e−ct[K(2 + 6ρ0(M))

ect − 1c

+ 6Kν(M)(ct− 1)ect + 1

c2

]‖α− β‖c.

Thus, ∆ : Γ→ Γ is a contraction, provided c is chosen sufficiently large.It follows from the Contraction Mapping Theorem that the equation

(2.10) ρt = ∆ρt = ρ0 −∫ t

0

Dρ+s ds+ tν


has a unique solution in Γ. Furthermore, any function in H that is a solution to(2.10) must automatically be in Γ. Therefore, the solution is unique.

This unique solution is a function taking values in the set of signed measures H,and it remains to show that it actually takes values in the subset H+ of nonnegativemeasures. For any Borel set A ⊆M,

ρt(A) = ρ0(A)−∫ t

0

∫A

Fρ+s (x) dρ+s (x) ds+ tν(A).

In particular, t 7→ ρt(A) is continuous. For any Borel set A ⊂M we have

ρ+t (A) ≤ ρ0(A) + tν(A).

Since Fπ(x) ≤ K for all x,

ρt(A) ≥ ρ0(A)−K∫ t

0

(ρ0(A) + sν(A)

)ds+ tν(A)

= (1−Kt)ρ0(A) + t

(1− Kt

2

)ν(A).

Hence, ρt(A) ≥ 0 for 0 ≤ t ≤ 1/K. Because this is true for all A, we have ρt ∈ H+

for 0 ≤ t ≤ 1/K. Iterating this argument, with the time 0 replaced successively bythe times 1/K, 2/K, . . . gives the result. �

2.4. Lemmas used in the proof of existence and uniqueness

We assume in this section that the hypotheses of Theorem 2.4 hold.

Lemma 2.5. The function Fπ is in Lip for each finite measure π ∈ H+, and

supπ∈H+

‖Fπ(·)‖Lip ≤ 2K.

Proof. By definition,

‖Fπ(·)‖Lip = supx|E[S(Xπ + δx)− S(Xπ)]|

+ supx 6=y|E[S(Xπ + δx)− S(Xπ)]− E[S(Xπ + δy)− S(Xπ)]| /d(x, y)

≤ supxK‖δx‖Was + sup

x 6=yK‖δx − δy‖Was/d(x, y)

≤ 2K.

�

Notation 2.6. Given a measure q ∈ H+, define Πq to be the distributionof the Poisson random measure with intensity q. Thus, if P is the distributionof a Poisson random finite measure on M (that is, a Poisson distributed randomelement of G), then P = ΠµP , where µP ∈ H+ is the intensity measure associatedwith P (that is, µP [F ] =

∫G g[F ] dP (g)).

The next result probably already exists in some form in the literature, but wehave been unable to find a reference. It says that if the intensities of two Poissonrandom measures are close in the Wasserstein sense, then the same is true of theirdistributions. In the statement of the result, the Wasserstein distance on the left-hand side of the inequality is between probability measures on the space G, whilethe Wasserstein distance on the right-hand side is between finite measures on M.

2.4. LEMMAS USED IN THE PROOF OF EXISTENCE AND UNIQUENESS 17

Lemma 2.7. For two finite measures π′, π′′ ∈ H+

∥∥Ππ′ −Ππ′′∥∥

Was≤ 4∥∥π′ − π′′∥∥

Was.

Proof. Fix f : G → Lip with ‖f‖Lip ≤ 1. Fix π′, π′′ ∈ H+. If π′ = π′′ = 0there is nothing to prove. Therefore, suppose without loss of generality that π′ 6= 0and π′(M) ≥ π′′(M). Set π∗ = (π′′(M)/π′(M))π′ ∈ H+. We have

∣∣Ππ′ [f ]−Ππ′′ [f ]∣∣ ≤ ∣∣Ππ′ [f ]−Ππ∗ [f ]

∣∣+∣∣Ππ∗ [f ]−Ππ′′ [f ]

∣∣.

Note that if Xπ′ and Xπ∗ are any two Poisson random measures on the sameprobability space with distributions Ππ′ and Ππ∗ , respectively, then

∣∣Ππ′ [f ]−Ππ∗ [f ]∣∣ =

∣∣E [f(Xπ′)− f(Xπ∗)]∣∣

≤ 2P{Xπ′ 6= Xπ∗

}

because |f(g)| ≤ 1 for all g ∈ G. In particular, if we first build Xπ′ and then con-struct Xπ∗ by the usual “thinning” procedure of independently keeping each pointof Xπ′ with probability π′′(M)/π′(M) and discarding it with the complementaryprobability, we have

∣∣Ππ′ [f ]−Ππ∗ [f ]∣∣ ≤ 2P

{Xπ′(M) 6= Xπ∗(M)

}≤ 2

∞∑k=0

e−π′(M)π

′(M)k

k!

[1−

(π′′(M)π′(M)

)k]

≤ 2∞∑k=0

e−π′(M)π

′(M)k

k!k

[1− π′′(M)

π′(M)

]= 2π′(M)

[1− π′′(M)

π′(M)

]= 2∣∣π′(M)− π′′(M)

∣∣≤ 2∥∥π′(M)− π′′(M)

∥∥Was

.


Setting r = π∗(M) = π′′(M),

∣∣Ππ∗ [f ]−Ππ′′ [f ]∣∣

≤∞∑k=0

e−r

k!

∣∣∣∣∣∫. . .

∫f

(k∑`=1

δy`

)π∗(dy1) · · ·π∗(dyk)

−∫. . .

∫f

(k∑`=1

δy`

)π′′(dy1) · · ·π′′(dyk)

∣∣∣∣∣≤∞∑k=0

e−r

k!

k−1∑m=0

∣∣∣∣∣∫. . .

∫f

(k∑`=1

δy`

)π∗(dy1) · · ·π∗(dym)π′′(dym+1) · · ·π′′(dyk)

−∫. . .

∫f

(k∑`=1

δy`

)π∗(dy1) · · ·π∗(dym+1)π′′(dym+2) · · ·π′′(dyk)

∣∣∣∣∣≤∞∑k=0

e−r

k!rk−1k sup

g∈G

∣∣∣∣∫ f(g + δz)π∗(dz)−∫f(g + δz)π′′(dz)

∣∣∣∣≤ ‖π∗ − π′′‖Was

≤ ‖π′ − π′′‖Was +∣∣π′(M)− π′′(M)

∣∣≤ 2‖π∗ − π′′‖Was.

Putting these bounds together yields

∣∣Ππ′ [f ]−Ππ′′ [f ]∣∣ ≤ 4‖π′ − π′′‖Was,

as required. �

Lemma 2.8. For two finite measures π′, π′′ ∈ H+,

supx∈M

|Fπ′(x)− Fπ′′(x)| ≤ 8K‖π′ − π′′‖Was.

Proof. Fix x ∈ M. Define G : G → R+ by G(g) := (S(g + δx) − S(g))/2K.Then, ‖G‖Lip ≤ 1 by the Lipschitz assumption on S. Note that Fπ(x) = 2KΠπ[G].By definition of the Wasserstein distance,

|Fπ′(x)− Fπ′′(x)| = 2K∣∣Ππ′ [G]−Ππ′′ [G]

∣∣ ≤ 2K∥∥Ππ′ [G]−Ππ′′ [G]

∥∥Was

.

The lemma now follows from Lemma 2.7. �

Lemma 2.9. For two finite signed measures σ, τ ∈ H,

‖Dσ+ −Dτ+‖Was ≤ K(2 + 8{σ+(M) ∧ τ+(M)}) ‖σ − τ‖Was.

2.4. LEMMAS USED IN THE PROOF OF EXISTENCE AND UNIQUENESS 19

Proof. Suppose without loss of generality that σ+(M) ≤ τ+(M). By Lem-mas 2.5 and 2.8,∣∣∣∫ f(x)Dσ+(dx)−

∫f(x)Dτ+(dx)

∣∣∣≤∣∣∣∣∫ F (x, σ+)f(x)σ+(dx)−

∫F (x, τ+)f(x)σ+(dx)

∣∣∣∣+∣∣∣∣∫ F (x, τ+)f(x)σ+(dx)−

∫F (x, τ+)f(x) τ+(dx)

∣∣∣∣≤ 8K‖σ+ − τ+‖Was‖f‖∞σ+(M) + 2K‖f‖Lip‖σ+ − τ+‖Was

≤ (2K + 8Kσ+(M))‖f‖Lip‖σ − τ‖Was,

where we have used the fact that ‖f ′f ′′‖Lip ≤ ‖f ′‖Lip‖f ′′‖Lip for f ′, f ′′ ∈ Lip. �

CHAPTER 3

Equilibria

One of the primary problems concerning our dynamical systems is to under-stand their asymptotic behavior. We begin the analysis of these asymptotics byidentifying when fixed points of the motion exist, and then examining whether thereis convergence to these fixed points from suitable initial conditions.

We assume throughout this chapter that the assumptions of Theo-rem 2.4 always hold.

Definition 3.1. A finite fixed point or equilibrium for the motion (the termswill be used interchangeably) is a measure ρ∗ ∈ H+ at which the driving vectorfield vanishes. That is, ρ∗ is absolutely continuous with respect to ν, with Radon-Nikodym derivative satisfying

(3.1) Fρ∗dρ∗dν

= 1.

The equilibrium ρ∗ is called stable if for every neighborhood V of ρ∗ thereis a neighborhood U ⊂ V , such that ρt ∈ V for all times t if ρ0 ∈ U . It iscalled attractive if it is stable and there is a neighborhood U0 of ρ∗ such thatlimt→∞ ρt = ρ∗ whenever ρ0 ∈ U0.

We introduce the terms box-stable and box-attractive when the above definitionshold if “neighborhoods” in the above definitions are replaced by sets of the form

(3.2) B(ρ, ρ′) :={ρ : ρ < ρ < ρ′

},

where ρ ≤ ρ∗ ≤ ρ′, and both measures ρ− ρ∗ and ρ′ − ρ∗ are mutually absolutelycontinuous with respect to ν.

Remark 3.2. Note that box-stability (respectively, box-attractivity) is a weakercondition than stability (respectively, attractivity), because boxes do not containopen neighborhoods in the topology induced by the Wasserstein metric (that is,the topology of weak convergence), but open neighborhoods do contain boxes.

Of course, it is not obvious that the motion needs to have fixed points, since νcould dominate all fitness costs. This possibility is easy to see in the one-dimensionalcase (when M is a single point), which we described in Section 1.2 and which werevisit in Section 3.1. However, we show in Section 3.4 that at least one fixedpoint exists when the mutation measure ν is small enough. In order to go further,we need to impose additional assumptions. For the case of multiplicative selectioncosts, Section 3.2 gives a complete description of the fixed points, of which there canbe 0, 1 or 2. In the remainder of this chapter we then impose a weaker assumption,namely that the selective cost is concave, in the sense that the marginal cost ofan additional mutation decreases as more mutations are added to the genotype.(The formal definition is given as the last condition in Theorem 3.11.) Underthis assumption, we show in Section 3.5 that trajectories starting from 0 increase

21

22 3. EQUILIBRIA

monotonically, and we give a sufficient condition for them to converge to a finiteequilibrium. This equilibrium is dominated by any other equilibrium. If the drivingvector field at nearby points above this minimal equilibrium are in the negative“orthant”, then trajectories starting above the equilibrium will converge down toit, so the minimal equilibrium will be box attractive. We give an iterative procedurefor computing the minimal equilibrium that avoids following the dynamical systemto large times in Section 3.6. Moreover, the iteration is a useful theoretical tool inSection 3.7, where we establish that the equilibria shown to exist for suitably smallν in Section 3.4 are in fact the same as the minimal equilibria in the concave settingand that these equilibria are then box stable. Finally, in Section 3.8 we apply theabove results to the demographic selection cost introduced in Section 1.4

These results cover many cases of substantive interest, although they do notsettle all relevant questions. Cases with very small ν(M) do not wholly put on dis-play the rich array of differences between the full non-linear model and non-epistaticadditive models. However, the conditions of Corollary 3.14 and Theorem 3.15 canoften be verified in specific cases, and we have found that they can hold when ν(M)is only moderately small.

The example discussed in [WSE08] provides numerical evidence that there canbe convergence to an equilibrium even without uniformly monotone growth.

3.1. Introductory example: One-dimensional systems

Suppose as in Section 1.2 that the space of loci M consists of a single point.Then, the space of genotypes G may be identified with the natural numbers N0, theselective cost S is simply an increasing function from N0 to R+, and the mutationmeasure ν is a positive constant. The space H+ of finite measures measures on Gmay also be identified with R+. The dynamical system (ρt)t≥0 is R+-valued andsatisfies

(3.3) ρt = ν − Fρtρt,

where

(3.4) Fρρ = e−ρ∞∑k=1

ρk(k − ρ)k!

S(k).

The system has equilibria at solutions to the equation ρFρ = ν and we discussedsome special cases in Section 1.2. In general, it is possible to construct selectivecosts for which the number of equilibria is arbitrarily large for a given mutationrate. For example, suppose the selective cost has magnitude 1 for 1 to 5 mutationsand magnitude 2 for 6 or more. The resulting function −ρFρ is shown in Figure 1.The number of equilibria may be 0, 1, 2, 3, or 4, depending on the value of ν.

However, there are a few things that we can say quite generally about one-dimensional systems:

• As long as S is not identically 0, the same will be true for Fρ, so therewill be at least one equilibrium for ν sufficiently small.

• The smallest equilibrium will be attractive, unless it corresponds to alocal minimum of −Fρρ, in which case it will attract only trajectoriescoming from below; trajectories starting above the equilibrium will thenbe repelled.

3.2. INTRODUCTORY EXAMPLE: MULTIPLICATIVE SELECTIVE COST 23

0 5 10 15 20

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

ρ

−ρF

ρ

Figure 1. Plot of the function −ρFρ when S(k) = 1{k≥1} + 1{k≥7}.

• If S is bounded above, then −Fρρ is bounded below, and so there is noequilibrium for ν sufficiently large.

3.2. Introductory example: Multiplicative selective cost

It is probably apparent to the reader by now that one cannot hope to getanything like a closed-form solution for the dynamical system (2.8) for generalselective costs. However, we first show in this section that it is possible to solve(2.8) explicitly when the selective cost has the form

(3.5) S(g) = 1− exp(−∫Mσ(m) dg(m)

)for some σ : M → R+. Moreover, it is possible to analyze in this case whetherlimt→∞ ρt exists, and we will couple this analysis with a comparison argumentin later sections to give sufficient conditions for the existence of a limit for moregeneral selective costs.

24 3. EQUILIBRIA

Suppose that ρ0 is absolutely continuous with respect to ν. It follows from(2.8) that ρt will have a density φt against ν, for all t ≥ 0. These densities willsolve the equation

dφt(m)dt

= 1− E[S(Xρt + δm)− S(Xρt)]φt(m).

Put bk(t) :=∫

exp(−kσ(m))φt(m) dν(m) for k ∈ N, then

(3.6)dφt(m)dt

= 1− (1− exp(−σ(m))) exp(b1(t)− b0(t))φt(m).

This is an ordinary differential equation for each m ∈M that has the solution

φt(m) = exp(−∫ t

0

(1− exp(−σ(m))) exp(b1(s)− b0(s)) ds)

×[φ0(m) +

∫ t

0

exp(−∫ s

0

(1− exp(−σ(m))) exp(b1(r)− b0(r)) dr)ds

].

Thus, we have reduced the (in general) infinite collection of coupled ordinary dif-ferential equations to the problem of finding two functions, b0 and b1.

Of course, this will benefit us only if we have an autonomous system of equationsin just the functions b0 and b1. To this end, set ak :=

∫exp(−kσ(m)) dν(m) for

k ∈ N. Then, from (3.6) we have

(3.7)dbk(t)dt

= ak + exp(b1(t)− b0(t))(bk+1(t)− bk(t)).

Introduce the generating functions

A(z) :=∞∑k=0

akzk

k!

and

B(z, t) :=∞∑k=0

bk(t)zk

k!.

The system of ordinary differential equations (3.7) then becomes the first order,linear partial differential equation

∂B(z, t)∂t

= A(z) + exp(b1(t)− b0(t))[∂B(z, t)∂z

−B(z, t)].

We may find the general solution of this PDE via the method of characteristiccurves. Once we get the general solution (which will involve the unknown functionsb0 and b1), we have to impose the conditions

B(0, t) = b0(t)

and∂B(0, t)∂z

= b1(t)

to get the solution we are after.Using Mathematica, the PDE has the solution

B(z, t) =∫ t

0

exp(−∫ t

u

exp(b1(s)− b0(s)) ds)A

(z +

∫ t

u

exp(b1(s)− b0(s)) ds)du

+ exp(−∫ t

0

exp(b1(s)− b0(s)) ds)C

(z +

∫ t

0

exp(b1(s)− b0(s)) ds)

3.2. INTRODUCTORY EXAMPLE: MULTIPLICATIVE SELECTIVE COST 25

where C(z) :=∑∞k=0 bk(0) z

k

k! .Therefore, the functions b0 and b1 solve the system of equations

b0(t) =∫ t

0

exp(−∫ t

u

exp(b1(s)− b0(s)) ds)A

(∫ t

u


+ exp(−∫ t

0

exp(b1(s)− b0(s)) ds)C

(∫ t

0

exp(b1(s)− b0(s)) ds)

and

b1(t) =∫ t

0

exp(−∫ t

u

exp(b1(s)− b0(s)) ds)A′(∫ t

u


+ exp(−∫ t

0

exp(b1(s)− b0(s)) ds)C ′(∫ t

0

exp(b1(s)− b0(s)) ds).

Note that this system of ordinary differential equations it autonomous: it dependsonly on the unknown functions b0 and b1 themselves, together with the knownfunctions A (determined by the given mutation and selection) and C (determined bythe initial conditions). Hence, these equations could at least be solved numerically.(Of course, we started with a system of as many equations as there are points inM. This reduction to a system of two coupled ordinary differential equations isonly advantageous if the space M has more than two points.)

We can now obtain a necessary and sufficient condition for the existence of anequilibrium for (2.8) when the selection cost is the multiplicative one of (3.5). Notethat if ρ∗ is an equilibrium, then we have

(3.8) ν(dm)− E[S(Xρ∗ + δm)− S(Xρ∗)] ρ∗(dm) = 0,

and so ρ∗ has a density φ∗ against ν that satisfies

φ∗(m) =exp(b∗0 − b∗1)

1− exp(−σ(m)),

where b∗k :=∫

exp(−kσ(m))φ∗(m) dν(m) for k ∈ {0, 1} (cf. (3.6)). Therefore, anequilibrium exists if and only if∫

11− exp(−σ(m))

dν(m) <∞

and there is a constant c > 0 such that

c =∫

exp(c)1− exp(−σ(m))

dν(m)−∫

exp(−σ(m))exp(c)

1− exp(−σ(m))dν(m)

= exp(c) ν(M),

in which case c = b∗0 − b∗1. Such a constant exists if and only if

ν(M) ≤ supx≥0

x exp(−x) = e−1.

Note that there are three possible cases:• If ν(M) < e−1, then there are two equilibria, corresponding to the two

distinct solutions of ce−c = ν(M). The equilibrium corresponding to thesmaller c is attractive, while that corresponding to the larger c is unstable.

• If ν(M) > e−1, then there is no equilibrium.• If ν(M) = e−1, then there is a single equilibrium, which is unstable.

26 3. EQUILIBRIA

3.3. Frechet derivatives

We need to introduce some machinery on derivatives of curves and vector fieldsin order to analyze equilibria in more generality. Consider a Banach space (X, ‖·‖X)and a closed convex cone X+ ⊆ X. For x ∈ X+, let Ux be the closed convex cone{t(x′−x) : t ≥ 0, x′ ∈ X+}. Consider another Banach spaces.(Y, ‖·‖Y ). Extendingthe usual definition slightly, we say that a map Φ : X+ → Y , is Frechet differentiableat x ∈ X+, if there is map DxΦ : Ux → Y with the properties

limx′→x, x′∈X+

‖x′ − x‖−1X

(Φ(x′)− Φ(x)−DxΦ[x′ − x]

)= 0,

DxΦ[tz] = tDxΦ[z] for t ≥ 0 and z ∈ Ux, DxΦ[z′ + z′′] = DxΦ[z′] + DxΦ[z′′] forz′, z′′ ∈ Ux, and, for some constant C, ‖Dx[z]‖Y ≤ C‖z‖X for z ∈ Ux. It is notdifficult to show that if Φ is differentiable at x, then DxΦ is uniquely defined.

As usual, we say that a curve ψ : I → X, where I ⊆ R is an interval, isdifferentiable at t ∈ I if the limit

ψt = limt′→t

(t′ − t)−1(ψt′ − ψt)

exists.For the sake of completeness, we record the following standard fundamental

theorem of calculus and chain rule.

Lemma 3.3. Consider an interval I ⊆ R and a Banach spaces (X, ‖ · ‖X).Suppose that a curve ψ : I → X is differentiable at every t ∈ I and the curvet 7→ ψt is continuous. Then,

ψb − ψa =∫ b

a

ψt dt

for all a, b ∈ I with a < b.

Lemma 3.4. Consider an interval I ⊆ R, two Banach spaces (X, ‖ · ‖X) and(Y, ‖·‖Y ), and a closed convex cone X+ ⊆ X. Suppose for some t ∈ I that the curveψ : I → X+ is differentiable at t ∈ I and the map Φ : X+ → Y is differentiable atψt. Then, the curve Φ ◦ ψ : I → Y is differentiable at t with derivative DψtΦ[ψt].

We also have a particular analogue of the product rule. Write Cb(M,R) forthe Banach space of bounded continuous functions fromM to R equipped with thesupremum norm.

Lemma 3.5. Consider an interval I ⊆ R and two curves γ : I → H+ andf : I → Cb(M,R). Suppose that γ and f are differentiable at t ∈ I. Define acurve β : I → H+ by βu := fu · γu, u ∈ I; that is, βu is the element of H that hasRadon-Nikodym derivative fu with respect to γu. Then, β is differentiable at t with

βt = ft · γt + ft · γt.

Proof. The follows from Lemma 3.4 with X = Cb(M,R) × H+, Y = H,Φ(e, η) = e · η, and ψ = (f, γ) upon showing that the map Φ is differentiable at any(e, η) ∈ Cb(M,R)×H with

De,ηΦ[(e′, η′)] = e′ · η + e · η′

and the curve ψ is differentiable at t with ψt = (ft, γt). Both proofs are straight-forward and we leave them to the reader. �

3.4. EXISTENCE OF EQUILIBRIA VIA PERTURBATION 27

For η ∈ H+ and m′,m′′ ∈M, set

Kη(m′,m′′) := E[S(Xη + δm′ + δm′′)− S(Xη + δm′′)− S(Xη + δm′) + S(Xη)

].

By our standing assumption the conditions of Theorem 2.4 are in place, the map(η,m′,m′′) 7→ Kη(m′,m′′) is bounded. Ideas similar to those behind Lemma 2.7and Lemma 2.8 establish that this kernel gives the Frechet derivative of the mapη 7→ Fη(·).

Lemma 3.6. The mapping η 7→ Fη(·) from H+ to Cb(M,R) is Frechet differ-entiable at every point η′ ∈ H+ with derivative Dη′F given by

Dη′F [η′′](m′) =∫MKη′(m′,m′′) dη′′(m′′).

Moreover, straightforward coupling arguments establish the following bounds,where we recall that the constant K is such that |S(g)− S(h)| ≤ K‖g − h‖Was forall g, h ∈ G.

Lemma 3.7. For any ρ, ρ′, η ∈ H+ with ρ ≤ ρ′,∥∥DρF [η]∥∥∞ ≤ 2Kη(M),

and ∥∥DρF [η]−Dρ′F [η]∥∥∞ ≤ 16K

(ρ′(M)− ρ(M)

)η(M).

3.4. Existence of equilibria via perturbation

We define a family of dynamical systems for u ∈ R+ by

(3.9) ρ(u)t = ρ

(u)0 + utν −

∫ t

0

Dρ(u)s ds.

That is, we apply the equation (2.8) with the mutation measure ν replaced by themultiple uν.

Theorem 3.8. Suppose the selective cost of a nonzero genotype is boundedbelow. That is,

inf{S(δm) : m ∈M} = inf{S(g) : g ∈ G, g 6= 0} > 0.

Then, there exists U > 0 such that there is a finite equilibrium for the equation(3.9) for all u ∈ [0, U ]. That is, there exists a measure ρ(u)

∗ ∈ H+ such that

uν = Fρ(u) · ρ(u).

Proof. If it exists, ρ(u)∗ has a density with respect to ν, which we write as

p(u) ∈ Cb(M,R+). The equilibrium equation then becomes

(3.10) Fp(u) · p(u) = u,

where we adopt the convention that Fr for a function r is Fρ where ρ(dm) =r(m)ν(dm). The theorem is proved if we can show that there is a differentiablefamily of functions p(u) that satisfy p(0) = 0 and (3.10) in a neighborhood of u = 0.

Differentiating with respect to u and applying Lemma 3.6, we get

(3.11)[∫MKp(u)(m′,m′′)

dp(u)

du(m′′) dν(m′′)

]p(u)(m′) +Fp(u)(m′)

dp(u)

du(m′) = 1.

28 3. EQUILIBRIA

For each p ∈ Cb(M,R) we may define the bounded linear operator Tp : Cb(M,R)→Cb(M,R) by

(3.12) Tp(q) :=[∫MKp+(m′,m′′) q(m′′) dν(m′′)

]p+(m′) + Fp+(m′)q(m′),

where p+ ∈ Cb(M,R+) is defined by p+(m) = (p(m))+, m ∈ M. Let D be thesubset of p ∈ Cb(M,R) such that Tp is invertible. It follows that 0 ∈ D, becauseT0(q)(m′) = S(δm′)q(m′) and inf{S(δm) : m ∈M} > 0 by assumption. A standardresult in operator theory (see Lemma VII.6.1 of [DS88]) tells us that the invertibleoperators form an open set in the operator norm topology, so that D includes all psuch that ‖Tp−T0‖ is sufficiently small, where ‖ · ‖ denotes the operator norm. ByLemma 2.8 we can bound ‖Tp − T0‖ by 10K‖p‖∞, from which we conclude that Dincludes an open ball around 0.

Define a map L : D → Cb(M,R+) by

L(p) := T−1p (1),

where 1 ∈ Cb(M,R+) is the function with constant value 1. The map L is contin-uous, and

0 = Tp(L(p)

)− T0

(L(0)

)=(T0 + (Tp − T0)

)(L(0) + (L(p)− L(0))

)− T0

(L(0)

)= (Tp − T0)L(0) + T0(L(p)− L(0)) + (Tp − T0)(L(p)− L(0)).

Thus,L(p) = L(0)− T−1

0

((Tp − T0)L(0)− (Tp − T0)(L(p)− L(0))

).

Since L(0)(m) is bounded away from 0, it follows that there is a neighborhood V of0 such that 0 < infp∈V infm∈M L(p)(m). Hence, by standard results on existenceand uniqueness of solutions to ordinary differential equations in a Banach space,the ordinary differential equation dp(u)

du = L(p(u)) with initial condition p(0) = 0 hasa solution on an interval [0, U ] and this solution takes values in Cb(M,R+). Thus,(p(u))0≤u≤U satisfies (3.11) and hence also (3.10). �

3.5. Concave selective costs

For an important class of examples — including the demographic example ofSection 1.4 — the selective costs are concave, in the sense that the marginal costof adding a given mutation becomes smaller, the more other mutations are alreadypresent. Formally, this is stated in Definition 3.9. Under a few mild constraints, weshow in Theorem 3.11 that concave selective costs yield monotonic solutions (ρt)t≥0

when started from the pure wild type population 0, and hence such systems musteither diverge to a measure with infinite total mass or converge to an element ofH+.Corollary 3.14 gives a further condition that is sufficient to ensure the limit is anelement of H+. Our conditions for monotone increase in ρ over time turn out to besatisfied quite generally for the applications we have investigated. The conditionsfor the existence of a limit in H+, on the other hand, are not always satisfied,and there are important cases (discussed in [WES08]) for which ρt increases to ameasure with infinite total mass.

Definition 3.9. A selective cost function S is concave if

(3.13) S(g + h+ k)− S(g + h) ≤ S(g + k)− S(g) for all g, h, k ∈ G.

3.5. CONCAVE SELECTIVE COSTS 29

Lemma 3.10. A selective cost S is concave if and only if

S(g + δm + δm′)− S(g + δm) ≤ S(g + δm′)− S(g)

for all g ∈ G and m,m′ ∈M.

Proof. Since elements of G have finite integer mass, we can prove (3.13) byinduction on n := max{h(M), k(M)}. Our assumption is equivalent to (3.13) in thecase n = 1. Assume now that (3.13) holds whenever max{h(M), k(M)} ≤ n − 1.Suppose h(M) = n and k(M) ≤ n − 1. Let m be in the support of h, and leth = h+ δm. Then

S(g+h+ k)− S(g + h)− S(g + k) + S(g)

= S(g + k + h+ δm)− S(g + h+ δm)− S(g + k) + S(g)

=[S(g + δm + k + h)− S(g + δm + h)− S(g + δm + k) + S(g + δm)

]+[S(g + δm + k)− S(g + δm)− S(g + k) + S(g)

].

Since h and k both have mass smaller than n, each of the terms in brackets is ≤ 0by the induction hypothesis. To complete the induction, we need only address thecase when h(M) = k(M) = n; this proceeds exactly as above. �

Theorem 3.11. Fix a mutation measure ν ∈ H+ and a selective cost S : G →R+, that satisfies the conditions

• S(0) = 0,• S(g) ≤ S(g + h) for all g, h ∈ G,• for some constant K,

∣∣S(g)− S(h)∣∣ ≤ K∥∥g − h∥∥

Was, for all g, h ∈ G,

• supg∈G S(g) <∞,• S(g + h+ k)− S(g + h) ≤ S(h+ k)− S(h) for all g, h, k ∈ G.

If ρ0 ≥ 0 (≤ 0), then the solution of equation (2.8) guaranteed by Theorem 2.4satisfies ρs ≤ ρt (ρs ≥ ρt) for all 0 ≤ s ≤ t <∞.

Proof. Recall that Fη(m) = E[S(Xη + δm)−S(Xη)] for η ∈ H+ and m ∈M.It is clear from the assumptions that η 7→ Fη(·) is a map from H+ to Cb(M,R).

The curve ρ is differentiable at each t ≥ 0 and satisfies

(3.14) ρt = ν − Fρt · ρt.

The right-hand side is continuous in t. By Lemma 3.3, it then suffices to show thatρt ≥ 0 for all t ≥ 0.

By Lemma 3.4 and Lemma 3.6, the curve t 7→ Fρt , t ∈ R+, is differentiable,with

d

dtFρt = DρtF [ρt],

and, by Lemma 3.5,

ρt :=d

dtρt = −

(DρtF [ρt]

)· ρt − Fρt · ρt.

Define

γt := − exp{∫ t

0

Fρsds

}ρt.

30 3. EQUILIBRIA

Then,

dγtdt

= − exp{∫ t

0

Fρsds

}(Fρt ρt + ρt)

= − exp{∫ t

0

Fρsds

}(Fρt ρt −

(DρtF [ρt]

)· ρt − Fρt · ρt

)= exp

{∫ t

0

Fρsds

}(−DρtF

[exp

{−∫ t

0

Fρsds

}· γt])· ρt.

(3.15)

Suppose now that γ0 = −ρ0 ≤ 0. For any Borel set B ⊆Mγt(B) ≤ γt(B)− γ0(B)

=∫ t

0

(∫B

[∫M−Kρs(m

′,m′′) exp{−∫ s

0

Fρr (m′′) dr

}dγs(m′′)

]× exp

{∫ s

0

Fρr (m′) dr

}dρs(m′)

)ds.

Put βt := supB⊂M γt(B), where the supremum is taken over Borel sets. Because−K and F are both nonnegative and bounded and since ρs(M) ≤ sν(M), for anypositive T there is a positive CT such that

βt ≤ CT∫ t

0

βsds

for 0 ≤ t ≤ T . By Gronwall’s Inequality [SY02, D.1], this implies that βt ≤ 0 forall t. It follows that the measure γt is nonpositive. Thus, ρt (which differs from γtby a strictly negative Radon-Nikodym factor) is nonnegative.

If ρ0 ≤ 0, then we define γt to be + exp{∫ t

0Fρsds

}ρt, and then the rest of the

proof carries through as above. �

Corollary 3.12. Suppose the conditions of Theorem 3.11 hold, and thereexists ρ∗∗ ∈ H+ satisfying the equilibrium condition

ν(dm) = E [S(Xρ∗∗ + δm)− S(Xρ∗∗)] ρ∗∗(dm)

for equation (2.8). For the process started at ρ0 = 0, ρt ↑ ρ∗ ∈ H+, where ρ∗ ≤ ρ∗∗and ρ∗ is also an equilibrium for (2.8).

The proof of the following comparison result is straightforward and is left tothe reader.

Lemma 3.13. Consider two selective cost functions S′ and S′′ that satisfy theconditions of Theorem 3.11. Let ρ′ and ρ′′ be the corresponding solutions of (2.8).Suppose that S′(g + δm) − S′(g) ≤ S′′(g + δm) − S′′(g) for all g ∈ G and m ∈ Mand that ρ′0 ≥ ρ′′0 . Then, ρ′t ≥ ρ′′t for all t ≥ 0.

If the conditions of Theorem 3.11 hold, trajectories starting from 0 will eitherconverge as time goes to infinity to an equilibrium state in H+ or diverge to ameasure with infinite total mass. We therefore wish to consider conditions that willensure the existence of an equilibrium with finite total mass. One approach is tocompare the concave selective cost to a multiplicative selective cost. This producesthe small benefit over the general existence result of Theorem 3.8 of providing anexplicit value of ν(M) that is “small enough” to guarantee the existence of finiteequilibria.

3.5. CONCAVE SELECTIVE COSTS 31

Corollary 3.14. Suppose that the selective cost S : G → R satisfies the con-ditions of Theorem 3.11 and also satisfies the bound

(3.16) S(g + δm′)− S(g) ≥ ξ [1− exp(−τ(m′))] exp(−∫Mτ(m′′) dg(m′′)

)for all m′ ∈M for some constant ξ > 0 and function τ :M→ R+ such that∫

M

11− exp(−τ(m))

dν(m) <∞.

Suppose also that ρ0 is the null measure 0 and ν(M) ≤ e−1ξ. Then, there existsρ∗ ∈ H+ such that ρt ↑ ρ∗ as t→∞.

On the other hand, if the reverse inequality

(3.17) S(g + δm′)− S(g) ≤ ξ [1− exp(−τ(m′))] exp(−∫Mτ(m′′) dg(m′′)

)holds and ν(M) > e−1ξ, then limt→∞ ρt(M) =∞.

Proof. Suppose first that (3.16) holds. Let ρ′′ be the solution of 2.8 withselective cost S′′(g) = ξ

∫M(1− exp(−τ(m)) dg(m) and initial condition ρ′′0 = ρ∗∗,

where

ρ∗∗(dm) =exp(c)

1− exp(−τ(m))ν(dm)

with cξ = exp(c)ν(M) (such a c exists by the assumption that ν(M) ≤ e−1ξ). Itfollows from the results of Section 3.2 that ρ′′t = ρ∗∗ for all t ≥ 0.

Apply Lemma 3.13 with S′ = S and and ρ′0 = 0 to conclude that ρt ≤ ρ∗∗ forall t ≥ 0. It follows from Theorem 3.11 that ρt ↑ ρ∗ as t → ∞ for some ρ∗ ∈ H+

with ρ∗ ≤ ρ∗∗.Now suppose instead that the upper bound (3.17) holds. We define ρ′′ as before,

with selective cost S′′, but with initial condition ρ′′0 = 0. Lemma 3.13 implies thenthat ρt ≥ ρ′′t for all t ≥ 0. We know that ρ′′t is increasing in t, and there is nofinite equilibrium. Suppose R := limt→∞ ρt(M) < ∞. Then for any Borel set A,the quantity ρ′′t (A) is increasing in t and bounded by R, so it converges to a limitρ′′∗(A). It is easy to check that A 7→ ρ′′∗(A) is a measure H+ and that ρ′′t convergesto ρ′′∗ in the Wasserstein metric (that is, in the topology of weak convergence).From equation (2.8) we know that for any Borel set A, and any s < t,

0 ≤ ρ′′t (A)− ρ′′s (A) =∫ t

s

(ν(A)−D′′ρ′′u(A)

)du,

where the operator D′′ is the analogue of the operator D when the selective cost Sis replaced by the selective cost S′′. We conclude that ν(A)−D′′ρ′′u(A) ≥ 0 for allu ≥ 0, and so∣∣ρ′′∗(A)− ρ′′t (A)

∣∣ =∣∣∣∣∫ ∞t

(ν(A)−D′′ρ′′u(A)

)du

∣∣∣∣ ↓ 0 as t→∞.

Since D′′ρ is continuous in ρ by Lemma 2.5, it follows that the integrand on theright-hand side converges to ν(A) − D′′ρ′′∗(A), which must then be 0. Since thisis true for all Borel sets A, it would follow that ρ′′∗ would be an equilibrium forthe dynamical system with selective cost S′′, contradicting the fact that no suchequilibrium exists. �

32 3. EQUILIBRIA

The monotone growth of systems with concave fitness cost functions allows usto derive a simple sufficient condition for stability of the “minimum equilibrium”ρ∗ — the fixed point to which the process converges when started from 0. Weknow that all trajectories that start strictly below ρ∗ converge asymptotically toρ∗. Since the vector field vanishes at ρ∗, this implies that the derivative of thevector field −Dρ∗F [η] · ρ∗ − Fρ∗ · η is nonpositive for all positive directions η. Theequilibrium is box stable if this nonpositivity extends to a neighborhood of ρ∗, whichcan be guaranteed if the derivative is actually bounded away from 0, measured byits Radon-Nikodym derivative with respect to ν.

Theorem 3.15. (i) Suppose the selective cost function satisfies the conditionsof Theorem 3.11, and the curve (ρt)t≥0, started at ρ0 = 0, converges to a finitefixed point ρ∗. Suppose further that

k := infm∈M

Dρ∗F [ν](m)dρ∗dν

(m) + Fρ∗(m) > 0.

Then the fixed point is box stable.(ii) Moreover, if M is compact and the equation

Dρ∗F [η] · ρ∗ + Fρ∗ · η = 0

has no solution η ∈ H+ that is absolutely continuous with respect to ν, then ρ∗ isbox attractive.

Proof. Consider part (i). Let ψs := ρ∗+ sν, and let φs = ν −Fψs ·ψs, be thedriving vector field evaluated at the point ψs. Applying Lemma 3.7, we see that

infm∈M

(DρF [ν](m)

dρ

dν(m) + Fρ(m)

)> k − 10Ksν(M)− 16Ksν(M)2

(supm′∈M

dρ∗dν

(m) + s

).

For positive s small enough this bound is positive. Then, φ0 = 0 and

φs =∫ s

0

φrdr

=∫ s

0

(−DφrF [ν]− Fφr · ν

)dr

< 0 for s sufficiently small.

Applying Theorem 3.11, we see that when ρ0 = ψs, the trajectory ρt is monotoni-cally decreasing. Thus, for all times t ≥ 0 we have ρt ∈ B(ρ∗, ψs).

Now consider part (ii). Suppose that there is no η ∈ H+ that is absolutelycontinuous with respect to ν and satisfies Dρ∗F [η] · ρ∗ + Fρ∗ · η = 0. For all ssufficiently small, the trajectory starting from ψs is monotonically decreasing, andconverges to an equilibrium ρ

(s)∗ . If ρ∗ were not box attractive, then these equilibria

would be distinct from ρ∗ = ρ(0)∗ . Since the ρ(s)

∗ are all between ρ∗ and ρ∗ + sν,we know that ρ(s)

∗ − ρ∗ is absolutely continuous with respect to ν. Consider themeasures π(s), defined by

π(s) =ρ

(s)∗ − ρ∗

ρ(s)∗ (M)− ρ∗(M)

.

3.7. STABLE EQUILIBRIA IN THE CONCAVE SETTING VIA PERTURBATION 33

These are probability measures onM. SinceM is compact, the space of probabilitymeasures on M is also compact (recall that the Wasserstein norm induces thetopology of weak convergence), so there is an accumulation point. Let s1, s2, . . .be a sequence converging to u, with π(si) → η. Since the vector field vanishes atall of these points, the derivative in direction η vanishes as well, contradicting ourassumption, and hence proving that ρ∗ is, in fact, box attractive. �

3.6. Iterative computation of the minimal equilibrium

The measure π ∈ H+ is an equilibrium for the dynamic equation (2.8) if Fπ ·π =ν, where we recall that Fπ · π is the measure that has Radon-Nikodym derivativeFπ with respect to π. This implies that π is absolutely continuous with respect toν with a Radon-Nikodym derivative p that satisfies

(3.18) Fp(m)p(m) = 1, m ∈M,

where, with a slight abuse of notation, we write Fq for Fκ when κ ∈ H+ is a measurethat has Radon-Nikodym derivative q with respect to ν. Under the conditions ofTheorem 3.11, if (3.18) has a solution, then it has a minimal solution that arisesas the Radon-Nikodym derivative with respect to ν of limt→∞ ρt when ρ is thesolution of (2.8) with ρ0 = 0.

If one is only interested in finding the minimal equilibrium numerically, then itwould be desirable to be able to do so without having to solve (2.8). An obviousapproach to that problem is to define a sequence of functions pn : M → R+

inductively by p0 = 0 and

pn+1 =1Fpn

, n ≥ 0.

It is clear that if S is concave, then Fp′′ ≤ Fp′ for p′ ≤ p′′. Because p0 =0 ≤ p1, it follows that p0 ≤ p1 ≤ . . .. Moreover, if p∗∗ is a solution of (3.18) suchthat

∫M p∗∗(m) ν(dm) < ∞, then pn ↑ p∗ ≤ p∗∗ as n → ∞ for some function

p∗ :M→ R+ and the measure p∗ · ν is the minimal equilibrium of the dynamicalequation (2.8).

Remark 3.16. If π ∈ H+ is an equilibrium for (2.8) with Radon-Nikodymderivative p against ν, then, from (3.18),

π(M) =∫Mp(m) dν(m) =

∫M

1Fp(m)

dν(m)

≥∫M

1supg∈G S(g + δm)− S(g)

dν(m) =∫M

1S(δm)

dν(m)

under the concavity assumption. Thus, a necessary condition for the existence ofan equilibrium in H+ is that the last integral is finite.

3.7. Stable equilibria in the concave setting via perturbation

Suppose that the selective cost is concave. In Section 3.4 we constructed equi-libria for sufficiently small mutation measures. We know from Section 3.5 that alltrajectories starting below an equilibrium ρ converge asymptotically to an equilib-rium that is also dominated by ρ. This leaves open the question of whether theequilibrium constructed by perturbing the dynamical system away from ν ≡ 0 isthe same as the minimal equilibrium ρ∗ to which the system converges when startedfrom 0.

34 3. EQUILIBRIA

Theorem 3.17. (i) Suppose the selective cost function satisfies the conditionsof Theorem 3.11. For U > 0 sufficiently small, there is a unique p : [0, U ] →Cb(M,R+) solving the equation[∫

MKp(u)(m′,m′′) ˙p(u)(m′′) dν(m′′)

]p(u)(m′) + Fp(u)(m′) ˙p(u)(m′) = 1,

with p(0) ≡ 0. The measure p(u)ν ∈ H+ is the minimal equilibrium for the systemwith mutation measure uν for all u ∈ [0, U ]. Furthermore, the minimal equilibriaso realized for u < U are box stable.

(ii) Moreover, if M is compact and the equation

Dρ∗F [η] · ρ∗ + Fρ∗ · η = 0

has no solution η ∈ H+ with η absolutely continuous with respect to ν, then ρ∗ isbox attractive.

Proof. Following the method of Section 3.6, we may compute the minimalequilibrium as the limit of the iteration

pn+1 =u

Fpn,

with p0 ≡ 0. Let qn be the corresponding iterates for vν, where 0 ≤ u < v.The concavity of S implies that pn ≤ qn for all n and

qn+1 − pn+1 =v

Fqn− u

Fpn

=vFpn − uFqnFpnFqn

=(v − u)FpnFpnFqn

+u(Fpn − Fqn)FpnFqn

≤ a|v − u|+ b‖qn − pn‖∞for appropriate constants a and b, if pn and qn both converge to finite equilibria,by Lemma 2.8.

Thus,‖qn+1 − pn+1‖∞ ≤ a|v − u|+ b‖qn − pn‖∞.

Moreover, we can make the constant b less than 1 by taking u and v sufficientlysmall. Iterating, we see that

‖qn+1 − pn+1‖∞ ≤ c|v − u|+ bn‖q0 − p0‖∞,

where c = a/(1 − b). Hence, the corresponding minimal equilibria, say p∗ and q∗,satisfy

‖q∗ − p∗‖∞ ≤ c|v − u|.Therefore, if we write u 7→ p(u) for the curve of densities of minimal equilibria,

then0 ≤ p(v)(m)− p(u)(m) ≤ c(v − u)

for all m ∈ M for all 0 ≤ u < v sufficiently small. It follows that u 7→ p(u)(m)is Lebesgue almost everywhere differentiable, and is the integral of its derivative.Since p(u) satisfies the relation p(u)Fp(u) = u for every u, we see by differentiatingwith respect to u that p(u) satisfies the differential equation in the statement. By

3.8. EQUILIBRIA FOR DEMOGRAPHIC SELECTIVE COSTS 35

standard uniqueness results for ordinary differential equations, it is the uniquesolution.

Fix any u ∈ [0, U). The dynamical system with mutation measure Uν convergesto a finite equilibrium p(U)ν ∈ H+, where p(U) ≥ p(u). Let (ρt)t≥0 be the dynamicalsystem with mutation measure Uν started from ρ0 = p(u) · ν and let (ρ′t)t≥0 be thereverse – the dynamical system with mutation measure uν started from ρ′0 = p(U) ·ν.Thus, (ρt)t≥0 starts below its equilibrium, and (ρ′t)t≥0 starts above its equilibrium.We have

ρ′0 = uν − Fp(U)p(U) · ν = (u− U)ν ≤ 0,

ρ0 = uν − Fp(u)p(u) · ν = (U − u)ν ≥ 0.

Therefore, p(U) > p(u), and we can conclude from Theorem 3.11 that the measureρt stays bounded between the measures p(u) ·ν and p(U) ·ν for all times t, and hencethe minimal equilibrium p(u) · ν is box stable.

Now consider part (ii). We know that the system with mutation measure uνconverges monotonically downward to an equilibrium when started from p(v)ν. Thefinal part of the proof, showing that this equilibrium is in fact p(u) · ν, proceedsexactly as in the previous proof of Theorem 3.15. �

3.8. Equilibria for demographic selective costs

As an example of Corollary 3.14, suppose that S is the demographic selectivecost of Section 1.4, so that

S(g + δm′)− S(g) =∫ ∞

0

(1− e−θ(m

′,x))fx exp

(−λx−

∫Mθ(m′′, x) dg(m′′)

)dx.

This selection cost is concave, allowing us to apply the results of Sections 3.5 and3.7. If supm,x θ(m,x) < ∞ and infm infx∈B θ(m,x) > 0 for some set B such that∫Bfx dx > 0, then the conditions of Corollary 3.14 hold, with the function τ a

suitable constant and the constant ξ sufficiently small. In other words, if there isa range of fertile ages over which all deleterious mutations reduce survival by atleast some minimal amount, then selection keeps the total intensity from going toinfinity provided ν(M) ≤ e−1ξ.

On the other hand, when supm,x θ(m,x) < ∞ holds (the lower bound is thenirrelevant) there is also a (larger) constant τ and ξ sufficiently large such that thereverse inequality (3.17) holds. We then conclude that there is no convergence whenν(M) is too large.

For example, suppose that S is the demographic selective cost of Section 1.4,with M = [α, β] for 0 < α < β <∞, ν is a constant multiple of Lebesgue measureon M, fx is constant, and θ(m,x) = η1[m,β](x) for some constant η. Such asimplified model has featureless fertility between two ages that represent the onsetand end of reproduction, mutations associated with effects at specific ages, constantmutation rate during the reproductive span, and equal increments to the hazardfrom all mutations. Then, for a suitable constant ζ,

S(g + δm)− S(g) ≤ ζ(1− e−η

)[exp (−λm)− exp (−λβ)] ≤ ζ(m− β).

Because ∫ β

α

1m− β

dm =∞,

36 3. EQUILIBRIA

an equilibrium with finite total mass does not exist. Of course, (3.18) may have asolution p such that p · ν has infinite total mass and so does not belong to H+. Inthe setting of Theorem 3.11, the measure π = p ·ν could still arise as the increasinglimit of the solution ρt to the dynamic equation (2.8) and be such that populationaverage selective cost E[S(Xπ)] would be finite. These more delicate possibilitiesare treated in [WES08].

Observe, now, that Theorem 3.15 immediately tells us that the minimal equi-librium is box stable when ν is sufficiently small. In order to show that it is boxattractive, though, we need to apply Theorem 3.15, which requires that we showthe derivative to be nonsingular in a neighborhood of the equilibrium. An equilib-rium π for the demographic selective cost is a measure π(dm) = p(m)ν(dm) wherep satisfies[∫ ∞

0

(1− e−θ(m′,x))fx exp

(−λx−

∫M

(1− e−θ(m′′,x)) p(m′′) dν(m′′)

)dx

]p(m) = 1.

If we differentiate the left-hand side at p in the direction of the function q,(giving us the derivative of the vector field) we obtain

−[∫ ∞

0


(−λx−

∫M

(1− e−θ(m′′,x)) p(m′′) dν(m′′)

)×(−∫M

(1− e−θ(m′′′,x)) q(m′′′) dν(m′′′)

)dx

]p(m′)

−[∫ ∞

0


(−λx−

∫M

(1− e−θ(m′′,x)) p(m′′) dν(m′′)

)dx

]q(m′)

=1

p(m′)

[p(m′)2

∫ ∞0


(−λx−

∫M

(1− e−θ(m′′,x)) p(m′′) dν(m′′)

)×(∫M

(1− e−θ(m′′′,x)) q(m′′′) dν(m′′′)

)dx − q(m′)

].

(3.19)

As we have already observed, if supm,x θ(m,x) <∞ and infm infx∈B θ(m,x) >0 for some set B such that

∫Bfx dx > 0, then

S(g + δm′)− S(g) ≥ ξ [1− exp(−τ)] exp (−τ g(M))

for all m′ ∈M for suitable constants ξ > 0 and τ > 0. If ν(M) ≤ e−1ξ, then thereis a minimal equilibrium π with

π ≤ γ(ν(M))ξν(M)(1− exp(−τ))

ν,

where γ(z) is the solution of cξ = exp(c)z for z ≤ e−1ξ. Hence π = p · ν with

supm∈M

p(m) ≤ γ(ν(M))ξν(M)(1− exp(−τ))

.

Observe that

limz↓0

γ(z)z

=1ξ,

so we can vary ν (but keep the selective cost fixed) and conclude that

sup{p(m) : m ∈M, ν(M) ≤ e−1ξ} =: p <∞.

3.8. EQUILIBRIA FOR DEMOGRAPHIC SELECTIVE COSTS 37

The supremum over m′ ∈M of the quantity in brackets on the right-hand sideof (3.19) is at most

supm∈M

q(m)[−1 + ν(M) p2

∫ ∞0

fxe−λx dx

],

which is negative for ν(M) sufficiently small. Thus, Theorem 3.15 implies that theminimal equilibrium is box attractive when ν(M) is sufficiently small.

CHAPTER 4

Mutation-selection-recombination in discrete time

We devote the remainder of this work to establishing rigorously the claim thatwe argued heuristically in the Introduction: that the dynamical system (1.5), whichwe defined more formally in (2.8), should be the limit of a sequence of infinitepopulation mutation-selection-recombination models in the standard asymptoticregime where selection and mutation are weak relative to recombination and bothscale at the same infinitesimal rate in the limit.

More specifically we show that our continuous time model is indeed a limit ofthe sort of discrete generation, infinite population mutation-selection-recombinationmodels considered in [BT91, KJB02], once such models have been extended toincorporate our more general definition of genotypes. Such a result not only justifiesour model as a tractable approximation to more familiar models in the literature,but, as we remarked in the Introduction, it also marks out the range of relativestrengths for mutation, selection, and recombination where this approximation canbe expected to be satisfactory, and thus where the resulting conclusions that underlyapplications to the demographic study of longevity can be trusted.

4.1. Mutation and selection in discrete-time

Consider first the situation without recombination.Recall that the distribution of genotypes in a population is described by a

probability measure on G, the space of finite integer-valued measures on the setMof loci.

We consider a sequence of models indexed by the positive integers. In thenth model each new birth gets a random set of extra mutations away from theancestral wild type added to those it inherited from its parents. The loci at whichthese mutations occur are distributed as the Poisson random measure Xν/n withintensity ν/n. The fitness of a genotype g ∈ G is e−S(g)/n. We speed up time by afactor of n so that n generations pass in 1 unit of time. The net result is that in afinite, non-zero amount of time the population is asymptotically subject to finite,non-zero “amounts” of mutation and selection.

We now define operators Mn and Sn acting on the space of probabilities onG that describe the transformation in genotype distribution by, respectively, oneround of mutation and one round of selection. The intuition behind these definitionsis described in the Introduction.

Notation 4.1. Given a probability measure P on G, define new probabilitymeasures MnP and SnP by

(4.1) MnP [F ] :=∫G

E[F (g +Xν/n)] dP (g)

39

40 4. MUTATION-SELECTION-RECOMBINATION IN DISCRETE TIME

and

(4.2) SnP [F ] :=

∫G e−S(g)/nF (g) dP (g)∫G e−S(g)/n dP (g)

for a Borel function F : G → R+.

Note that when n is large,

P{Xν/n = 0} = exp(−ν(M)/n) ≈ 1− ν(M)/n,

and conditional on the event {Xν/n 6= 0} the random measure Xν/n is approxi-mately a unit point mass with location in M chosen according to the probabilitymeasure ν(·)/ν(M). Hence, for any probability measure P on G

(4.3) limn→∞

n(MnP [F ]− P [F ]

)=∫G

(∫MF (g + δm)− F (g) ν(dm)

)dP (g).

In particular, if F is of the form F (g) := e−g[f ] for some Borel function f :M→ R+,then

(4.4) MnP [F ] = P[F · eν[e−f−1]/n

]and so

(4.5) limn→∞

n(MnP [F ]− P [F ]

)= ν

[e−f − 1

]P [F ].

Note also that e−S(g)/n ≈ 1− S(g)/n when n is large and so, when P [S] is finite,

(4.6) limn→∞

n(SnP [F ]− P [F ]

)= P [S · F ]− P [S]P [F ].

When we start with a population genotype distribution P , the population geno-type distribution after one generation of mutation and selection is MnSnP (assum-ing that selection precedes mutation). A trajectory of the resulting discrete-timemodel is defined by iteration: Given an initial population genotype distribution P ,the population genotype distribution after m generations is

(MnSn

)mP .

If, after speeding up time in the nth model, the resulting sequence of trajectorieshas a continuously differentiable limit Pt, this limit should satisfy the equation

limε↓0

ε−1(Pt+ε[F ]− Pt[F ]

)= limn→∞

n(MnSnP [F ]− P [F ]

)= limn→∞

n(SnP [F ]− P [F ]

)+ limn→∞

n(MnSnP [F ]−SnP [F ]

)= P [SF ]− P [S]P [F ] + ν

[e−f − 1

]P [F ]

for test functions F : G → R+ of the form F (g) = e−g[f ] for some Borel functionf :M→ R+. For this special choice of test function, this is precisely the dynamicalequation defining the recombination-free process that we introduced in [SEW05]and derived heuristically in equation (1.4). In fact, these test functions are enoughto consider, since they determine probability distributions on G. Formally, a proofthat the discrete-time process converges to this continuous-time process would re-quire that we prove the existence of the continuously differentiable limit, a fact thatwe assumed above.

4.2. RECOMBINATION IN DISCRETE-TIME 41

4.2. Recombination in discrete-time

We now introduce recombination. Imitating [BT91], we think of a recom-bination event as taking two genotypes g′, g′′ ∈ G from the population and re-placing the genotype g′ in the population by the genotype g defined by g(A) :=g′(A ∩ R) + g′′(A ∩ Rc), where R ⊆ M is the particular “segregating set” for therecombination event. That is, the new individual with genotype g has the sameaccumulated mutations as the individual with genotype g′ (respectively, g′′) for“loci” in the set R (respectively, Rc). (Recall that we work in an abstract frame-work in which loci are just places at which mutations from wild type can occurrather than concrete physical loci per se). From a biological point of view, therecombination process would more legitimately be defined on a linear sequence ofloci, as in [BT91], so that one can picture recombination caused by crossovers dur-ing meiosis. Working with a somewhat abstract space of loci is perhaps a weakness,but also a useful simplification, of our approach.)

We will, of course, think of g′ and g′′ as being chosen independently at randomaccording to the particular probability measure describing the distribution of geno-types in the population. We will also imagine that the segregating set is chosen atrandom via some suitable mechanism. In order to discuss random sets rigorously,we follow formalism described in [Ken74] and define a σ-algebra on sets of Borelsubsets ofM by the requirement that all incidence functions with Borel subsets aremeasurable. A consequence of this definition is that if Ξ is a random Borel set andκ is a finite measure, then κ(Ξ) is a real-valued random variable. We suppose thereis a probability distribution R that describes the distribution of the random setof loci that segregate together. We will always assume, without loss of generality,that R is symmetric in the sense that

(4.7) R(A) = R({Rc : R ∈ A}

),

where Rc denotes the complement of the set R.

Notation 4.2. For any Borel measure g on M and Borel subset R of M,define the Borel measure gR on M by

gR(A) := g(A ∩R)

for Borel subsets A ⊂M. Given the distributionR of a random subset ofM, definethe corresponding recombination operator that maps the space of Borel probabilitymeasures on G into itself by

(4.8) RP [f ] :=∫B(M)

∫G

∫Gf(g′R + g′′Rc) dP (g′) dP (g′′) dR(R),

where P is a Borel probability measure on G, f : G → R is a bounded Borel function,and B(M) is the collection of Borel subsets of M.

Thus, if P describes the distribution of genotypes in the population, then RPdescribes the distribution of a genotype that is obtained by picking two genotypesg′ and g′′ independently according to P and then forming a composite genotypethat agrees with g′ on the set R and g′′ on the set Rc, where the segregating setR is picked according to the distribution R. Recall that µP denotes the intensitymeasure of P ; that is,

(4.9) µP (A) :=∫Gg(A) dP (g).

42 4. MUTATION-SELECTION-RECOMBINATION IN DISCRETE TIME

Remark 4.3. Note that µRP = µP ; that is, the intensity measure of the pointprocess describing the genotype of a randomly chosen individual is left unchangedby recombination. Also, note that if P is the distribution of a Poisson randommeasure, then RP = P .

4.3. The discrete-time model

Notation 4.4. Given a probability measure P0 on G, set Qk := (RMnSn)kP0.That is, if P0 describes the population at generation 0, then Qk describes thepopulation at generation k, where we suppose that each generation is produced fromthe previous one by the successive action of selection, mutation, and recombination.

CHAPTER 5

Hypotheses and statement of the convergenceresult

5.1. Shattering of point processes

We expect recombination to break up dependencies between different parts ofthe genome, so that RkP should be approximately ΠµP when k is large, where werecall that ΠµP is the distribution of the Poisson random measure with intensityµP . In order that this happens for a given distribution P , it must generically be thecase that there is a positive probability that the segregating set and its complementwill both intersect any set with positive µP mass in two sets that each have positiveµP mass. The following condition (with λ = µP ) will be useful for establishingquantitative bounds on the rate with which RkP converges to ΠµP .

Definition 5.1. Given a recombination measure R and a finite measure λ onM, we say that the pair (R, λ) is shattering if there is a positive constant α suchthat for any Borel set A,

λ(A)3 ≤ α[λ(A)2 − 2

∫λ(A ∩R)2 dR(R)

]= 2α

∫λ(A ∩R)λ(A ∩Rc) dR(R)

(5.1)

For example, suppose that N is a finite simple point process on M = (0, 1].We think ofM in this case as a physical chromosome and N as the set of crossoverpoints formed during meiosis. Write 0 < T1 < . . . < TK < 1 for the successivepoints of M. Set T0 = 0 and TK+1 = 1. Let Z be a {0, 1}-valued random variablethat is independent of N with P{Z = 0} = 1

2 . Define R to be the distribution ofthe random set given by

(T0, T1] ∪ (T2, T3] ∪ . . . , if Z = 0,

(T1, T2] ∪ (T3, T4] ∪ . . . , if Z = 1.

Take λ to be any diffuse probability measure (that is, λ has no atoms). Supposethat there is a constant c such that P{N(u,w] = 1} ≥ cλ((u,w]) for 0 < u < w ≤ 1.This will be the case, for example, if N is Poisson process with intensity measurebounded below by a positive multiple of λ or N consists of a single point withdistribution bounded below by a positive multiple of λ. Also, for most “reasonable”simple point processes with intensity Cλ for some constant C, it will be the casethat P{N(u,w] = 1} ≈ E[N(u,w]] = Cλ((u,w]) when |u − w| is small, where thenotation ≈ indicates that the ratio of the two sides is close to 1.

43

44 5. HYPOTHESES AND STATEMENT OF THE CONVERGENCE RESULT

Then,

λ(A)2 − 2∫λ(A ∩R)2 dR(R)

= λ(A)2 − 212

∫ 1

0

∫ 1

0

1A(u)1A(w)P{N((u,w]) = 0 mod 2} dλ(u) dλ(w)

=∫ 1

0

∫ 1

0

1A(u)1A(w)P{N((u,w]) = 1 mod 2} dλ(u) dλ(w)

≥∫ 1

0

∫ 1

0

1A(u)1A(w)P{N((u,w]) = 1} dλ(u) dλ(w)

≥ c∫ 1

0

∫ 1

0

1A(u)1A(w)λ((u,w]) dλ(u) dλ(w).

Observe that

λ(A)3 = 3!∫∫∫

{0<u<v<w≤1}1A(u)1A(v)1A(w) dλ(u) dλ(v) dλ(w)

≤ 3!∫∫∫

{0<u<v<w≤1}1A(u)1A(w) dλ(u) dλ(v) dλ(w)

= 3!∫∫{0<u<w≤1}

1A(u)1A(w)λ((u,w]) dλ(u) dλ(w)

= 3∫ 1

0

∫ 1

0

1A(u)1A(w)λ((u,w]) dλ(u) dλ(w).

Thus, the pair (R, λ) is shattering with constant α = 3/c.

Remark 5.2. Note that if the pair (R, λ) is shattering with some constant α,then it follows from (5.1) that λ({x}) = 0 for all x ∈ M, and so the measure λ isdiffuse.

Remark 5.3. It can be shown that if the pair (R, µP ) is shattering and thereis a constant β such that∫

g(A)1{g(A)≥2} dP (g) ≤ βµP (A)2

for any Borel set A ⊆M, then RkP converges to ΠµP as k →∞.

5.2. Statement of the main convergence result

We can now state our main result that justifies the continuous-time model (2.8)as a limit of discrete-time models in which recombination acts on a faster time scalethan mutation and selection. We use the notation bxc to denote the greatest integerless than or equal to the real number x.

Theorem 5.4. Let (ρt)t≥0 be the measure-valued dynamical system of (2.8)whose existence is guaranteed by Theorem 2.4. Suppose in addition to the hypothesesof Theorem 2.4 that the pair (R, ν) (respectively, (R, ρ0)) consisting of the recombi-nation measure and the mutation intensity measure (respectively, the recombinationmeasure and the initial intensity) is shattering, the initial measure P0 is Poisson

5.2. STATEMENT OF THE MAIN CONVERGENCE RESULT 45

(with intensity ρ0), and the selective cost S is bounded with σ = supg∈G S(g). Then,for any T > 0,

limn→∞

sup0≤t≤T

∥∥Πρt −Qbtnc∥∥

Was= 0.

This result is proved in Chapter 8 as a consequence of a similar result in Chap-ter 6 in which the partially Poissonizing action of recombination is replaced bycomplete Poissonization in each generation. Although it is intuitively reasonablethat the effect of recombination should be asymptotically indistinguishable fromthat of complete Poissonization when the per-generation effect of selection is suffi-ciently small, the technical difficulties are substantial, and we collect some necessarypreliminary results in Chapter 7.

Remark 5.5. We have assumed throughout, for notational convenience, a par-ticular order of operations: In each generation there is first selection, then muta-tion, then recombination. This order has no special significance, and it is easyto see that the proofs would hold equally well for another order, or if the sametotal amounts of mutation and selection were split up into multiple bouts within ageneration, whether before or after recombination.

Remark 5.6. Because of Remark 5.2, the hypotheses of Theorem 5.4 implythat the mutation intensity measure ν and the initial intensity ρ0 are both diffuse.It follows easily from this that each probability measure Qk assigns all of its massto the set of elements of G that have atoms of mass one; that is, every Qk is thedistribution of a simple point process.

From now on, we assume without further comment that all the prob-ability measures on G we consider are distributions of simple point pro-cesses.

CHAPTER 6

Convergence with complete Poissonization

Recall from Notation 4.4 that Qk := (RMnSn)kP0. Theorem 5.4 says that,under suitable conditions, Qbtnc is close to Πρt when n is large. The intuitionbehind this result is that R drives distributions on G towards Poisson faster thanSn drive ones away. As a first step in the proof of this theorem, we prove a simplerresult in which the operator R is replaced by the operator that immediately replacesa distribution on G by the distribution of a Poisson random measure with the sameintensity measure. This latter proposition is essentially a shadowing result, but noneof the standard shadowing theorems, such as those in [CKP95], can be applied asstated to this setting. In any case, the direct proof is not difficult.

Notation 6.1. The complete Poissonization operator P acts on a probabilitymeasure P on G by PP := ΠµP . That is, PP is the distribution of the Poissonrandom measure with the same intensity measure µP as P .

Notation 6.2. SetQ′k := (PMnSn)kP0

andπk := µQ′k.

That is, Q′k the distribution of the Poisson random measure resulting from k itera-tions of mutation and selection of intensity 1/n, with complete Poissonization afterevery round, and πk is the corresponding intensity measure.

Proposition 6.3. There are constants A,B,C, depending on K, ν(M), andρ0(M), such that for every T > 0

sup0≤t≤T

‖πbtnc − ρt‖Was ≤ eAT (BT + C)n−1.

In particular,

limn→∞

sup0≤t≤T

∥∥∥Πρt −Q′btnc∥∥∥

Was= 0.

Proof. For any 0 ≤ s ≤ t,∥∥ρt − ρs∥∥Was≤ ‖(t− s)ν‖Was +

∥∥∥∥∫ t

s

Fρuρu du

∥∥∥∥Was

.

The first term on the right is simply (t− s)‖ν‖Was = (t− s)ν(M). For the secondterm we use Lemma 2.5 and the triangle inequality, together with the universalbound ‖ρu‖Was ≤ ‖ρ0‖Was + u‖ν‖Was, to obtain the bound

2K∫ t

s

‖ρu‖Was du ≤ 2K(t− s)(ρ0(M) + tν(M)

).

47

48 6. CONVERGENCE WITH COMPLETE POISSONIZATION

Putting these bounds together,

(6.1)∥∥ρt − ρs∥∥Was

≤ (t− s)ν(M) + 2K(t− s)(ρ0(M) + tν(M)

).

Note that

(6.2)d(πm+1 − ν/n)

dπm(x) =

E[e−S(Xπm+δx)/n]E[e−S(Xπm )/n]

,

so that ∥∥∥∥πm+1 −ν

n−πm

(1− 1

nFπm(·)

)∥∥∥∥Was

≤ n−2 · 2e2K/n(E[S(Xπm)2] +K2) exp {E[S(Xπm)/n]} .(6.3)

Next, ∥∥∥∥πm(1− 1nFπm

)− πm

(1− 1

nFρm/n

)∥∥∥∥Was

≤ 1n

supx∈M

∣∣Fπm(x)− Fρm/n(x)∣∣

≤ 8Kn

∥∥πm − ρm/n∥∥Was

(6.4)

by Lemma 2.8, and

(6.5)∥∥∥∥πm(1− 1

nFρm/n

)− ρm/n

(1− 1

nFρm/n

)∥∥∥∥Was

≤∥∥πm − ρm/n∥∥Was

since Fρ(x) is always nonnegative. Finally,∥∥∥∥ρ(m+1)/n −ν

n− ρm/n

(1− 1

nFρm/n

)∥∥∥∥Was

≤

∥∥∥∥∥∫ 1/n

0

Fρm/nρm/n − Fρs+m/nρs+m/n ds

∥∥∥∥∥Was

≤∫ 1/n

0

supx∈M

∣∣Fρm/n(x)− Fρs+m/n(x)∣∣∥∥ ρm/n+s

∥∥Was

ds

+ supx∈M

Fρm/n(x)∫ 1/n

0

∥∥ρm/n − ρs+m/n∥∥Wasds

≤∫ 1/n

0

8Ks(m+ 1n

ν(M) + ρ0(M))ds

+K

∫ 1/n

0

∥∥∥∥sν − ∫ s

0

Dρu+m/n du

∥∥∥∥Was

ds

≤ Kn−2

(4 +

12

)m+ 1n

(ν(M) + ρ0(M)

)+Kν(M)n−2.

(6.6)

Thus, there are constants a and b such that

‖πm+1 − ρ(m+1)/n‖Was ≤a

n‖πm − ρm/n‖Was +

bT + c

n2.

Hence,

‖πm − ρm/n‖Was ≤ n−1eaTbT + c

a.

6. CONVERGENCE WITH COMPLETE POISSONIZATION 49

Combining this inequality with the inequality (6.1), the proposition follows imme-diately. �

CHAPTER 7

Preparatory lemmas for the main convergenceresult

7.1. Consequences of shattering

We introduced the recombination probability measureR as the distribution of arandom subset ofM with the property that the subset and its complement have thesame distribution, and we think of the subset and its complement as segregating setof loci. These two sets form a partition of M and, with a slight abuse of notation,we will use also use the notation R for the distribution of this random partitionof M into two sets. In the same vein, we define the probability measure Rk onpartitions ofM to be the distribution of the coarsest random partition that is finerthan each of the independent partitions {Rj , Rcj}, 1 ≤ j ≤ k, where each partition{Rj , Rcj} has distribution R. That is, Rk is the distribution of the partition of Mconsisting of the non-empty sets of the form R1 ∩ · · · Rk, where Rj is either Rj orRcj . Using standard notation, we write this partition as {R1, R

c1} ∧ · · · ∧ {Rk, Rck}.

Note that the number of sets in this partition is at most 2k, but may be smaller.With this notation, the product of k copies of the recombination operator R

can be written as

(7.1) Rk =∫

RAdRk(A),

where, for a finite partition A = {A1, . . . , AK} of M into Borel sets, the annealedrecombination operator RA is the operator on probability measure on G defined by

(7.2) RAQ[F ] :=∫· · ·∫F(g

(1)A1

+ · · ·+ g(K)AK

)dQ(g(1)) · · · dQ(g(K))

for a probability measure Q on G and a function F : G → R. The annealedrecombination operator should be thought of as representing the combined effect ofsome number k of successive recombination events involving specified segregatingsets R1, . . . , Rk such that A = {R1, R

c1} ∧ · · · ∧ {Rk, Rck}. Note that if A and A′

are partitions, then RARA′ = RA∧A′ .Given some probability measure P on G, the relation (7.1) indicates that in

order for RkP to be approximately Poisson with intensity µP for large k it will, ingeneral, be necessary that a partition of M distributed according to Rk typicallyconsists of sets that all have small µP mass, so that a sample from RkP consists of amosaic of many small pieces each taken from genomes sampled independently fromthe population described by P . The following is a convenient way of measuring theextent to which a partition ofM is made up of sets that each have small mass withrespect to some reference measure.

51

52 7. PREPARATORY LEMMAS FOR THE MAIN CONVERGENCE RESULT

Notation 7.1. Suppose that A = (A1, . . . , Ak) is a partition of M, and λ ameasure on M. For r > 0 set

|A|(λ)r (t) :=

k∑i=1

λ(Ai)r.

Recall the definition of a shattering pair from Definition 5.1.

Lemma 7.2. If the pair (R, λ) is shattering with constant α, then for all k ≥ 1,∫|A|(λ)

2 dRk(A) ≤ α∗

k + 1,

whereα∗ = λ(M)2 ∨ 2αλ(M).

Proof. Define a sequence of random partitions A0,A1, . . . of M by settingA0 = {M} and Ak = {R1, R

c1} ∧ · · · ∧ {Rk, Rck} for k ≥ 1, where the partitions

{Rj , Rcj}, j ≥ 1, are independent and identically distributed with common distri-

bution R. Set Xk = |Ak|(λ)2 . Then,∫

|A|(λ)2 dRk(A) = E[Xk].

By the symmetry of R,

E[Xk+1

∣∣A1, . . . ,Ak]

=∫ ( ∑

A∈Ak

λ(A ∩R)2 + λ(A ∩Rc)2

)dR(R)

=∑A∈Ak

2∫λ(A ∩R)2 dR(R)

≤∑A∈Ak

λ(A)2(1− αλ(A))

= |Ak|(λ)2 − α|Ak|(λ)

3 .

It is always the case for any partition A of M that

|A|(λ)3 ≥

(|A|(λ)

1

)−1(|A|(λ)2

)2 = λ(M)−1(|A|(λ)

2

)2.

Thus, setting c = 2/αλ(M),

E[Xk+1

∣∣A1, . . . ,Ak]≤ Xk(1− cXk).

Applying Jensen’s inequality to the concave function x(1− cx), we see that

(7.3) E[Xk+1] ≤ E[Xk(1− cXk)

]≤ (E[Xk])

(1− cE[Xk]

).

We complete the proof by induction. We have X0 = λ(M)2. Suppose thatE[Xk] ≤ α∗/(k + 1). Since α∗ ≥ 1/c,

E[Xk+1] ≤ α∗

k + 1

(1− cα∗

k + 1

)≤ α∗

k + 1· k

k + 1

≤ α∗

k + 2.

�

7.2. ESTIMATES FOR RADON-NIKODYM DERIVATIVES 53

The following corollary is immediate.

Corollary 7.3. Suppose that the pairs (R, ρ0) and (R, ν) are both shatteringwith the same constant α. Let νt = tν + ρ0. Then, for all k ≥ 1,∫

|A|(νt)2 dRk(A) ≤ α∗

k + 1,

whereα∗ := (2νt(M)2) ∨ (4ανt(M)).

7.2. Estimates for Radon-Nikodym derivatives

Let X be the canonical random variable on G; that is, X is defined by X(g) = g.Given a probability measure Q on G, a Borel subset A ofM, and a measure g ∈ G,we may define a regular conditional distribution Q[·

∣∣XA = gA].Recall the definition of the annealed recombination operator RA from (7.2).

Lemma 7.4. For a finite partition A of M into Borel subsets and a probabilitymeasure Q on G, the Radon–Nikodym derivative of the probability measure RASnQwith respect to the probability measure RAQ is given by

dRASnQ

dRAQ= e−SQ,A/n,

where

SQ,A(g) :=

− n∑A∈A

(logQ

[e−S(X)/n

∣∣XA = gA

]− logQ

[e−S(X)/n

]).

Proof. Write A = {A1, . . . , AK}. By definition, for any measurable F : G →[0, 1],

(RASnQ)[F ] =∫· · ·∫F(g

(1)A1

+ · · ·+ g(K)AK

)dSnQ(g(1)) · · · dSnQ(g(K))

= Q[e−S(X)/n

]−K ∫· · ·∫F

(K∑i=1

g(i)Ai

)

× exp

{−

K∑i=1

S(g(i))/n

}dQ(g(1)) · · · dQ(g(K))

= Q[e−S(X)/n

]−K ∫· · ·∫F (g)

×

(K∏i=1

∫Ge−S(X)/ndQ(X

∣∣∣XAi = gAi)

)dRAQ(g)

= (RAQ)[e−SQ,A/nF ].

�

Lemma 7.5. For a probability measure Q on G, a partition A of M into Borelsets, and measures g, g′ ∈ G,∣∣∣SQ,A(g)− SQ,A(g′)

∣∣∣ ≤ σ(g(M) + g′(M)).


Proof. We may write

SQ,A(g) = −n∑A∈A

(logQ

[e−S(X)/n

∣∣XA = gA

]− logQ

[e−S(X)/n

∣∣XA = 0])

− n∑A∈A

(logQ

[e−S(X)/n

∣∣XA = 0]− logQ

[e−S(X)/n

]).

Consider the summand in the first sum corresponding to the set A in the partitionA. If gA = 0, then the summand is 0. If gA > 0, then the summand is boundedabove by

logQ[e−S(X)/n1{XA = gA}] + logQ[1{XA = 0}]

− logQ[e−S(X)/n1{XA = 0}]− logQ[1{XA = gA}]≤ logQ[e01{XA = gA}] + logQ[1{XA = 0}]

− logQ[e−σ/n1{XA = 0}]− logQ[1{XA = gA}]≤ σ/n,

and, by a similar argument, it is bounded below by −σ/n. Thus, the first sum isbounded in absolute value by σ#{A ∈ A : g(A) > 0} ≤ σg(M).

Note that SQ,A(g) − SQ,A(g′) is the difference of the first sum and the corre-sponding quantity with g replaced by g′; that is, the second sum and its counterpartfor g′ coincide and hence cancel each other in the difference. �

For some purposes, it will be useful to keep track of the temporal order in whichmutations appeared in genotype. We denote by G∗ the space of finite sequences ofgenotypes. That is,

(7.4) G∗ := G t G2 t · · · ,

where t denotes disjoint union. We think of an element of Gi as recording themutations from the ancestral wild type that appear in each of i consecutive gen-erations. There are natural “projection” operations Σ : G∗ → G defined forg = (gi, . . . , g2, g0) ∈ Gi+1 (it will be convenient for us later to index sequencesin reverse order) by

Σ(g) := g0 + g2 + · · ·+ gi

and

Ψj(g) =

{gj , 0 ≤ j ≤ i,0, j > i.

In essence, Σ removes the “labels” that record in which generation the variousmutations from the ancestral wild type occurred, while Ψj isolates mutations thatoccurred in generation j. Note that Σ =

∑∞j=0 Ψj .

A probability measure Q on G∗ may be thought of as a sequence of sub-probability measures (Q(i))∞i=0, where Q(i) is the portion of the Q concentratedon i+ 1-tuples. Thus,

∑∞i=0Q

(i)(Gi+1) = 1. We interpret Q as the distribution ofa random sequence (XI , . . . , X2, X0), with Q(i)(Gi+1) being the probability of theevent {I = i} that the sequence has length i + 1 and Q(i)(·)/Q(i)(Gi+1) being thejoint distribution of (Xi, . . . , X2, X0) conditional on this event.


Given a probability measure Q on G∗, we define, with a slight abuse of notation,probability measures ΣQ and ΨjQ on G by

(ΣQ)[F ] := Q[F ◦ Σ] and (ΨjQ)[F ] := Q[F ◦Ψj ]

for a Borel function F : G → R+.We now define analogues M∗n, S∗n, R∗, and P∗ of the mutation, selection,

recombination and complete Poissonization operators Mn, Sn, R, and P with Greplaced by G∗.

Notation 7.6. Consider a probability measure Q on G∗ thought of as thedistribution of a finite sequence (XI , . . . , X0) of random measures.

Define the probability measure M∗nQ on G∗ to be the distribution of the fi-nite sequence (XI+1, XI , . . . , X0), where XI+1 is a Poisson random measure withintensity measure ν/n that is independent of (XI , . . . , X0).

Similarly, define a new probability measure S∗nQ on G∗ by setting

(S∗nQ)(i)[F ] :=

∫Gi exp{−S(Σ(g))/n}F (g) dQ(i)(g)∫G∗ exp{−S(Σ(g))/n} dQ(g)

=Q[exp{−S(XI + · · ·+X0)/n}F (XI , . . . , X0)1{I = i}]

Q[exp{−S(XI + · · ·+X0)/n}]for a Borel function F : G∗ → R+.

The probability measure R∗Q is the distribution of

((X ′J)R + (X ′′J )Rc , . . . , (X ′0)R + (X ′′0 )Rc),

where the segregating set R is a pick from R, J is independent of R and distributedas I, and, conditional on J = i, (X ′J , . . . , X

′0) and (X ′′J , . . . , X

′′0 ) are independent

picks from the distribution of (XI , . . . , X0) under Q conditional on I = i.The probability measure P∗Q is the distribution of (YJ , . . . , Y0), where J is

distributed as I and conditional on J = i the random measures YJ , . . . , Y0 areindependent and Poisson, with the intensity measure associated with the conditionaldistribution of Yk being the same as the intensity measure associated with theconditional distribution of Xk given I = i for 0 ≤ k ≤ i.

Observe that we have the four intertwining relations

ΣM∗n = MnΣ, ΣS∗n = SnΣ, ΣR∗ = RΣ,ΣP∗ = PΣ.

These equalities confirm that “starred” operators agree with the “unstarred” onesonce the labeling that identified in which generation a mutation occurred is re-moved.

We will think of our initial condition P0 as a probability measure on G∗ that isconcentrated on sequences of length one.

Notation 7.7. Put

(7.5) Pk := (Mn)k P0.

Set

Q∗k := (R∗M∗nS∗n)k P0,

P ∗k := (M∗n)k P0.(7.6)


Note that both probability measures Q∗k and P ∗k are concentrated on sequencesof length (k+1). Note also that if (Xk, . . . , X0) is distributed according to P ∗k , thenX1, . . . , Xk are independent Poisson random measures, each with intensity measureν/n, and X0 is independent with distribution P0. As we expect, Qk = ΣQ∗k andPk = ΣP ∗k , where we recall that Qk = (RMnSn)kP0 and Pk := (Mn)kP0. IfP0 = Πρ0 (that is, P0 is the distribution of a Poisson random measure with inten-sity measure µP0 = ρ0), then P ∗k is the distribution of a (k+ 1)-tuple (Xk, . . . , X0)of independent Poisson random measures on M, where Xj has intensity measureµΨjP

∗k given by ρ0 when j = 0, and ν/n otherwise. In this case, Pk is the distri-

bution of a Poisson random measure with intensity knν + ρ0.

Lemma 7.8. For a positive integer k, and sequence of measures (gk, . . . , g0) ∈G∗,

dQ∗kdP ∗k

(gk, . . . , g0)

=∫

exp

{− 1n

k−1∑i=0

SQi,Ai(gi + · · ·+ g0)

}dR(A1) · · · dR(Ak),

where the partition Ai in the integral is {Ai+1, Aci+1} ∧ · · · ∧ {Ak, Ack}.

Proof. This follows by repeated application of Lemma 7.4, and the fact thatR∗ commutes with M∗n. �

Lemma 7.9. Suppose that P0 = Πρ0 . For P ∗k -almost every g ∈ G∗,

exp{−σkn

Σg(M)−(eσk/n − 1

)µPk(M)

}≤ dQ∗kdP ∗k

(g)

≤ exp{σk

nΣg(M) +

(1− e−σk/n

)µPk(M)

}.

Proof. Recall that µPk = ρ0 + knν. By Lemmas 7.5 and 7.8, for any g ∈ G∗,∣∣∣∣log

dQ∗k/dP∗k (g)

dQ∗k/dP∗k (0)

∣∣∣∣ ≤ k

nsupA

sup0≤i≤k

∣∣∣SQi,A(g0 + · · ·+ gi)∣∣∣

≤ σkΣg(M)n

.

Thus,

dQ∗kdP ∗k

(g) =dQ∗k/dP

∗k (g)

dQ∗k/dP∗k (0)∫

GdQ∗k/dP

∗k (h)

dQ∗k/dP∗k (0) dP

∗k (h)

≤ eσkΣg(M)/n∫G∗ e

−σkΣh(M)/n dP ∗k (h)

= exp{σk

nΣg(M)

}exp

{(1− e−σk/n

)µPk(M)

}.

The lower bound is similar. �


For the next result, we recall Campbell’s Theorem [DVJ88, (6.4.11)], whichsays that ∫

GF (g)g[f ] dΠπ(g) =

∫M

Ππ[F (·+ δx)]f(x) dπ(x)

for a finite measure π on M. In particular,∫Gg[h] exp{−cg(M)} dΠπ(g) = exp{−(1− e−c)π(M)}

∫M

exp{−c}h(x) dπ(x)

for any constant c ≥ 0.

Corollary 7.10. Suppose that P0 = Πρ0 . For 0 ≤ j ≤ k <∞ and µPk-almostevery x ∈M,

exp{−(eσk/n − e−σk/n

)µPk(M)− σk/n

}≤ dµΨjQ

∗k

dµΨjP ∗k(x)

≤ exp{(eσk/n − e−σk/n

)µPk(M) + σk/n

}.

Proof. By the first inequality of Lemma 7.9, we have for a Borel functionh :M→ R+ that

µΨjQ∗k[h] =

∫G∗

Ψjg[h] dQ∗k(g)

=∫G∗

Ψjg[h]dQ∗kdP ∗k

(g) dP ∗k (g)

≥∫G∗

Ψjg[h] exp{−σkn

Σg(M)−(eσk/n − 1

)µPk(M)

}dP ∗k (g)

= exp{−(eσk/n − 1

)µPk(M)

}×∫G∗

Ψjg[h] exp{−σkn

Ψjg(M)}

exp{−σkn

∑i 6=j

Ψig(M)}dP ∗k (g)

= exp{−(eσk/n − 1

)µPk(M)

}× exp

{−(1− e−σk/n

)µΨjP

∗k (M)

}∫M

exp{−σkn

}h(x) dµΨjP

∗k (x)

× exp{−(1− e−σk/n

)∑i 6=j

µΨjP∗k (M)

}= exp

{−(eσk/n − e−σk/n

)µPk(M)− σk/n

}µΨjP

∗k [h].

The first inequality of the result follows. The proof of the second inequality issimilar. �

The next result is an elementary consequence of Corollary 7.10.


Corollary 7.11. Suppose that P0 = Πρ0 . For 0 ≤ k < ∞ and µPk-almostevery x ∈M,

exp{−(eσk/n − e−σk/n

)µPk(M)− σk/n

}≤ dµQkdµPk

(x)

≤ exp{(eσk/n − e−σk/n

)µPk(M) + σk/n

}.

7.3. Palm measures

For various probability measures Q on G, we will need to understand somethingabout the probability measure that corresponds to conditioning a pick from Q toplace a point at the locus x ∈ M. When the intensity measure µQ is finite, theobject that makes rigorous sense of such conditioning is the corresponding Palmmeasure P(x)

Q . For a Borel set A ⊆ G, P(x)Q (A) is the Radon–Nikodym derivative

dµAQ

dµQ(x),

where µAQ is the finite measure on M defined by

µAQ[f ] :=∫A

g[f ] dQ(g)

(see, for example, [Kal84, (1.7)]). The Palm measure is defined for µQ-almost everyx ∈M. If Q is the distribution Ππ of a Poisson random random measure with finiteintensity measure π, then P(x)

Q [F ] = Q[F (·+δx)] for π-almost every x ∈M (see, forexample, [DVJ88, Example 12.1(b)]). It follows from the definition of the Palmmeasure that if F : G → R+ and f :M→ R+ are Borel functions, then∫

GF (g)g[f ] dQ(g) =

∫MP(x)Q [F ]f(x) dµQ(x).

Note that Campbell’s theorem follows from applying this observation when Q isthe distribution of a Poisson random measure.

Corollary 7.12. Suppose that P0 = Πρ0 . For every positive integer k, µQk-almost every x ∈ M (equivalently, µPk-almost every x ∈ M or (ρ0 + ν)-almostevery x ∈M), and every Borel function F : G → R+,

P(x)Qk

[F ] ≤ exp{

2σkn

+(eσk/n + 1− 2e−σk/n

)µPk(M)

}×∫GF (g + δx) exp

{σk

ng(M)

}dPk(g).

Proof. By Corollary 7.11,

(7.7)dµPkdµQk

(x) ≤ exp{(eσk/n − e−σk/n

)µPk(M) + σk/n

}for (ρ0 + ν)-almost every x.

Consider Borel functions F : G → R+ and f : M → R+. From the secondinequality of Lemma 7.9, the inequality (7.7), and the characterization of the Palm

7.4. COMPARISONS WITH COMPLETE POISSONIZATION 59

measure of a Poisson random measure we have∫MP(x)Qk

[F ]f(x) dµQk(x)

=∫GF (g)g[f ] dQk(g)

=∫GF (g)g[f ]

dQkdPk

(g) dPk(g)

=∫M

(∫GF (g)

dQkdPk

(g) dP(x)Pk

(g))f(x) dµPk(x)

≤∫M

(∫GF (g + δx) exp

{σk

n(g + δx)(M) +

(1− e−σk/n

)µPk(M)

}dPk(g)

)× f(x) dµPk(x)

= exp{σk

n+(1− e−σk/n

)µPk(M)

}×∫M


{σk

n(g)(M)

}dPk(g)

)f(x)

dµPkdµQk

(x) dµQk(x)

≤ exp{σk

n+(1− e−σk/n

)µPk(M)

}exp{(eσk/n − e−σk/n

)µPk(M) +

σk

n

}×∫M


{σk

n(g)(M)

}dPk(g)

)f(x) dµQk(x)

= exp{

2σkn


)µPk(M)

}×∫M


{σk

n(g)(M)

}dPk(g)

)f(x) dµQk(x),

and the result follows. �

7.4. Comparisons with complete Poissonization

Recall from Notation 6.1 that the complete Poissonization operator P acts ona probability measure P on G by PP := ΠµP . Also, recall from Notation 6.2that Q′k := (PSnMn)kP0 is the analogue of the sequence of probability measuresQk := (RSnMn)kP0 from Notation 4.4 that is of primary interest to us in Theo-rem 5.4, with the recombination operator R replaced by the complete Poissonizationoperator P. Finally, recall from Notation 7.7 that the counterpart of Qk for G∗(that is, for the setting in which we keep track of the generation in which mutationsfrom the ancestral wild type occurred) is Q∗k = (R∗S∗nM

∗)kP0.The next result is a consequence of Corollary 7.10 and the observation that

if π′ and π′′ are two finite measures on M and π′ is absolutely continuous withrespect to π′′, then

dΠπ′

dΠπ′′(g) = exp

{g

[log(dπ′

dπ′′

)]− π′(M) + π′′(M)

}.


Corollary 7.13. Suppose that P0 = Πρ0 . For P ∗k -almost every g ∈ G∗,

exp{(−(eσk/n − e−σk/n

)µPk(M)− σk/n

)Σg(M)− µQk(M) + µPk(M)

}≤ dP∗Q∗k

dP ∗k(g)

≤ exp{((

eσk/n − e−σk/n)µPk(M) + σk/n


}.

Lemma 7.14. Suppose that Q = Ππ for some finite measure π on M. Theprobability measure PSnQ on G is absolutely continuous with respect to Q withRadon-Nikodym derivative

(7.8)dPSnQ

dQ(g) = e−S

′Q(g)/n,

where

(7.9) S′Q(g) := g[sQ]− n∫G h(M)e−S(h)/n dQ(h)∫G e−S(h)/n dQ(h)

+ n

∫Gh(M) dQ(h),

with

sQ(x) : = −n log∫Ge−S(h)/n dP(x)

Q (h) + n log∫Ge−S(h)/n dQ(h)

= −n log∫Ge−S(h+δx)/n dQ(h) + n log

∫Ge−S(h)/n dQ(h)

= −n log(dµ(PSnQ)

dπ(x)).

(7.10)

Proof. Consider (7.10). The first equality is simply the characterization ofthe Palm distribution for Poisson random measures that we have already noted.For the second equality, take a Borel function φ :M→ R+, and define Φ : G → R+

by Φ(g) = g[f ]. Then,

µPSn[φ] = µSnQ[φ]

= SnQ[Φ]

=Q[e−S/nΦ]Q[e−S/n]

=∫ P(x)

Q [e−S/n]

Q[e−S/n]φ(x) dπ(x)

=∫e−sQ(x)/nφ(x) dπ(x)

by Campbell’s Theorem, and the equality follows.Next consider (7.8). By (7.10),

dPSQ

dQ(g) =

e−g[sQ/n]∫e−h[sQ/n] dQ(h)

.

Now, ∫e−h[sQ/n] dQ(h) = exp

{−∫ (

e−sQ(x)/n − 1)dπ(x)

},


and, by another application of Campbell’s Theorem,∫ (e−sQ(x)/n − 1

)dπ(x) =

(∫ P(x)Q [e−S/n]

Q[e−S/n]dπ(x)

)e−π(M)

=∫e−S(h)/nh(M)dQ(h)∫e−S(h)/n dQ(h)

e−∫h(M) dQ(h).

Putting these equations together proves (7.8). �

The analogue of Q∗k with the complete Poissonization operator P∗ replacingthe recombination operator R∗ (equivalently, the counterpart of Q′k for G∗) is

Q′∗k := (P∗S∗nM

∗n)kP0.

As expected, we have the intertwining relation ΣQ′∗k = Q′k.

Recall from Definition 2.3 that if π is a finite measure on M and x ∈M, thenFπ(x) := E[S(Xπ + δx)− S(Xπ)], where, as usual, Xπ is Poisson random measureon M with intensity measure π (that is, Xπ has distribution Ππ).

Proof. Observe that∣∣sQ(x)− Fπ(x)∣∣ ≤ n ∣∣∣∣log

∫Ge−S(h+δx)/n dQ(h)−

∫GS(h+ δx)/n dQ(h)

∣∣∣∣+ n

∣∣∣∣log∫Ge−S(h)/n dQ(h)−

∫GS(h)/n dQ(h)

∣∣∣∣ ,and the result follows from Lemma A.1. �

Recall from Lemma 7.4 that dRASnQ/dRAQ = exp{−SQ,A/n}, where Qis any probability measure on G, A is a partition of M into a finite number ofBorel sets, RA is the annealed recombination operator of (7.2), SQ,A is defined inLemma 7.4. In particular, SQ,A depends implicitly on n.

Corollary 7.15. Consider a finite Borel partition A of M. Suppose that Qa probability measure on G under which the masses of sets in the partition A areindependent. Let Q be defined to have dQ/dQ = exp{−SQ,A/n}. Then,

dPQ

dPQ= e−S′

Q/n.

Proof. By assumption, Q = RAQ. By Lemma 7.4 we may write

Q = RASnQ

Thus,

dPQ

dPQ=dPSnQ

dPQ= e−S′

Q/n

by Lemma 7.14. �

Lemma 7.16. There is a constant κ depending on σ := supg∈G S(g) but not onn such that for a probability measure Q on G, a finite Borel partition A of M, and


a point g ∈ G,∣∣∣SQ,A(g)− S′Q(g)∣∣∣

≤ κ(

1ng(M) +

1nQ[X(M)]

+∑A∈A

Q{X(A) > 0

}2 +∑A∈A

Q[X(A)1{X(A)≥2}]

+∫MQ{X(A(x)) > 0} dg(x) +

∫MP(x)Q

{X(A(x)) ≥ 2

}dg(x)

+ g

{x ∈M : Q{X(A(x)) > 0} > 1

2

}+ #{A ∈ A : g(A) ≥ 2}

),

where A(x) ⊆M is the unique set in the partition A that contains the point x ∈M.

Proof. We begin by establishing a bound on the difference between SQ,A(g)and the quantity ∑

A∈A

(Q[S∣∣XA = gA

]−Q

[S∣∣XA = 0

])− ΓQ,

whereΓQ := Q[S(X)X(M)]−Q[S(X)]−Q[X(M)].

By definition (see Lemma 7.4),

SQ,A(g)

= −n∑A∈A

(logQ

[e−S(X)/n

∣∣XA = gA

]− logQ

[e−S(X)/n

∣∣X(A) = 0])

− n∑A∈A

(logQ

[e−S(X)/n

∣∣X(A) = 0]− logQ

[e−S(X)/n

]).

In the first sum, a summand can only be nonzero when g(A) > 0. By Lemma A.1,the first term is is thus approximately∑

A∈A : g(A)>0

(Q[S(X)

∣∣XA = gA]−Q

[S(X)

∣∣X(A) = 0]),

with an error bounded by

2n12σ2

n2eσ/ng(M) =

σ2

neσ/ng(M).

The second term requires more care, because the sum extends over all (thepotentially very large number of) sets of the partition A. Fix one of these sets A.Then,

logQ[e−S(X)/n

∣∣X(A) = 0]− logQ

[e−S(X)/n

]= log

Q[e−S(X)/n1{X(A)=0}

]Q[e−S(X)/n

]Q {X(A) = 0}

.(7.11)

Note for y ≥ 1 that

(7.12) 0 ≤ (y − 1)− log y =∫ y

1

1− 1zdz ≤

(1− 1

y

)(y − 1) =

(y − 1)2

y,


and a similar argument shows that the same bound holds for 0 < y ≤ 1. That is, wehave the approximation log(y) ≈ y−1, y > 0, with error bounds y−1−(y−1)2/y ≤log(y) ≤ y − 1.

Applying this approximation to the right-hand side of (7.11), the first-orderterm is

Q[e−S(X)/n1{X(A)=0}

]Q[e−S(X)/n

]Q {X(A) = 0}

− 1 =CovQ[e−S(X)/n,1{X(A)=0}]Q[e−S(X)/n

]Q {X(A) = 0}

=CovQ[1− e−S(X)/n,1{X(A)>0}]Q[e−S(X)/n

]Q {X(A) = 0}

≈ 1n

CovQ[S(X),1{X(A)>0}].

(7.13)

The difference between the second and third quantities in (7.13) is

Q[(

1− e−S(X)/n − S(X)/n−Q[1− e−S(X)/n − S(X)/n

])1{X(A)>0}

]Q[e−S(X)/n

]Q {X(A) = 0}

+Q[(S(X)/n−Q [S(X)/n]) 1{X(A)>0}

]×(

1Q[e−S(X)/n]Q{XA = 0}

− 1),

(7.14)

From the inequalities −y2/2 ≤ 1 − e−y − y ≤ 0, y ≥ 0, the absolute value of(7.14) is bounded by

(7.15) eσ/nσ2

2n2

Q{X(A) > 0}Q{X(A) = 0}

+ eσ/nσ

nQ{X(A) > 0}

(σ

n+Q{X(A) > 0

}Q{X(A) = 0}

).

This bound is only useful when Q{X(A) > 0} is small, so we further note that theabsolute value of (7.14) is also bounded by∣∣∣∣∣CovQ[1− e−S(X)/n,1{X(A)=0}]

Q[e−S(X)/n

]Q {X(A) = 0}

∣∣∣∣∣+1n

∣∣CovQ[S(X),1{X(A)>0}]∣∣

≤ eσ/nσn

+σ

n.

(7.16)

The contribution coming from the bound (y − 1)2/y on the error in the firstorder approximation log(y) ≈ y − 1 is bounded by(

CovQ[e−S(X)/n,1{X(A)=0}]Q[e−S(X)/n]Q{X(A) = 0}

)2Q[e−S(X)/n

]Q {X(A) = 0}

Q[e−S(X)/n1{X(A)=0}

]=

CovQ[1− e−S(X)/n,1{X(A)>0}]2

Q[e−S(X)/n]Q{X(A) = 0}Q[e−S(X)/n1{X(A)=0}

]≤ e2σ/nσ

2

n2

Q{X(A) > 0}2

Q{X(A) = 0}2.

(7.17)

Once again, this bound is only useful when Q{X(A) > 0} is small, so we furthernote that the left-hand side of (7.17) is also bounded by

(7.18) e2σ/nσ2

n2.


Putting these bounds together, we see that

− n(

logQ[e−S(X)/n

∣∣X(A) = 0]− logQ

[e−S(X)/n

])≈ CovQ[S,1{X(A)>0}],

(7.19)

with error bounded by

eσ/nσ2

nQ{X(A) > 0}

+ eσ/nσQ{X(A) > 0}(σn

+ 2Q{X(A) > 0

})+ 4e2σ/nσ

2

nQ{X(A) > 0}2,

(7.20)

when Q{X(A) = 0} ≥ 1/2, and by

(7.21) eσ/nσ + σ + e2σ/nσ2

n,

otherwise.Note that

∣∣∣∣∣ΓQ −∑A∈A

CovQ[S(X),1{X(A)>0}]

∣∣∣∣∣ =

∣∣∣∣∣∑A∈A

CovQ[S(X), X(A)− 1{X(A)>0}]

∣∣∣∣∣≤ σ

∑A∈A

Q[X(A)1{X(A)≥2}]

(7.22)

Summing (7.19), (7.20), and (7.21) over A ∈ A and combining the result with(7.22), we see that

(7.23) SQ,A(g) ≈∑A∈A

(Q[S∣∣XA = gA

]−Q

[S∣∣XA = 0

])− ΓQ

with an error bounded by

eσ/nσ2

ng(M) + 2eσ/n

σ2

nQ[X(M)]

+(

2eσ/nσ + 4e2σ/nσ2

n+ 4

(eσ/nσ + σ + e2σ/nσ

2

n

)) ∑A∈A

Q{X(A) > 0

}2

+ σ∑A∈A

Q[X(A)1{X(A)≥2}],

(7.24)

where we used the inequality

1{Q{X(A) = 0} ≤ 1

2

}≤ 4Q{X(A) > 0}2.

We now bound the difference between S′Q(g) and the approximation to SQ,A(g)in (7.23). We stress that here we are not assuming that Q is the distribution of aPoisson random measure, and so, in the definition of S′Q(g) from (7.9), namely

S′Q(g) = g[sQ]− nQ[X(M)e−S(X)/n]Q[e−S(X)/n]

+ nQ[X(M)],


the definition of the function sQ is the one appearing in the first line of (7.10),namely

sQ(x) = −n logP(x)Q [e−S(X)/n] + n logQ[e−S(X)/n],

rather than the definitions on the other lines of (7.10) that are only equivalent inthe Poisson case. The absolute value of the difference is at most

∑A∈A:g(A)=1

∣∣gA[sQ]−(Q[S∣∣XA = gA

]−Q

[S∣∣XA = 0

])∣∣+

∑A∈A:g(A)>1

∣∣gA(sQ)−(Q[S∣∣XA = gA

]−Q

[S∣∣XA = 0

])∣∣+∣∣∣∣−nQ[X(M)e−S(X)/n]

Q[e−S(X)/n]+ nQ[X(M)] + ΓQ

∣∣∣∣ .(7.25)

We will consider each of the terms in (7.25) in turn.Note first of all that, by Lemma A.1,

∣∣∣n logQ[e−S(X)/n] +Q[S(X)

∣∣XA = 0]∣∣∣

≤∣∣∣∣Q[S(X)]−

Q[S(X)1{X(A)=0}]Q{X(A) = 0}

∣∣∣∣+ eσ/nσ2

2n

≤ σQ{X(A) > 0}Q{X(A) = 0}

+ eσ/nσ2

2n

≤ 2σQ{X(A) > 0}+ eσ/nσ2

2n

(7.26)

if Q{X(A) > 0} ≤ 12 , and otherwise the left-hand side is bounded by σ.

Note next, that, by definition of the Palm measure, for any Borel subset B ⊆ Gwe have

∣∣∣P(x)Q (B)−Q

(B∣∣XA = δx

)∣∣∣ ≤ P(x)Q {X(A) ≥ 2}

for µQ-almost every x ∈M. Combining this observation with Lemma A.1 gives

∣∣∣n logP(x)Q [e−S(X)/n] +Q

[S(X)

∣∣XA = δx]∣∣∣

≤ eσ/n σ2

2n+ σP(x)

Q

{X(A) ≥ 2

}.

(7.27)


Integrating the inequalities (7.26) and (7.27) against the measure gA and sum-ming over A yields that the first sum in (7.25) is at most∑

A∈A:g(A)=1

∫M

∣∣sQ(x)−(Q[S∣∣XA = δx

]−Q

[S∣∣XA = 0

])∣∣ dgA(x)

≤∑

A∈A:g(A)=1

∫A

2σQ{X(A) > 0}+ eσ/nσ2

2n+ σ1

{Q{X(A) > 0} > 1

2

}

+ eσ/nσ2

2n+ σP(x)

Q

{X(A) ≥ 2

}dgA(x)

≤ eσ/nσ2

ng(M)

+∫M

2σQ{X(A(x)) > 0}+ σP(x)Q

{X(A(x)) ≥ 2

}dg(x)

+ σg

{x ∈M : Q{X(A(x)) > 0} > 1

2

}.

(7.28)

Turning to the second sum in (7.25), we have

(7.29)∣∣∣n logP(x)

Q [e−S(X)/n] +Q[S(X)

∣∣XA = gA]∣∣∣ ≤ σ.

Combining this with (7.26), integrating with respect to gA, and summing over Ayields that the second sum in (7.25) is at most

eσ/nσ2

2ng(M) +

∫M

2σQ{X(A(x)) > 0} dg(x)

+ σg

{x ∈M : Q{X(A(x)) > 0} > 1

2

}+ σ#{A ∈ A : g(A) ≥ 2}.

(7.30)

The third term in (7.25) is

∣∣∣∣−nQ[X(M)e−S(X)/n]Q[e−S(X)/n]

+ nQ[X(M)] +Q[X(M)S(X)]−Q[X(M)]Q[S(X)]∣∣∣∣

≤∣∣∣∣−n(Q[X(M)e−S(X)/n]−Q[X(M)]Q[e−S(X)/n]

)( 1Q[e−S(X)/n]

− 1)∣∣∣∣

+∣∣∣−n(Q[X(M)e−S(X)/n]−Q[X(M)]Q[e−S(X)/n]

)− (Q[X(M)S(X)]−Q[X(M)]Q[S(X)])

∣∣∣=∣∣∣∣nQ [(X(M)−Q[X(M)])(1− e−S(X)/n)

] 1−Q[e−S(X)/n]Q[e−S(X)/n]

∣∣∣∣+∣∣∣nQ [(X(M)−Q[X(M)])(1− e−S(X)/n − S(x)/n)

]∣∣∣≤ Q[X(M)]

(eσ/n

σ2

n+

12σ2

n

).

(7.31)

It follows from substituting the bounds (7.28), (7.30), and (7.31) into (7.25)that the difference between S′Q(g) and the approximation to SQ,A(g) in (7.23) is


bounded in absolute value by

eσ/n3σ2

2ng(M)

+∫M

4σQ{X(A(x)) > 0}+ σP(x)Q

{X(A(x)) ≥ 2

}dg(x)

+ 2σg{x ∈M : Q{X(A(x)) > 0} > 1

2

}+ σ#{A ∈ A : g(A) ≥ 2}

+(eσ/n

σ2

n+

12σ2

n

)Q[X(M)].

(7.32)

The result follows upon combining this last bound with the bound (7.24) on theabsolute value of the difference between SQ,A(g) and the approximation to SQ,A(g)in (7.23). �

CHAPTER 8

Proof of convergence of the discrete-generationsystems

With Proposition 6.3 in hand, it will suffice to show for a number of generationsk with 0 ≤ k ≤ Tn thatQk = (RMnSn)kP0, the probability measure that describesa population after k rounds of selection, mutation, and recombination, is close toQ′k = (PMnSn)kP0, the probability measure that describes a population after krounds of selection, mutation, and complete Poissonization.

By Lemma 2.7,

∥∥Qk −Q′k∥∥Was≤∥∥Qk −PQk

∥∥Was

+∥∥PQk −Q′k∥∥Was

≤∥∥Qk −PQk

∥∥Was

+ 4∥∥µQk − µQ′k∥∥Was

,(8.1)

where we recall that µQk and µQ′k are the intensity measures of Qk and Q′k.Set

ak :=∥∥Qk −PQk

∥∥Was

and

bk :=∥∥µQk − µQ′k∥∥Was

.

Given a Borel function f : M→ [−1, 1], define the function F : G → R by F (g) :=g[f ]. Note that

Qk+1[F ] = µQk+1[f ] = µ(RMnSn)Qk[f ] =ν[f ]n

+ SnQk[f ],

where we use the facts that, for any probability measure Q on G, µRQ = µQ andµMnQ = µQ+ ν/n. Thus, from the inequalities −y2/2 ≤ 1− e−y − y ≤ 0, y ≥ 0,

bk+1 ≤ sup{∣∣SnQk[f ]−SnQ

′k[f ]

∣∣ : ‖f‖Lip ≤ 1}

≤ eσ/n∣∣∣Qk [fe−S/n]−Q′k [fe−S/n]∣∣∣

≤ eσ/n(|Qk[f ]−Q′k[f ]|+ 1

n|Qk [S · f ]−Q′k [S · f ]|+ 2σ2

n2

)≤ eσ/n

(bk +

2σnak +

2σ2

n2

).

(8.2)

69

70 8. PROOF OF CONVERGENCE OF THE DISCRETE-GENERATION SYSTEMS

We now bound ak. It follows from the relations ΣQ∗k = Qk, ΣP∗ = PΣ, andΣP ∗k = Pk that∥∥Qk −PQk

∥∥Was

= sup{|Qk[φ]−PQk[φ]| : ‖φ‖Lip ≤ 1}≤ sup{|Qk[φ]−PQk[φ]| : ‖φ‖∞ ≤ 1}= sup{|Q∗k[φ ◦ Σ]−P∗Q∗k[φ ◦ Σ]| : ‖φ‖∞ ≤ 1}≤ sup{|Q∗k[φ∗]−P∗Q∗k[φ∗]| : ‖φ∗‖∞ ≤ 1}

=∫ ∣∣∣∣dQ∗kdP ∗k

(g)− dP∗Q∗kdP ∗k

(g)∣∣∣∣ dP ∗k (g)

So, from the inequality log y ≤ y − 1,∥∥Qk −PQk∥∥

Was≤∫ ∣∣∣∣log

dQ∗kdP ∗k

(g)− logdP∗Q∗kdP ∗k

(g)∣∣∣∣

× dQ∗kdP ∗k

(g) ∨ dP∗Q∗k

dP ∗k(g) dP ∗k (g)

By Lemma 7.8,dQ∗kdP ∗k

(gk, . . . , g0)

=∫

exp

{− 1n

k−1∑i=0

SQi,Ai(gi + · · ·+ g0)

}dR(A1) · · · dR(Ak),

where the partition Ai in the integral is {Ai+1, Aci+1} ∧ · · · ∧ {Ak, Ack}, and, by

Lemma 7.9,

dQ∗kdP ∗k

(g) ≤ exp{σk

nΣg(M) +

(1− e−σk/n

)µPk(M)

}.

By repeated application of Corollary 7.15,

dP∗Q∗kdP ∗k

(gk, . . . , g0) = exp

{− 1n

k−1∑i=0

S′Qi(g0 + · · ·+ gi)

},

and, by Corollary 7.13,dP∗Q∗kdP ∗k

(g)

≤ exp{((



}.

Observe that{σk

nΣg(M) +

(1− e−σk/n

)µPk(M)

}∨{((



}≤((eσk/n − e−σk/n

)µPk(M) + σk/n

)Σg(M) + µPk(M)

≤((eσk/n − e−σk/n

)(ρ0(M) + ν(M)k/n

)+ σk/n

)Σg(M)

+(ρ0(M) + ν(M)k/n

)≤ cΣg(M) + c,

8. PROOF OF CONVERGENCE OF THE DISCRETE-GENERATION SYSTEMS 71

where c and c are constants that depend on σ, T , ρ0(M), and ν(M). Therefore,∥∥Qk −PQk∥∥

Was

≤∫ ∣∣∣∣∣ 1n

k−1∑i=0

S′Qi(g0 + · · ·+ gi)−1n

k−1∑i=0

SQi,Ai(g0 + · · ·+ gi)

∣∣∣∣∣× exp{cΣg(M) + c} dP ∗k (g) dR(A1) · · · dR(Ak)

≤ 1n

k−1∑i=0

∫ ∣∣∣∣∣S′Qi(g)− SQi,A(g)

∣∣∣∣∣ exp{cg(M) + c} dPi(g) dRk−i(A),

(8.3)

where c is a constant that depends on σ, T , ρ0(M), and ν(M) such that∫exp{c(gk−i+1 + · · ·+ gk)(M) + c} dΠ⊗k−iν/n (gk, . . . , gk−i+1) ≤ exp{c},

and we recall from Section 7.1 that Rk−i is the probability measure on parti-tions of M that is the push-forward of the product measure R⊗(k−i) by the map(Ai+1, . . . , Ak) 7→ {Ai+1, A

ci+1} ∧ · · · ∧ {Ak, Ack}.

Recall from Chapter 7.3 that P(x)Q denotes the Palm measure at x ∈ M of a

probability measure Q on G. Recall also for a finite measure π onM and boundedfunctions f :M→ R+ and h :M→ R+ that∫

exp{g[f ]} dΠπ(g) = exp{π[ef − 1]}∫g[h] exp{g[f ]} dΠπ(g) = π[hef ] exp{π[ef − 1]}

(8.4)

From Lemma 7.16,∫ ∣∣∣S′Qi(g)− SQi,A(g)∣∣∣ exp {cg(M) + c} dPi(g)

≤∫κ

(1ng(M) +

1nQi[X(M)]

+∑A∈A

Qi{X(A) > 0

}2 +∑A∈A

Qi[X(A)1{X(A)≥2}]

+∫MQi{X(A(x)) > 0} dg(x) +

∫MP(x)Qi

{X(A(x)) ≥ 2

}dg(x)

+ g

{x ∈M : Qi{X(A(x)) > 0} > 1

2

}+ #{A ∈ A : g(A) ≥ 2}

),

× exp {cg(M) + c} dPi(g),

(8.5)

for any Borel partition A of M, where A(x) is the set in A that contains x.We next break (8.5) up into a sum of integrals and, ignoring multiplicative con-

stants that depend on σ, T , ρ0(M) and ν(M), estimate each integral individually.We will apply (8.4) repeatedly without explicit mention. We will write c1, c2, . . . forconstants that depend on σ, T , ρ0(M) and ν(M), and sometimes use expectationrather than integral notation.

To begin,1nPi [X(M) exp {cX(M)}] ≤ c1

n.


From Lemma 7.9,

1nQi[X(M)]Pi [exp {cX(M)}]

≤ 1nPi

[exp

{σi

nX(M) +

(1− e−σi/n

)µPi(M)

}]Pi [exp {cX(M)}]

≤ c2n.

Again from Lemma 7.9,

Qi{X(A) > 0

}≤ Qi[X(A)]

≤ Pi[X(A) exp

{σi

nX(M) +

(1− e−σi/n

)µPi(M)

}]≤ c3µPi(A),

and so ∑A∈A

Qi{X(A) > 0

}2Pi [exp {cX(M)}] ≤ c4

∑A∈A

(µPi(A))2.

Similarly,

Qi[X(A)1{X(A)≥2}]

≤ Pi[X(A)1{X(A)≥2} exp

{σi

nX(M) +

(1− e−σi/n

)µPi(M)

}]= Pi

[X(A)(1− 1{X(A)≤1} exp

{σi

nX(M) +

(1− e−σi/n

)µPi(M)

}]= exp

{(1− e−σi/n

)µPi(M)

}(Pi

[X(A) exp

{σi

nX(M)

}]− Pi

[exp

{σi

nX(M\A)

}]Pi{X(A) = 1}

)≤ c5

(µPi(A) exp

{σi

n

}exp

{µPi(M)

(σi

n− 1)}

− exp{µPi(M\A)

(σi

n− 1)}

µPi(A) exp{−µPi(A)})

≤ c6(µPi(A))2,

and so

∑A∈A

Qi[X(A)1{X(A)≥2}]Pi [exp {cX(M)}] ≤ c7∑A∈A

(µPi(A))2.

8. PROOF OF CONVERGENCE OF THE DISCRETE-GENERATION SYSTEMS 73

Also,

Pi

[∫MQi{X(A(x)) > 0} dX(x) exp {cX(M)}

]= Pi

[∑A∈A

∫MQi{X(A) > 0}1{x∈A} dX(x) exp {cX(M)}

]

= Pi

[∑A∈A

Qi{X(A) > 0}X(A) exp {cX(M)}

]≤ c8

∑A∈A

(µPi(A))2.

by arguments along the lines of those above.By Corollary 7.12,

Pi

[∫MP(x)Qi

{X(A(x)) ≥ 2

}dX(x) exp {cX(M)}

]≤ exp

{2σkn


)µPi(M)

}× Pi

[∫M

(∫1{(g+δx)(A(x))≥2} exp

{σk

ng(M)

}dPi(g)

)dX(x)

× exp {cX(M)}]

≤ c9∑A∈A

Pi

[∫M

(∫1{g(A)≥1} exp

{σk

ng(M)

}dPi(g)

)1{x∈A} dX(x)

× exp {cX(M)}]

= c9∑A∈A

Pi

[1{X(A)≥1} exp

{σk

nX(M)

}]Pi [X(A) exp {cX(M)}]

≤ c10

∑A∈A

(µPi(A))2,

via arguments we have used before.Observe that

g

{x ∈M : Qi{X(A(x)) > 0} > 1

2

}≤ 2

∫MQi{X(A(x)) > 0} dg(x),

and so

Pi

[X

{x ∈M : Qi{X(A(x)) > 0} > 1

2

}exp {cX(M)}

]≤ c11

∑A∈A

(µPi(A))2,

from above.


Lastly, ∑A∈A

Pi[1{X(A)≥2} exp {cX(M)}

]=∑A∈A

Pi[1{X(A)≥2} exp {cX(A)}

]Pi [exp {cX(M\A)}]

≤ c12

∑A∈A

(µPi(A))2.

Substituting these estimates into (8.5) and recalling (8.3) gives∥∥Qk −PQk∥∥

Was

≤ 1n

k−1∑i=0

(c

n+ c

∫ ∑A∈A

(µPi(A))2 dRk−i(A)

),

where c and c are constants that depend on σ, T , ρ0(M) and ν(M).It follows from Corollary 7.3 that

ak =∥∥Qk −PQk

∥∥Was≤ β′

n

k−1∑i=0

1k − i+ 1

≤ β′′ log nn

,

where β′ and β′′ are constants that depend on σ, T , ρ0(M), ν(M), and the con-stants that appear in the shattering condition for the measures ρ0 and ν withrespect to the recombination operation R.

It is clear that ak tends to 0 as n→∞. Moreover, from (8.2),

bk+1 ≤ eσ/n(bk +

2σnβ′′

log nn

+2σ2

n2

)≤ eσ/nbk + γ

log nn2

for a suitable constant γ that does not depend on n, so that bk also tends to 0 asn→∞, completing the proof.

CHAPTER 9

Convergence to Poisson of iterated recombination

Our main result, Theorem 5.4, states that when n is large the discrete gen-eration mutation-selection process, when started in a probability measure on thegenotype space G that is the distribution of Poisson random measure on the lo-cus space M, stays close to the set of such probability measures out to numbersof generations that are of order n. In particular, this conclusion holds when thediscrete generation starts with the entire population being completely wild type atevery locus, so that the initial condition is the distribution of the Poisson randommeasure with zero intensity.

In this section we provide a complement to this observation by showing thatthe recombination process acting alone rapidly shuffles the distribution of a non-Poisson random measure onM to produce a probability measure on G that is closeto the distribution of a Poisson random measure on M.

Definition 9.1. A distribution P on G is dispersive if there is a constant βsuch that for any Borel A ⊂M,∫

g(A)1{g(A)≥2}dP (g) ≤ βµP (A)2.

Of course, the distribution of a Poisson random measure is always dispersive.

Proposition 9.2. For all n and k, µRkP = µP . If the recombination measureis shattering with respect to µP , and P is dispersive, then RkP converges to PP ,as k →∞, with

(9.1) ‖RkP −PP‖Was ≤ (3β + 2)(ν(M)2 ∨ 2αν(M)

)(k + 1)−1.

Proof. We note first that

µRk+1P (B) =∫ ∫

G

∫G

(g1

∣∣R

+g2

∣∣Rc

)(B) dµRkP (g1) dµRkP (g2) dR(R) = µn(B).

Let A = (A1, . . . , AN ) be a partition of M. Then, RAP defines a randomgenotype that has independent components on each Ai, with corresponding distri-butions P

∣∣Ai

. The distance between two probabilities on genotypes is the sum ofthe distances between the restrictions to a partition of M. Thus,

∥∥RkAP −ΠµP

∥∥Was

=N∑i=1

∥∥RkAP∣∣Ai−ΠµP

∣∣Ai

∥∥Was

≤N∑i=1

∥∥P ∣∣Ai−ΠµP

∣∣Ai

∥∥Was

.

(9.2)

75

76 9. CONVERGENCE TO POISSON OF ITERATED RECOMBINATION

Pick any Borel subset A of X . Define µA to be the measure on A defined for Borelsubsets B ⊆ A by

µA(B) =∫Gg(B)1{g(A)=1}dP (g).

Clearly µA ≤ µP . Also define the sub-probability measure PA on genotypes re-stricted to A by

PA[f ] =∫Gf(g∣∣A

)1{g(A)≤1}dP (g).

Then, µA is the first-moment measure of PA. Note also that

(9.3) µP (A) = µA(A) +∫Gg(A)1{g(A)≥2}.

We have∥∥P ∣∣A−ΠµP

∣∣A

∥∥Was

≤∥∥P ∣∣

A−PA

∥∥Was

+∥∥PA −ΠµA

∥∥Was

+∥∥ΠµA −ΠµP

∣∣A

∥∥Was

.(9.4)

We bound first

(9.5)∥∥P ∣∣

A− PA

∥∥Was≤ P

{g : g(A) ≥ 2

}≤ βµP (A)2.

Next, for any f : G → R with ‖f‖Lip ≤ 1,∣∣PAf −ΠµAf∣∣ ≤ ∣∣P{g : g(A) = 0

}−ΠµA

{g : g(A) = 0

}∣∣+∣∣PAf1{g:g(A)=1} −ΠµAf1{g:g(A)=1}

∣∣ + ΠµA

{g : g(A) ≥ 2

}≤∣∣∣1− µA(A) + P

{g : g(A) ≥ 2

}− e−µA(A)

∣∣∣+∣∣∣µA(f)

(1− e−µA(A)

)∣∣∣ +µA(A)2

2

≤ P{g : g(A) ≥ 2

}+µP (A)2

2+ µP (A)2 +

µP (A)2

2.

Thus,

(9.6)∥∥PA −ΠµA

∥∥Was≤(β + 2

)µP (A)2.

Finally, by Lemma 2.7,∥∥ΠµA −ΠµP

∣∣A

∥∥Was≤∥∥µA−µP ∣∣A∥∥Was

+∣∣µA(A)−µP (A)

∣∣+ 12(µA(A)2 +µP (A)2

)For any f : A → R with ‖f‖Lip ≤ 1, since µA is the first-moment measure of PA,we have by Campbell’s Theorem [DVJ88, (6.4.11)],∣∣µAf − µPf ∣∣ =

∣∣∣∣∫Gg[f ]dPA(g)−

∫Gg[f ]dP (g)

∣∣∣∣=∫Gg[f ]1{g(A)≥2}dP (g).

Thus,

(9.7)∥∥ΠµA −ΠµP

∣∣A

∥∥Was≤ βµP (A)2.

Putting (9.5), (9.6) and (9.7) into (9.4), we get

(9.8)∥∥P ∣∣

A−ΠµP

∣∣A

∥∥Was≤ (3β + 2)µP (A)2

9. CONVERGENCE TO POISSON OF ITERATED RECOMBINATION 77

By (9.2), then,

(9.9)∥∥RkP −ΠµP

∥∥Was≤ (3β + 2)E|π|2,

where the expectation is taken with respect to partitions π chosen from the distri-bution Rk. Applying Lemma 7.2 completes the proof of (9.1). �

Note that Proposition 9.2 is essentially a point-process version of Le Cam’sPoisson convergence result of [LC60].

APPENDIX A

An expectation approximation

The following lemma, which gives error bounds for this approximation, wasused several times in this work.

Lemma A.1. Let T be a nonnegative random variable with finite second mo-ment. Then,

−E[T ] +12

Var(T )eE[T ] ≥ log E[e−T

]≥ −E[T ].

and so0 ≤ log E

[e−T

]≥ +E[T ] ≤ 1

2Var(T )eE[T ].

In particular, if T is bounded by a constant τ , then

0 ≤ log E[e−T

]≥ +E[T ] ≤ 1

2τ2eτ .

Proof. The second inequality holds directly from Jensen’s inequality, by theconvexity of x 7→ e−x.

The function x 7→ e−x− x2

2 is concave for x ≥ 0, so that by Jensen’s inequality,

E[e−T − T 2

2

]≤ e−E[T ] − E[T ]2

2.

Consequently,

E[e−T

]≤ e−E[T ] +

12

Var(T ) = e−E[T ]

(1 +

12

Var[T ]eE[T ]

).

Taking logarithms of both sides, and using the bound log(1 +x) ≤ x completes theproof. �

79

Bibliography

[AGS05] Luigi Ambrosio, Nicola Gigli, and Giuseppe Savare, Gradient flows in metric spaces and

in the space of probability measures, Lectures in Mathematics ETH Zurich, BirkhauserVerlag, Basel, 2005.

[Baa01] Ellen Baake, Mutation and recombination with tight linkage, J. Math. Biol. 42 (2001),

no. 5, 455–488.[Baa05] Michael Baake, Recombination semigroups on measure spaces, Monatsh. Math. 146

(2005), no. 4, 267–278.

[Baa07] , Addendum to: “Recombination semigroups on measure spaces” [Monatsh.Math. 146 (2005), no. 4, 267–278; mr2191729], Monatsh. Math. 150 (2007), no. 1,

83–84.[Bau05] Annette Baudisch, Hamilton’s indicators of the force of selection, Proceedings of the

National Academy of Sciences 102 (2005), no. 23, 8263–8.

[Bau08] , Inevitable aging? : contributions to evolutionary-demographic theory,Springer, Berlin, 2008.

[BB03] Michael Baake and Ellen Baake, An exactly solved model for mutation, recombination

and selection, Canad. J. Math. 55 (2003), no. 1, 3–41.[BT91] N. H. Barton and M. Turelli, Natural and sexual selection on many loci, Genetics 127

(1991), 229–55.

[Bur98] Reinhard Burger, Mathematical properties of mutation selection models, Genetica102/103 (1998), 279—298.

[Bur00] Reinhard Burger, The mathematical theory of selection, recombination, and mutation,

John Wiley & Sons, Chichester, New York, 2000.[BW01] Ellen Baake and Holger Wagner, Mutation—selection models solved exactly with meth-

ods of statistical mechanics, Genet. Res., Camb. 78 (2001), 93—117.[CE09] Aubrey Clayton and Steven N. Evans, Mutation-selection balance with recombination:

convergence to equilibrium for polynomial selection costs, SIAM J. Appl. Math. 69

(2009), no. 6, 1772–1792.[Cha01] Brian Charlesworth, Patterns of age-specific means and genetic variances of mortality

rates predicted by the mutation-accumulation theory of ageing, J. Theor. Bio. 210

(2001), no. 1, 47–65.[CKP95] Brian A. Coomes, Huseyin Kocak, and Kenneth J. Palmer, A shadowing theorem for

ordinary differential equations, Z. Angew. Math. Phys. 46 (1995), no. 1, 85–106.

[Daw99] Kevin J. Dawson, The dynamics of infinitesimally rare alleles, applied to the evolutionof mutation rates and the expression of deleterious mutations, Theor. Popul. Biol. 55(1999), 1–22.

[DS88] Nelson Dunford and Jacob T. Schwartz, Linear operators. Part I, Wiley Classics Li-brary, John Wiley & Sons Inc., New York, 1988.

[DU77] J. Diestel and J. J. Uhl, Jr., Vector measures, American Mathematical Society, Provi-dence, R.I., 1977, With a foreword by B. J. Pettis, Mathematical Surveys, No. 15.

[DVJ88] D. J. Daley and D. Vere-Jones, An introduction to the theory of point processes, SpringerSeries in Statistics, Springer-Verlag, New York, 1988.

[EK86] Stewart Ethier and Thomas Kurtz, Markov processes: Characterization and conver-gence, John Wiley & Sons, 1986.

[Hor03] Shiro Horiuchi, Interspecies differences in the lifespan distribution: Humans ver-sus invertebrates, Life Span: Evolutionary, Ecological, and demographic perspectives(James R. Carey and Shripad Tuljapurkar, eds.), Population and Development Review,

vol. 29 (Supplement), The Population Council, New York, 2003, pp. 127–151.

81

82 BIBLIOGRAPHY

[Kal84] Olav Kallenberg, An informal guide to the theory of conditioning in point processes,

International Statistical Review 52 (1984), no. 2, 151–64.

[Ken74] D. G. Kendall, Foundations of a theory of random sets, Stochastic geometry (a tributeto the memory of Rollo Davidson), Wiley, London, 1974, pp. 322–376.

[KJB02] Mark Kirkpatrick, Toby Johnson, and Nick Barton, General models of multilocus evo-

lution, Genetics 161 (2002), 1727–50.[KM66] Motoo Kimura and Takeo Maruyama, The mutational load with epistatic gene inter-

actions in fitness, Genetics 54 (1966), 1337–1351.

[Kon82] A.S. Kondrashov, Selection against harmful mutations in large sexual and asexual pop-ulations, Genet. Res. 40 (1982), 325–332.

[LC60] Lucien Le Cam, An approximation theorem for the Poisson binomial distribution, Pa-

cific Journal of Mathematics 10 (1960), 1181–1197.[PC98] Scott D. Pletcher and James W. Curtsinger, Mortality plateaus and the evolution of

senescence: Why are old-age mortality rates so low?, Evolution 52 (1998), no. 2, 454–64.

[Rac91] Svetlozar T. Rachev, Probability metrics and the stability of stochastic models, Wiley

Series in Probability and Mathematical Statistics: Applied Probability and Statistics,John Wiley & Sons Ltd., Chichester, 1991. MR MR1105086 (93b:60012)

[RR98] Svetlozar T. Rachev and Ludger Ruschendorf, Mass transportation problems. Vol. I,

Probability and its Applications (New York), Springer-Verlag, New York, 1998, Theory.[SEW05] David Steinsaltz, Steven N. Evans, and Kenneth W. Wachter, A generalized model of

mutation-selection balance with applications to aging, Adv. Appl. Math. 35 (2005),

no. 1, 16–33.[SH92] Stanley A. Sawyer and Daniel L. Hartl, Population genetics of polymorphism and di-

vergence, Genetics 132 (1992), 1161–1176.

[SY02] George R. Sell and Yuncheng You, Dynamics of evolutionary equations, Applied Math-ematical Sciences, vol. 143, Springer Verlag, 2002.

[Vil03] Cedric Villani, Topics in optimal transportation, Graduate Studies in Mathematics,vol. 58, American Mathematical Society, Providence, RI, 2003.

[Vil09] , Optimal transport, Grundlehren der Mathematischen Wissenschaften [Funda-

mental Principles of Mathematical Sciences], vol. 338, Springer-Verlag, Berlin, 2009,Old and new.

[Wac03] Kenneth W. Wachter, Hazard curves and life span prospects, Life Span: Evolutionary,

Ecological, and demographic perspectives (James R. Carey and Shripad Tuljapurkar,eds.), Population and Development Review, vol. 29 (Supplement), The Population

Council, New York, 2003, pp. 270–291.

[WBG98] Holger Wagner, Ellen Baake, and Thomas Gerisch, Ising quantum chain and sequenceevolution, J. Statist. Phys. 92 (1998), no. 5-6, 1017–1052.

[WES08] Kenneth W. Wachter, Steven N. Evans, and David R. Steinsaltz, The age-specific forcesof natural selection and walls of death, U.C. Berkeley Department of Statistics Technical

Report No. 757, http://arxiv.org/abs/0807.0483, 2008.

[WSE08] Kenneth W. Wachter, David R. Steinsaltz, and Steven N. Evans, Vital rates from theaction of mutation accumulation, U.C. Berkeley Department of Statistics Technical

Report No. 762, http://arxiv.org/abs/0808.3622, 2008.

steven n. evans david steinsaltz kenneth w. wachter...chapter 4. mutation-selection-recombination in...

Documents