applied functional analysis lecture notes spring, 2010 · 1 introduction to functional analysis 1.1...

Applied Functional AnalysisLecture NotesSpring, 2010

Dr. H. T. BanksCenter for Research in Scientific Computation

Department of MathematicsN. C. State University

February 23, 2010

1 Introduction to Functional Analysis

1.1 Goals of this Course

In these lectures, we shall present functional analysis for partial differentialequations (PDE’s) or distributed parameter systems (DPS) as the basis ofmodern PDE techniques. This is in contrast to classical PDE techniquessuch as separation of variables, Fourier transforms, Sturm-Louiville prob-lems, etc. It is also somewhat different from the emphasis in usual functionalanalysis courses where one learns functional analysis as a subdiscipline inits own right. Here we treat functional analysis as a tool to be used inunderstanding and treating distributed parameter systems. We shall alsomotivate our discussions with numerous application examples from biology,electromagnetics and materials/mechanics.

1.2 Uses of Functional Analysis for PDEs

As we shall see, functional analysis techniques can often provide powerfultools for insight into a number of areas including:

∙ Modeling

∙ Qualitative analysis

∙ Inverse problems

∙ Control

∙ Engineering analysis

∙ Computation (such as finite element and spectral methods)

1.3 Example 1: Heat Equation

∂y

∂t(t, �) =

∂

∂�(D(�)

∂y

∂�(t, �)) + f(t, �) 0 < � < l, t > 0 (1)

where y(t, �) denotes the temperature in the rod at time t and position �and f(t, �) is the input from a source, e.g. heat lamp, laser, etc.B.C.

� = 0 : y(t, 0) = 0 Dirichlet B.C. This indicates temperatureis held constant.

� = l : D(l)∂y∂� (t, l) = 0 Neumann B.C. This indicates an insulated

1

end and results in heat flux being zero.

Here j(t, �) = D(�)∂y∂� (t, �) is called the heat flux.

I.C.

y(0, �) = Φ(�) 0 < � < l

Here Φ denotes the initial temperature distribution in the rod.

1.4 Some Preliminary Operator Theory

Let X,Y be normed linear spaces. Then

ℒ(X,Y ) = {T : X → Y ∣bounded, linear}

is also a normed linear space, and

∣T ∣ℒ(X,Y ) = sup∣x∣X=1

∣Tx∣Y .

Recall that linear spaces are closed under addition and scalar multiplication.A normed linear space X is complete if for any Cauchy sequence {xn} thereexists x ∈ X such that limn→∞ xn = x. Note: ℒ(X,Y ) is complete if Y iscomplete.

If X is a Hilbert or Banach space (complex), then T is a bounded linearoperator from X1 → X2, i.e., T ∈ ℒ(X1, X2) ⇐⇒ T is continuous.

Differentiation in Normed Linear Spaces [HP]

Suppose f : X → Y is a (nonlinear) transformation (mapping).

Definition 1 If lim�→0+f(x0+�z)−f(x0)

� exists, we say f has a directionalderivative at x0 in direction z. This is denoted �f(x0; z), and is called theGateaux differential at x0 in direction z.

If the limit exists for any direction z, we say f is Gateaux differentiable atx0 and z → �f(x0; z) is Gateaux derivative. Note that z → �f(x0; z) is notnecessarily linear, however it is homogeneous of degree one (i.e., �f(x0;�z) =��f(x0; z) for scalars �). Moreover, it need not be continuous in z.

The following definition of o(∣z∣) will be useful in defining the Frechetderivative:

2

Definition 2 g(z) is o(∣z∣) if ∣g(z)∣∣z∣ → 0 as z → 0.

Definition 3 If there exists df(x0; ⋅) ∈ ℒ(X,Y ) such that ∣f(x0 + z) −f(x0) − df(x0; z)∣Y = o(∣z∣X), then df(x0; ⋅) is the Frechet derivative of f

at x0. Equivalently, lim�→0f(x0+�z)−f(x0)

� = df(x0; z) for every z ∈ X withz → df(x0; z) ∈ ℒ(X,Y ). We write df(x0; z) = f ′(x0)z.

Results

1. If f has a Frechet derivative, it is unique.

2. If f has a Frechet derivative at x0, then it also has a Gateaux derivativeat x0 and they are the same.

3. If f : Dopen ⊂ X → Y has a Frechet derivative at x0, then f iscontinuous at x0.

Examples

1. X = ℝn, Y = ℝm, and f : X → Y such that each component of f has

partials at x0 in the form ∂f i

∂xj(x0). Then

f ′(x0) =

(∂f i

∂xj(x0)

)and hence

df(x0; z) =

(∂f i

∂xj(x0)

)z.

2. Now we consider the case where X = ℝn and Y = ℝ1. Then

df(x0; z) = ▽f(x0) ⋅ z.

3. We also look at the case where X = Y = ℝ1. Here,

df(x0; z) = f ′(x0)z.

Normally z = 1 because that is the only direction in ℝ1. Then f ′(x0)is called the derivative, but in actuality z → f ′(x0)z is the derivative.Note that f ′(x0) ∈ ℒ(ℝ1,ℝ1), but the elements of ℒ(ℝ1,ℝ1) are justnumbers.

3

Homework Exercises

∙ Ex. 1 : Consider f : ℝ2 → ℝ1.

f(x1, x2) =

{x1x22x21+x42

, if x ∕= 0

0, if x = 0

Show f is Gateaux differentiable at x = (x1, x2) = 0, but not contin-uous at x = 0, (and hence cannot be Frechet differentiable).

1.5 Transforming the Initial Boundary Value Problem

If we can transform the initial boundary value problem (IBVP) for equation(1) into something of the form{

x(t) = Ax(t) + F (t)x(0) = x0

(2)

then conceptually it might be an easier problem with which to work. Thatis, formally it looks like an ordinary differential equation problem for whicheAt is a solution operator.

To rigorously make this transformation and develop a corresponding con-ceptual framework, we need to undertake several tasks:

1. Find a space X of functions and an operator A : D(A) ⊂ X → X suchthat the IBVP can be written in the form of equation (2). We mayfind X = L2(0, l) or C(0, l).

We want a solution x(t) = y(t, ⋅) to (2), but in what sense - mild,weak, strong, classical? This will answer questions about regularity(i.e., smoothness) of solutions.

2. We also want solution operators or “semigroups” T (t) which play therole of eAt. But what does “eAt” mean in this case? If A ∈ Rn × Rnis a constant matrix, then

eAt ≡∞∑n=0

(At)n

n!= I +At+

A2t2

2!+ . . .

where the series has nice convergent properties. In this case, by the“variation of constants” or “variation of parameters” representation,we can then write

x(t) = eAtx0 +

∫ t

0eA(t−s)F (s)ds.

4

We want a similar variation of constants representation in X withoperators T (t) ∈ ℒ(X) = ℒ(X,X) such that T (t) ∼ eAt, so that

x(t) = T (t)x0 +

∫ t

0T (t− s)F (s)ds.

holds and represents the PDE solutions in some appropriate sense andcan be used for qualitative (stability, asymptotic behavior, control)and quantitative (approximation and numerics) analyses.

5

2 Semigroups and Infinitesimal Generators

2.1 Basic Principles of Semigroups [HP, Pa, Sh, T]

Definition 4 A semigroup is a one parameter set of operators {T (t) :T (t) ∈ ℒ(X)}, where X is a Banach or Hilbert space, such that T (t) satisfies

1. T (t+ s) = T (t)T (s) (semigroup or Markov or translation property)

2. T (0) = I. (identity property)

Classification of semigroups by continuity

∙ T (t) is uniformly continuous if limt→0+

∣T (t)− I∣ = 0.

This is not of interest to us, because T (t) is uniformly continuous ifand only if T (t) = eAt where A is a bounded linear operator.

∙ T (t) is strongly continuous, denoted C0, if for each x ∈ X, t → T (t)xis continuous on [0, �) for some positive �.

Note 1: All continuity statements are in terms of continuity from theright at zero. For fixed t

T (t+ ℎ)− T (t) = T (t)[T (ℎ)− T (0)]

= T (t)[T (ℎ)− I]

andT (t)− T (t− �) = T (t− �)[T (�)− I]

so that continuity from the right at zero is equivalent to continuity at any tfor operators that are uniformly bounded on compact intervals.

Note 2: T (t) uniformly continuous implies T (t) strongly continuous(∣T (t)x− x∣ ≤ ∣T (t)− I∣ ∣x∣), but not conversely.

2.2 Return to Example 1 : Heat Equation

To begin the process of writing the system of Example 1 in the form ofequation (2), take X = L2(0, l) and define

D(A) = {' ∈ H2(0, l)∣'(0) = 0, '′(l) = 0}

6

to be the domain of A. (Assume D is smooth for now, e.g., D is at leastH1.) Then we can define A : D(A) ⊂ X → X by

A['](�) = (D(�)'′(�))′. (3)

Notation: In general, W k,p(a, b) = {' ∈ Lp∣'′, '′′, . . . , '(k) ∈ Lp}.If p = 2, we write W k,2(a, b) = Hk(a, b). AC(a, b) = {' ∈ L2(a, b)∣' isabsolutely continuous (AC) on [a, b]}.

Homework Exercises

∙ Ex. 2 : Show that A : D(A) ⊂ X → X of the heat example (Example1) is not bounded X → X.

∙ Ex. 3 : Give an example when X is a Hilbert space, but ℒ(X) is nota Hilbert space.

∙ Ex. 4 : Consider H1(a, b) = W 1,2(a, b),W 1,1(a, b) and AC(a, b). Findthe relationships for each pair of spaces in terms of ⊂. That is, estab-lish X ⊂ Y or X ⊊ Y , etc.

2.3 Example 2 : General Transport Equation

Problems in which the transport equation is used

∙ Insect dispersion

∙ Growth and decline of population-a special case-see Example 4 below

∙ Flow problems (convective/diffusive transport)

Specific applications as described in [BKa, BK] include the “cat brainproblem” (which involves in vivo labeled transport) and population dispersalproblems.

Description of various fluxes

The quantity y(t, �) is the population or amount of a substance at location� in the habitat.

∙ j1(t, �) is the flux due to dispersion (random foraging or moments;random molecular collisions).

j1(t, �) = D(�)∂y

∂�(t, �)

7

∙ j2(t, �) is the advective/convective/directed bulk movement.

j2(t, �) = �(�)y(t, �)

∙ j3(t, �) is the net loss due to “birth/death”; immigration/emigration,label decay.

j3(t, �) = �(�)y(t, �)

General transport equation for a 1-dimensional habitat

∂y

∂t(t, �) +

∂

∂�(�(�)y(t, �)) =

∂

∂�(D(�)

∂y

∂�(t, �))− �(�)y(t, �) (4)

0 < � < l, t > 0

B.C.

� = 0 : y(t, 0) = 0 (essential)

� = l :[D(�)∂y∂� (t, �)− �(�)y(t, �)

]�=l

= 0 (natural)

I.C. y(0, �) = Φ(�)

To write this example in operator form (2), we first let X = L2(0, l)be the usual complex Hilbert space of square integrable functions, and thedomain of A be defined by D(A) = {' ∈ H2(0, l)∣'(0) = 0, [D'′ − �'] (l) =0} where

A'(�) = (D(�)'′(�))′ − (�(�)'(�))′ − �'(�). (5)

Note: We will assume D and � are smooth for now.

2.4 Example 3 : Delay Systems–Insect/Insecticide Models

Delay systems have been of interest for the past 70 or more years, arisingin applications ranging from aerospace engineering to biology (biochemicalpathways, etc.), population models, ecology, HIV and other disease pro-gression models to viscoelastic and smart hysteretic materials, as well asnetwork models [BBH, BBJ, BBPP, BKurW1, BKurW2, BRS, Hutch, KP,Ma, Vis, Warga, Wright]. We describe here one arising in insect/insecrticideinvestigations [BBJ].

Mathematical models that are suitable for field data with mixed pop-ulations should consider reproductive effects and should also account for

8

multiple generations, containing neonates (juveniles) and adults and theirinterconnectedness. This suggests the need at the minimum for a coupledsystem of equations describing two separate age classes. Additionally, dueto individual differences within the insect population, it is biologically un-realistic to assume that all neonate aphids born on the same day reach theadult age class at the same time. In fact, the age at which the insects reachadulthood varies from as few as five to as many as seven days. Hence onemust include a term in any model to account for this variability, leading oneto develop a coupled delay differential equation model for the insect pop-ulation dynamics. We consider the delay between birth and adulthood forneonate pea aphids and present a first mathematical model that treats thisdelay as a random variable. For a careful derivation of models with similarstructure in HIV progression at the cellular level, see [BBH].

Let a(t) and n(t) denote the number of adults and neonates, respectively,in the population at time t. We lump the mortality due to insecticide intoone parameter pa for the adults, pn for the neonates, and denote by da and dnthe background or natural mortalities for adults and neonates, respectively.We let b be the rate at which neonates are born into the population.

We suppose that there is a time delay for maturation of a neonate toadult life stage. We further assume that this time delay varies across the in-sect population according to a probability distribution P (�) for � ∈ [−Tn, 0]

with corresponding density m(�) = dP (�)d� . Here we tacitly assume an up-

per bound on Tn for the maturation period of neonates into adults. Thus,we have that m(�), � < 0, is the probability per unit time that a neonatewho has been in the population −� time units becomes an adult. Thenthe rate at which such neonates become adults is n(t + �)m(�). Summingover all such � ’s, we obtain that the rate at which neonates become adultsis∫ 0−Tn n(t + �)m(�)d� . Using the biological knowledge that the matu-

ration process varies between five and seven days (i.e., m vanishes out-side [−7,−5]), we obtain the functional differential equation (FDE) (see[JKH1, JKH2, JKH3] for the widespread interest and use of such systems)system

dadt (t) =

∫ −5

−7n(t+ �)m(�)d� − (da + pa) a(t)

dndt (t) = ba(t)− (dn + pn)n(t)−

∫ −5

−7n(t+ �)m(�)d�

a(�) = Φ(�), n(�) = Ψ(�), � ∈ [−7, 0)a(0) = a0, n(0) = n0,

(6)

where m is now a probability density kernel which we have assumed has the

9

property m(�) ≥ 0 for � ∈ [−7,−5] and m(�) = 0 for � ∈ (−∞,−7)∪(−5, 0].The system of functional differential equations described in (6) can be

written in terms of semigroups [BBu1, BBu2, BKa].Let

x(t) = (a(t), n(t))T

andxt(�) = x(t+ �),−7 ≤ � ≤ 0. (7)

We define the Hilbert space Z ≡ ℝ2 × L2(−7, 0;ℝ2) with inner product

∣(�, ')∣Z =

(∣�∣2 +

∫ 0

−7∣'(�)∣2d�

)1/2

, (�, ') ∈ Z,

and let z(t) = (x(t), xt) ∈ Z. Then our system (6) can be written as

dxdt (t) = L (x(t), xt) for 0 ≤ t ≤ T,(x(0), x0) = (Φ(0),Φ) ∈ Z, Φ ∈ C(−7, 0;ℝ2),

(8)

where T <∞ and for � = ( 0, �0)T ∈ ℝ2 and ' = ( , �)T ∈ C(−7, 0;ℝ2)

L(�, ') =

[−da − pa 0

b −dn − pn

]� +

[0 10 −1

] ∫ −5

−7'(�)m(�)d�. (9)

We now define a linear operator A : D(A) ⊂ Z → Z with domain

D(A) ={

(�, ') ∈ Z ∣' ∈ H1(−7, 0;ℝ2) and � = '(0)}

(10)

byA(�, ') = (L(�, '), ') . (11)

Then the delay system (6) can be formulated as

z(t) = Az(t)z(0) = z0,

(12)

where z0 =((a0, n0)T , (Φ,Ψ)T

).

10

2.5 Example 4 : Probability Measure Dependent Systems -Maxwell’s Equations

We consider Maxwell’s equations in a complex, heterogenerous material (see[BBL] for details):

▽× E =−∂B∂t

▽×H =∂D

∂t+ J

▽ ⋅D = �

▽ ⋅B = 0

where E is the electric field (force), H is the magnetic field (force), D is theelectric flux density (also called the electric displacement), B is the magneticflux density, and � is the density of charges in the medium.

To complete this system, we need constituitive (material dependent)relations:

D = �0E + P

B = �0H +M

J = Jc + Js

where P is electric polarization, M is magnetization, Jc is the conductioncurrent density, Js is the source current density, �0 is the dielectric permit-tivity, and �0 is the magnetic permeability.

General Polarization:

P (t, x) = [g ∗ E](t, x) =

∫ t

0g(t− s, x)E(s, x)ds

Here, g is the polarization susceptibility kernel, or dielectric response func-tion (DRF).

Several examples of polarization are widely used:

∙ Debye Model for PolarizationThis describes reasonably well a polar material. This is also calleddipolar or orientational polarization. The DRF is defined by

g(t− s, x, �) = e−(t−s)�

(�0(�s − �∞)

�

)11

and corresponds to

P +1

�P = �0(�s − �∞)E.

It is important to note that P represents macroscopic polarization,and when we refer to microscopic polarization we will instead use p.

∙ Lorentz PolarizationThis is also called electronic polarization or the electronic cloud model.The DRF is given by

g(t− s, x, �) =�0!

2p

�0e−(t−s)

2� sin(�0(t− s)),

and corresponds to

P +1

�P + !2

0P = �0!2pE

where !p = !0√�s − �∞ and �0 =

√!2

0 − 14�2

.

Note: When we allow for instantaneous polarization, we find that D =�0�rE + P where �r = 1 + � ≥ 1 is a relative permittivity.

For complex composite materials, the standard Debye or Lorentz polar-ization model is not adequate, e.g., we need multiple relaxation times � ’sin some kind of distribution [BBo, BG1, BG2]. Then the multiple Debye

model becomes

P (t, x;F ) =

∫Tp(t, x; �)dF (�)

where T is a set of possible relaxation parameters � and

F ∈ ℱ(T ) = {F : T → ℝ1∣F is a probability distribution T }.

Note: Here the microscopic polarization is given by

p(t, x; �) =

∫ t

0g(t− s, x; �)E(s, x)ds

where g(t− s, x; �) = �0(�s−�∞)� e

−(t−s)� which corresponds to

p+1

�p = �0(�s − �∞)E.

12

Here, E is the total electric field. Thus,

P (t, x;F ) =

∫T

∫ t

0g(t− s, x; �)E(s, x)dsdF (�)

=

∫ t

0

[∫Tg(t− s, x; �)dF (�)

]E(s, x)ds

=

∫ t

0G(t− s, x;F )E(s, x)ds

Assuming M = 0 (nonmagnetic materials), we find this system becomes

▽× E = − ∂

∂t(�0H)

▽×H =∂

∂t

[�0�rE +

∫ t

0G(t− s, x;F )E(s, x)ds

]+ J

▽ ⋅D = 0 (13)

▽ ⋅H = 0

where F ∈ ℱ(T ) and J = Jc + Js. Note: Jc is usually also a convolutionon E although Ohm’s law uses Jc = �E where � is the conductivity of thematerial. In general, one should treat Jc as a convolution, e.g.,

Jc = �c ∗ E =

∫ t

0�c(t− s, x)E(s, x)ds.

For the measure dependent system (13), existence and uniqueness via asemigroup formulation have not been established. Comparison of solutionsvia semigroups versus weak solution have not been done. Nothing has yetbeen done in two or three dimensions. For the one-dimensional case only,existence, uniqueness, and continuous dependence have been established viaa weak formulation (see [BG1]). For continuous dependence of solutions onF , a metric is needed on ℱ(T ). More generally, we may need to treat othermaterial parameters q = (�, �s, �∞, �), where q ∈ Q ⊂ ℝ4 and we look forF ∈ ℱ(Q).

∙ Special Case

We can consider a physically meaningful special case of the Maxwellsystem in a dielectric material which has a general polarization convolutionrelationship. Detailed derivations given in Section 2.3 of [BBL] lead to a

13

one-dimensional version we next present and will use for an example in oursubsequent discussions. Among the assumptions are some homogeneity inthe medium (in planes parallel to that of an interrogating polarized planarwave) and a polarized sheet antenna source Js. The resulting model leadsto only nontrivial E fields in the x direction, and H fields in the y direction,each depending only on t and z. Assuming Ohm’s law for Jc, we find thesystem reduces to

∂E

∂z= −�0

∂H

∂t∂H

∂z=

∂D

∂t+ �E + Js (14)

D = �E + P

or in second order form for E(t, z):

�0�∂2E

∂t2+ �0

∂2P

∂t2+ �0�

∂E

∂t− ∂2E

∂z2= −�0

∂Js∂t

or

�r∂2E

∂t2+

1

�0

∂2P

∂t2+�

�0

∂E

∂t− c2∂

2E

∂z2=−1

�0

∂Js∂t

,

where c2 = 1�0�0

and �r is a relative permittivity that is material and geom-etry dependent. Typical boundary conditions (say on Ω = [0, 1]) are[

1

c

∂E

∂t− ∂E

∂z

]z=0

= 0 (absorbing B.C. at z = 0)

and E∣z=1 = 0 (supraconductive B.C. at z = 1).

If we assume a general polarization relationship

P (t, z) =

∫ t

0g(t− s, z)E(s, z)ds

and initial conditions

E(0, z) = Φ(z),∂E(0, z)

∂t= Ψ(z),

then it can be argued that the system becomes

∂2E

∂t2+

∂E

∂t+ �E +

∫ 0

−∞g(s)E(t+ s)ds− c2∂

2E

∂z2= J (t),

14

(without loss of generality we may take �r = 1 for theoretical conditions)where we have tacitly assumed E(t, z) = 0 for t < 0. If we approximatethe “memory” term by assuming only a finite past history is significant, weobtain the integro-partial differential equation

∂2E

∂t2+

∂E

∂t+ �E +

∫ 0

−rg(s)E(t+ s)ds− c2∂

2E

∂z2= J (t). (15)

As with the first two examples, this example can also be written asdxdt = Ax + F in an appropriate function space setting. In this directionwe define an operator A in appropriate state spaces. In this example onemight choose the electric field E and the magnetic field (H in (14)) as“natural” state spaces along with some type of hysteresis state to accountfor the memory in (15). However, in second order (in time) systems, it isalso sometimes natural to choose the state E and velocity ∂E

∂t as states.We do that in this example. We first define an auxiliary variable w(t) inW ≡ L2

g(−r, 0;L2(Ω)) by

w(t)(�) = E(t)− E(t+ �), −r ≤ � ≤ 0.

For the inner product in W , we choose the weighted inner product

⟨�, w⟩W ≡∫ 0

−rg(�)⟨�(�), w(�)⟩L2(Ω)d�

and then (15) can be written as

∂2E

∂t2+

∂E

∂t+ (� + g11)E −

∫ 0

−rg(s)w(t)(s)ds− c2∂

2E

∂z2= J (t),

where g11 ≡∫ 0−r g(s)ds and w(t)(s) = E(t) − E(t + s), −r ≤ s ≤ 0. For a

semigroup formulation we consider the state space

Z = V ×H ×W = H1R(Ω)× L2(Ω)× L2

g(−r, 0;H)

(here H1R(Ω) = {� ∈ H1∣�(1) = 0}) with states

(�, , �) = (E(t),∂E(t)

∂t, w(t)) = (E(t),

∂E(t)

∂t, E(t)− E(t+ ⋅)).

To define what we will later see is an infinitesimal generator of a C0

semigroup, we first define several component operators. Let A ∈ ℒ(V, V ★)be defined by

A� ≡ c2�′′ − (� + g11)�+ c2�′(0)�0

15

where �0 is the Dirac operator �0 = (0). Here V ★ is the dual space ofV (e.g. the space of continuous linear functionals on V , to be discussed indetail later). We also define B ∈ ℒ(V, V ★) and K ∈ ℒ(W,H) by

B� = − �− c�(0)�0

and, for � ∈W ,

K�(z) =

∫ 0

−rg(s)�(s, z)ds, z ∈ Ω.

Moreover, we introduce the operator C : D(C) ⊂W →W defined on

D(C) = {� ∈ H1(−r, 0;L2(Ω)) ∣ �(0) = 0}

by

C�(�) =∂

∂��(�).

We then define the operator A on

D(A) = {(�, , �) ∈ Z∣ ∈ V, � ∈ D(C), A�+B ∈ H}

by

A =

⎛⎝ 0 I 0

A B K0 I C

⎞⎠ .

That is,AΦ = ( , A�+B + K�, + C�)

for Φ = (�, , �) ∈ D(A).

2.6 Example 5: Structured Population Models

We consider the special case of “transport” models or Example 2 withD = 0 but with so-called renewal or recruitment boundary conditions.The “spatial” variable � is actually “size” in place of spatial location (see[BT]) and such models have been effectively used to model marine popula-tions such as mosquitofish [BBKW, BF, BFPZ] and, more recently, shrimp[GRD-FP, BDEHAD, GRD-FP2]. Such models have also been the basisof labeled cell proliferation models in which � represents label intensity[BSTBRSM]. The early versions of these size structured population mod-els were first proposed by Sinko and Streifer [SS] in 1967. Cell populationversions were proposed by Bell and Anderson [BA] almost simultaneously.

16

Other versions of these models called “age structured models”, where agecan be “physiological age” as well as chronical age, are discussed in [MetzD].

The Sinko-Streifer model (SS) for size-structured mosquitofish popula-tions is given by

∂v

∂t+

∂

∂�(gv) = −�v, �0 < � < �1, t > 0 (16)

v(0, �) = Φ(�)

g(t, �0)v(t, �0) =

∫ �1

�0

K(t, s)v(t, s)ds

g(t, �1) = 0.

Here v(t, �) represents number density (given in numbers per unit length)or population density, where t represents time and � represents the lengthof the mosquitofish. The growth rate of individual mosquitofish is assumedgiven by g(t, �), where

d�

dt= g(t, �) (17)

for each individual (all mosquitofish of a given size have the same growthrate).

In the SS �(t, �) represents the mortality rate of mosquitofish, and thefunction Φ(�) represents initial size density of the population, while K repre-sents the fecundity kernel. The boundary condition at � = �0 is recruitment,or birth rate, while the boundary condition at � = �1 = �max ensures themaximum size of the mosquitofish is �1. The SS model cannot be used asformulated above to model the mosquitofish population because it does notpredict dispersion or bifurcation of the population in time under biologi-cally reasonable assumptions [BBKW, BF, BFPZ]. As we shall see, we willsubsequently replace the growth rate g by a family G of growth rates andreconsider the model with a probability distribution P on this family. Thepopulation density is then given by summing “cohorts” of subpopulationswhere individuals belong to the same subpopulation if they have the samegrowth rate [BD, BDTR, GRD-FP, BDEHAD, GRD-FP2]. Thus, in theso-called Growth Rate Distribution (GRD) model, the population densityu(t, �;P ), first discussed in [BBKW] and developed in [BF], is actually givenby

u(t, �;P ) =

∫Gv(t, �; g)dP (g), (18)

17

where G is a collection of admissible growth rates, P is a probability mea-sure on G, and v(t, �; g) is the solution of the (SS) with growth rate g. Thismodel assumes the population is made up of collections of subpopulationswith individuals in the same subpopulation having the same size dependentgrowth rate. This example can also be formulated in terms of semigroups[BKa, BKW1, BKW2], but the details are somewhat more difficult thanthose for the first three examples. In some cases it is advantageous to use aweak formulation (to be developed below) instead of a semigroup formula-tion.

2.7 Infinitesimal Generators

Definition 5 Assume {T (t) ∈ ℒ(X); 0 ≤ t < ∞} is a C0 semigroup in Xor on X, then the infinitesimal generator A of the semigroup T (t) is definedby

D(A) = {x ∈ X : limt→0+

T (t)x− xt

exists in X}

and

Ax = limt→0+

T (t)x− xt

with A : D(A) ⊂ X → X.

Theorem 1 Let T (t) be a C0 semigroup in X. Then there exist constants! ≥ 0 and M ≥ 1 such that

∣T (t)∣ ≤Me!t t ≥ 0.

See [Pa]: Theorem 2.2.

Outline of proof: We want to show there exists � > 0 and M > 0 suchthat ∣T (t)∣ ≤ M on [0, �]. If not, then there exists tn → 0+ such that∣T (tn)∣ ≥ n, which implies ∣T (tn)x∣ is unbounded for some x ∈ X (by theuniform boundedness principle). This contradicts the fact that t → T (t)xcontinuously on [0, �] for each x ∈ X. Then define ! ≡ 1

� ln(M). Then

e!t = M t/� for any t > 0.Let t = � + k� for some integer k and � ∈ [0, �]. Then ∣T (t)∣ =

∣T (�)T (�)k∣ ≤MMk ≤MM t/� = Me!t.

Notation: Write A ∈ G(M,!). For M = 1 and ! = 0, A ∈ G(1, 0) isthe infinitesimal generator of the contraction semigroup: ∣T (t)x− T (t)y∣ ≤∣x− y∣.

18

Theorem 2 Let T (t) be a C0 semigroup and let A be its infinitesimal gen-erator. Then

1. For x ∈ X,

limℎ→0

1

ℎ

∫ t+ℎ

tT (s)xds = T (t)x.

2. For x ∈ X,∫ t

0T (s)xds ∈ D(A) and A

(∫ t

0T (s)xds

)= T (t)x− x.

3. If x ∈ D(A), then T (t)x ∈ D(A) and

d

dtT (t)x = AT (t)x = T (t)Ax.

In other words, D(A) is invariant under T (t), and on D(A) at least,

T (t)x0 is a solution of

{x(t) = Ax(t)x(0) = x0.

4. For x ∈ D(A),

T (t)x− T (s)x =

∫ t

sT (�)Axd� =

∫ t

sAT (�)xd�.


Corollary 1 A ∈ G(M,!)⇒ D(A) is dense in X and A is a closed linearoperator.

See [Pa]: Corollary 2.5.

Recall: By definition, a linear operator A is closed is equivalent to theproperty that A has a closed graph in X ×X.

That is, Gr(A) = {(x, y) ∈ X × X∣x ∈ D(A), y = Ax} is closed or forany (xn, yn) ∈ Gr(A), if xn → x and yn → y, then x ∈ D(A) and y = Ax.

Question: Does there exist a one-to-one relationship between a semigrouupand its infinitesimal generator?

Theorem 3 Let T (t) and S(t) be C0 semigroups on X with infinitesimalgenerators A and B respectively. If A = B, then T (t) = S(t), i.e., theinfinitesimal generator uniquely determines the semigroup on all of X.


19

3 Generators

3.1 Introduction to Generation Theorems

How do we tell when A, derived from a PDE (recall Example 1: the heatequation), is actually a generator of a C0 semigroup? This is important,because it leads to the idea of well-posedness and continuous dependence ofsolutions for an IBVPDE.

Well-posedness of a PDE is equivalent to saying that a unique solutionexists in some sense and is continuous with respect to data. In other words,{

x(t) = Ax(t) + F (t)x(0) = x0,

is satisfied in some sense and the corresponding semigroup generated solutionx(t) = T (t)x0 +

∫ t0 T (t−s)F (s)ds, yields the map (x0, F )→ x(⋅;x0, F ), that

is then continuous in some sense (depending on the spaces used).Probably the best known generation theorem is the Hille-Yosida theo-

rem. First, however, we review resolvents. In the study of C0-semigroups,one frequently encounters the operators �I − A, for � ∈ ℂ (also denotedby � − A), and their inverse R�(A) = (� − A)−1. Here ℂ is the field ofcomplex scalars. For a linear operator A in X, we denote the resolvent setby �(A) = {� ∈ ℂ∣� − A has range ℛ(� − A) dense in X and � − A hasa continuous inverse on ℛ(� − A)}. We recall that any continuous denselydefined linear operator in X can be extended continuously to all of X. For� ∈ �(A), we denote the resolvent operator in ℒ(X) by R�(A) = (�−A)−1.The spectrum �(A) of a linear operator A is the complement in ℂ of theresolvent set �(A).

3.2 Hille-Yosida Theorems [HP, Pa, Sh, T]

Theorem 4 (Hille - Yosida) For M ≥ 1, ! ∈ ℝ, we have A ∈ G(M,!) ifand only if

1. A is closed and densely defined (i.e., A closed and D(A) = X).

2. For real � > !, we have � ∈ �(A) and R�(A) satisfies

∣R�(A)n∣ ≤ M

(�− !)n, n = 1, 2, . . . .

Note: The equivalent version of Hille-Yosida in [Pa] is stated differently:

20

Theorem 5 (Hille - Yosida) A ∈ G(1, 0) ⇐⇒

1. A is closed and densely defined.

2. R+ ⊂ �(A) and for every � > 0, ∣R�(A)∣ ≤ 1� .


Homework Exercises

∙ Ex. 5 :

a) Study the proof of Theorem 3.1 in [Pa] and read Section 1.3carefully.

b) Show that the version of Hille-Yosida in [Pa] is completely equiv-alent to the version stated previously. (Hint: This is not anexercise in proving Hille-Yosida. It is an exercise in comparingA ∈ G(M,!) and A ∈ G(1, 0) with regard to exponential ratesand norms.)

3.3 Results from the Hille-Yosida proof

Results of the necessity portion of the proof

Out of the necessity portion of the proof of Hille-Yosida we find the repre-sentation:

R�(A)x = R(�,A)x ≡∫ ∞

0e−�tT (t)xdt for x ∈ X, � > !

This says that the Laplace transform of the semigroup is the resolvent op-erator.

We now ask the question: can we recover the semigroup T (t) from theresolvent R�(A) through some kind of inverse Laplace transform? Yes -later!!

Homework Exercises

∙ Ex. 6 : Show that the heat equation operator of Example 1 generatesa C0 semigroup in X = L2(0, l).

(That is, show thatA' = (D(�)'′(�))′ onD(A) = {' ∈ H2(0, l)∣'(0) =0, '′(l) = 0} is the infinitesimal generator of a C0 semigroup in X.)

Note: If you want to, you can take D(�) = D constant for this exercise.

21

Results of the sufficiency portion of the proof

In the sufficiency proof of Hille-Yosida, we encounter the Yosida approxi-mation:

For � > !,A� ≡ �AR�(A) = �2R�(A)− �I.

From this definition, we can see that A� is bounded and defined on all ofX. To see the above relationship, note that

�AR�(A) = �[(A− �I)R�(A) + �R�(A)].

We also found that for A satisfying the hypothesis of Hille-Yosida, we have

lim�→∞

A�x = Ax for all x ∈ D(A).

This says that for large �, A� acts like A. Moreover, under this hypothesis,A� is the infinitesimal generator of the uniformly continuous semigroup ofcontraction operators : etA� . Then we can argue that T (t)x ≡ lim

�→∞etA�x,

for all x ∈ X, is the desired semigroup that A generates. (The proof isconstructive.) See Corollary 3.5 in [Pa].

3.4 Corollaries to Hille-Yosida

Corollary 2 Let A be the infinitesimal generator of a C0 semigroup in X,and A� be the Yosida approximation, then

T (t)x = lim�→∞

etA�x for all x ∈ X.

Corollary 3 If A is the infinitesimal generator of a C0 semigroup in X,then we actually have {�∣Re(�) > !} ⊂ �(A) and for such �

∣R�(A)n∣ ≤ M

(Re�− !)n.

Corollary 4 A ∈ G(M,!) implies D(A) is dense in X, and moreover, we

find∞∩n=1D(An) is also dense in X.

22

4 Dissipative Operators

Definition 6 Let X be a Hilbert space, then a linear operator A : D(A) ⊂X → X is dissipative if

Re ⟨Ax, x⟩ ≤ 0 for all x ∈ D(A).

Theorem 6 A linear operator A is dissipative if and only if ∣(�I −A)x∣ ≥�∣x∣ for all x ∈ D(A) and � > 0.

This result yields that for dissipative operators we have �I − A is con-tinuously invertible on ℛ(�−A) when � > 0. Thus, � > 0 implies � ∈ �(A)whenever ℛ(�−A) is dense in X. Note: Proof will come later.

Theorem 7 (Lumer-Phillips) Suppose A is a linear operator in a Hilbertspace X.

1. If A is densely defined, A − !I is dissipative for some real !, andℛ(�0 −A) = X for some �0 with Re (�0) > !, then A ∈ G(1, !).

2. If A ∈ G(1, !), then A is densely defined, A − !I is dissipative andℛ(�0 −A) = X for all �0 with Re (�0) > !.

Proof of Theorem 6

A is dissipative means Re ⟨Ax, x⟩ ≤ 0 for all x ∈ D(A). This impliesRe⟨−Ax, x⟩ ≥ 0. So we have:

∣(�−A)x∣∣x∣ ≥ ∣⟨(�−A)x, x⟩∣≥ Re⟨(�−A)x, x⟩≥ �⟨x, x⟩= �∣x∣2.

Conversely, suppose ∣(�I −A)x∣ ≥ �∣x∣ for all x ∈ D(A) and � > 0.Let x ∈ D(A).

Define y� = (�−A)x and z� = y�∣y�∣ . Then we have ∣z�∣ = 1 and

�∣x∣ ≤ ∣(�−A)x∣= ⟨(�−A)x, z�⟩= �Re⟨x, z�⟩ − Re⟨Ax, z�⟩ (19)

≤ �∣x∣∣z�∣ − Re⟨Ax, z�⟩= �∣x∣ − Re⟨Ax, z�⟩.

23

This implies

Re⟨Ax, z�⟩ ≤ 0. (20)

We always have the relationship

−Re⟨Ax, z�⟩ ≤ ∣Ax∣. (21)

Combining equations (19) and (21), we obtain

�∣x∣ ≤ �Re⟨x, z�⟩ − Re⟨Ax, z�⟩≤ �Re⟨x, z�⟩+ ∣Ax∣. (22)

Rearranging equation (22), we have

�Re⟨x, z�⟩ ≥ �∣x∣ − ∣Ax∣

or

Re⟨x, z�⟩ ≥ ∣x∣ −1

�∣Ax∣. (23)

Observe that we have ∣z�∣ = 1 for � > 0. This implies that as �k →∞, z�k ⇀ z with ∣z∣ ≤ 1. (Note: The norm is weakly lower semi-continuous;therefore, z�k ⇀ z implies ∣z∣ = ∣w lim

kz�k ∣ ≤ lim∣z�k ∣ = 1. This is discussed

in detail in Section 5.2.)Taking the limit of equation (42), we obtain

Re⟨Ax, z⟩ ≤ 0. (24)

Similarly, taking the limit of equation (23), we have

Re⟨x, z⟩ ≥ ∣x∣. (25)

Thus because ∣z∣ ≤ 1 we can make the arguments

∣x∣ ≤ Re⟨x, z⟩≤ ∣⟨x, z⟩∣≤ ∣x∣∣z∣≤ ∣x∣.

Therefore, we actually have the equality

Re⟨x, z⟩ = ∣⟨x, z⟩∣ = ⟨x, z⟩ = ∣x∣. (26)

24

Define x = z∣x∣; then ⟨x, x⟩ = ∣x∣2 by (26) as ⟨x, z⟩ = ∣x∣ implies ⟨x, z∣x∣⟩ =∣x∣⟨x, z⟩ = ∣x∣2. From equation (24),

Re⟨Ax, z⟩ ≤ 0implies Re⟨Ax, z∣x∣⟩ ≤ 0or Re⟨Ax, x⟩ ≤ 0.

We also have ∣x∣ ≤ ∣x∣ because ∣x∣ = ∣z∣∣x∣ ≤ ∣x∣.

We claim that x = x. If this is true, then (27) would imply Re⟨Ax, x⟩ ≤0 for all x ∈ D(A) and we are finished. We need the following lemma toprove this.

Lemma 1 (Duality Set Lemma) Let x ∈ X be arbitrary in a Hilbert spaceX and x ∈ X such that ⟨x, x⟩ = ∣x∣2 and ∣x∣ ≤ ∣x∣; then x = x.

Proof0 ≤ ⟨x− x, x− x⟩

= ∣x∣2 − ⟨x, x⟩ − ⟨x, x⟩+ ∣x∣2≤ ∣x∣2 − ∣x∣2 − ∣x∣2 + ∣x∣2= 0

Therefore, ∣x− x∣ = 0 implies x = x; i.e., in a Hilbert space, F (x) ≡ dualityset = {x}. For x ∈ X,F (x) = {x∗ ∈ X∣⟨x, x∗⟩ = ∣x∣2 = ∣x∗∣2}.

As we shall see below, for a general Banach space X, the duality set isa subset of the dual space X∗. Because Hilbert spaces are self-dual, in aHilbert space X the concept reduces to one involving subsets of the Hilbertspace X itself.

Definition 7 Let X be a Hilbert space with A : D(A) ⊂ X → X withD(A) dense in X. . Then the adjoint of A, denoted A∗, is defined by

D(A∗) = {y ∈ X∣ there exists w ∈ X satisfying ⟨Ax, y⟩ = ⟨x,w⟩ for all x ∈ D(A)}

andA∗y = w.

Corollary 5 (to Lumer-Phillips) Let X be a Hilbert space and A : D(A) ⊂X → X with D(A) dense and A closed in X. If both A and A∗ are dissipa-tive, then A is an infinitesimal generator of a C0 semigroup of contractionson X.

25

Proof By Lumer-Phillips, it suffices to argue that ℛ(� − A) = X for some� > 0.Take � = 1:

Recall that in a Hilbert space we have

⟨Ax, y⟩ = ⟨x,A∗y⟩ for x ∈ D(A), y ∈ D(A∗).

But by assumption A is dissipative and closed, which means

{(x,Ax)∣x ∈ D(A)} is closed in X ×X.

Therefore, ℛ(I −A) is a closed subspace of X.

Claim: ℛ(I −A) = X.Suppose ℛ(I−A) ∕= X. Then by the Hahn-Banach theorem in a Hilbert

space, there exists x ∈ X, x ∕= 0, such that

⟨x, (I −A)x⟩ = 0.

In other words

⟨x, x⟩ = ⟨x, Ax⟩ for all x ∈ D(A).

Therefore

x ∈ D(A∗) and ⟨A∗x, x⟩ = ⟨x, x⟩,

or

⟨x−A∗x, x⟩ = 0 for all x ∈ D(A).

However, D(A) is dense, which implies

x−A∗x = 0 or x = A∗x. (27)

By assumption, A∗ is also dissipative. Therefore

∣(I −A∗)x∣ ≥ ∣x∣ for all x ∈ D(A∗). (28)

Combining (27) and (28) we have

∣x∣ ≤ ∣x−A∗x∣∣x∣ ≤ ∣x− x∣∣x∣ ≤ 0.

This is a contradiction since x ∕= 0. Therefore, ℛ(I −A) = X.

26

Theorem 8 In a Hilbert space X, if A is dissipative and ℛ(� − A) = Xfor some � > 0, then A is densely defined.

See [Pa]: Theorem 4.6.Note: This also holds in a reflective Banach space (more on this later).

Important practical results:

1. In a Hilbert space, we only need A to be dissipative andℛ(�I−A) = Xfor some � > 0 in order to use Lumer-Phillips theorem.

2. In a real Hilbert space (such as real L2(0, l)), we can get by with⟨Ax, x⟩ ≤ 0.

Lemma 2 (A Practical Lemma) Let X be a complex Hilbert space and sup-pose that for u ∈ D(A), Au is real whenever u is real (for example, anoperator A coming from a PDE defined by real coefficients), then A is dis-sipative, i.e.,

Re⟨Ax, x⟩ ≤ 0 for all x ∈ D(A)

if and only if

⟨Au, u⟩ ≤ 0 for all real valued u ∈ D(A).

Proof:We have for x = u+ iv, with u, v, real:

⟨Ax, x⟩ = ⟨A(u+ iv), u+ iv⟩= ⟨Au, u⟩+ ⟨Av, v⟩+ i[⟨Av, u⟩ − ⟨Au, v⟩].

We want to argue that ⟨Av, u⟩ − ⟨Au, v⟩ is real; however, for any complexHilbert space we have the polarization identity [GP, p. 168]

⟨u, v⟩ =1

4{∣u+ v∣2 + i∣u+ iv∣2 − i∣u− iv∣2 − ∣u− v∣2}.

But if u and v are real-valued we have ∣u+ iv∣ = ∣u− iv∣ and hence

⟨u, v⟩ =1

4{∣u+ v∣2 − ∣u− v∣2}

(note that this is the polarization identity in a real Hilbert space) which isreal valued. Since u and v are real, then by assumption Au and Av are real.Therefore, ⟨Av, u⟩ − ⟨Au, v⟩ is real-valued and hence

27

Re ⟨Ax, x⟩ = ⟨Au, u⟩+ ⟨Av, v⟩.

Thus to have Re ⟨Ax, x⟩ ≤ 0, it suffices to argue

⟨Au, u⟩ ≤ 0 for all real valued u in D(A).

Remark: Any Banach space is a Hilbert space if and only if the paral-lelogram law

∣x+ y∣2 + ∣x− y∣2 = 2∣x∣2 + 2∣y∣2

holds. If this holds in a Banach space, the polarization identity can be usedto define a norm compatible inner product.

To illustrate the use of the Lumer-Phillips theorem, we return to someof the previous examples.

Examples using Lumer-Phillips Theorem

Return to Example 1: The Heat Equation

We return to the heat diffusion example where we found the PDE

∂y

∂t(t, �) =

∂

∂�

(D(�)

∂y

∂�

)+ f(t, �)

y(t, 0) = 0, D(l)∂y

∂�(t, l) = 0

gave rise to an operator A in X = L2(0, l) given by

D(A) = {' ∈ H2(0, l)∣(D'′)′ ∈ L2(0, l), '(0) = 0, '′(l) = 0},

andA' = (D'′)′.

Remark: Notice that if D is smooth, we have

D(A) = {' ∈ H2(0, l)∣'(0) = 0, '′(l) = 0}.

In view of the remarks following the Lumer-Phillips theorem, to showthat A generates a C0 semigroup in X, it suffices to show that A is dissipativeand that ℛ(�−A) = X; i.e. (�−A)D(A) = X for some � > 0.

28

We have for ' ∈ D(A), ' real-valued:

⟨A','⟩ = ⟨(D'′)′, '⟩=

∫ l0(D'′)′'

= −⟨D'′, '′⟩+D'′'∣l0= −

∫ 10 D('′)2

≤ 0

for D ≥ 0 (a natural assumption since D is the coefficient of thermal diffu-sion).

To argue that ℛ(� − A) = L2(0, l) for some � > 0, we note that this isequivalent to arguing that for some � > 0, the boundary value problem

�'− (D'′)′ = , 0 < x < l,

'(0) = 0, '′(l) = 0,

has a solution ' ∈ H2(0, l) for any ∈ L2(0, l). But one can answer thislatter question in the affirmative (assuming D > 0 is sufficiently regular,e.g., D ∈W 1

∞) using classical Sturm-Liouville theory. (See [BR, CH, D, S].)Thus, we readily see that A is a densely-defined generator of a C0 solutionsemigroup for the system above.

Homework Exercises

∙ Ex. 7 : Consider the general transport example, Example 2, and showthat it generates a C0 semigroup.

Return to Example 2 : The General Transport Equation

We next reconsider the population model given by

∂y

∂t(t, �) +

∂

∂�(�(�)y(t, �)) =

∂

∂�

(D(�)

∂y

∂�(t, �)

)− �(�)y(t, �)

y(t, 0) = 0,

[D(�)

∂y

∂�(t, �)− �(�)y(t, �)

]�=l

= 0.

The corresponding operator equation in X = L2(0, l) is in terms of theoperator A given by

D(A) =

{' ∈ H2(0, l) ∣(D'′)′ − (�')′ ∈ L2(0, l), '(0) = 0,[D'′ − �'

](l) = 0}

29

A' = (D'′)′ − (�')′ − �'.

Again we show that A is dissipative in L2(0, l) (actually we shall argue thata translation of A is dissipative). For real valued ' ∈ D(A), we have

⟨A','⟩ = ⟨(D'′ − �')′, '⟩ − ⟨�', '⟩,

= −⟨(D'′ − �'), '′⟩+ (D'′ − �')'∣l0 − ⟨�', '⟩,

= −⟨D'′, '′⟩ − ⟨�', '⟩+ ⟨�', '′⟩.

Hence, using ab = 1√2�a√

2�b ≤ 14�a

2 + �b2, we find

⟨A','⟩ ≤ −⟨D'′, '′⟩ − ⟨�', '⟩+ 14� ∣�'∣

2 + �∣'′∣2

= ⟨(−D + �)'′, '′⟩+ ⟨(−�+ �2

4� )','⟩

≤ k4� ∣'∣

2

for � sufficiently small so that D(�) ≥ � and k chosen so that −4��+�2(�) ≤k. Thus, we have that A− ( k4�)I is dissipative in L2(0, l).

The range statement ℛ(�+ k4� −A) = X again reduces to a question for

Sturm-Liouville problems. For any ∈ L2(0, l), we seek an L2(0, l) solutionof

(�+k

4�+ �)'+ (�')′ − (D'′)′ = , 0 < � < l,

'(0) = 0, [D'′ − �'](l) = 0.

For � sufficiently large, the Sturm-Liouville theory guarantees existence ofsolutions. Thus A − k

4� is the generator of a C0-semigroup S(t) so that

T (t) = ek4�tS(t) is the solution semigroup generated by A.

30

5 Adjoint Operators and Dual Spaces

5.1 Adjoint Operators

Recall that A∗ : D(A∗) ⊂ X → X is defined in Definition 7 and we requiredthat A be densely defined in X. We argue that this defines A∗ uniquely ifD(A) is dense in X. To show this, suppose there exist w1 and w2 such that

⟨Ax, y⟩ = ⟨x,w1⟩ for all x ∈ D(A)

and⟨Ax, y⟩ = ⟨x,w2⟩ for all x ∈ D(A).

Then,⟨x,w1 − w2⟩ = 0 for all x ∈ D(A).

If D(A) is dense in X, then w1 − w2 must equal zero. Therefore, w1 = w2,and the adjoint of A is uniquely defined.

5.1.1 Computation of A∗: an Example from the Heat Equation

The computation of A∗ can be easy or impossible. Let X = L2(0, l), thenthe operator A from the heat equation is defined by

A' = (D'′)′

onD(A) = {' ∈ H2(0, l) ∣ (D'′)′ ∈ L2(0, l), '(0) = '′(l) = 0}.

For ' ∈ D(A), D real, and ∈ L2(0, l) = X, we have

⟨A', ⟩ =∫ l

0(D'′)′

= −∫ l

0 D'′ ′ +D'′ ∣l0

=∫ l

0 '(D ′)′ − 'D ′∣l0 +D'′ ∣l0= ⟨', (D ′)′⟩ − 'D ′∣l0 +D'′ ∣l0= ⟨',w⟩ for all ' ∈ D(A)

if and only if ′(l) = 0 and (0) = 0. (We must, of course, have ∈ H2(0, l)and (D ′)′ ∈ L2(0, l) to carry out the operations.) Hence

D(A∗) = { ∈ H2(0, l) ∣ (D ′)′ ∈ L2(0, l), (0) = ′(l) = 0}.

andA∗ = (D ′)′.

In this case, ⟨A', ⟩ = ⟨',A∗ ⟩; therefore A = A∗, D(A) = D(A∗), andhence A is a self adjoint operator.

31

5.1.2 Self Adjoint Operators and Dissipativeness

For self adjoint linear operators, if ' ∈ D(A) = D(A∗), we have

Re ⟨A','⟩ = Re ⟨',A∗'⟩ = Re ⟨A∗','⟩.

Therefore, whenever A is self adjoint, A is dissipative if and only if A∗ isdissipative. Thus, for any closed, densely-defined, dissipative self adjointoperator in X, both A and A∗ are infinitesimal generators (see Corollary5 to the Lumer-Phillips theorem) of C0 semigroups T (t) ∼ eAt and S(t) ∼eA∗t. From a fundamental Hilbert space result, given below, we know that

S(t) = T (t)∗.

Theorem 9 If A is an infinitesimal generator of a C0 semigroup T (t) on aHilbert space X, then T (t)∗ = S(t) is a C0 semigroup on X with infinitesimalgenerator A∗.

Homework Exercises

∙ Ex. 8 (Project 1) : Consider D(A∗) in a Hilbert space X where A isdensely defined in X.

– What can you say about D(A∗) with respect to closed, dense,nonempty, all of X, relationship with A−1 and (A∗)−1 when theyexist, and symmetric versus self adjoint?

– Now consider all the same questions in a Banach space X. (Note:In a Banach space D(A∗) ⊂ X∗.)

Give complete examples and counter examples. Also, give completereferences.

For relevant material, see [DS, HP, K, RN, Y].

5.2 Dual Spaces and Strong, Weak, and Weak* Topologies

Before discussing dissipative operators in a Banach space X, we must dis-cuss topological dual spaces X∗. We talk about strong, weak, and weak*topologies in a Banach space X. For relevant material, see [DS, K, Y].

Let X be a Banach space. Then X∗, called the dual space or “topologi-cal” dual space, is a Banach space whenever X is a normed linear space andthe scalar field is complete. It is defined as the Banach space of continuous(conjugate or anti-) linear functionals on X, where a continuous linear func-tional satisfies ∣f(x)∣ ≤ k∣x∣ for all x ∈ X and a conjugate linear functional

32

satisfies f(�x + �y) = �f(x) + �f(y) for all x, y ∈ X. The elements of X∗

are denoted by x∗ where ∣x∗∣∗ = sup∣x∣≤1

∣x∗(x)∣, where ∣ ⋅ ∣∗ denotes the norm

in X∗. This of course gives rise to a strong or normed topology of X∗.If X is a complex Hilbert space, then X ≈ X∗, and we have the Riesz

theorem.

Theorem 10 (Riesz theorem) If X is a Hilbert space, then for every f ∈X∗, there is a corresponding f ∈ X such that

f(x) = ⟨f , x⟩

and

f(�x+ �y) = ⟨f , �x+ �y⟩ = �⟨f , x⟩+ �⟨f , y⟩

for all x ∈ X.

Definition 8 Weak convergence of a sequence {xn} ⊂ X, xn ⇀ x, is de-fined by

xn ⇀ x if and only if x∗(xn)→ x∗(x) for all x∗ ∈ X∗.

We write x = wlim xn.

Definition 9 X is said to be weakly sequentially complete if and only ifevery weakly Cauchy sequence in X converges weakly to an element x ∈ X.

Result Section 1

There are a number of useful results related to weak convergence.

1. Weak limits are unique.

To see this, suppose xn ⇀ x and xn ⇀ y, then x∗(x) = x∗(y), i.e,x∗(x − y) = 0 for all x∗ ∈ X∗. However, the Hahn-Banach theoremsays if z ∕= 0, then there exists an x∗ ∈ X∗ such that x∗(z) ∕= 0.Therefore, x = y.

2. (a) xn → x implies xn ⇀ x, but the converse is not true.

(b) xn ⇀ x implies {xn} is bounded in X and ∣wlim xn∣ = ∣x∣ ≤ liminf ∣xn∣.Note: This says that the norm is weakly lower semi-continuous.Recall if f is continuous, then ∣f(x) − f(x0)∣ < � or f(x0) − � <

33

f(x) < f(x0) + � for x near x0 . Lower semi-continuous meanswe just have f(x0)− � < f(x) for x sufficiently near x0. This hasimportant applications in optimization (see the remark below).

(c) xn → x if and only if x∗(xn) converges uniformly for ∣x∗∣∗ ≤ 1.

Remark: Application of Semi-continuity in Optimization.Weak lower semi-continuity plays a fundamental role in many optimiza-

tion settings. Suppose we’re in a Hilbert space X and we want to minimize afunction J(x) over K ⊂ X. Further suppose we have a minimizing sequence{xn} such that J(xn) ↘ inf

x∈KJ(x) and {xn} is bounded, i.e., ∣xn∣ ≤ M .

As we shall see below, bounded sequences possess weakly convergent sub-sequences. Therefore we can choose a subsequence converging weakly, i.e,xnk ⇀ x. If J is weakly lower semi-continuous, then we have J(wlim xnk) ≤lim inf J(xnk) or J(x) ≤ inf

x∈KJ(x). Whenever K is weakly closed, this

yields existence of a minimum (not just an infimum) when minimizing aweakly lower semi-continuous J . These facts are often used in constructivearguments to develop computational algorithms.

Definition 10 Weak convergence generates a topology in X (use gener-alized sequences or nets) called the weak topology of X or X∗ topology ofX.

Note: Weakly closed implies strongly closed, but not conversely in general.However, we do have an important case for which weak and strong closureare equivalent.

Theorem 11 (Mazur - 1933) A convex set E ⊂ X is X∗ closed if and onlyif X is closed.

Thus, for convex sets, a set is weakly closed if and only if it is stronglyclosed. See [DS] for relevant information.

Definition 11 Since X∗ is a Banach space, we can define its dual, X∗∗.There is a natural embedding X ↪→ X∗∗ defined by J : X → X∗∗ where

J(x)(x∗) = x∗(x)

for all x∗ ∈ X∗.

In general, J(X) ⊂ X∗∗, but J(X) ∕= X∗∗.

Definition 12 If J(X) = X∗∗, then X is called a reflexive Banach space.

34

Result Section 2

The following elementary results can be argued.

1. Hilbert spaces are reflexive.

2. X is a reflexive Banach space implies X is weakly sequentially com-plete.

3. X is a reflexive Banach space if and only if {x ∈ X∣ ∣x∣ ≤ 1} is compactin the weak topology. Therefore, in a Banach space, bounded sets areweakly, sequentially pre-compact.

4. Bounded sets in a Hilbert space are weakly, sequentially pre-compact.(We have this from Results 1. and 3.)

5. Let X be a Hilbert space. Then

xn ⇀ x and ∣xn∣ → ∣x∣ implies xn → x.

Conversely,

xn ⇀ x and xn → x implies ∣xn∣ → ∣x∣.

Therefore,

xn ⇀ x implies xn → x if and only if ∣xn∣ → ∣x∣.

This is easy to see as

∣xn − x∣2 = ⟨xn − x, xn − x⟩ = ∣xn∣2 − ⟨xn, x⟩ − ⟨x, xn⟩+ ∣x∣2.

Definition 13 For a Banach spaceX, X∗ is itself a Banach space; therefore,the weak topology of X∗ can be defined and is called the X∗∗ topology.

We can now consider weak convergence in X∗ which can be characterizedby

x∗n ⇀ x∗ ∈ X∗ if and only if x∗∗(x∗n)→ x∗∗(x∗) for all x∗∗ ∈ X∗∗.

Definition 14 We can put an even weaker topology on X∗. We can define

a weak* convergence of {x∗n} ⊂ X∗, denoted as x∗nw∗⇀ x∗, by

x∗nw∗⇀ x∗ ≡ x∗n(x)→ x∗(x) for all x ∈ X.

(i.e., the pointwise convergence of (anti) linear functionals.) Note that thisis in general a weaker convergence than weak convergence since J(X) ⊂ X∗∗with J(X) ∕= X∗∗ in general.

35

Result Section 3

We have the following results.

1. x∗nw∗⇀ x∗ implies ∣x∗n∣∗ ≤M .

2. x∗n → x∗ in X∗ implies x∗nw∗⇀ x∗ in X∗ (i.e., strong convergence implies

weak* convergence).

The converse does not hold.

3. If X is a Banach space, then X∗ is weak* sequentially complete andthe norm ∣ ⋅ ∣∗ is weak* lower semi-continuous. In other words,

∣w*lim inf x∗n∣∗ ≤ lim inf ∣x∗n∣∗.

4. x∗n ⇀ x∗ in X∗ implies x∗nw∗⇀ x∗ in X∗ (i.e., weak convergence implies

weak* convergence).

The weak* convergence in X∗ can be used to define a topology for X∗,called the X topology of X∗. In general, the X topology of X∗ is weakerthan the X∗∗ topology of X∗. In other words, in X∗

strong convergence implies weak convergence implies weak* convergence;

however, the converses do not hold.

5.2.1 Summary of topologies on X and X∗

X X∗

strong strong

weak weak( X∗ topology of X) ( X∗∗ topology of X∗)

– weak *( X topology of X∗)

Note:For any Banach space Y , we can talk about weak convergence, but we canonly talk about weak* convergence in the case that Y = X∗ = the dual ofsome space X !!

Theorem 12 (Alaoglu’s Theorem) Let X be a Banach space. Then the unitball (closed unit sphere) in X∗ is compact in the weak* topology (i.e., the Xtopology of X∗).

Corollary 6 Bounded sets in X∗ are weak* sequentially compact.

36

Result Section 4

We can summarize additional findings that are consequences of the aboveresults.

1. Let J : X → X∗∗ be the natural embedding so that for a reflex-ive Banach space, J(X) = X∗∗. Hence we have that the weak andweak* topologies of X∗ are the same. To see this we argue x∗∗(x∗n) =J(x)(x∗n) = x∗n(x) and hence

x∗∗(x∗n)→ x∗∗(x∗)

is weak sense convergence with

x∗∗(x∗) = J(x)(x∗) = x∗(x).

Since x∗n(x)→ x∗(x) is weak* convergence, these are thus equivalent.

Conversely, if X is not reflexive, then J(X) ∕= X∗∗ and the X topologyof X∗ is weaker than the X∗∗ topology of X∗. Thus in a reflexiveBanach space (which includes Hilbert spaces), bounded sets are weaklysequentially compact. (This is a special case of Alaoglu’s theorem.)

2. A Banach space is reflexive if and only if the unit ball is compact inweak topology.

5.3 Examples of Spaces and Their Duals

∙ We first consider Lp(0, 1) and recall

L∗p(0, 1) = Lq(0, 1), 1 ≤ p <∞, 1

p+

1

q= 1.

In other words, L∞ is the dual of a space, L∗1 = L∞; however, we donot have a satisfactory representation for the dual L∗∞ of L∞. In otherwords, we cannot write L1 as the dual of L∞ or any other known space.As a result, we can consider the weak* topology in Lp for 1 < p ≤ ∞,but we cannot talk about the weak* topology in L1.

∙ Lp(0, T ;X) are important spaces because they are very useful in study-ing the system x(t) = Ax(t) +F (t), 0 < t < T, in X. We have (to bediscussed later)

Lp(0, T ;X)∗ ∼= Lq(0, T ;X∗), 1 ≤ p <∞.

See [E] for relevant details on Banach space valued functions of timeon intervals (0, T ).

37

∙ The Banach space of (bounded) continuous functions on [a, b] is de-noted by C[a, b]. It has a norm defined by ∣'∣ = sup

�∈[a,b]∣'(�)∣ for

' ∈ C[a, b]. Then we have the following theorem from [DS, Chap-ter IV].

Theorem 13 The following equivalences hold:

C∗[a, b] ∼= rba[a, b] ∼= NBV[a, b]

where NBV[a, b] stands for normalized bounded variation functions andrba[a, b] stands for regular bounded additive set functions.

Therefore we can consider the weak* topology for NBV[a, b]; however,C[a, b] is not the dual of another space, and specifically NBV[a, b]∗ ∕=C[a, b] (i.e., we do not have satisfactory characterizations or representations–there is no known space X such C[a, b] = X∗). Thus we cannot talkabout the weak* topology in C[a, b].

Discussion of rba[a, b]: Suppose � is a regular set function on Borelsubsets of [a, b]. Then by definition of regular, given a set F ⊂ [a, b]and � > 0, there exists a closed set E and an open set O such thatE ⊂ F ⊂ O and �(O − E) < �. Additive means that for any twoBorel subsets A and B contained in [a, b] such that A∩B = ∅, we have�(A ∪B) = �(A) + �(B).

Discussion of NBV[a, b]: BV[a, b] has a weak norm ∣f ∣ = ∣f(a+)∣+var(f) where var(f) is the total variation

var(f) = sup�[a,b]

N∑i=1

∣f(xi)− f(xi−1)∣,

and �[a, b] denotes the partitions of [a, b]. Note that var(f) provides asemi-norm on BV and BV can be written as BVo ⊕ N where BVo ={f ∈ BV∣f(a+) = 0 and N is the 1-D space of constant functions}.NBV[a, b] is normalized by making f right-continuous at the interiorpoints and f(a+) = 0 with ∣f ∣ = var(f).

Suppose f ∈ NBV[a, b]; then we can write [Ru1, p. 101], [Ru2, p. 163]

f(x) = f(a) + pf (x)− nf (x)

v(x) = pf (x) + nf (x),

38

where pf , nf and v are nondecreasing monotone functions defined by

pf (x) = sup�[a,x]

N∑i=1

(f(xi)− f(xi−1))+,

nf (x) = sup�[a,x]

N∑i=1

∣(f(xi)− f(xi−1))−∣, and

v(x) = sup�[a,x]

N∑i=1

∣f(xi)− f(xi−1∣,

where (⋅)+ and (⋅)− are the positive and negative parts of (⋅), respec-tively. Using the monotone functions pf and nf , one can generate(positive) Lebesgue-Stieltjes measures �pf and �nf [DS, Hal, HeSt,McS, Ru1, Ru2, Smir] and subsequently a signed measure

�f = �pf − �nf ,

where �f is regular. From the above theorem we have C∗[a, b] ∼=rba(a, b), which means there is a one-to-one correspondence

x∗ ↔ �,

given by

x∗(') =

∫'d�.

We also have C∗[a, b] ∼= NBV[a, b], which means we have

x∗ ↔ �f ,

given by

x∗(') =

∫'d�f

(=

∫'df

)where the integrals are Lebesque-Stieltjes integrals.

∙ We also need to consider Sobolev spaces, W k,p(Ω). Recall W k,p(Ω) isreflexive for 1 < p <∞ and, moreover, W k,2(Ω) is a Hilbert space. We

can define W k,p0 (Ω) as Sobolev spaces with compact support (“func-

tions vanish at the boundary”), or we can think of W k,p0 (Ω) as the

closure of C∞0 (Ω) in W k,p. We can consider their dual spaces and find

W k,p0 (Ω)∗ ∼= W−k,p(Ω) 1 < p <∞, 1

p+

1

q= 1.

For relevant material, see [A, DS].

39

5.4 Return to Dissipativeness for General Banach Spaces

Definition 15 Let X be a Banach space and X∗ be its conjugate dual.Then

x∗(x) = ⟨x∗, x⟩is called the duality product.

Definition 16 For each x in a Banach space X, we define the duality setF (x) ⊂ X∗ by

F (x) = {x∗ ∈ X∗∣ ⟨x∗, x⟩ = ∣x∣2X = ∣x∗∣2X∗}.

We know F (x) ∕= ∅ for each x ∈ X by the Hahn-Banach theorem.

Result

If X is a Hilbert space, one can show F (x) = {x} under the usual isomor-phism X ≃ X∗. (This is the Duality Set Lemma, Section 4)

Definition 17 Let A : D(A) ⊂ X → X for a Banach space X. Then A iscalled dissipative if for every x ∈ D(A) there exists some x∗ ∈ F (x) suchthat

Re⟨x∗, Ax⟩ ≤ 0.

Result

The theorems stated previously for Hilbert spaces are still true in a Banachspace with the definition of dissipativeness given above. (Lumer-Phillips,characterizations, etc.)See [Pa, p. 13-16].

5.5 More on Adjoint Operators

Let X and Y be normed linear spaces with X∗ and Y ∗ as their conjugateduals, respectively. For D(A) dense in X, suppose we have the operatorA : D(A) ⊂ X → Y . Then we can define the adjoint operator A∗ : D(A∗) ⊂Y ∗ → X∗ with

D(A∗) = {y∗ ∈ Y ∗∣ x→ y∗(Ax) is continuous on D(A)}.

Given y∗ ∈ Y ∗, we see that y∗ ∈ D(A∗) if and only if there exists x∗ ∈ X∗such that y∗(Ax) = x∗(x) for all x ∈ D(A). Therefore, we define A∗ by

A∗y∗ = x∗.

40

In other words,(A∗y∗)(x) = x∗(x).

5.5.1 Adjoint Operator in a Hilbert Space

In a Hilbert space, X∗ ∼ X and Y ∗ ∼ Y ; therefore,

⟨Ax, y⟩Y = ⟨x,A∗y⟩X .

5.5.2 Special Case

In general, we do not need to have T ∈ ℒ(X,Y ) for T ∗ to be defined;however, if we do have T ∈ ℒ(X,Y ), then T ∗ ∈ ℒ(Y ∗, X∗).

Theorem 14 Let X and Y be reflexive Banach spaces. If T is a closedlinear operator T : X → Y with D(T ) dense, then D(T ∗) is dense in Y ∗(∼Y ) and T ∗∗ = T .

Note: This does not say anything about the relationship between D(T ) andD(T ∗).

5.6 Examples of Computing Adjoints

In this series of examples, we illustrate how the domain, and in particular,boundary conditions, can affect the definition of an operator. In this casewe study operators which are all essentially the operations of second orderdifferentiation, i.e., A' = '′′, arising in the heat equation operator of Ex-ample 1. As we shall see, the boundary conditions affect dramatically thenature of the adjoint A∗.

We define H20 (0, 1) = {' ∈ H2(0, 1)∣'(0) = '(1) = '′(0) = '′(1) = 0}.

1. Let V = H20 (0, 1) and H = L2(0, 1). We define the operator A by

A : D(A) = V ⊂ H → H

byA' = '′′.

In order to define the adjoint operator A∗, we need to find, for each , an element w such that

⟨A', ⟩ = ⟨',w⟩ for all ' ∈ H20 (0, 1).

41

Therefore, we may carry out the calculations

⟨A', ⟩ =∫ 1

0 '′′

= −∫ 1

0 '′ ′ + '′ ∣10

=∫ 1

0 ' ′′ − ' ′∣10 + '′ ∣10

=∫ 1

0 ' ′′ + 0 + 0

= ⟨', ′′⟩

which hold if and only if ∈ H2.

ThereforeD(A∗) = H2(0, 1)

withA∗ = ′′.

Observe that D(A∗) ∕= D(A); therefore, A is not self-adjoint. However,A∗ = A on their common domain H2

0 . Thus we see that there areboundary conditions which immediately render the heat operator non-self-adjoint.

2. Let H = L2(0, 1), V = H20 (0, 1) and recall that V ∗ = H−2(0, 1) ≡

{ ∣x →∫ x

0

∫ s0 (t)dtds is in L2(0, 1)} (see [A]). We define the opera-

tor A byA : H → V ∗

by(A') = ⟨', ′′⟩H

for ' ∈ H and ∈ V . We claim that A ∈ ℒ(H,V ∗) and this shows thatdifferentiation can be bounded (continuous) if the spaces are chosenin a certain way. To verify our claim show ∣A'∣V ∗ ≤ k∣'∣H . We have

∣(A')( )∣ = ∣⟨', ′′⟩∣≤ ∣'∣L2 ∣ ′′∣L2

≤ ∣'∣H ∣ ∣V .

Therefore∣(A')( )∣∣ ∣V

≤ ∣'∣H .

However

∣A'∣V ∗ = sup ∕=0

∣(A')( )∣∣ ∣V

.

42

Thus∣A'∣V ∗ ≤ ∣'∣H

which implies that A ∈ ℒ(H,V ∗). From this we know A∗ ∈ ℒ(V,H).Next we seek to find A∗ by using

(A∗ )(') = (A')( ) = ⟨', ′′⟩

for ' ∈ H. Thus A∗ is defined on all of V by

A∗ = ′′.

As in the previous example, A is not self adjoint.

3. Let H = L2(0, 1) and define the operator A1 by

A1 : D(A1) ⊂ H → H

whereD(A1) = H2(0, 1) ∩H1

0 (0, 1)

withA1' = '′′.

We know A1 /∈ ℒ(H) from a previous assignment. In order to find theadjoint operator A∗1, we want for each to find a w such that

⟨A1', ⟩ = ⟨',w⟩ for all ' ∈ D(A1).

However⟨A1', ⟩ =

∫ 10 '′′

=∫ 1

0 ' ′′ − ' ′∣10 + '′ ∣10

=∫ 1

0 ' ′′ + 0 + '′ ∣10

=∫ 1

0 ' ′′

if and only if ∈ H2 ∩H10 . Hence

A∗1 = ′′

withD(A∗1) = H2(0, 1) ∩H1

0 (0, 1) = D(A1).

Thus A1 is self-adjoint; however, it’s not a bounded linear operator onH.

43

4. Let V = H10 (0, 1). Then V ∗ = H−1(0, 1). We can define the operator

A1 byA1 : V → V ∗

with(A1')( ) = −⟨'′, ′⟩H .

We argue A1 ∈ ℒ(V, V ∗). To see this we must establish ∣A1'∣V ∗ ≤k∣'∣V for some k. We find

∣(A1')( )∣ = ∣⟨'′, ′⟩∣≤ ∣'′∣L2 ∣ ′∣L2

≤ ∣'∣V ∣ ∣V ,

and hence∣(A1')( )∣∣ ∣V

≤ ∣'∣V .

However

∣A1'∣V ∗ = sup ∕=0

∣(A1')( )∣∣ ∣V

.

Therefore∣A1'∣V ∗ ≤ ∣'∣V

which yields that A1 ∈ ℒ(V, V ∗). We know that A1∗ ∈ ℒ(V, V ∗). Next

we would like to find A∗1 which must satisfy

(A1∗ )(') = (A1')( ) = −⟨'′, ′⟩

for ' ∈ D(A1). Hence A1∗

is defined on all of V by

(A1∗ )' = −⟨'′, ′⟩.

Thus we find that A1 is a bounded self adjoint linear operator.

44

6 Weak* Convergence, the Prohorov Metric, In-verse Problems and Optimization

6.1 Weak* Convergence and the Prohorov Metric

We turn next to discuss functional analysis concepts that are integral to sev-eral types of inverse problems. These problems entail parameter estimationusing output observations where the parameters to be estimated are proba-bility distributions. Such problems can be readily found in certain areas ofapplications [BBi, BBH, BBKW, BBPP, BD, BDTR, GRD-FP, BDEHAD,GRD-FP2, BF, BFPZ, BG1, BG2, BGIT]. Here we consider two types ofmeasure dependent problems: individual dynamics and aggregate dynamics.

6.1.1 Type I: Individual Dynamics/Aggregate Data

Recall, from Example 5 (the size-structured population model for mosquitofishwhere there is a family G of individual growth rates g) we have the expectedsize density u(t, �;P ) = ℰ [v(t, �; ⋅)∣P ] described for a general probabilitydistribution P , we have

u(t, �;P ) =

∫Gv(t, �; g)dP (g),

as the expected value of the density v(t, �; g) which satisfies for a given gthe Sinko-Streiffer.

Since in typical inverse problems, we must use aggregate population datadij to estimate P itself, we are therefore required to understand the qualita-tive properties (continuity, sensitivity, etc.) of u(t, �;P ) as a function of P .Thus, we see that the data for parametric estimation or inverse problems isdij ∼ u(t, �;P ).

6.1.2 Type II: Aggregate Dynamics

We return to the electromagnetics example of Example 4 which entails po-larization of inhomogeneous dielectric materials. In this case, individual(particle) dynamics are not available. Instead, the dynamics (13) them-selves depend on a probability measure F , e.g.,

∂u

∂t+

∂2

∂x2u = f(u, F ).

To describe the behavior of the electric polarization P , we begin withthe general formulation of Chapter 2 of [BBL] by employing a polarization

45

kernel g in the convolution expression

P (t, z) =

∫ t

0g(t− s, z; �)E(s, z)ds. (29)

As explained in Section 2.5 and [BBL], this general formulation includes asspecial cases the well known orientational or Debye polarization model, theelectronic or Lorentz polarization model, and linear combinations thereof,as well as other higher order models. In the Debye case the kernel is givenby

g(t; �) = �0(�s − �∞)/� e−t/� ,

while in the Lorentz model (again see [BBL]), it takes the form

g(t; �) = �0!2p/�0 e

−t/2� sin(�0t).

However, use of these kernels presupposes that the material may be suf-ficiently defined by a single relaxation parameter � , which is generally notthe case. In order to account for multiple relaxation parameters in the po-larization mechanisms, we allow for a distribution of relaxation parameterswhich is conveniently described in terms of a probability measure F . Thus,we define our polarization model in terms of a convolution operator

P (t, z) =

∫ t

0G(t− s, z;F )E(s, z)ds

where G is determined by various polarization mechanisms each describedby a different parameter � , and therefore is given by

G(t, z;F ) =

∫Tg(t, z; �)dF (�)

where T ⊂ [�1, �2].To obtain the macroscopic polarization, we sum over all the parameters.

We cannot separate dynamics to obtain individual dynamics, and thereforewe have (13) in Section 2.5 as an example where the dynamics for the Eand H fields depend explicitly on the probability measure F .

For inverse problems, data dij ≈ u(ti, �j) = u(ti, �j ;P ) in Type I, whereasdij ≈ E(ti, xj ;F ) in Type II. Observe that P or F ∈ P(Q), which is thespace of probability measures, where Q consists of “parameters”, and P(Q)is a set of probability distributions over Q.

For example, in ordinary least square problems, we have

J(P ) =∑ij

∣dij − u(ti, �j ;P )∣2 or J(F ) =∑ij

∣dij − E(ti, xj ;F )∣2,

46

to be minimized over P or F ∈ P(Q) or over some subset of P(Q). We need(for computational and theoretical considerations) a sense of closeness of twoprobability distributions P1 and P2. The ideas of local minima, continuity,“gradient”, etc., depend on a metric (or topology) for P1 and P2.

More fundamentally for Type II systems, the “well-posedness of sys-tems”, including existence and continuous dependence of solutions on “pa-rameters”, depends on the concept of F1 and F2 being close, i.e., F →E(t, x;F ) is continuous.

47

6.2 Metrics on Probability Spaces

We consider P(Q) as the space of probability measures or distributions onQ. Let P1 and P2 be probability distributions. If working on the real line(−∞,∞) or (0,∞) we will sometimes not distinguish between the probabil-ity measure P and its cumulative distribution function F .

We need to understand the concept of a metric or distance �(P1, P2)between P1 and P2. There are several metrics that metrize (i.e.,generate anequivalent topology for) the weak* topology on P including

i) Levy−− denoted here by �L(P1, P2),

ii) Prohorov−−denoted by �PR(P1, P2),

iii) Bounded Lipschitz−−denoted by �BL(P1, P2),

and also some that do not metrize the weak* topology on P including

iv) Total variation−− denoted by �TV (P1, P2),

v) Kolmogorov−− denoted by �K(P1, P2).

For relevant material, see [B, H, P].Previous developments in probability theory provide helpful results in

the pursuit of a possible complete computational methodology. One of themost important tools in probability theory is the Prohorov metric, which wewill now formally define. Let Q be a complete metric space with metric d.Given a closed subset A of Q, define the �-neighborhood of A as

A� = {q ∈ Q : d(q, q) ≤ � for some q ∈ A}= {q ∈ Q : inf

q∈Ad(q, q) ≤ �}.

We define the Prohorov metric � : P(Q)× P(Q)→ R+ by

�(P1, P2) ≡ inf{� > 0 : P1(A) ≤ P2(A�) + �, A closed ⊂ Q}.

This can be shown to be a metric on P(Q) and has many properties including

(a.) If Q is complete, then (P(Q), �) is a complete metric space;

(b.) If Q is compact, then (P(Q), �) is a compact metric space.

Note that the definition of � is not intuitive. For example, we do not nec-essarily know what Pk → P in � means. We have the following importantcharacterizations [B]. Given Pk, P ∈ P(Q), the following convergence state-ments are equivalent:

48

1. �(Pk, P )→ 0;

2.∫Q fdPk(q)→

∫Q fdP (q) for all bounded, continuous f : Q→ R1;

3. Pk[A]→ P [A] for all Borel sets A ⊂ Q with P [∂A] = 0.

Thus, we immediately obtain the following results:

∙ Convergence in the � metric is equivalent to convergence in distri-bution or so-called “weak” convergence of measures encountered inprobability theory.

∙ Let C∗B(Q) denote the topological dual of CB(Q), where CB(Q) is theusual space of bounded continuous functions on Q with the supremumnorm. If we view P(Q) ⊂ C∗B(Q), convergence in the � topology isequivalent to weak* convergence in P(Q). Note the misnomer “weakconvergence of measures” used by probabilists which is actually weak*convergence in the functional analysis sense defined here and in almostall standard texts.

More importantly,

�(Pk, P )→ 0 is equivalent to

∫Qx(ti; q)dPk(q)→

∫Qx(ti; q)dP (q),

for any continuous functions q → x(ti; q). Thus Pk → P in � metric isequivalent to

ℰ [x(ti; q)∣Pk]→ ℰ [x(ti; q)∣P ]

or “convergence in expectation”. This yields that

P → J(P ) =

n∑i=1

∣ℰ [x(ti; q)∣P ]− di∣2

is continuous in the � topology. Continuity of P → J(P ) allows us to assertthe existence of a solution to min J(P ) over P ∈ P(Q).

These results give us the following theorem.

Theorem 15 The Prohorov metric metrizes the weak* topology of P(Q).

Now we may consider whether there are other (equivalent) metrics thatmetrize the weak* topology on P. But first we point out an application instatistics where the need for metrics on distributions arises.

49

Robust Statistics

Prohorov (also sometimes found as Prokhorov in translations) was interestedin finding a useful metric for the frequently used “convergence in distribu-tion” or “convergence in measure” in probability theory. Later theoreti-cal statisticians became interested in (and concerned with) the underlyingfoundations for statistical testing and in particular inference procedures [H].Specifically, attention was given to the “stability” of assumptions underwhich statistical asymptotic theories may be valid. This led to “robust-ness” (insensitivity to small deviations in underlying assumptions) or RobustStatistics. These ideas focus on “distributional robustness” as embodied ininsensitivity to “small” derivations in distributions (often from the normalor Gaussian distribution assumptions found in many formulations) and alsohow lack of this robustness might affect the validity of statistical tests. Thisled to a renewed interest in the use of distributional metrics such as that ofProhorov. As formulated by Prohorov this led to attempts to find a metricfor the “weak” topology on P defined as the weakest topology on P suchthat for every bounded continuous ( ∈ CB(Q)) the map

P →∫Q dP

is continuous, i.e. consider P ⊂ C∗B(Q).To describe the various metrics that have found use by statisticians and

probabilitists, it is desirable to recall the concept of a Polish space which isdefined as a complete, separable, metrizable topological space. We turn toa brief summary of other metrics of interest.

The Levy Metric

The Levy metric is defined in the special case for Q = R1, where one usuallydoes not distinguish between the probability measure or distribution P1 andits cumulative distribution function (cdf) F1. The metric is given by

�L(P1, P2) = inf{�∣ for all x, P1(x− �)− � ≤ P2(x) ≤ P1(x+ �) + �}.

One can argue that this is a symmetric function and indeed defines a metric.Remarks

∙√

2�L(P1, P2) is maximum distance between graphs of P1, P2 measuredalong 45o direction. It is a theorem that the Levy metric metrizes theweak* topology of P.

50

∙ The Prohorov metric is more difficult to visualize but is applicablewhen Q is any complete separable metric space (Polish space), notjust the real line (Levy case). For example, when Q is a function spacesuch as growth rates or mortality rates in Sinko-Streiffer (mosquito fishproblem), the Prohorov metric is applicable whereas the Levy metricis not.

We have already noted that both the Levy and Prohorov metrics metrizethe weak* topology on P, the former only when Q = R1. Moreover, asnoted above, we can argue that for Q a complete separable metric space,then (P(Q), �PR) is a complete separable metric space.

Recall separability implies the existence of a subsetQ0 that is a countabledense subset of Q. One can then obtain a countable dense subset of P bydefining (see also Theorem 17 below)

P0 = {measures with finite support in Q0 with rational masses}= {P =

∑finite

pi�qi ∣pi rational, {qi} ⊂ Q0}

The Bounded Lipschitz Metric:

Assume without loss of generality that distance function d on Q is boundedby 1. (If necessary, one can replace the metric by an equivalent metric

d(q1, q2) = d(q1,q2)1+d(q1,q2) .) Then define

�BL(P1, P2) = sup ∈Ψ

∣∣∣∣∫Q dP1 −

∫Q dP2

∣∣∣∣where Ψ = { ∈ C(Q) : ∣ (q1)− (q2)∣ ≤ d(q1, q2)}

A fundamental result of practical importance is

Theorem 16 For all P1, P2 ∈ P(Q),

�PR(P1, P2)2 ≤ �BL(P1, P2) ≤ 2�PR(P1, P2).

Thus, �PR and �BL define the same topology and hence �BL also metrizesthe weak topology on P(Q).

Other metrics

Other metrics of interest can be found in the literature. These include

51

∙ Total variation:

�TV (P1, P2) = supA∈ℬ∣P1(A)− P2(A)∣,

and

∙ Kolmogorov: (Q = R1)

�K(P1, P2) = supx∈R′∣P1(x)− P2(x)∣.

These metrics do not metrize the weak topology, but do satisfy

�L ≤ �PR ≤ �TV�L ≤ �K ≤ �TV .

Why is the knowledge about �BL versus �PR useful? We have that

Δk ≡∣∣∣∣∫ G(g)dPk(g)−

∫G(g)dP (g)

∣∣∣∣=

∣∣∣∣∫ G(g)fk(g)dg −∫G(g)f(g)dg

∣∣∣∣=

∣∣∣∣∫ G(g)[fk(g)− f(g)]dg

∣∣∣∣≤ �BL(Pk, P )

≤ 2�PR(Pk, P ).

Therefore, if we know Pk → P in �PR, it may be useful in direct esti-mates. But Δk → 0 is Prohorov metric convergence itself if G ∈ CB(Q).

52

6.3 Populations with Aggregate Data and Uncertainty

We consider approximation methods in estimation or inverse problems–quantity of interest is a probability distribution assume we have parameter(q ∈ Q) dependent system with model responses x(t, q) describing popula-tion of interest For data or observations, we are given a set of values {yl}for the expected values

ℰ [xl(q)∣P ] =

∫Qxl(q)dP (q)

for model xl(q) = x(tl, q) wrt unknown probability distribution P describingdistribution of parameters q over population

Use data to choose from a given family P(Q) the distribution P ∗ thatgives best fit of underlying model to data

Formulate ordinary least squares (OLS) problem–not essential–couldequally well use a WLS, MLE, etc., approach

Seek to minimize

J(P ) =∑l

∣ℰ [xl(q)∣P ]− yl∣2

over P ∈ P(Q)Even for simple dynamics for xl is an infinite dimensional optimiza-

tion problem–need approximations that lead to computationally tractableschemes

That is, it is useful to formulate methods to yield finite dimensional setsPM (Q) over which to minimize J(P ) Of course, we wish to choose thesemethods so that “PM (Q)→ P(Q)” in some sense

In this case we shall use Prohorov metric [BBPP, BBi] of weak* conver-gence of measures to assure the desired approximation results

A general theoretical framework is given in [BBPP] with specific resultson the approximations we use here given in [BBi, BPi]. Briefly, ideas for theunderlying theory are as follows:

1. One argues continuity of P → J(P ) on P(Q) with the Prohorov metric

2. If Q is compact then P(Q) is a complete metric space–also compact–when taken with Prohorov metric

3. Approximation families PM (Q) are chosen so that elements PM ∈PM (Q) can be found to approximate elements P ∈ P(Q) in Prohorovmetric

53

4. Well-posedness (existence, continuous dependence of estimates on data,etc.) obtained along with feasible computational methods

The data {yl} available (which, in general, will involve longitudinal ortime evolution data) determines the nature of the problem.

∙ Type I: In what we have referred to above as Type I problems one hasonly aggregate or population level longitudinal data available. This iscommon in marine, insect, etc., catch and release experiments [BK]where one samples at different times from the same population butcannot be guaranteed of observing the same set of individuals at eachsample time. This type of data is also typical in experiments wherethe organism or population member being studied is sacrificed in theprocess of making a single observation (e.g., certain physiologicallybased pharmacokinetic (PBPK) modeling [BPo, Po] and whole organ-ism transport models [BK]). In this case one may still have dynamic(i.e., time course) models for individuals, but no individual data isavailable.

∙ Type II: The second class of problems which above we called Type IIproblems involves dynamics which depend explicitly on the probabilitydistribution P itself. In this case one only has dynamics (aggregatedynamics) for the expected value

x(t) =

∫Qx(t, q)dP (q)

of the state variable. No dynamics are available for individual trajecto-ries x(t, q) for a given q ∈ Q. The electromagnetic example of Example4 is precisely this situation. Such problems also arise in viscoelasticityas well as biology (the HIV cellular models of Banks, Bortz and Holte[BBH]) – see also [BBPP, BG1, BG2, BPi].

While the approximations we discuss below are applicable to both typesof problems, we shall illustrate the computational results in the context ofsize-structured marine populations (mosquitofish, shrimp) and PBPK prob-lems (TCE) where the inverse problems are of Type I. Finally, we notethat in the problems considered here, one can not sample directly from theprobability distribution being estimated and this again is somewhat differentfrom the usual case treated in some of the statistical literature, e.g., see[Wahba1, Wahba2] and the references cited therein.

54

Example 5: The Growth Rate Distribution Model and InverseProblem in Marine Populations

Our motivating application entails estimation of growth rate distributionsfor size-structured mosquitofish and shrimp populations. Mosquitofish areused in place of pesticides to control mosquito populations in rice fields. Ma-rine biologists desire to correctly predict growth and decline of mosquitofishpopulation in order to determine the optimal densities of mosquitofish to useto control mosquito populations. A mathematical model that accurately de-scribes the mosquitofish population would be beneficial in this application,as well as in other problems involving population dynamics and age/size-structured data.

Based on data collected from rice fields, a reasonable mathematicalmodel would have to predict two key features that are exhibited in thedata: dispersion and bifurcation (i.e., a unimodal density becomes a bi-modal density) of the population density over time [BBKW, BF, BFPZ].As mentioned above in Example 5, the Sinko-Streifer model does exhibit ei-ther of these under reasonable biological assumptions. However, the Growthrate distribution (GRD) model, developed in [BBKW] and [BF], capturesboth of these features in its solutions.

The model is a modification of the Sinko-Streifer model (used for mod-eling age/size-structured populations) which allows individuals to have dif-ferent characteristics or behaviors.

Figure 1: Mosquitofish data.

55

Recall that the Sinko-Streifer model (SS) for size-structured mosquitofishpopulations is given by

∂v

∂t+

∂

∂�(gv) = −�v, �0 < � < �1, t > 0 (30)

v(0, �) = Φ(�)

g(t, �0)v(t, �0) =

∫ �1

�0

K(t, �)v(t, �)∂�

g(t, �1) = 0.

Here v(t, �) represents size (given in numbers per unit length) or populationdensity, where t represents time and x represents length of mosquitofish–growth rate of individual mosquitofish given by g(t, �), where

d�

dt= g(t, �) (31)

for each individual (all mosquitofish of given size have same growth rate)As we have noted, the SS model cannot be used as is to model the

mosquitofish population because it does not predict dispersion or bifurcationof the population in time under biologically reasonable assumptions [BBKW,BF]. But by modifying the SS model so that the individual growth rates ofthe mosquitofish vary across the population (instead of being the same forall individuals in the population), one obtains a model, known as the growthrate distribution (GRD) model which does in fact exhibit both dispersal intime and development of a bimodal density from a unimodal density (see[BF, BFPZ]).

As noted in Example 5, in the (GRD) model, the population densityu(t, �;P ) is actually given by

u(t, �;P ) =

∫Gv(t, �; g)dP (g). (32)

where G is collection of admissible growth rates, P is probability measureon G, and v(t, �; g) is solution of (SS) with g. This model assumes that thepopulation is made up of collections of subpopulations –individuals in samesubpopulation have same growth rate

As demonstrated in [BF], solutions to GRD model exhibit both disper-sion and bifurcation of the population density in time. Here we assume thatthe admissible growth rates g have the form

g(�; b, ) = b( − �)

56

for �0 ≤ � ≤ and zero otherwise, where b is the intrinsic growth rate of themosquitofish and = �1 is the maximum size. This choice based on work in[BBKW], where idea of other properties related to the growth rates varyingamong the mosquitofish is discussed.

Under assumption of varying intrinsic growth rates and maximum sizes,assume that b and are random variables taking values in the compact setsB and Γ, respectively. A reasonable assumption is that both are boundedclosed intervals. Thus we take

G = {g(⋅; b, )∣b ∈ B, ∈ Γ}

so that G is also compact in, for example, C[�0, �1] where �1 = max(Γ). ThenP(G) is compact in the Prohorov metric and we are in framework outlinedabove. In illustrative examples, one may choose a growth rate parameterizedby the intrinsic growth rate b with = 1 , leading to a one parameter familyof varying growth rates g among the individuals in the population. We alsoassume here that � = 0 and K = 0 in order to focus on only the distributionof growth rates; however, distributions could just as well be placed on � andK.

Next, introduce two different approaches that can be used in inverseproblem for estimation of distribution of growth rates of the mosquitofish

The first approach, which has been discussed and used in [BF] and[BFPZ], involves the use of delta distributions. We assume that probabil-ity distributions PM placed on growth rates are discrete corresponding toa collection GM with the form GM = {gk}Mk=1 where gk(x) = bk(1 − �), fork = 1, . . . ,M . Here the {bk} are a discretization of B. For each subpopu-lation with growth rate gk, there is a corresponding probability pk that anindividual is in subpopulation k. The population density u(t, x;P ) in (32)is then approximated by

u(t, �; {pk}) =∑k

v(t, �; gk)pk,

where v(t, �; gk) is the subpopulation density from (SS) with growth rate gk.We denote this delta function approximation method as DEL(M), where Mis number of elements used in this approximation.

While it has been shown that DEL(M) provides a reasonable approxima-tion to (32), a better approach might involve techniques that will provide asmoother approximation of (32) in the case of continuous probability distri-butions on the growth rates. Thus, as a second approach, we chose to use anapproximation scheme based on piecewise linear splines. Here we have as-sumed that P is a continuous probability distribution on the intrinsic growth

57

rates. We approximate the density P ′ = dPdb = p(b) using piecewise linear

splines, which leads to the following approximation for u(t, �;P ) in (32):

u(t, �; {ak}) ≈∑k

ak

∫Bv(t, �; g)lk(b)db,

where g(�; b) = b(1−�), pk(b) = aklk(b) is the probability density for an indi-vidual in subpopulation k and lk represents the piecewise linear spline func-tions. This spline based approximation method is denoted by SPL(M,N),where M is the number of basis elements used to approximate the growthrate probability distribution and N is the number of quadrature nodes usedto approximate the integral in the formula above.

One can use the approximation methods DEL(M) and SPL(M,N) in theinverse problem for the estimation of the growth rate distributions. Theleast squares inverse problem to be solved is

minP∈PM (G)

J(P ) =∑i,j

∣u(ti, �j ;P )− uij ∣2 (33)

=∑i,j

(u(ti, �j ;P ))2 − 2u(ti, �j ;P )uij + (uij)2,

where {uij} is the data and PM (G) is the finite dimensional approximationto P(G). When using DEL(M), the finite dimensional approximation PM (G)to the probability measure space P(G) is given by

PM (G) =

{P ∈ P(G)∣ P ′ =

∑k

pk�gk ,∑k

pk = 1

},

where �gk is the delta function with an atom at gk. However, when usingSPL(M,N), the finite dimensional approximation PM (G) is given by

PM (G) =

{P ∈ P(G)∣ P ′ =

∑k

aklk(b),∑k

ak

∫Blk(b)db = 1

}.

Furthermore, we note that this least squares inverse problem (33) be-comes a quadratic programming problem [BF, BFPZ]. Letting p be thevector that contains pk, 1 ≤ k ≤M, when using DEL(M) or ak, 1 ≤ k ≤M,when using SPL(M,N), we let A be the matrix with entries given by

Akm =∑i,j

v(ti, �j ; gk)v(ti, �j ; gm),

58

let b the vector with entries given by

bk = −∑i,j

uijv(ti, �j ; gk),

andc =

∑i,j

(uij)2,

where 1 ≤ k,m ≤M. In the place of (33), we now minimize

F (p) ≡ pTAp + 2pTb + c (34)

over PM (G). We note when using DEL(M) we also had to include the con-straint ∑

k

pk = 1,

while when using SPL(M,N) we had to include the constraint∑k

ak

∫Blk(b)db = 1.

However, in both cases, we were able to include these constraints along withnon-negativity constraints on the {pk} and {ak} in the programming of thesetwo inverse problems.

Other Size-Structured Population Models

The Sinko-Streifer (SS) model [SS] and its variations have been widely usedto describe numerous age and size-structured populations (see [BBDS1,BBDS2, BBKW, BF, BFPZ, BA, Kot, MetzD] for example).

One recent example involves the shrimp growth models of [Shrimp, BDEHAD].Dispersion in size observed in experimental data for early growth of shrimphas been observed in data from different raceways at Shrimp MaricultureResearch Facility, Texas Agricultural Experiment Station in Corpus Christi,TX. Initial sizes were very similar but variability is observed in aggregatetype longitudinal data. A reasonable model for these populations must ac-count for variability in size distribution data which perhaps is a result ofvariability in individual growth rates across population.

In summary, the GRD model (32) represents one approach to account-ing for variability in growth rates by imposing a probability distribution onthe growth rates in the SS model (30). Individuals in the population grow

59

according to a deterministic growth model (31), but different individuals inthe population may have different parameter dependent growth rates in theGRD model. The population is assumed to consist of subpopulations withindividuals in the same subpopulation having the same growth rate. Thegrowth uncertainty of individuals in the population is the result of variabilityin growth rates among the subpopulations. This modeling approach, whichentails a stationary probabilistic structure on a family of deterministic dy-namic systems, may be most applicable when the growth of individuals isassumed to be the result of genetic variability.

60

6.4 Two Player Min-Max Games with Uncertainty

We consider electromagnetic evasion-interrogation games wherein the evadercan use ferroelectric material coatings to attempt to avoid detection whilethe interrogator can manipulate the interrogating frequencies (wave num-bers) and angles of incidence of the interrogating inputs to enhance detec-tion and identification. The resulting problems are formulated as two playergames in which one player wishes to minimize the reflected signal while theother wishes to maximize it. Simple deterministic strategies are easily de-feated and hence the players must introduce uncertainty to disguise theirintentions and confuse their opponent. Mathematically, the resulting gameis carried out over spaces of probability measures which in many cases areappropriately metrized using the Prohorov metric. Details of the resultsdiscussed in this section can be found in [BGIT].

In [BIKT] the authors demonstrated that it is possible to design fer-roelectric materials with appropriate dielectric permittivity and magneticpermeability to significantly attenuate reflections of electromagnetic inter-rogation signals from highly conductive targets such as airfoils and missiles.This was done under assumptions that the interrogating input signal is uni-formly likely to come from a sector of interrogating angles � ∈ [�0, �1](� = �

2 − � where � is the angle of incidence of the input signal) but thatthe evader has knowledge of the interrogator’s input frequency or frequencies(denoted in the presentations below as the interrogator design frequenciesID). These results were further sharpened and illustrated in [BIT] wherea series of different material designs were considered to minimize over agiven set of input design frequencies ID the maximum reflected field frominput signals. In addition, a second critical finding was obtained in thatit was shown that if the evader employed a simple counter interrogationdesign based on a fixed set (assumed known) of interrogating frequenciesID, then by a rather simple counter-counter interrogation strategy (use aninterrogating frequency little more than 10% different from the assumed de-sign frequencies), the interrogator can easily defeat the evader’s materialcoatings counter interrogation strategy to obtain strong reflected signals.

From the combined results of [BIKT, BIT] it is thus rather easily con-cluded that the evader and the interrogator must each try to confuse theother by introducing significant uncertainty in their design and interro-gating strategies, respectively. This concept, which we refer to as mixedstrategies in recognition of previous contributions to the literature on games(von Neumann’s finite mixed strategies to be explained below), leads totwo player non-cooperative games with probabilistic strategy formulations.

61

These can be mathematically formulated as two sided optimization prob-lems over spaces of probability measures, i.e., minmax games over sets ofprobability measures.

6.4.1 Problem Formulation

We consider electromagnetic interrogation of objects in the context of min-max evader-interrogator games where each player has uncertain informationabout the adversary’s capabilities. The minmax cost functional is basedon reflected fields from an object such as an airfoil or missile and can becomputed in one of several ways [BIKT, BIT].

Dielectric Coating

z

y

z=0

z=d

Ambient

Perfectly conducting half plane

φ

Figure 2: Interrogating high frequency wave impinging (angle of incidence�) on coated (thickness d) perfectly conducting surface

The simplest is the reflection coefficient based on a simple planar geome-try (e.g., see Figure 2) using Fresnel’s formula for a perfectly conducting halfplane which has a coating layer of thickness d with dielectric permittivity �and magnetic permeability �. A normally incident (� = 0) electromagneticwave with the frequency f is assumed to impinge the half plane. Then thecorresponding wavelength � in air is � = c/f , where the speed of light isc = 0.3× 109. The reflection coefficient R for the wave is given by

R =a+ b

1 + ab, (35)

where

a =�−√��+√��

and b = e4i�√��fd/c. (36)

62

This expression can be derived directly from Maxwell’s equation by con-sidering the ratio of reflected to incident wave for example in the case ofparallel polarized (TEx) incident wave (see [BIKT, Jackson]).

An alternative and much more computationally intensive approach (whichmay be necessitated by some target geometries) employs the far field pat-tern for reflected waves computed directly using Maxwell’s equations (seeExample 4). In two dimensions, for a reflecting body Ω with coating layerΩ1 and computational domain Π with an interrogating plane wave E(i), thescattered field E(s) satisfies the Helmholtz equation [CK1, CK2]

∇ ⋅(

1

�∇E(s)

)+ �!2E(s) = −∇ ⋅

(1

�∇E(i)

)− �!2E(i) in Π ∖ Ω

E(s) = −E(i) on ∂Ω[1

�

∂E

∂n

]= [E] = 0 on ∂Ω1 ∖ ∂Ω

∂E(s)

∂n− ikE(s) − i

2k

∂2E(s)

∂s2= 0 on ∂Π

∂E(s)

∂s− ik3

2E(s) = 0 at C,

(37)

where the Silver-Muller radiation condition has been approximated by asecond-order absorbing boundary condition on ∂Π as described in [BJR,HKNT, HRT]. The vectors n and s denote the normal and tangential di-rections on the boundary ∂Π, respectively, and C is the set of the cornerpoints of Π. Here [ ⋅ ] denotes the jumps at interfaces. The incident field inair is given by

E(i) = eik(x1 cos�+x2 sin�),

where � is the interrogation angle (� = �2 − �) and k = 2�f/c = !/c is the

interrogation wave number corresponding to the interrogating frequency f .The corresponding far field pattern is given by [BIKT, CK1, CK2]

F�(�+ �; �, �, �, f) =

limr→∞

(√8�kr e−i(kr+�/4) E(s)(r cos(�+ �), r sin(�+ �); �, �, �, f)

),

(38)

where E(s)(x1, x2; �, �, �, f) is the scattered electromagnetic field. This canbe used as a measure of the reflected field intensity instead of the reflectioncoefficient R of (35)–(36).

63

The evader and the interrogator are each subject to uncertainties as tothe actions of the other. The evader wants to choose a best coating design(i.e., best �’s and �’s ) while the interrogator wants to choose best anglesof interrogation � and interrogating frequencies f . Each player must actin the presence of incomplete information about the other’s action. Partialinformation regarding capabilities and tendencies of the adversary can beembodied in probability distributions for the choices to be made. That is,we may formalize this by assuming the evader may choose (with an as yetto be determined set of probabilities) dielectric permittivity and magneticpermeability parameters (�, �) from admissible sets ℰ ×ℳ while the inter-rogator chooses angles of interrogation and interrogating frequencies (�, f)from sets A×ℱ . The formulation here is based on the mixed strategies pro-posals of von Neumann [Aubin, VN, VNM] and the ideas can be summarizedas follows. The evader does not choose a single coating, but rather has a setof possibilities available for choice. He only chooses the probabilities withwhich he will employ the materials on a target. This, in effect, disguises hisintentions from his adversary. By choosing his coatings randomly (accordingto a best strategy to be determined in, for example, a minmax game), heprevents adversaries from discovering which coating he will use–indeed, evenhe does not know which coating will be chosen for a given target. The inter-rogator, in a similar approach, determines best probabilities for choices offrequency and angle in the interrogating signals. Note that such a formula-tion tacitly assumes that the adversarial relationship persists with multipleattempts at evasion and detection.

The associated minmax problem consists of the evader choosing distribu-tions Pe(�, �) over ℰ×ℳ to minimize the reflected field while the interrogatorchooses distributions Pi(�, f) over A×ℱ to maximize the reflected field. IfBR(�, �, �, f) is the chosen measure of reflected field and Pe = P(ℰ ×ℳ)and Pi = P(A × ℱ) are the corresponding sets of probability distributionsor measures over ℰ ×ℳ and A × ℱ , respectively, then the cost functionalfor the minmax problem can be defined by

J(Pe, Pi) =

∫ℰ×ℳ

∫A×ℱ

∣BR(�, �, �, f)∣2dPe(�, �)dPi(�, f). (39)

The problems thus formulated are special cases of classical static zero–sum two player non-cooperative games [Aubin, Basar] where the evader min-imizes over Pe ∈ Pe and the interrogator maximizes over Pi ∈ Pi. In suchgames one defines upper and lower values for the game by

J = infPe∈Pe

supPi∈Pi

J(Pe, Pi)

64

andJ = sup

Pi∈Piinf

Pe∈PeJ(Pe, Pi).

The first represents a security level (worst case scenario) for the evader whilethe latter is a security level for the interrogator. It is readily argued thatJ ≤ J and if the equality J∗ = J = J holds, then J∗ is called the value ofthe game. Moreover, if there exist P ∗e ∈ Pe and P ∗i ∈ Pi such that

J∗ = J(P ∗e , P∗i ) = min

Pe∈PeJ(Pe, P

∗i ) = max

Pi∈PiJ(P ∗e , Pi),

then (P ∗e , P∗i ) is a saddle point solution or non-cooperative equilibrium of the

game.To investigate theoretical, computational and approximation issues for

these problems, it is necessary to put a topology on the space of probabilitymeasures: a natural choice for P(Q) is the Prohorov metric topology Usingits properties and arguments similar to those in [BBi, BG1, BG2, BK, BPi],one can develop well-posedness and approximation results for the minmaxproblems defined above. Efficient computational methods that correspondto von Neumann’s finite mixed strategies [Aubin] can readily be presentedin this context. These can be based on several approximation theories thathave been recently developed and used. The first, developed in [BBi] andbased on Dirac delta measures, is summarized in the following theorem.

Theorem 17 Let Q be a complete, separable metric space, ℬ the class of allBorel subsets of Q and P(Q) the space of probability measures on (Q,ℬ). LetQ0 = {qj}∞j=1 be a countable, dense subset of Q. Then the set of P ∈ P(Q)such that P has finite support in Q0 and rational masses is dense in P(Q)in the � metric. That is,

P0(Q) ≡ {P ∈ P(Q) : P =

k∑j=1

pjΔqj , k ∈ N+, qj ∈ Q0, pj rational,

k∑j=1

pj = 1}

is dense in P(Q) taken with the � metric, where Δqj is the Dirac measurewith atom at qj .

It is rather easy to use the ideas and results associated with this theoremto develop computationally efficient schemes. Given Qd =

∪∞M=1QM with

QM = {qMj }Mj=1 (a “partition” of Q) chosen so that Qd is dense in Q, define

PM (Q) = {P ∈ P(Q) : P =M∑j=1

pjΔqMj, qMj ∈ QM , pj rational,

M∑j=1

pj = 1}.

65

Then we find

(i) PM (Q) is a compact subset of P(Q) in the � metric,

(ii)PM (Q) ⊂ PM+1(Q) whenever QM+1 is a refinement of QM ,

(iii)“PM (Q) → P(Q)” in the � topology; that is, for M sufficiently large,elements in P(Q) can be approximated in the � metric by elements ofPM .

A second class of approximations, based on linear splines, was devel-oped and used in [BPi] for problems where one assumes that the probabilitydistributions to be approximated possess densities in L2.

Theorem 18 Let ℱ be a weakly compact subset of L2(Q), Q compact andlet Pℱ (Q) ≡ {P ∈ P(Q) : P ′ = p, p ∈ ℱ}. Then Pℱ (Q) is compact in P(Q)in the � metric. Moreover, if we define {ℓMj } to be the linear splines on Qcorresponding to the partition QM , where

∪M QM is dense Q, define

PM ≡ {pM : pM =∑j

bMj ℓMj , b

Mj rational }

and if

PℱM ≡ {PM ∈ P(Q) : PM =

∫pM , pM ∈ PM},

we have∪M PℱM is dense in Pℱ (Q) taken with the � metric.

A study comparing the relative strengths and weaknesses of these twoclasses of approximation schemes in the context of inverse problems is givenin [BD].

6.4.2 Theoretical Results

To establish existence of a saddle point solution for the evasion-interrogationproblems formulated above, one can employ a fundamental result of vonNeumann [VN, VNM] as stated by Aubin (see [Aubin, p. 126]).

Theorem 19 (von Neumann) Suppose X0, Y0 are compact, convex subsetsof metric linear spaces X,Y respectively. Further suppose that

(i) for all y ∈ Y0, x→ f(x, y) is convex and lower semi-continuous;

(ii) for all x ∈ X0, y → f(x, y) is concave and upper semi-continuous.

66

Then there exists a saddle point (x∗, y∗) such that

f(x∗, y∗) = minX0

maxY0

f(x, y) = maxY0

minX0

f(x, y).

A straight forward application of these results to the problems underdiscussion lead immediately to desired well-posedness results.

Theorem 20 Suppose ℰ ,ℳ,Φ,ℱ are compact and the spaces X0 = P(ℰ ×ℳ), Y0 = P(Φ × ℱ) are taken with the Prohorov metric. Then X0, Y0

are compact, convex subsets of X = C∗B(ℰ × ℳ) and Y = C∗B(Φ × ℱ),respectively. Moreover, there exists (P ∗e , P

∗i ) ∈ P(ℰ ×ℳ)× P(Φ×ℱ) such

that

J(P ∗e , P∗i ) = min

P(ℰ×ℳ)maxP(Φ×ℱ)

J(Pe, Pi) = maxP(Φ×ℱ)

minP(ℰ×ℳ)

J(Pe, Pi).

For computations, we may use the “delta” approximations of Theorem17 and obtain

dPe(�, �) ≈∑

pje�(�j ,�j)(�, �)d�d�,

dPi(�, f) ≈∑

pji �(�j ,fj)(�, f)d�df.

As noted above, a convergence theory can be found in [BBi]. This cor-responds precisely to von Neumann’s finite mixed strategies framework forprotection by disguising intentions from opponents, i.e., by introducing un-certainty in the players’ choices.

To illustrate the computational framework based on the delta measureapproximations, we take

dPMe (�, �) =M∑j=1

pMj �(�Mj ,�Mj )d�d� and dPNi (�, f) =

N∑k=1

qNk �(�Nk ,fNk )d�df,

which can be represented respectively by

pM ={pMj}Mj=1∈ PM ≡ {p ∈ ℝM ∣ pj ≥ 0,

M∑j=1

pj = 1} (40)

and

qN ={qNk}Nk=1∈ QN ≡ {q ∈ ℝN ∣ qk ≥ 0,

N∑k=1

qk = 1}. (41)

67

We note that in this case the pMj , qNk are the probabilities associated with

the use of the material parameters (�Mj , �Mj ) and interrogating parameters

(�Nk , fNk ) in the mixed strategies of the evader and interrogator, respectively.

Then J(PMe , PNi ) reduces to

J (pM , qN ) =

N∑k=1

M∑j=1

pMj∣∣BR(�Mj , �

Mj , �

Nk , f

Nk )∣∣2 qNk

where BR is a measure of the reflected field (either the reflection coefficientor the far field scattering intensity). Since PM , QN are compact, convexsubsets of ℝM ,ℝN , respectively, we have the following theorem.

Theorem 21 For fixed M,N there exists (pM∗ , qN∗ ) in PM ×QN such that

J ∗ = J (pM∗ , qN∗ ) = min

pM∈PMmaxqN∈QN

J (pM , qN ) = maxqN∈QN

minpM∈PM

J (pM , qN ).

Assume further that (�, �, �, f)→ BR(�, �, �, f) is continuous on ℰ×ℳ×Φ × ℱ which is assumed compact. Then there exists a sequence (pM∗ , q

N∗ )

of elements from PM × QN respectively with corresponding (PMe , PNi ) inP(ℰ ×ℳ)×P(Φ×ℱ) converging in the Prohorov metric to (P ∗e , P

∗i ) which

is a saddle point for the original minmax problem.

We give the arguments for these results since they are in some sense rep-resentative of the use of functional analysis in such problems. The hypothesisthat the map (�, �, �, f)→ BR(�, �, �, f) be continuous on ℰ ×ℳ× Φ× ℱimplies continuity of the map (Pe, Pi)→ J(Pe, Pi) on X0×Y0 → ℝ1

+, whereX0 = P(ℰ ×ℳ) and Y0 = P(Φ×ℱ) are compact convex in X = C∗B(ℰ×ℳ)and Y = C∗B(Φ×ℱ), respectively. This is due to the compactness of ℰ ×ℳand Φ×ℱ , and properties of the Prohorov metric.

Let PM , QN be defined as in (40),(41), and observe that these are com-pact convex subsets of ℝM ,ℝN , respectively. Now let Pe ∈ X0 and Pi ∈ Y0

be arbitrary. Then by Theorem 17 and (iii) above (see also [BBi]), thereexists a sequence pM , qN ∈ PM , QN with associated measures PMe , PNi , re-spectively, such that

PMe → Pe in X0 as M →∞ and PNi → Pi in Y0 as N →∞.

Let PM∗e , PN∗i (guaranteed to exist by continuity, compactness, and thevon Neumann theorem) with coordinates pM∗ , q

N∗ , respectively, satisfy

J(PM∗e , PNi ) ≤ J(PM∗e , PN∗i ) ≤ J(PMe , PN∗i )

68

for all (PMe , PNi ) in X0 × Y0 with coordinate representations in PM , QN ,respectively. Now X0 × Y0 compact implies that there exists a subsequence(PMk∗

e , PNk∗i ) and (Pe, Pi) in X0 × Y0 such that (PMk∗e , PNk∗i )→ (Pe, Pi) in

X0 × Y0.Consider (42) for indices Mk, Nk, so that

J(PMk∗e , PNki ) ≤ J(PMk∗

e , PNk∗i ) ≤ J(PMke , PNk∗i ).

Then given the continuity of J and taking the limit we find

J(Pe, Pi) ≤ J(Pe, Pi) ≤ J(Pe, Pi).

Since Pe, Pi are arbitrary in X0, Y0 respectively, we have (Pe, Pi) is a saddlepoint for the original minmax problem.

Further note that since these arguments hold for any subsequence of(PM∗e , PN∗i ) , then

J(PM∗e , PNi ) ≤ J(PM∗e , PN∗i ) ≤ J(PMe , PN∗i )

and PM∗e → Pe as well as PN∗i → Pi. That is, any subsequence of PM∗e , PN∗i

has a convergent subsequence to Pe, Pi so that the sequence itself mustconverge to Pe, Pi.

69

References

[A] R. A. Adams. Sobolev Spaces. Academic Press, 1975.

[Aubin] J.-P. Aubin, Optima and Equilbria: An Introduction to NonlinearAnalysis, Spring-Verlag, Berlin Heidelberg, 1993.

[BJR] A. Bamberger, P. Joly and J. E. Roberts, Second-order absorbingboundary conditions for the wave equation: a solution for the cornerproblem, SIAM J. Numer. Anal., 27 (1990), 323–352.

[BBDS1] H.T. Banks, J.E. Banks, L.K. Dick, and J.D. Stark, Estimationof dynamic rate parameters in insect populations undergoing sublethalexposure to pesticides, CRSC-TR05-22, NCSU, May, 2005; Bulletin ofMathematical Biology, 69 (2007), 2139–2180.

[BBDS2] H.T. Banks, J.E. Banks, L.K. Dick, and J.D. Stark, Time-varyingvital rates in ecotoxicology: selective pesticides and aphid populationdynamics, Ecological Modelling, 210 (2008), 155–160.

[BBJ] H. T. Banks, J. E. Banks and S. L. Joyner, Estimation in time-delay modeling of insecticide-induced mortality, CRSC-TR08-15, Octo-ber, 2008; J. Inverse and Ill-posed Problems, 17 (2009), 101–125.

[BBi] H. T. Banks and K. L. Bihari, Modeling and estimating uncertaintyin parameter estimation, Inverse Problems, 17 (2001) 1–17.

[BBo] H.T. Banks and V.A. Bokil, Parameter identification for dispersivedielectrics using pulsed microwave interrogating signals and acousticwave induced reflections in two and three dimensions, CRSC-TR04-27,July, 2004; Revised version appeared as A computational and statisticalframework for multidimensional domain acoustoptic material interroga-tion, Quart. Applied Math., 63 (2005), 156–200.

[Shrimp] H.T.Banks, V.A. Bokil, S. Hu, F.C.T. Allnutt, R. Bullis, A.K.Dhar and C.L. Browdy, Shrimp biomass and viral infection for produc-tion of biological countermeasures, CRSC-TR05-45, NCSU, December,2005; Mathematical Biosciences and Engineering, 3 (2006), 635–660.

[BBH] H. T. Banks, D. M. Bortz and S. E. Holte, Incorporation of vari-ability into the mathematical modeling of viral delays in HIV infectiondynamics, Math. Biosciences, 183 (2003), 63–91.

70

[BBPP] H.T. Banks, D.M. Bortz, G.A. Pinter and L.K. Potter, Modelingand imaging techniques with potential for application in bioterrorism,CRSC-TR03-02, January, 2003; Chapter 6 in Bioterrorism: Mathe-matical Modeling Applications in Homeland Security, (H.T. Banks andC. Castillo-Chavez, eds.), Frontiers in Applied Math, FR28, SIAM,Philadelphia, PA, 2003, 129–154.

[BBKW] H.T. Banks, L.W. Botsford, F. Kappel, and C. Wang, Modelingand estimation in size structured population models, LCDS-CCS Re-port 87-13, Brown University; Proceedings 2nd Course on MathematicalEcology, (Trieste, December 8-12, 1986) World Press (1988), Singapore,521–541.

[BBL] H.T. Banks, M.W. Buksas, T. Lin. Electromagnetic Material Inter-rogation Using Conductive Interfaces and Acoustic Wavefronts. SIAMFR 21, Philadelphia, 2002.

[BBu1] H.T. Banks and J.A. Burns, An abstract framework for approx-imate solutions to optimal control problems governed by hereditarysystems, International Conference on Differential Equations (H. An-tosiewicz, Ed.), Academic Press (1975), 10–25.

[BBu2] H.T. Banks and J.A. Burns, Hereditary control problems: numericalmethods based on averaging approximations, SIAM J. Control & Opt.,16 (1978), 169–208.

[BD] H.T. Banks and J.L. Davis, A comparison of approximation methodsfor the estimation of probability distributions on parameters, CRSC-TR05-38, NCSU, October, 2005; Applied Numerical Mathematics, 57(2007), 753–777.

[BDTR] H.T. Banks and J.L. Davis, Quantifying uncertainty in the estima-tion of probability distributions with confidence bands, CRSC-TR07-21, NCSU, December, 2007; Mathematical Biosciences and Engineer-ing, 5 (2008), 647–667.

[GRD-FP] H.T. Banks, J.L. Davis, S.L. Ernstberger, S. Hu, E. Artimovich,A.K. Dhar and C.L. Browdy, A comparison of probabilistic and stochas-tic formulations in modeling growth uncertainty and variability, CRSC-TR08-03, NCSU, February, 2008; Journal of Biological Dynamics, 3(2009) 130–148.

71

[BDEHAD] H.T. Banks, J.L. Davis, S.L. Ernstberger, S. Hu, E. Artimovichand A.K. Dhar, Experimental design and estimation of growth ratedistributions in size-structured shrimp populations, CRSC-TR08-20,NCSU, November, 2008; Inverse Problems, to appear.

[GRD-FP2] H.T. Banks, J.L. Davis, and S. Hu, A computational compar-ison of alternatives to including uncertainty in structured populationmodels, CRSC-TR09-14, June, 2009; in Three Decades of Progress inSystems and Control, Springer, 2009, to appear.

[BDEK] H.T. Banks, S. Dediu, S. L. Ernstberger and F. Kappel, Gener-alized sensitivities and optimal experimental design, CRSC-TR08-12,September, 2008, Revised Nov, 2009; J Inverse and Ill-Posed Problems,submitted.

[BF] H.T. Banks and B.G. Fitzpatrick, Estimation of growth rate distri-butions in size-structured population models, CAMS Tech. Rep. 90-2,University of Southern California, January, 1990; Quarterly of AppliedMathematics, 49 (1991), 215–235.

[BFPZ] H.T. Banks, B.G. Fitzpatrick, L.K. Potter, and Y. Zhang, Es-timation of probability distributions for individual parameters usingaggregate population data, CRSC-TR98-6, NCSU, January, 1998; InStochastic Analysis, Control, Optimization and Applications, (Editedby W. McEneaney, G. Yin and Q. Zhang), Birkhauser, Boston, 1989.

[BG1] H.T. Banks and N.L. Gibson. Well-posedness in Maxwell systemswith distributions of polarization relaxation parameters. CRSC, TechReport TR04-01, North Carolina State University, January, 2004; Ap-plied Math. Letters, 18 (2005), 423–430.

[BG2] H.T. Banks and N.L. Gibson, Electromagnetic inverse problems in-volving distributions of dielectric mechanisms and parameters, CRSC-TR05-29, August, 2005; Quarterly of Applied Mathematics, 64 (2006),749–795.

[BGIT] H.T. Banks, S.L. Grove, K. Ito and J.A. Toivanen, Static two-playerevasion-interrogation games with uncertainty, CRSC-TR06-16, June,2006; Comp. and Applied Math, 25 (2006), 289–306.

[BI] H.T. Banks and K. Ito. A unified framework for approximation andinverse problems for distributed parameter systems. Control: Theoryand Adv. Tech. 4 (1988), 73-90.

72

[BIKT] H. T. Banks, K. Ito, G. M. Kepler and J. A. Toivanen, Materialsurface design to counter electromagnetic interrogation of targets, Tech.Report CRSC-TR04-37, Center for Research in Scientific Computation,N.C. State University, Raleigh, NC, November, 2004; SIAM J. Appl.Math., 66 (2006), 1027–1049.

[BIT] H. T. Banks, K. Ito and J. A. Toivanen, Determination of interrogat-ing frequencies to maximize electromagnetic backscatter from objectswith material coatings, Tech. Report CRSC-TR05-30, Center for Re-search in Scientific Computation, N.C. State University, Raleigh, NC,August, 2005; Communications in Comp. Physics, 1 (2006), 357–377.

[BKap] H.T. Banks and F. Kappel, Transformation semigroups and L1 ap-proximation for size structured population models (with ), LCDS/CCSRep. 88-20, July, 1988; Semigroup Forum, 38 (1989), 141–155.

[BKW1] H.T. Banks, F. Kappel and C. Wang, Weak solutions and differ-entiability for size structured population models (with ), CAMS Tech.Rep. 90-11, August, 1990, University of Southern California; in DPSControl and Applications, Birkhauser, Intl. Ser. Num. Math., Vol.100(1991), 35–50.

[BKW2] H.T. Banks, F. Kappel and C. Wang, A semigroup formulationof nonlinear size-structured distributed rate population model, CRSC-TR94-4, March 1994; in Control and Estimation of Distributed Param-eter Systems: Nonlinear Phenomena, Birkhauser ISNM, 118 (1994),1–19.

[BKa] H.T. Banks and P. Kareiva, Parameter estimation techniques fortransport equations with application to population dispersal and tis-sue bulk flow models, LCDS Report #82-13, Brown University, July1982; J. Math. Biol., 17 (1983), 253–273.

[BK] H.T. Banks and K. Kunisch. Estimation Techniques for DistributedParameter Systems. Birkhausen, Bosten, 1989.

[BKurW1] H. T. Banks, A. J. Kurdila and G. Webb, Identification of hys-teretic control influence operators representing smart actuators, Part I:Formulation, Mathematical Problems in Engineering 3 (1997), 287–328.

[BKurW2] H. T. Banks, A. J. Kurdila and G. Webb, Identification of hys-teretic control influence operators representing smart actuators, Part

73

II. Convergent approximations, J. of Intelligent Material Systems andStructures 8 (1997, 536–550.

[BPi] H.T. Banks and G.A. Pinter, A probabilistic multiscale approach tohysteresis in shear wave propagation in biotissue, CRSC-TR04-03, Jan-uary, 2004; SIAM J. Multiscale Modeling and Simulation, 3 (2005),395–412.

[BPo] H.T. Banks and L.K. Potter, Probabilistic methods for addressinguncertainty and variability in biological models: Application to a tox-icokinetic model, CRSC-TR02-27, N.C. State University, September,2002; Math. Biosci., 192 (2004), 193–225.

[BRS] H.T. Banks, K.L. Rehm and K.L. Sutton, Dynamic social networkmodels incorporating stochasticity and delays, CRSC-TR09-11, May,2009; Quarterly Applied Math., to appear.

[BSW] H.T. Banks, R.C. Smith and Y. Wang, Smart Material Struc-tures: Modeling, Estimation and Control, Masson/John Wiley,Paris/Chichester, 1996.

[BSTBRSM] H.T. Banks, Karyn L. Sutton, W. Clayton Thompson, Gen-nady Bocharov, Dirk Roose, Tim Schenkel, and Andreas Meyerhans,Estimation of cell proliferation dynamics using CFSE data, CRSC-TR09-17, August, 2009; Bull. Math. Biology, submitted.

[BT] H.T. Banks and H.T. Tran, Mathematical and Experimental Modelingof Physical and Biological Processes, CRC Press, Boca Raton, FL, 2009.

[Basar] T. Basar and G. J. Olsder, Dynamic Noncooperative Game Theory,SIAM Publ., Philadelphia, PA, 2nd edition, 1999.

[BA] G. Bell and E. Anderson, Cell growth and division I. A mathemati-cal model with applications to cell volume distributions in mammaliansuspension cultures, Biophysical Journal, 7 (1967), 329–351.

[B] P. Billingsley. Convergence of Probability Measures. Wiley, New York,1968.

[BR] G. Birkhoff and G.C. Rota. Ordinary Differential Equations, 1962.

[ButBer] P.L. Butzer and H. Berens, Semi-Groups of Opeerators and Ap-proximation, Springerr-Verlag, New York, 1967.

74

[CK1] D. Colton and R. Kress. Integral Equation Methods in ScatteringTheory. Wiley, New York, 1983.

[CK2] D. Colton and R. Kress. Inverse Acoustic and Electromagnetic Scat-tering Theory. Springer-Verlag, New York, 1992.

[CH] R. Courant and D. Hilbert. Methods of Mathematical Physics, Vol.1,p. 291.

[D] Dettman. Mathematical Methods in Physics and Engineering, p. 201.

[DS] N. Dunford and J.T. Schwartz. Linear Operators, Vol. I, 1966.

[E] R. E. Edwards. Functional Analysis: Theory and Applications. Holt,Rinehart and Winston. New York, 1965.

[GP] C. Goffman and G. Pedrick, First Course in Functional Analysis,Prentice-Hall, Englewood Cliffs, NJ, 1965.

[JKH1] J. K. Hale, Functional Differential Equations, Springer-Verlag, NewYork, NY, 1971.

[JKH2] J. K. Hale, Theory of Functional Differential Equations, Springer-Verlag, New York, NY, 1977.

[JKH3] J. K. Hale and S. M. Verduyn Lunel, Introduction to FunctionalDifferential Equations, Springer-Verlag, New York, NY, 1993.

[Hal] P. R. Halmos, Measure Theory Springer-Verlag, Berlin-New York, 1974

[HKNT] E. Heikkola, Y. A. Kuznetsov, P. Neittaanmaki and J. Toiva-nen, Fictitious domain methods for the numerical solution of two-dimensional scattering problems, J. Comput. Phys., 145 (1998), 89–109.

[HRT] E. Heikkola, T. Rossi and J. Toivanen, Fast direct solution of theHelmholtz equation with a perfectly matched layer or an absorbingboundary condition, Internat. J. Numer. Methods Engr., 57 (2003),2007–2025.

[HeSt] E. Hewitt and K.R. Stromberg, Real and Abstract Analysis, Springer-Verlag, New York, 1965.

[HP] E. Hille and R. Phillips. Functional Analysis and Semigroups. AMSColloquium Publications, Vol. 31, 1957.

75

[Hutch] G.E. Hutchinson, Circular causal systems in ecology, Ann. N.Y.Acad. Sci., 50 (1948), 221–246.

[H] P. Huber, Robust Statistics. Wiley, New York, (1981).

[Jackson] J. D. Jackson, Classical Electrodynamics, Wiley & Sons, NewYork, 1975.

[K] T. Kato, Perturbations Theory for Linear Operators, 1966.

[Kot] M. Kot, Elements of Mathematical Ecology, Cambridge UniversityPress, Cambridge, 2001.

[KP] M. A. Krasnoselskii and A. V. Pokrovskii, Systems withHysteresis,Springer-Verlag, New York, NY, 1989.

[Kr] R. Kress. Linear Integral Equations. Springer-Verlag, New York, 1989.

[Ma] I. D. Mayergoyz, Mathematical Models of Hysteresis, Springer-Verlag,New York, NY, 1991.

[McS] E.J. McShane, Integration, Princeton Unoversity Press, Princeton,NJ, 1944.

[MetzD] J. A. J. Metz and E.O. Diekmann, The Dynamics of Physiolog-ically Structured Populations, Lecture Notes in Biomathematics. Vol.68, Springer, Heidelberg, 1986.

[NS] A.W. Naylor and G.R. Sell, Linear Operator Theory in Engineeringand Science, Springer Verlag, New York, 1982.

[Pa] A. Pazy. Semigroups of Linear Operators and Applications to PartialDifferential Equations. Springer-Verlag, New York, 1983.

[P] Yu. V. Prohorov, Convergence of random processes and limit theoremsin probability theory, Theor. Prob. Appl., 1 (1956), 157–214.

[Po] L.K. Potter, Physiologically Based Pharmacokinetic Models for the Sys-temic Transport of Trichloroethylene, Ph.D. Thesis, N.C. State Univer-sity, August, 2001; (http://www.lib.ncsu.edu).

[RM] R.D. Richtmyer and K.W. Morton. Finite Differences Methods forInitial Value Problems, 2d. Edt. Intersciences Publishers, New Tork,1967.

76

[RN] F. Riesz and B.S. Nagy. Functional Analysis. Ungar Press, 1955.

[Ru1] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, NewYork, 1953.

[Ru2] W. Rudin, Real and Complex Analysis, McGraw-Hill, New York,1966.

[SS] J. Sinko and W. Streifer, A new model for age-size structure of a pop-ulation, Ecology, 48 (1967), 910–918.

[Sh] R. E. Showalter. Hilbert Space Methods for Partial Differential Equa-tions. Pitman, London, 1977.

[Smir] V.I. Smirnov, Integration and Functional Analysis, Vol 5 of A Courseof Higher Mathematics,Adison-Wesley, Reading, MA, 1964.

[S] I. Stakgold. Boundary Value Problems of Mathematical Physics, Vol. 1,1967.

[T] H. Tanabe. Equations of Evolution. Pitman, London, 1979.

[Wahba1] G. Wahba, Bayesian “confidence intervals” for the cross-validatedsmoothing spline, J. Roy. Statist. Soc. B, 45 (1983), 133–150.

[Wahba2] G. Wahba, (Smoothing) splines in nonparametric regression, Deptof Statistics-TR 1024, Univ. Wisconsin, September, 2000; In Encyclope-dia of Environmetrics, (A. El-Shaarawi and W. Piegorsch, eds.), Wiley,Vol. 4, pp. 2099–2112, 2001.

[W] J. Wloka. Partial Differential Equations. Cambridge University Press,Cambridge, 1987.

[Wright] E. M. Wright, A non-linear difference-differential equation, J.Reine Angew. Math., 494 (1955), 66–87.

[Vis] A. Visintin, Differential Models of Hystersis, Springer-Verlag, NewYork, NY, 1994.

[VN] J. von Neumann, Zur theorie der gesellschaftsspiele, Math. Ann., 100(1928), 295–320.

[VNM] J. von Neumann and O. Morgenstern, Theory of Games and Eco-nomic Behavior, Princeton University Press, Princeton, NJ, 1944.

77

[Warga] J. Warga, Optimal Control of Differential and Functional Equa-tions, Academic Press, New York, NY, 1972.

[Y] K. Yosida. Functional Analysis, 1971.

78

applied functional analysis lecture notes spring, 2010 · 1 introduction to functional analysis 1.1...

Documents