ordinary di erential equations: a constructive...

Ordinary Differential Equations: A Constructive Approach

M. Gameiro, J.-P. Lessard, J. Mireles James, K. Mischaikow

January 8, 2017

Contents

1 Motivation 51.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Existence and Uniqueness, Flows and Definitions 72.1 Contraction Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Existence and Uniqueness of Solutions to ODEs . . . . . . . . . . . . . . . . 92.3 Regularity of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4 Dynamical Aspects of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Equilibria and Radii Polynomials in Finite Dimension 393.1 Newton’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.2 Radii Polynomial Approach in Finite Dimension . . . . . . . . . . . . . . . 413.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4 Linear Theory and Stability of Equilibria 594.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.2 Homogeneous Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 614.3 Constant Coefficient Linear Systems . . . . . . . . . . . . . . . . . . . . . . 634.4 Hyperbolic Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 674.5 Linear Approximations of Nonlinear Systems . . . . . . . . . . . . . . . . . 724.6 Rigorous Computation of Eigenvalues and Eigenvectors . . . . . . . . . . . 754.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

5 Continuation of Equilibria 815.1 Parameterized Families of Equilibria . . . . . . . . . . . . . . . . . . . . . . 815.2 Computing Branches of Equilibria . . . . . . . . . . . . . . . . . . . . . . . 835.3 Saddle-Node Bifurcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

3

4 CONTENTS

Chapter 1

Motivation

1.1 Exercises

Exercise 1.1.1. Recalling that Hook’s law for the force exerted on a mass by a stretchedspring is

F (x) = −Kx,where K is the “stiffness” constant of the spring, and x is the signed displacement fromequilibrium. Derive the equation of motion for a mass connected to a spring on a frictionlesstable (perhaps a surface of ice or maybe the mass has well oiled wheels).

Suppose that the mass is

Exercise 1.1.2....

5

6 CHAPTER 1. MOTIVATION

Chapter 2

Existence and Uniqueness, Flowsand Definitions

In this chapter we provide fundamental results concerning the existence, uniqueness, andcontinuity of solutions to ODEs. The results of this section are classical and can be foundin any graduate level text on ordinary differential equations. We have chosen not to presentthe most general results, but rather the minimal results necessary for this book. This isdone because we can directly use the contraction mapping theorem which is suggestive ofmany of the proofs in this class. See [1, Section 1.12] or [4, Chapter 1] for alternativeand/or more general proofs of existence and uniqueness. Our presentation follows that of[6, Section V.5.3].

2.1 Contraction Mapping Theorem

Consider a function T : X → X where X is a topological space. An element x̃ ∈ X is afixed point of T , if T (x̃) = x̃. A fixed point is globally attracting if limn→∞ T

n(x) = x̃ forall x ∈ X.Definition 2.1.1. Let (X,d) denote a metric space. A function T : X → X is a contractionif there is a number κ ∈ [0, 1), called a contraction constant, such that

d(T (x), T (y)) ≤ κd(x, y)for all x, y ∈ X.Theorem 2.1.2 (Contraction Mapping Theorem). Let (X,d) be a complete metricspace. Assume that T : X → X is a contraction with contraction constant κ. Then thereexists a unique, globally attracting fixed point x̃ ∈ X. Furthermore, for any x ∈ X,

d(Tn(x), x̃) ≤ κn

1− κd(T (x), x). (2.1)

7

8 CHAPTER 2. EXISTENCE AND UNIQUENESS, FLOWS AND DEFINITIONS

Proof. Choose x0 ∈ X. Recursively definexn+1 := T (xn).

By assumptiond(xn+1, xn) = d(T (xn), T (xn−1)) ≤ κd(xn, xn−1).

Thus, by inductiond(xn+1, xn) ≤ κnd(x1, x0).

By the triangle inequality for n < m,

d(xn, xm) ≤m−1∑j=n

d(xj+1, xj)

≤m−1∑j=n

κjd(x1, x0)

≤ κn( ∞∑k=0

κk

)d(x1, x0)

≤ κn 11− κd(x1, x0).

This implies that {xn} is a Cauchy sequence. Since X is complete there exists x̃ ∈ X suchthat

limn→∞

xn = x̃.

Hence, by continuity of T ,

x̃ = limn→∞

xn = limn→∞

T (xn−1) = T(

limn→∞

xn−1

)= T (x̃).

This establishes the existence of a fixed point.To prove (2.1), observe that again by the triangle inequality

d(xn, x̃) ≤ d(xn, xm) + d(xm, x̃)

≤ κn

1− κd(x1, x0) + d(xm, x̃)

for all m > n. Taking the limit as m→∞ gives the desired inequality.Assume now that there is another fixed point ỹ of T , that is T (ỹ) = ỹ. Setting x = ỹ

in (2.1) and using the fact that Tn(ỹ) = ỹ for all n, implies that

d(ỹ, x̃) ≤ κn

1− κd(ỹ, ỹ) = 0.

Hence, x̃ = ỹ and therefore the fixed point is unique.Finally, (2.1) and the fact that κ ∈ [0, 1) proves that x̃ is a globally attracting fixed

point.

2.2. EXISTENCE AND UNIQUENESS OF SOLUTIONS TO ODES 9

2.2 Existence and Uniqueness of Solutions to ODEs

Definition 2.2.1. Let f : U → Rn be a continuous function defined on an open set U ⊂ Rn.A solution to the differential equation

ẋ :=dx

dt= f(x) (2.2)

on an interval J ⊂ R is a differentiable function ϕ : J → U such that ϕ(t) ∈ U and

dϕ

dt(t) = f(ϕ(t))

for all t ∈ J .

In this section we focus on solutions to the initial value problem (IVP)

ẋ = f(x), x(t0) = x0,

that is, the existence of a solution x : J → U such that t0 ∈ J and x(t0) = x0.

Definition 2.2.2. Consider metric spaces (X,dX) and (Y,dY ). A function f : X → Y isLipschitz if there exists a real constant K ≥ 0 such that, for all x1, x2 ∈ X,

dY (f(x1), f(x2)) ≤ KdX(x1, x2).

The smallest K satisfying this inequality is denoted by Lip(f) := K and is called theLipschitz constant of f .

More generally, f is locally Lipschitz if every point in X has a neighborhood such thatf restricted to that neighborhood is Lipschitz.

The first goal of this section is the proof of the following theorem which guaranteeslocal existence and uniqueness of solutions.

Theorem 2.2.3. Assume f : U → Rn is a locally Lipschitz continuous function defined onan open set U ⊂ Rn. If x0 ∈ U , then there exists an open interval J ⊂ R, containing t0,over which a solution to the initial value problem

ẋ = f(x), x(t0) = x0 (2.3)

is defined. Furthermore, any two solutions to the initial value problem agree on the inter-section of their domains of definition.

The proof of Theorem 2.2.3 is obtained via a series of propositions and lemmas. Thefirst step in the proof is the observation that by the fundamental theorem of calculus asolution to an ODE can be recast as a solution to an integral equation.


Lemma 2.2.4. Assume f : U → Rn is a continuous function defined on an open setU ⊂ Rn. Let x0 ∈ U . A continuous function ϕ : J → Rn is a solution to the initial valueproblem ẋ = f(x), x(t0) = x0 if and only if

ϕ(t) = x0 +

∫ tt0

f(ϕ(s)) ds.

Observe that since f is independent of t, there is no loss of generality in assuming thatt0 = 0, which for the sake of simplicity of expression will be done henceforth. The followingproposition provides for existence of solutions.

Proposition 2.2.5. If f : U → Rn is a locally Lipschitz continuous function defined on anopen set U ⊂ Rn and x0 ∈ U , then there exists a solution ϕ : (−a, a)→ Rn for some a > 0to the initial value problem

ẋ = f(x), x(0) = x0. (2.4)

Proof. Choose � > 0 such that

B� (x0) = {x ∈ Rn | ‖x− x0‖ ≤ �} ⊂ U,

where ‖ · ‖ is a norm for Rn.For � sufficiently small f is Lipschitz on B� (x0), and hence there exist positive constants

K and M such that

‖f(x)− f(y)‖ ≤ K‖x− y‖ and ‖f(x)‖ ≤M

for all x, y ∈ B� (x0).By Lemma 2.2.4 it is sufficient to prove the existence of ϕ satisfying

ϕ(t) = x0 +

∫ t0f(ϕ(s)) ds. (2.5)

The strategy is to define a function space X within which we expect to find a solution anddefine a contraction T : X → X such that the fixed point of T is a solution to (2.5).

Since solutions are functions of time the domain of elements of the function space shouldbe an interval J ⊂ R. Choose a > 0 such that

a < min

{�

M,

1

K

}(2.6)

and set J := (−a, a). Define

X :={α : J → B� (x0)

∣∣∣ α ∈ C0(J), and Lip(α) ≤M} .


We endow X with the C0 norm

‖α‖C0(J) := sup {‖α(t)‖ | t ∈ J}

and leave it as an exercise to show that this implies that X is a complete metric space.

Given α ∈ X define an operator T : X → C0(J) by

T (α)(t) := x0 +

∫ t0f(α(s)) ds.

Observe that by definition T (α) = α if and only if ϕ = α satisfies (2.5). To apply thecontraction mapping theorem (Theorem 2.1.2) we need to prove two things: (i) T : X → Xand (ii) T is a contraction.

We first prove (i). Observe that

T (α)(0) = x0 +

∫ 00f(α(s)) ds = x0. (2.7)

Furthermore, given t1, t2 ∈ J ,

‖T (α)(t2)− T (α)(t1)‖ =∥∥∥∥∫ t2

0f(α(s)) ds−

∫ t10

f(α(s)) ds

∥∥∥∥=

∥∥∥∥∫ t2t1

f(α(s)) ds

∥∥∥∥≤∣∣∣∣∫ t2t1

M ds

∣∣∣∣= M |t2 − t1| (2.8)

and hence Lip(T (α)) ≤M . Setting t2 = t, t1 = 0 and applying (2.7) and(2.8) we obtain

‖T (α)(t)− x0‖ ≤M |t| < Ma < �

for all t ∈ J and hence T (α) : J → B� (x0). Therefore, T (α) ∈ X.


We now prove (ii). Observe that given α, β ∈ X

‖T (α)− T (β)‖C0(J) = supt∈J‖T (α)(t)− T (β)(t)‖

= supt∈J

∥∥∥∥∫ t0f(α(s)) ds−

∫ t0f(β(s)) ds

∥∥∥∥≤ sup

t∈J

∫ t0‖f(α(s))− f(β(s))‖ ds

≤ supt∈J

∫ t0K‖α(s)− β(s)‖ ds

≤ K‖α− β‖C0(J) supt∈J

∫ t0ds

≤ Ka‖α− β‖C0(J).

By (2.6), Ka < 1, and therefore T : X → X is a contraction with contraction constantκ = Ka < 1. Denote by ϕ : J → Rn the unique fixed point of T within X and observe thatthis implies that ϕ is a solution of the initial value problem (2.4).

Returning to the details of the proof of Theorem 2.2.5 observe that if δ = �/2, then forany x ∈ Bδ(x0),

Bδ (x) ⊂ U, ‖f(z)− f(y)‖ ≤ K‖z − y‖, and ‖f(z)‖ ≤M

for all z, y ∈ Bδ (x). This leads to the following corollary.

Corollary 2.2.6. Let f : U → Rn be a locally Lipschitz continuous function defined on anopen set U ⊂ Rn. For every x0 ∈ U there exists a neighborhood V of x0 and a constanta = a(V ), such that for every y ∈ V there exists a solution ϕ(·, y) : (−a, a) → Rn to theinitial value problem

ẋ = f(x), x(0) = y.

Remark 2.2.7. It is worth noting that while we have proven the existence of a solution, it islimited to a time interval of length 2a where a < min

{�M ,

1K

}, which could be very small.

Theorem 2.2.18 provides information concerning existence over longer intervals of time.

Observe that we have only proven uniqueness of existence of solutions over the familyof functions Lip(α) ≤ M , as opposed to all differentiable functions. Thus the proof ofTheorem 2.2.3 remains to be completed. For the moment we turn our attention to adifferent question and demonstrate that solutions to an IVP are Lipschitz continuous as afunction of the initial value. The following inequality will be used to prove the previousstatement and is fundamental to the study of differential equations.


Theorem 2.2.8 (Gronwall’s Inequality). Let α, β : (a, b)→ [0,∞) be continuous func-tions. Assume

α(t) ≤ C +∣∣∣∣∫ tt0

α(s)β(s) ds

∣∣∣∣ , t0, t ∈ (a, b)for some constant C ≥ 0. Then,

α(t) ≤ C exp(∣∣∣∣∫ t

t0

β(s) ds

∣∣∣∣) .Proof. The proof is done in several cases.

Assume a < t0 ≤ t < b. Define

G(t) := C +

∫ tt0

α(s)β(s) ds.

Then

G′(t) =dG

dt(t) = α(t)β(t) ≤ G(t)β(t).

Now assume C > 0. Then G(t) > 0 and hence

G′(t)

G(t)≤ β(t)∫ t

t0

G′(s)

G(s)ds ≤

∫ tt0

β(s) ds

log

(G(t)

G(t0)

)≤∫ tt0

β(s) ds

G(t) ≤ G(t0) exp(∫ t

t0

β(s) ds

)G(t) ≤ C exp

(∫ tt0

β(s) ds

)α(t) ≤ C exp

(∫ tt0

β(s) ds

).

Now assume C > 0 and a < t ≤ t0 < b. Define

G(t) := C +

∫ t0t

α(s)β(s) ds

and repeat the argument.Finally, assume that C = 0 and consider a sequence Cn > 0 converging to 0. By the

previous argument and the fact that α is non-negative,

0 ≤ α(t) ≤ Cn exp(∣∣∣∣∫ t

t0

β(s) ds

∣∣∣∣)for all Cn and hence α(t) ≡ 0.


We now use Gronwall’s Inequality to show that solutions to an IVP are Lipschitzcontinuous with respect to initial conditions.

Proposition 2.2.9. Let U ⊂ Rn be an open set and assume f : U → Rn is a Lipschitzcontinuous function with Lip(f) = K. If ϕ(·, x0) : Jx0 → Rn and ψ(·, y0) : Jy0 → Rn aresolutions to the initial value problem ẋ = f(x) with x(0) = x0 and x(0) = y0, respectively,then

‖ϕ(t, x0)− ψ(t, y0)‖ ≤ ‖x0 − y0‖eK|t| (2.9)for all t ∈ Jx0 ∩ Jy0. Furthermore, with respect to the C0 norm, solutions to the IVP arelocally Lipschitz continuous with respect to the initial value.

Proof. Let ϕ(·, x0) : Jx0 → Rn and ψ(·, y0) : Jy0 → Rn be solutions to the initial valueproblems x(0) = x0 and x(0) = y0, respectively. By Proposition 2.2.5 there exists a > 0such that J = (−a, a) ⊂ Jx0 ∩ Jy0 . By Lemma 2.2.4,

ϕ(t, x0)− ψ(t, y0) = x0 − y0 +∫ t

0f(ϕ(s, x0))− f(ψ(s, y0)) ds.

Let α(t) := ‖ϕ(t, x0)− ψ(t, y0)‖. Then

α(t) ≤ ‖x0 − y0‖+∥∥∥∥∫ t

0f(ϕ(s, x0))− f(ψ(s, y0)) ds

∥∥∥∥≤ ‖x0 − y0‖+

∣∣∣∣∫ t0‖f(ϕ(s, x0))− f(ψ(s, y0))‖ ds

∣∣∣∣≤ ‖x0 − y0‖+

∣∣∣∣∫ t0K‖ϕ(s, x0)− ψ(s, y0)‖ ds

∣∣∣∣≤ ‖x0 − y0‖+

∣∣∣∣∫ t0Kα(s) ds

∣∣∣∣since Lip(f) = K. Applying Gronwall’s Inequality with C = ‖x0 − y0‖ and β(t) = K, weobtain

α(t) ≤ ‖x0 − y0‖ exp(∣∣∣∣∫ t

0K ds

∣∣∣∣)≤ ‖x0 − y0‖eK|t|

‖ϕ(t, x0)− ψ(t, y0)‖ ≤ ‖x0 − y0‖eK|t|.To see that solutions are locally Lipschitz continuous with respect to initial conditions,

observe that given initial values x0 and y0 and the interval J = (−a, a) as defined above‖ϕ(·, x0)− ψ(·, y0)‖C0(J) = sup

t∈J‖ϕ(t, x0)− ψ(t, y0)‖

≤ ‖x0 − y0‖ supt∈J

eK|t|

= ‖x0 − y0‖eKa.


Proof of Theorem 2.2.3. As is remarked above Proposition 2.2.5 guarantees the existenceof solutions. To show that two solutions to the same IVP agree on the intersection of theirdomains of definition let ϕ : J0 → Rn and ψ : J1 → Rn denote two solutions to the initialvalue problem ẋ = f(x), x(0) = x0. By (2.9) for all t ∈ J0 ∩ J1

‖ϕ(t)− ψ(t)‖ = 0.

The following Corollary provides a summary of the results derived in the proof ofTheorem 2.2.3.

Corollary 2.2.10. Assume f : U → Rn is a locally Lipschitz continuous function definedon an open set U ⊂ Rn. For every x0 ∈ U and t0 ∈ R there exists a neighborhood V0 ⊂ U ofx0, an interval J0 ⊂ R containing t0, and a Lipschitz continuous function ϕ : J0×V0 → Rnsuch that ϕ(·, x0) is a solution to the IVP, ẋ = f(x), x(t0) = x0.

The following proposition guarantees existence and uniqueness of solutions for everyC1 vector field. The proof is left to the reader.

Proposition 2.2.11. Let U ⊂ Rn be an open set and f : U → Rn. If f ∈ C1(U), then fis locally Lipschitz.

Note that since ϕ(·, x0) is a solution to the IVP it is differentiable in t. One canalso prove differentiability with respect to the initial conditions [1, Theorem 1.261]. Thisquestion is considered in Exercise 2.5.9 of this Chapter.

Models that arise in applications typically depend on set of parameters Λ and often aretime dependent. Thus, we are interested in solutions to differential equations that appearto take a more general form. Let J ⊂ R, U ⊂ Rn and Λ ⊂ Rm be open sets and letf : J ×U ×Λ→ Rn be a continuous function. For fixed λ ∈ Λ a solution to the differentialequation

ẋ = f(t, x, λ) (2.10)

is a differentiable function ϕ : J0 → U defined on an open interval J0 ⊂ J such that

dϕ

dt(t) = f(t, ϕ(t), λ)

for all t ∈ J0. For t0 ∈ J , x0 ∈ U and λ0 ∈ Λ, the initial value problem (IVP) associatedwith (2.10) requires finding a solution ϕ(t) = ϕ(t; t0, x0, λ0) to ẋ = f(t, x, λ0) satisfyingϕ(t0) = x0.

The corresponding existence and uniqueness theorem is as follows.

Theorem 2.2.12. Let J ⊂ R, U ⊂ Rn and Λ ⊂ Rm be open sets, and assume f : J ×U × Λ→ Rn is a Lipschitz function. If (t0, x0, λ0) ∈ J × U × Λ, then there exists an open


neighborhood of the form J0 × U0 × Λ0 of (t0, x0, λ0) and a Lipschitz continuous functionϕ : J0 × J0 × U0 × Λ0 → Rn such that for every (t1, x1, λ1) ∈ J0 × U0 × Λ0

ϕ(·, t1, x1, λ1) : J0 → Rn

is a solution to the initial value problem

ẋ = f(t, x, λ1), x(t1) = x1. (2.11)

Furthermore, if ψ(·, t1, x1, λ1) is another solution to the initial value problem, then ψ(t) =φ(t) on the intersection of their domains of definition.

Proof. The proof follows from the realization that result can be viewed as a special case ofTheorem 2.2.3. Define F : U × J × Λ→ Rn+1+m by

F (x, s, λ) = (f(s, x, λ), 1, 0).

By Theorem 2.2.3, there exists a function ϕ : J0 → Rn+1+m which satisfies the initial valueproblem

ẋ = f(s, x, λ)

ṡ = 1

λ̇ = 0

(x(t0), s(t0), λ(t0)) = (x0, s0λ0) ∈ U × J × Λ.

Furthermore, if ψ : J1 → Rn+1+m is another solution to this initial value problem, then theyagree on the domain J0 ∩ J1. It is left to the reader to check that the first n componentsof ϕ is a solution to (2.11).

The most significant restriction on the assumptions of Theorem 2.2.3 is that f is Lip-schitz. As the following example indicates, existence is possible in more general settings,however uniqueness of the solution can no longer be assumed.

Example 2.2.13. Consider the initial value problem

ẋ = 3x23 , x(0) = 0. (2.12)

Observe that

ϕ(t) ≡ 0 and ψ(t) :={t3 if t ≥ 00 if t ≤ 0

are two distinct solutions to (2.12) on R.

Having established existence and uniqueness we turn to the question of the maximaltime interval on which a solution is defined.


Example 2.2.14. Consider the initial value problem

ẋ = x2, x(0) = x0, (2.13)

where we assume that x0 > 0. By Theorem 2.2.3 we know that this IVP has a locallyunique solution. It is straightforward to check that

ϕ(t, x0) :=x0

1− tx0

is a solution and that the maximal interval in time over which ϕ is defined is −∞ < t <1/x0. As motivation for the next theorem it is also worth observing that

limt→1/x0

ϕ(t, x0) =∞.

Definition 2.2.15. Let ϕ : I → Rn and ψ : J → Rn be solutions to ẋ = f(x), wheref : U → Rn is a continuous function defined on an open set U ⊂ Rn. We say that ψ is anextension of ϕ if I ⊂ J and ϕ(t) = ψ(t) for all t ∈ I. If we also have that I is a propersubset of J , then we say that ψ is a proper extension of ϕ.

Definition 2.2.16. A solution ϕ : J → Rn to ẋ = f(x) is called a maximal solution ifit has no proper extensions. In this case the interval J is called the maximal interval ofexistence of ϕ.

Theorem 2.2.17. If f : U → Rn is a Lipschitz continuous function defined on an openset U ⊂ Rn and x0 ∈ U , then the initial value problem

ẋ = f(x), x(0) = x0

has a unique maximal solution ϕ : J → Rn, and the maximal interval of existence J isopen.

Proof. Let J be the union of all intervals I for which there is a solution ψ : I → Rn forthe initial value problem ẋ = f(x), x(0) = x0. Since all the intervals I contain t0 = 0 wehave that J is an interval. Define ϕ : J → Rn by ϕ(t) := ψ(t) if t ∈ I and ψ : I → Rn is asolution. Since any two solutions agree on the intersection of their domains of definition,we have that ϕ is well defined. The function ϕ : J → Rn just defined is clearly the uniquemaximal solution to ẋ = f(x), x(0) = x0. It remains to show that the interval J is open.Let t− and t+ be the left and right end points of J , respectively. Assume that J is closedat t+, then by Theorem 2.2.3 we can extend the solution to an interval about t+, whichcontradicts the fact that J is the maximal interval. Likewise if we assume that J is closedat t−. Therefore J is the open interval (t−, t+).


Theorem 2.2.18. Let f : U → Rn be a Lipschitz continuous function defined on an openset U ⊂ Rn. Consider the initial value problem

ẋ = f(x), x(0) = x0

with maximal solution ϕ(t, x0). Let J = (t−, t+) be the maximal interval of existence of ϕ.

(i) If t+ −∞, then given any compact set C ⊂ U there is atime t−C ∈ (t−, 0) such that ϕ(t−C , x0) 6∈ C.

(ii) If U = Rn and ‖f(x)‖ is bounded, then (t−, t+) = R.

Proof. (i) Let C ⊂ U compact and assume ϕ([0, t+), x0) ⊂ C. Since C is compact and f iscontinuous, there are positive constants K1 and K2 such that

‖f(x)− f(y)‖ ≤ K1‖x− y‖ and ‖f(x)‖ ≤ K2

for all x, y ∈ C. Thus the proof of Theorem 2.2.3 implies that the solution ϕ satisfiesLip(ϕ) ≤ K2 for all t ∈ [0, t+). Because ϕ is Lipschitz with respect to t, there existsx+ = limt→t+ ϕ(t, x0) and the compactness of C implies that x+ ∈ C. Theorem 2.2.3guarantees the existence of δ > 0 and a solution ψ : (t+ − δ, t+ + δ) to the initial valueproblem ẋ = f(x), x(t+) = x+. Furthermore, ψ = φ on (t+ − δ, t+). This contradicts theassumption that J = (t−, t+) is the maximal interval of existence of ϕ.

The argument for t− is similar.(ii) Letting K := supx∈Rn ‖f(x)‖

2.3. REGULARITY OF SOLUTIONS 19

that if f ∈ Cr, then ϕ ∈ Cr+1. In particular, this shows that if f ∈ Cω, then ϕ ∈ C∞.However, to obtain the analyticity of ϕ requires a different line of reasoning. The argumentwe present here is based on the idea of majorants. It does not provide the most efficientproof, but has the advantage that it can be extended to obtain the Cauchy-Kovalevskayatheorem (also know as Cauchy-Kowalevski theorem) for partial differential equations [2,Chapter 4, Theorem 2], [3].

For the sake of presentation we first discuss the one-dimensional case before presentingthe general case.

Theorem 2.3.1 (Cauchy-Kovalevskaya Theorem: ODE version I). Let f : U → Rbe an analytic function defined on an open interval U ⊂ R containing the origin. If x : J →R is the solution to the IVP

ẋ = f(x), x(0) = 0, (2.15)

then x is an analytic function in a neighborhood of 0.

As is made clear shortly the following example is of particular interest in this context.

Example 2.3.2. Direct substitution shows that the unique solution to the IVP

ẏ = g(y) := Cr

r − y =∞∑k=0

Cr−kyk, y(0) = 0,

where C, r > 0, is given by

y(t) = r −√r2 − 2Crt (2.16)

which is analytic for |t| < r/(2C).Proof of Theorem 2.3.1. Let x : J → R be the solution to (2.15). Our goal is to show thatx is analytic at t = 0, which is equivalent to showing that x is given by a convergent Taylorseries at t = 0. We know that x is C∞ and thus we can compute all of its derivatives, i.e.,

ẋ(t) = f(x(t))

ẍ(t) = f ′(x(t))ẋ(t) = f ′(x(t))f(x(t))

x(3)(t) = f ′′(x(t))[f(x(t))]2 + [f ′(x(t))]2f(x(t))

...

x(k)(t) = pk

(f(x(t)), f ′(x(t)), . . . , f (k−1)(x(t))

),

where pk is a polynomial in k-variables with all the coefficients being positive integers. Bydefinition x is analytic at t = 0 if and only if there exists ρ > 0 such that (using x(0) = 0)

∞∑k=0

1

k!pk

(f(0), . . . , f (k−1)(0)

)tk


converges to x(t) for |t| < ρ. Since the coefficients of pk are positive, to show the conver-gence it is sufficient to show that

∞∑k=0

1

k!pk

(|f(0)|, . . . , |f (k−1)(0)|

)tk (2.17)

has a positive radius of convergence.Observe that the form of the polynomial pk is independent of f , i.e., for the ODE ẏ =

g(y), we end up with the same polynomial expression y(k)(t) = pk(g(y(t)), . . . , g(k−1)(y(t))

).

Thus, again making use of the fact that pk has positive coefficients, to prove the conver-gence of (2.17) it suffices to prove that there is an analytic differential equation ẏ = g(y)with an analytic solution y(t), satisfying y(0) = 0, with the property that

|f (k)(0)| ≤ g(k)(0), for all k ≥ 0. (2.18)

We now employ the assumption that f is analytic at 0 and choose r > 0 such that[−r, r] ⊂ U and

∞∑k=0

|f (k)(0)|k!

rk

converges to conclude that there exists C > 0 such that

|f (k)(0)| ≤ Ck!r−k, for all k ≥ 0. (2.19)

Observe that if we choose g(y) = Cr/(r − y), then by Example 2.3.2, we have g(k)(0) =Ck!r−k and from (2.19), we conclude that (2.18) is satisfied. In this case, the solution(2.16) of ẏ = g(y), y(0) = 0 has Taylor expansion

∞∑k=0

1

k!pk

(g(0), . . . , g(k−1)(0)

)tk

which converges for |t| < ρ, for some ρ > 0. We conclude from (2.18) that the series (2.17),and hence Taylor series of x about 0, has a positive radius of convergence. To finish theproof we need to show that the solution x(t) is given by this power series. This is done byshowing that the power series is a solution to (2.15). To this end let

ϕ(t) :=∞∑k=0

x(k)(0)

k!tk,

and consider the functions ϕ̇(t) and f(ϕ(t)). They are both analytic at t = 0 and ϕ(k)(0) =x(k)(0) for all k ≥ 0. From this, differentiating f(ϕ(t)) we get

(f ◦ ϕ)(k) (0) = x(k+1)(0), for all k ≥ 0.

2.3. REGULARITY OF SOLUTIONS 21

Therefore ϕ̇(t) and f(ϕ(t)) are given by the same power series about t = 0, and hence theyare equal, since all of their derivatives agree at t = 0. We now have that both x(t) andϕ(t) are solutions to (2.15), and so by the uniqueness of solutions we conclude that

x(t) =∞∑k=0

x(k)(0)

k!tk,

and then that the solution to (2.15) is analytic.

Theorem 2.3.3 (Cauchy-Kovalevskaya Theorem: ODE version II). Let f : U → Rnbe an analytic function defined on an open set U ⊂ Rn containing the origin. If x : J → Rnis the solution to the IVP

ẋ = f(x), x(0) = 0,

then x is an analytic function in a neighborhood of 0.

The proof of this theorem is analogous to that of the one-dimensional case, hence weonly present the main steps. We use multi-index notation (see Appendix ??). First weprove the following lemma.

Lemma 2.3.4. If h : U → Rn is an analytic function defined on an open set U ⊂ Rncontaining the origin, then there exist constants C, r, ρ > 0 such that

hk(z) ≤Cr

r −∑nj=1 zj , k = 1, . . . , n,for all ‖z‖ < ρ.

Proof. Throughout the proof α denotes a multi-index. Let

hk,α :=1

α!∂αhk(0).

The analyticity of h implies that there exists ρ > 0 such that for all ‖z‖ < ρ

hk(z) =∑|α|≥0

hk,αzα, k = 1, . . . , n.

This implies that for fixed 0 < r < ρ there exists a positive constant C such that |hk,α|r|α| ≤C for all α and k, and in particular,

|hk,α| ≤ Cr−|α| ≤ C|α|!α!

r−|α|.


Therefore, for ‖z‖ < ρ, applying multinomial theorem (see Appendix ??) leads to

hk(z) ≤∑|α|≥0

C|α|!α!

r−|α|zα

= C∞∑k=0

∑|α|=k

|α|!α!

zα

r|α|

= C

∞∑k=0

(z1 + · · ·+ zn

r

)k= C

1

1−(z1+···+zn

r

)=

Cr

r −∑nj=1 zj .Example 2.3.5. Direct substitution shows that the unique solution to the IVP

ẏk = gk(y) =Cr

r −∑nj=1 yj , y(0) = 0, k = 1, . . . , n,where C, r > 0, is given by

yk(t) =r

n−√( r

n

)2− 2Cr

nt,

and this is an analytic function for |t| < r2Cn .

Proof of Theorem 2.3.3. Since f is analytic, we know that the solution x to the IVP isC∞. Repeated differentiation produces

ẋj(t) = fj(x(t))

ẍj(t) =n∑

m=1

∂fj∂xm

(x(t))ẋm(t)

...

x(k)j (t) = qk

({∂αfj(x(t))}|α|

2.4. DYNAMICAL ASPECTS OF ODES 23

Applying this formula by induction on k, we can eliminate all the lower derivatives

x(`)m (0) from the right hand side to get

x(k)j (0) = pk

({∂αfj(0)}|α|


Theorem 2.4.2 (Reparameterization of time). Let U ⊂ Rn be an open set. Assumef : U → Rn is Ck (with k = ∞ allowed). Given x0 ∈ U , let ϕ(·, x0) : Jx0 → Rn be themaximal solution for the IVP (2.20). Denote by γ(x0) the orbit through x0. Then thereexists a Ck function g : U → (0, 1] such that R is the maximal domain of existence for anysolution to

ẋ = g(x)f(x).

Moreover, for each x0 ∈ U , denote by ψ(·, x0) : R → Rn the solution for the initial valueproblem ẋ = f(x)g(x), x(0) = x0. Then,

ψ(t, x0) = ϕ(τ(t, x0), x0),

where τ(·, x0) : R→ Jx0 satisfies τ(0, x0) = 0, and solves the differential equation

τ̇(t, x0) = g(ϕ(τ(t, x0), x0)

)> 0.

Hence, ψ(·, x0) is a reparameterization of ϕ(·, x0) with the same oriented solution curves,yielding the same orbit γ(x0) = ϕ(Jx0 , x0) = ψ(R, x0).

Proof. If U = Rn, let g(x) := 11+‖f(x)‖2 , with ‖ · ‖ the Euclidean norm (hence ‖ · ‖2 is a C∞

function from Rn to [0,∞)). Then g : U → (0, 1] is a Ck function, and ‖g(x)f(x)‖ ≤ 1 forall x ∈ U . From Theorem 2.2.18 part 2, any solution to ẋ = g(x)f(x) is defined on R.

In case U 6= Rn, then consider a function G : U → (0, 1] to be a C∞ function such thatsupx∈U ‖DG(x)‖M ≤ 1 (where ‖ · ‖M denotes the matrix norm induced by ‖ · ‖) and suchthat ‖G(x)‖ approaches 0 as x goes to the boundary of U or as ‖x‖ goes to infinity.

In this case, let g(x) := G(x)2

1+‖f(x)‖2 , which is a Ck function, and let F (x) := g(x)f(x).

Then, ‖F (x)‖ ≤ |G(x)|2 ≤ 1 for all x ∈ U . Now, consider x0 ∈ U and denote by ψ(t, x0)the unique solution of the IVP ẋ = F (x), x(0) = x0. From Theorem 2.2.18 part 1, to showthat ψ(t, x0) is defined on R, it is enough to show that G(ψ(t, x0)) does not go to 0 in finitetime, or equivalently that 1G(ψ(t,x0)) does not go to infinity in finite time. Now,

d

dt

1

G(ψ(t, x0))= − 1

G(ψ(t, x0))2DG(ψ(t, x0))F (ψ(t, x0))

and therefore,

1

G(ψ(t, x0))− 1G(ψ(0, x0))

= −∫ t

0

1

G(ψ(s, x0))2DG(ψ(s, x0))F (ψ(s, x0)) ds

= −∫ t

0DG(ψ(s, x0))

f(ψ(s, x0))

1 + ‖f(ψ(s, x0))‖2ds.

This implies that ∣∣∣∣ 1G(ψ(t, x0))∣∣∣∣ ≤ ∣∣∣∣ 1G(x0)

∣∣∣∣+ ∫ |t|0

ds =

∣∣∣∣ 1G(x0)∣∣∣∣+ |t|,


and therefore 1G(ψ(t,x0)) does not go to infinity in finite time. Hence, ψ(t, x0) is defined forall t ∈ R.

Now, let τ(t, x0) ∈ R the unique solution of the IVP

τ̇(t, x0) = g(ϕ(τ(t, x0), x0)

), τ(0, x0) = 0.

Then, ϕ(τ(t, x0), x0) satisfies ϕ(τ(0, x0), x0) = ϕ(0, x0) = x0 and

d

dtϕ(τ(t, x0), x0) = f

(ϕ(τ(t, x0), x0)

)τ̇(t, x0)

= f(ϕ(τ(t, x0), x0)

)g(ϕ(τ(t, x0), x0)

)= F

(ϕ(τ(t, x0), x0)

).

By unicity of the solutions of the IVP ẋ = F (x), x(0) = x0, we obtain that ψ(t, x0) =ϕ(τ(t, x0), x0).

With Theorem 2.4.2 as justification for the remainder of this chapter we work withautonomous differential equations for which the solutions exist for all time. This allows usto encode all the dynamics in the form of a continuous map.

Definition 2.4.3. Let X be a topological space. A continuous map ϕ : R ×X → X is aflow if

(i) ϕ(0, x) = x

(ii) ϕ(t, ϕ(s, x)) = ϕ(t+ s, x)

for all x ∈ X and t, s ∈ R. The space X is called the phase space for the flow.

For the sake of simplicity for the remainder of this book we will always assume thatthe phase space X of a flow is a subset of Rn.

Observe that Theorem 2.2.18 in combination with Corollary 2.2.10 implies that iff : Rn → Rn is bounded Lipschitz continuous function, then the solutions to the differ-ential equation ẋ = f(x) define a flow on Rn. Using [1, Theorem 1.261] and extensionsthereof one can prove that if f ∈ Cr, then the flow ϕ : R× Rn → Rn is also Cr.

With the language of flows we can generalize the concept of an orbit.

Definition 2.4.4. A set S ⊂ X is an invariant set for the flow ϕ : R×X → X if ϕ(R, S) =S.

Observe that every orbit is an invariant set and any invariant set is the union of orbits.Furthermore, if S is an invariant set for ϕ then ϕ(t, S) = S for all t ∈ R.

Slightly weaker, but useful notions of invariance include the following.

Definition 2.4.5. A set S ⊂ X is forward invariant if ϕ([0,∞), S) = S and is backwardinvariant if ϕ((−∞, 0], S) = S.


Observe that an invariant set is both forward and backward invariant.

Definition 2.4.6. A point x ∈ X is a equilibrium point if ϕ(R, x) = x or equivalently ifϕ(t, x) = x for all t ∈ R.

Observe that if the flow ϕ is generated by an ODE ẋ = f(x), then x0 is an equilibriumpoint if and only if f(x0) = 0.

Example 2.4.7. The logistic equation is used in biology as a simple model for populationgrowth that includes overcrowding effects. It takes the form

ẋ = rx(K − x) (2.21)

where r > 0 represents the birth rate and K > 0 is the carrying capacity for the environ-ment. Explicit solutions to this equation take the form

ϕ(t, x0) =x0Ke

rKt

K − x0 + x0erKt. (2.22)

Observe that ϕ is not a flow on the phase space R. If x0 < 0 or x0 > K, then |ϕ(t, x0)| → ∞as t → 1rK ln

(x0−Kx0

). However, from the biological perspective, the population levels of

interest lie in the interval [0,K] and the restriction ϕ : R× [0,K]→ [0,K] is a flow.

0 K

Figure 2.1: The phase portrait of the logistic equation (2.21) for r,K > 0.

The dynamics of the logistic equation restricted to [0,K] is particularly simple (e.g.see Figure 2.1). There are three orbits: two equilibrium points and the orbit (0,K) alongwhich the dynamics moves from 0 to K. This suggests that it is worth naming the lattertype of orbit.

Definition 2.4.8. A point x0 ∈ X is a heteroclinic point of a flow ϕ : R×X → X if

limt→∞

ϕ(t, x0) = x+ and limt→−∞

ϕ(t, x0) = x−

where x− 6= x+ are equilibria. The orbit ϕ(R, x0) is called a heteroclinic orbit from x−to x+. In case x− = x+ we say that x0 is a homoclinic point and ϕ(R, x0) is called ahomoclinic orbit

We leave the proof of the following proposition as an exercise.


Proposition 2.4.9. If f : R→ R is locally Lipschitz continuous, then every bounded solu-tion to

ẋ = f(x)

is either an equilibrium or a heteroclinic orbit.

Definition 2.4.10. A point x ∈ X is a periodic point if there exists T > 0 such thatϕ(T, x) = x. The associate periodic orbit is ϕ([0, T ], x).

Observe that a periodic orbit is an invariant set, since by Definition 2.4.3.2 ϕ(R, x) =ϕ([0, T ], x).

Example 2.4.11. Consider the system of differential equations{ẋ1 = −x2 + λx1(K2 − x21 − x22)ẋ2 = x1 + λx2(K

2 − x21 − x22).

Changing to polar coordinates results in the system{θ̇ = 1

ṙ = λr(K2 − r2).

Observe that the circle r = |K| or equivalently Γ :={x ∈ R2

∣∣ ‖x‖ = |K|} is a periodicorbit. Since the equation in r is a scalar differential equation, by Proposition 2.4.9 ifr0 ∈ (0,K), then r0 is a heteroclinic point. The associated heteroclinic orbit goes from 0to K and thus solutions spiral away from the origin toward the periodic orbit.

�

Figure 2.2: The phase portrait of the equation of Example 2.4.11 with K = λ = 1.


Example 2.4.11 suggests that we want to be able to discuss limits of trajectories thatare not single points. We begin with a general definition.

Definition 2.4.12. Let ϕ : R ×X → X be a flow and let U ⊂ X. The alpha and omegalimit sets of U are defined by

α(U) = α(U,ϕ) :=⋂T≤0

ϕ((−∞, T ], U

)and ω(U) = ω(U,ϕ) :=

⋂T≥0

ϕ([T,∞), U

),

respectively.

In a slight abuse of notation, if x ∈ X then we will let ω(x) = ω({x}) and α(x) = α({x}).As an exercise we let the reader check that if x0 is a heteroclinic point from x− to x+, thenα(x0) = x− and ω(x0) = x+.

Proposition 2.4.13. Let ϕ : R×X → X be a flow and U, V ⊂ X. We have:

(i) If there exists t ∈ R such that ϕ(t, U) = V , then

ω(U) = ω(V ) and α(U) = α(V ).

(ii) A point y ∈ ω(U,ϕ) if and only if there exists a sequence of times tk → ∞ and asequence of points xk ∈ U such that

limk→∞

ϕ(tk, xk) = y.

Furthermore, ω(U) = ω(ϕ([0,∞), U)).

(iii) Let V be a closed, forward invariant set. If U ⊂ V , then ω(U) ⊂ V .

(iv) If U is forward invariant, then

ω(U) =⋂t≥0

ϕ(t, U). (2.23)

(v) ω(U) and α(U) are closed invariant sets. Furthermore, if U is a closed invariant set,then

U = ω(U) = α(U).

(vi) If ϕ([0,∞), U) ⊂ K where K ⊂ X is compact, then ω(U) is nonempty and compact.If in addition, U is connected, then ω(U) is connected.

(vii) If V ⊂ ω(U), then ω(V ) ⊂ ω(U) and α(V ) ⊂ ω(U).

Similar results apply to α limit sets with regard to negative time and backward invariance.


Proof. (i). This follows from the definition of a flow and the definition of the alpha andthe omega limit sets.

(ii). Assume y ∈ ω(U). By definition y ∈ ϕ([T,∞), U

)for all T ≥ 0. In particular, for

each k ∈ Z+ there exist sk ≥ 0, xk ∈ U with yk := ϕ(k + sk, xk) ∈ ϕ([k,∞), U

)such that

‖yk − y‖ ≤ 1k . Let tk := k + sk. Then limk→∞ ϕ(tk, xk) = y.Conversely, assume there exists a sequence tk → ∞ and points xk ∈ U such that

limk→∞ ϕ(tk, xk) = y. Then y ∈ ϕ([t,∞), U) for all t ≥ 0 and hence y ∈ ω(U).The final remark follows from the observation that ϕ([t,∞), U) = ϕ(t, ϕ([0,∞), U)).(iii). Since U ⊂ V and V is forward invariant

ϕ([0,∞), U) ⊂ ϕ([0,∞), V ) ⊂ V.

Since V is closed ϕ([0,∞), U) ⊂ V and hence ω(U) ⊂ V .(iv). Since U is forward invariant

ϕ(t+ s, U) ⊂ ϕ(t, ϕ(s, U)) ⊂ ϕ(t, U)

for all t, s ≥ 0. Therefore, ϕ([t,∞), U) = ϕ(t, U), from which (2.23) follows.(v). By definition, alpha and omega limit sets are defined in terms of intersections of

closed sets and hence are closed.We prove that ω(U) is forward and backward invariant, which implies that it is invari-

ant. As a preliminary step we show that if U is forward invariant, then ω(U) is forwardinvariant. Observe that by (iv) for all t ≥ 0,

ϕ(t, ω(U)) = ϕ

t,⋂s≥0

ϕ(s, U)

⊂⋂s≥0

ϕ(t, ϕ(s, U)

)⊂⋂s≥0

ϕ (t, ϕ(s, U))

⊂⋂s≥0

ϕ (s, ϕ(t, U))

= ω(ϕ(t, U))

= ω(U)

where the inclusions follow from the fact that ϕ is continuous and the last equality followsfrom (i). To finish the proof we note that ϕ([0,∞), U) is forward invariant. Thus

ϕ(t, ω(U)) = ϕ(t, ϕ([0,∞), U)) = ω(ϕ([0,∞), U)) = ω(U)

where the last equality follows from (ii). This shows that ω(U) is forward invariant. Asimilar argument shows that U is backward invariant.


If U is a closed invariant set, then U is forward invariant. Thus by (iv)

U =⋂t≥0

ϕ(t, U) ⊂⋂t≥0

ϕ(t, U) = ω(U).

However, since U is a closed forward invariant set by (ii), ω(U) ⊂ U .(vi). Since for every T ≥ 0, ϕ([T,∞), U) ⊂ ϕ([0,∞), U) ⊂ K and K is compact, then

ϕ([T,∞), U) ⊂ K is compact. By definition this implies that ω(U) is the intersection of anested collection of non empty compact sets and so is nonempty and compact.

Now assume that U is connected. Then ϕ([T,∞), U) is connected and thus ϕ([T,∞), U)is connected. Again, since ω(U) is the intersection of a nested collection of compact con-nected sets, it is connected.

(vii). Since ω(U) is a closed invariant set, by (iii) ω(V ) ⊂ ω(U). The correspondingresult of 3 for alpha limit sets shows α(V ) ⊂ α(U).

It is important to observe that α(U) and ω(U) only describe the asymptotic dynamicsof points in U and ignores that of nearby points. Returning to Example 2.4.11 observethat ω(0) = 0, but ω(y) =

{x ∈ R2

∣∣ ‖x‖ = |K|} for all y ∈ R2 \{0}. To discuss sets whoseasymptotic dynamics is in agreement with that of its neighbors we introduce the followingconcepts.

Definition 2.4.14. Let ϕ : R×X → X be a flow. A forward invariant set U is a trappingregion if there exists T > 0 such that

ϕ(T,U) ⊂ int(U).

Dually, a backward invariant set U is a repelling region if there exists T < 0 such that

ϕ(T,U) ⊂ int(U).

Definition 2.4.15. Let ϕ : R ×X → X be a flow on a locally compact metric space. Aset A ⊂ X is an attractor if there exists a trapping region U such that A = ω(U). A setR ⊂ X is a repeller if there exists a repelling region U such that R = α(U).

Definition 2.4.16. Let ϕ : R×X → X be a flow on a locally compact metric space. Themaximal invariant set in U is defined by

Inv(U,ϕ) := {x ∈ U | ϕ(R, x) ⊂ U}.

We leave the proof of the following proposition as an exercise.

Proposition 2.4.17. If U is a trapping region and A = ω(U), then A is the maximalinvariant set in U .


In applications it can be difficult to determine a trapping region, because they must beforward invariant. A seemingly weaker notion is the following.

Definition 2.4.18. Let ϕ : R ×X → X be a flow on a locally compact metric space. Acompact set N is called an attracting neighborhood if

ω(N) ⊂ int(N).

Dually, a compact set N is a repelling neighborhood if α(N) ⊂ int(N).In general attracting neighborhoods are easier to identify than trapping regions (see

Exercise 4.7.3). However, as the following results indicate they are closely related.

Proposition 2.4.19. If U is a trapping region, then U is an attracting neighborhood. IfN is an attracting neighborhood, then there exists a trapping region U such that ω(N) ⊂U ⊂ N .

The proof of Proposition 2.4.19 follows from the continuity of the flow. However, theargument is somewhat technical thus we do not present them here.

Corollary 2.4.20. A is an attractor for a flow if and only if there exists an attractingneighborhood N such that A = ω(N).

Corresponding results hold for repellers. Attractors and repellers provide a powerfultool for decomposing dynamics.

Definition 2.4.21. Let A be an attractor for a flow ϕ : R×X → X on a compact metricspace. The dual repeller to A is

A∗ := {x ∈ X | ω(x) ∩A = ∅} .

(A,A∗) is called an attractor-repeller pair decomposition for ϕ.

Lemma 2.4.22. Let (A,A∗) be an attractor repeller pair decomposition, then

A ∩A∗ = ∅.

Proof. Since A is an attractor there exists a trapping region N such that A = ω(N) ⊂int(N). Observe that this implies that for any y ∈ N , ω(y) ⊂ A. Hence A∗ ∩N = ∅ andtherefore A ∩A∗ ⊂ int(N) ∩A∗ ⊂ N ∩A∗ = ∅.

Returning to the logistic equation in Example 2.4.7 and the induced flow ϕ : R×[0,K]→[0,K] observe that {K} is an attractor. Its dual repeller is {0}, thus ({K} , {0}) forms anattractor-repeller pair decomposition for ϕ. Observe that contrary to the name, this is nota decomposition of the phase space [0,K]. The justification for calling it a decompositionis made clear in Theorem 2.4.24. It is also worth noting that {0} is a repeller. As thefollowing result, which is left as an exercise, indicates this is true in general.


Proposition 2.4.23. If A is an attractor for a flow ϕ : R×X → X on a compact metricspace, then its dual repeller A∗ is a repeller for ϕ.

Theorem 2.4.24. Let ϕ : R ×X → X be a flow on a compact metric space. Let (A,A∗)be an attractor-repeller pair decomposition for ϕ. If x ∈ X \ (A ∪A∗), then

ω(x) ⊂ A and α(x) ⊂ A∗.Proof. Since A is an attractor there exists a trapping region N such that A = ω(N) ⊂int(N). Observe that this implies that for any y ∈ N , ω(y) ⊂ A.

We want to prove that ω(x) ⊂ A. By definition if x 6∈ A∗, then ω(x) ∩ A 6= ∅.Therefore, by Proposition 2.4.13(i) there exists t > 0 such that ϕ(t, x) = y ∈ N . ByProposition 2.4.13(ii), ω(x) = ω(y) ⊂ A.

We want to prove that α(x) ⊂ A∗. We first show that α(x) ∩ A = ∅. Assume not.Then, by the analogue of Proposition 2.4.13(i) for alpha limit sets, there exists a sequenceof times tk → −∞ such that

limk→∞

ϕ(tk, x) = limk→∞

yk = y ∈ A.

Thus without loss of generality we can assume that yk ∈ N for all k. Since N is atrapping region ϕ([0,∞), yk) ⊂ N . Since this is true for all tk, we can conclude thatϕ(R, x) ⊂ N . This implies that x belongs to the maximal invariant set in N and hence,by Proposition 2.4.17 that x ∈ A. Therefore, α(x) ∩A = ∅.

By the analogue of Proposition 2.4.13(iv) for alpha limit sets, α(x) 6= ∅. Assumey ∈ α(x). Since α(x) is invariant, ω(y) ⊂ α(x). This implies that ω(y) ∩ A = ∅ and henceby definition y ∈ A∗.

Attractor-repeller pair decompositions for the dynamics of Examples 2.4.7 and 2.4.11are reasonably easy to describe: ({K} , {0}) and

({x ∈ R2

∣∣ ‖x‖ = |K|} , {0}), respectively.A significant feature distinguishing the systems is that in the first case the attractor is anequilibrium, while in the second case it is a periodic orbit. It should also be observedthat both systems depend on two parameters, r and K for the logistics equation and λand K for Examples 2.4.11, and that as one changes these parameters the solutions to thedifferential equations change. This is made explicit by (2.22). This raises the question ofwhat level of refinement do we want to use to distinguish between the dynamics of differentsystems of differential equations.

Definition 2.4.25. Two flows ϕ : R × X → X and ψ : R × Y → Y are topologicallyequivalent if there exists a homeomorphism h : X → Y such that orbits of ϕ are mappedonto orbits of ψ preserving the direction of time, that is, there exists a continuous andstrictly increasing time-rescaling map τ : R×X → R such that

h(ϕ(t, x)) = ψ(τ(t, x), h(x)),

for all (t, x) ∈ R×X.


Observe that the dynamics of Example 2.4.7 and Example 2.4.11 are not topologicallyequivalent. Consider a flow ϕ : R× [0,K]→ [0,K] (resp. ψ : R× [0,K]→ [0,K]) generatedby Example 2.4.7 with r,K > 0 (resp. r < 0 < K). Then, as one can see in Figure 2.4,the flows ϕ and ψ are not topologically equivalent. However, for all r > 0 and K > 0 allflows generated by Example 2.4.7 are topologically equivalent. The same is true for allλ > 0 and K 6= 0 in Example 2.4.11. Demonstrating these last three statement can bedone because of the extremely simple form of the equations.

0 K 0 K

Figure 2.3: Phase portraits of the logistic equation (2.21) for r,K > 0 (left) and forr < 0 < K (right). The flows defined on the same phase space X = [0,K] are nottopologically equivalent.

In general showing that two ODEs generate topologically equivalent dynamics is ex-tremely challenging. We leave it to the reader to check that given a differential equationẋ = f(x), if ψi : R × Rn → Rn, i = 0, 1, are flows generated by ẋ = gi(x)f(x) for differ-ent rescaling functions gi as in Theorem 2.4.2, then ψ0 and ψ1 are topologically equivalent.Thus under this equivalence class there is no ambiguity about discussing the flow generatedby a differential equation.

We introduced the concept of an attractor by arguing that we should be interested inthe asymptotic dynamics of neighborhoods about a set of initial conditions. In the contextof fixed points and periodic orbits we will return to this question repeatedly and in greatergenerality than just that of an attractor. However, before leaving the subject we introducesome fundamental definitions. Consider the an autonomous differential equation

ẋ = f(x) (2.24)

where f : U → Rn is locally Lipschitz continuous on an open set U and let ϕ denote theassociated flow.

Definition 2.4.26. An equilibrium x̃ of (2.24) is stable if for any � > 0, there exists δ > 0such that for all t > 0, if ‖x0− x̃‖ < δ, then ‖ϕ(t, x0)− x̃‖ < �. An equilibrium that is notstable is called unstable.

A stronger notion of stability is the following.

Definition 2.4.27. An equilibrium x̃ of (2.24) is asymptotically stable if it is stable andthere exists ρ > 0 such that if ‖x0 − x̃‖ < ρ, then limt→∞ ϕ(t, x0) = x̃.

The relationship between the concepts of stability and attractors is subtle as the fol-lowing examples and propositions indicate.


Example 2.4.28. Consider the harmonic oscillator{ẋ1 = x2

ẋ2 = −x1(2.25)

The unique equilibrium is the origin and the remaining orbits are periodic orbits that takethe form of concentric circles. The origin is a stable equilibrium where one can chooseδ = �. Observe that the origin 0 is not an attractor; given any neighborhood N of theorigin the maximal invariant set in N will contain periodic orbits and thus ω(N) 6= {0}.

The proof of the following result is nontrivial.

Proposition 2.4.29. Let X ⊂ Rn be a compact neighborhood of x̃. An equilibrium x̃ is anattractor for the flow ϕ : R×X → X if and only if x̃ is an asymptotically stable equilibrium.

2.5 Exercises

Exercise 2.5.1 (Existence via Picard’s method). An alternative proof for the exis-tence of solutions is given by Picard’s method. In this exercise we use Picard’s method toprove the existence of a solution to the non autonomous IVP

ẋ = f(x, t), x(t0) = x0.

We begin with some hypotheses.Consider an open set U ⊂ Rn and an open interval I ⊂ R which contains t0. Denote

D := U × I. Assume that f : D → Rn is continuous and that for all t ∈ I, f(·, t) : U → Rnis a locally Lipschitz continuous function.

The goal is to show that there exists a solution ϕ : J = (t0 − a, t0 + a)→ Rn for somea > 0 to the integral equation

ϕ(t) = x0 +

∫ tt0

f(ϕ(s), s) ds, for all t ∈ J. (2.26)

Let �, δ > 0 small enough so that D�,δ := {(x, t) | ‖x− x0‖ ≤ �, |t− t0| ≤ δ} ⊂ D andso that

‖f(x, t)− f(y, t)‖ ≤ K‖x− y‖, for all (x, t), (y, t) ∈ D�,δ. (2.27)Let

M�,δ := max(x,t)∈D�,δ

‖f(x, t)‖, (2.28)

and let a > 0 such that

a ≤ min{δ,

�

M�,δ

}. (2.29)

2.5. EXERCISES 35

Define a time interval byJ := (t0 − a, t0 + a). (2.30)

For any t ∈ J , define the Picard operator by

T (x)(t) = x0 +

∫ tt0

f(x(s), s) ds. (2.31)

(i) Show that (T (x)(t), t) ∈ D�,a, for every (x, t) ∈ D�,a.This allows us to define a sequence of functions {xn}n≥0 with xn : J → Rn by

x0(t) ≡ x0, xn+1(t) = x0 +∫ tt0

f(xn(s), s) ds, n ≥ 0. (2.32)

The iterations (2.32) define what is known as the Picard iterative process.

(ii) Show that for every t ∈ J = (t0 − a, t0 + a),

‖xn(t)− xn−1(t)‖ ≤M�,δKn−1|t− t0|nn!

.

(iii) Show that {xn}n≥0 is a Cauchy sequence in the space C0(J) endowed with the supre-mum norm.

(iv) Show that there exists a continuous function ϕ : J → Rn that is solution of theintegral equation (2.26).

Exercise 2.5.2. Prove Proposition 2.2.11.


Exercise 2.5.4. Prove that given a flow ϕ : R×X → X and a set A ⊂ X⋃x∈A

ω(x, ϕ) ⊂ ω(A,ϕ).

Provide an example for which ⋃x∈A

ω(x, ϕ) 6= ω(A,ϕ).


Exercise 2.5.6. Let f : Rn → Rn be C1. Show that the flows generated by

ẋ = f(x) and ẋ =f(x)

1 + ‖f(x)‖are topologically equivalent.


Exercise 2.5.7. Observe that the definition of topologically equivalent in Definition 2.4.25requires a homeomorphism h : X → Y as opposed to diffeomorphism. Since every diffeo-morphism is a homeomorphism, our definition leads to larger equivalence classes. Usingscalar differential equations provide an explanation for this choice. More explicitly, considerthe family of linear scalar differential equation {ẋ = ax | a ∈ R}.

(i) Determine the topological equivalence classes.

(ii) In the definition of topological equivalence replace the homeomorphism h : X → Ywith the requirement that h be a diffeomorphism. Determine the equivalence classes.

The point of this exercise is to show that if we use a diffeomorphism, then the equiv-alence relation requires that the eigenvalues of the derivative at the equilibrium be thesame.

Exercise 2.5.8 (Existence and uniqueness for linear differential equations). Con-sider J ⊂ R an open interval. Suppose that g : J → Rn and ai,j : J → R, 1 ≤ i, j ≤ n, arecontinuous functions. Set

A(t) =

a1,1(t) . . . a1,n(t)... . . . ...an,1(t) . . . an,n(t)

.Prove that for any t0 ∈ J the initial value problem

ẋ(t) = A(t)x+ g(t), x(t0) = x0, (2.33)

has a unique solution x : J → Rn.

Hint: Define an integrating factor C(t) by writing down a solution of the equation

C ′(t) = −C(t)A(t),

that is write down an explicit formula for such a C(t). Show directly that

• C(t) is well defined, continuous, and differentiable on J .

• C(t0) = Id.

• C(t) is invertible for every t ∈ J .

Now multiply both sides of Equation (2.33) by C(t), apply the product rule on the left,integrate both sides from t0 to t with t ∈ J , and solve for x(t). This provides an explicitformula for the solution x(t). Show that the formula gives a well defined, continuous,differentiable function for each t ∈ J .

2.5. EXERCISES 37

Exercise 2.5.9 (Differentiability of the flow with respect to initial conditions).Let ϕ : R × Rn → Rn be the flow generated by ẋ = f(x), where f : Rn → Rn is C1 andbounded on Rn. Proposition 2.2.9 guarantees that ϕ(t, x) is continuous with respect toinitial conditions. The following approach shows that ϕ is actually differentiable withrespect to x.

Observe that differentiability of ϕ with respect to x is equivalent to the statement thatfor any t ∈ R, x0 ∈ Rn, there exists a unique n× n matrix Dϕ(t, x0) satisfying

lim‖h‖→0

‖ϕ(t, x0 + h)− ϕ(t, x0)−Dϕ(t, x0)h‖‖h‖ = 0. (2.34)

Let x0 ∈ Rn and γ(t) ≡ ϕ(t, x0) denote the orbit segment through x0. Define thematrix Dϕ(t, x0) to be the solution of the first variation equation,

d

dtDϕ(t, x0) = Df(γ(t))Dϕ(t, x0), Dϕ(0, x0) = Id.

Use the results of problem 2.5.8 to establish that the matrix solving this equation existsassuming only that φ(t, x0) ≡ γ(t) exists and is continuous. Now show that Dϕ(t, x0)satisfies (2.34).

Exercise 2.5.10 (Flow Box Theorem). Consider the differential equation ẋ = f(x)where f : Rn → Rn is a Cr vector field for r ≥ 1. A point x̃ ∈ Rn is called a ordinarypoint if f(x̃) 6= 0. Prove that there exists a Cr change of coordinates x = g(y) defined ona neighborhood of x̃ such that

ẏ1 = 1

ẏ2 = 0...

ẏn = 0

Exercise 2.5.11. Show that X with the C0(J) norm (as defined in the proof of Proposi-tion 2.2.5) is a complete metric space, with the metric given by

d(x, y) = ‖x− y‖C0(J) = supt∈J‖x(t)− y(t)‖.

Exercise 2.5.12. Show that if x0 is a heteroclinic point from x− to x+, then α(x0) = x−and ω(x0) = x+.

Exercise 2.5.13. Give an example of a flow ϕ : R×R2 → R2 for which there exists a pointx ∈ R2 such that ω(x) 6= ∅, but ω(x) is not connected.


Exercise 2.5.14 (Lyapunov’s stability theorem). Let x̃ be an equilibrium for ẋ =f(x), x ∈ Rn where f ∈ C1(U,Rn) for some open set U ⊂ Rn. Let W be a neighborhoodof x̃, V : W → R be a continuous function that is differentiable on W \ {x̃}. Define

V̇ (x) :=d

dtV (x(t)) = DV (x) · ẋ(t) = DV (x) · f(x),

and assume that V satisfies the following properties:

(i) V (x̃) = 0 and V (x) > 0 if x ∈W \ {x̃}.

(ii) V̇ (x) ≤ 0 for all x ∈W \ {x̃}.

Then, x̃ is stable. Furthermore, if in addition V̇ (x) < 0 for all x ∈ W \ {x̃}, then x̃ isasymptotically stable.

Chapter 3

Equilibria and Radii Polynomialsin Finite Dimension

Consider an open set U ⊂ Rn and a Lipschitz continuous function f : U → Rn. The-orem 2.2.12 guarantees the existence of a unique maximal solution to the initial valueproblem problem

ẋ = f(x), x(0) = x0

for any x0 ∈ U . Therefore, we can now turn our attention to studying solutions withspecific properties. The simplest, and hence the starting point for our investigations, isthat of an equilibrium point, also called a fixed point, steady state, or critical point, i.e., asolution that is constant in time. One of our goal is to develop a constructive method toprove the existence of equilibria. Observe that x0 ∈ Rn is an equilibrium point if and onlyif f(x0) = 0. Thus, it is sufficient for us to provide a constructive approach for provingthe existence of zeros of a function defined on a finite dimensional space X (in our case Rnor Cn). More precisely, the goal of this section is twofold: prove the existence of a pointx̃ ∈ X such that f(x̃) = 0 and provide bounds on the location of x̃. This is done using theradii polynomial approach which is a variant on Newton’s method. With this in mind werecall Newton’s method in Section 3.1 and then introduce the radii polynomial approachin finite dimension in Section 3.2.

Throughout this book, given two vectors x, y ∈ Rn, we use the notation x � y ifxk ≤ yk for all k = 1, . . . , n. Moreover, given a matrix A = {aij}i,j (real or complexvalued) we use the notation |A| to denote the matrix |A| = {|aij |}i,j , where | · | denotes theabsolute value.

3.1 Newton’s Method

We begin with a trivial proposition that sets the stage for our strategy for finding zeros ofa function.

39

40CHAPTER 3. EQUILIBRIA AND RADII POLYNOMIALS IN FINITE DIMENSION

Proposition 3.1.1. Let X be Rn or Cn, and let U, V ⊂ X be open sets. Consider f : U →V . Assume that A : X → X is an invertible linear map. Let T : U → X be defined by

T (x) := x−Af(x). (3.1)

If T (x̃) = x̃, then f(x̃) = 0.

Proposition 3.1.1 allows us to replace the problem of directly finding a zero of f withthat of proving the existence of a fixed point of T . As is indicated at the beginningof Chapter 2 the contraction mapping theorem (Theorem 2.1.2) provides existence anduniqueness if T is a contraction. Furthermore, it gives bounds on the location of x̃ as afunction of an initial guess (recall (2.1)). Thus the problems we are trying to address arereduced to finding an injective linear map A that makes T a contraction. This leads us toNewton’s method. We begin with a simple example.

Example 3.1.2. Consider f ∈ C1(R,R). Recall that in Newton’s method, T : R → Rapplied to f is given by

T (x) := x− f(x)f ′(x)

and that T is used iteratively to find an approximate value of a root of f . More explicitly,given an initial guess x̂ ∈ R, set x0 = x̂ and inductively define xk+1 := T (xk). Assume

limk→∞

xk = x̃.

Observe that if T is continuous at x̃ (a sufficient condition for this is that f ′(x̃) 6= 0), thenT (x̃) = x̃ and hence f(x̃) = 0. Thus the problem of proving the existence of a zero of fis essentially reduced to finding and/or identifying whether an initial guess x̂ ∈ R leads toconvergence of Newton’s method.

The existence of convergence is precisely the conclusion of the contraction mappingtheorem. With this in mind assume f(x̃) = 0 and f ′(x̃) 6= 0. Observe that for |h| small

|T (x̃+ h)− T (x̃)| =∣∣∣∣x̃+ h− f(x̃+ h)f ′(x̃+ h) −

(x̃− f(x̃)

f ′(x̃)

)∣∣∣∣=

∣∣∣∣h− f(x̃+ h)f ′(x̃+ h)∣∣∣∣

≈∣∣∣∣h− f(x̃) + hf ′(x̃)f ′(x̃+ h)

∣∣∣∣≈ |h|

∣∣∣∣1− f ′(x̃)f ′(x̃+ h)∣∣∣∣ .

Since h = x̃ + h − x̃, the contraction constant for Newton near the fixed point x̃ is∣∣∣1− f ′(x̃)f ′(x̃+h) ∣∣∣ ≈ 0.

3.2. RADII POLYNOMIAL APPROACH IN FINITE DIMENSION 41

The intended take away message from Example 3.1.2 is that in a sufficiently smallneighborhood of a nondegenerate zero of f the associated Newton operator is an extremelystrong contraction. In particular, returning to the question concerning the choice of aninjective linear map A, a naive interpretation of this example suggests setting A = A(x) =(f ′(x))−1. There are several reasons why this choice is not appropriate. The first is thatin general (f ′(x))−1 cannot be represented using a finite binary expansion. Hence, it doesnot have an exact floating point representation, and therefore a faithful expression for theassociated map T on a computer becomes a nontrivial task.

Returning to the general problem of finding equilibria, let f : U → Rn be a C1 functiondefined on an open set U ⊂ Rn. In this case the Newton operator is given by

T (x) := x− (Df(x))−1f(x). (3.2)An argument similar to that presented in Example 3.1.2 demonstrates that if f(x̃) = 0and Df(x̃) is invertible, then in a small neighborhood of x̃, T is a contraction mappingwith small contraction constant. Again, this suggests the choice of A(x) = (Df(x))−1.However, the cost of computing the inverse of a n × n matrix is of order n3 and thus forhigh dimensional problems repeatedly computing the inverse is prohibitively expensive.

One final caveat on the choice of A arises from the fact that for most of this book,i.e., Chapters ?? onward, we are interested in applying these techniques to maps thatare defined on infinite dimensional Banach spaces for which an explicit representation of(Df(x))−1 is not possible.

Of course, as presented in (4.24) A need not equal (Df(x))−1. Thus the approach weadopt is as follows. We assume that we are given an initial guess x̄ for a zero of f . Inpractice x̄ is obtained using a standard numerical method. We then use some, typicallyproblem dependent, form of approximation of (Df(x̄))−1 to choose A. This produces afunction T . What remains is the challenge of proving that T is a contraction mapping.

3.2 Radii Polynomial Approach in Finite Dimension

Consider f(x) = x2 − 1. The associated Newton operator T : R \ {0} → R is given byT (x) = x− (2x)−1(x2 − 1). Since T has two distinct fixed points, T (±1) = ±1, T cannotbe a contraction mapping over its entire domain. However, the analysis of Example 3.1.2suggests that there exists a neighborhood U+ of 1 such that T : U+ → R defined byT (x) := x− 12(x2− 1) is a contraction mapping. The theorem below provides a mechanismfor rigorously identifying a domain on which T is a contraction mapping.

Throughout this section we make use of the sup norm on Rn, i.e., given x = (x1, . . . , xn) ∈Rn define

‖x‖∞ := maxk=1,...,n

{|xk|} .

In this norm the closed ball of radius r centered at x is denoted by

Br (x) := {y ∈ Rn | ‖x− y‖∞ ≤ r} .


Theorem 3.2.1. Let U ⊂ Rn be an open set and let T = (T1, . . . , Tn) ∈ C1(U,Rn),where Tk : Rn → R. Let x̄ ∈ U . Assume that Y = (Y1, . . . , Yn) ∈ Rn and Z(r) =(Z1(r), . . . , Zn(r)) ∈ Rn provide the following bounds:

|Tk(x̄)− x̄k| ≤ Yk and supb,c∈Br(0)

|DTk(x̄+ b)c| ≤ Zk(r) (3.3)

for all k = 1, . . . , n. If ‖Y +Z(r)‖∞ < r, then T : Br (x̄)→ Br (x̄) is a contraction mappingwith contraction constant

κ :=‖Z(r)‖∞

r< 1.

In particular, there exists a unique x̃ ∈ Br (x̄) such that T (x̃) = x̃.Proof. The mean value theorem applied to Tk implies that for any x, y ∈ Br (x̄) there existsz ∈ {tx+ (1− t)y | t ∈ [0, 1]} ⊂ Br (x̄) such that

Tk(x)− Tk(y) = DTk(z)(x− y).

Thus,

|Tk(x)− Tk(y)| =∣∣∣∣DTk(z) r(x− y)‖x− y‖∞

∣∣∣∣ ‖x− y‖∞r ≤ Zk(r)‖x− y‖∞r . (3.4)Setting y = x̄ and noting that ‖x− y‖∞ ≤ r, (3.4) yields

|Tk(x)− Tk(x̄)| ≤ Zk(r).

By the triangle inequality

|Tk(x)− x̄k| ≤ |Tk(x)− Tk(x̄)|+ |Tk(x̄)− x̄k| ≤ Zk(r) + Yk ≤ ‖Y + Z(r)‖∞ < r.

That proves that T (Br (x̄)) ⊆ Br (x̄).From (3.4), it follows that

‖T (x)− T (y)‖∞ ≤ ‖Z(r)‖∞‖x− y‖∞

r.

By assumption ‖Z(r)‖∞ ≤ ‖Y +Z(r)‖∞ < r. Therefore T is a contraction on Br (x̄) witha contraction constant κ =

‖Z(r)‖∞r < 1, and hence, by the contraction mapping theorem

there exists a unique x̃ ∈ Br (x̄) such that T (x̃) = x̃.

Observe that Theorem 3.2.1 does not prescribe a specific value of r. In fact, to empha-size the freedom to choose r we introduce the following concept.

Definition 3.2.2. Given T ∈ C1(U,Rn), U ⊂ Rn open, and vectors Y,Z(r) ∈ Rn satisfying(3.3) the associated radii polynomials pk(r), k = 1, . . . , n are given by

pk(r) := Yk + Zk(r)− r.


Using the radii polynomials we restate Theorem 3.2.1 in the form in which we makeprimary use of it.

Corollary 3.2.3. Let U ⊂ Rn be open, f ∈ C1(U,Rn) and A : Rn → Rn be an invertiblelinear map. Define T : U → Rn by

T (x) := x−Af(x).

Let x̄ ∈ U , let Y, Z(r) ∈ Rn satisfy (3.3), and let pk(r), k = 1, . . . , n be the associated radiipolynomials. If there exists r > 0 such that pk(r) < 0, for all k = 1, . . . , n, then there existsa unique x̃ ∈ Br (x̄) such that f(x̃) = 0.

Proof. Suppose that r > 0 is such that pk(r) < 0 for all k = 1, . . . , n. Hence,

‖Y + Z(r)‖∞ = maxk=1,...,n

{(Y + Z(r))k} < r.

From Theorem 3.2.1 there exists a unique x̃ ∈ Br (x̄) such that T (x̃) = x̃ and therefore byProposition 3.1.1 such that f(x̃) = 0.

Remark 3.2.4. Observe that if there exists r > 0 such that pk(r) < 0, for all k = 1, . . . , n,then there exists a range of values I = (r−, r+) ⊂ [0,∞) over which the inequalities aresatisfied. Since x̃ is the unique zero of f in Br (x̄) for all r ∈ I, r− provides tight bounds onthe location of x̃, while r+ provides information about the domain of isolation of x̃. Themaximal such interval is called the existence interval for the radii polynomials. Observethat this allows us to rephrase Corollary 3.2.3 as follows; if the existence interval for theradii polynomials is nonempty, then one can present an explicit domain in which thereexists a unique zero of f .

The reader will note that the form of T presented in Corollary 3.2.3 plays no role inthe proof. However, in practice the choice of A is heavily influenced by (Df(x̄))−1. Thefollowing result is used to show that if f(x̃) = 0 and Df(x̃) is invertible, then the radiipolynomial approach can be used to identify x̃ provided the matrix norm of A is not toolarge and that the initial guess x̄ is a good enough approximation of x̃.

Given a matrix A, denote by ‖A‖∞ = maxx∈B1(0) ‖Ax‖∞ the matrix norm induced bythe norm ‖ · ‖∞.

Theorem 3.2.5. Consider a C1 map f : U → Rn. Let x̄ ∈ U and A ∈ Mn(R). Assumethere exist constants α, β, r+ > 0 for which the following inequalities are satisfied:

α+ 2β < 1 (3.5)

‖I −ADf(x̄)‖∞ < α (3.6)‖Df(x̄+ b)−Df(x̄)‖∞‖A‖∞ < β, ∀ b ∈ Br+ (0) (3.7)

‖A‖∞‖f(x̄)‖∞ < βr+. (3.8)


Then, the existence interval for the radii polynomials associated with T (x) := x − Af(x)contains the non-empty interval (

β−1‖A‖∞‖f(x̄)‖∞, r+).

Proof. Let r− := β−1‖A‖∞‖f(x̄)‖∞. By (3.8) r− < r+. Thus it is sufficient to show that

(r−, r+) is contained in the existence interval for the radii polynomials associated with T .For the remainder of the proof we assume r ∈ (r−, r+).

DefineY∞ := ‖A‖∞‖f(x̄)‖∞ and Z∞(r) := 2βr.

Observe that

‖T (x̄)− x̄‖∞ = ‖Af(x̄)‖∞ ≤ ‖A‖∞‖f(x̄)‖∞ = βr− < βr.

Furthermore,

supb,c∈Br(0)

‖DT (x̄+ b)c‖∞ = supb,c∈Br(0)

‖[I −ADf(x̄+ b)]c‖∞

= supb,c∈Br(0)

‖[I −ADf(x) +ADf(x)−ADf(x̄+ b)]c‖∞

≤ supb,c∈Br(0)

(‖I −ADf(x̄)‖∞‖c‖∞ + ‖A(Df(x̄+ b)−Df(x̄))‖∞‖c‖∞

)≤ sup

b∈Br(0)

(‖I −ADf(x̄)‖∞ + ‖A‖∞‖Df(x̄+ b)−Df(x̄)‖∞

)r

1©≤(α+ ‖A‖∞ sup

b∈Br(0)‖Df(x̄+ b)−Df(x̄)‖∞

)r

2©≤

α+ ‖A‖∞ supb∈Br+ (0)

‖Df(x̄+ b)−Df(x̄)‖∞

r3©≤ (α+ β)r

where 1© follows from (3.6), 2© follows from the assumption that r < r+, and 3© followsfrom (3.7).

Hence,

‖T (x̄)− x̄‖∞ + supb,c∈Br(0)

‖DT (x̄+ b)c‖∞ < βr + (α+ β)r < r

where the last inequality follows from (3.5). Therefore, for all r ∈ (r−, r+) the radiipolynomials for T are negative.


Remark 3.2.6. Theorem 3.2.5 provides considerable insight into the analytic issues asso-ciated with the radii polynomials. First, observe that by (3.5), α < 1. Therefore, (3.6)implies that A and Df(x̄) must be invertible matrices. Since A is invertible, if the hypothe-ses of Theorem 3.2.5 are satisfied, then by Corollary 3.2.3 there exists a unique solutionx̃ ∈ Br+ (x̄) to f(x) = 0.

Observe that if x̄ = x̃, then f(x̄) = 0 and hence the existence interval contains (0, r+).Furthermore, (3.8) is satisfied for all β, r+ > 0. If in addition, A = (Df(x̄))

−1, thenI − ADf(x̄) = 0 and hence (3.6) is satisfied for all α > 0. The assumption that f ∈ C1implies that β can be chosen arbitrarily small by choosing r+ sufficiently small. Therefore,given x̃ ∈ Rn for which f(x̃) = 0 and Df(x̃) is invertible, it is, at least theoretically, alwayspossible to use the radii polynomial approach to prove the existence of and provide boundson the location of x̃; a sufficient condition is a good initial approximation x̄ ≈ x̃ and agood approximation of (Df(x̃))−1.

However, Theorem 3.2.5 also indicates that this is not a necessary condition. The initialapproximation x̄ may differ significantly from x̃, if ‖f(x̄)‖∞ and/or ‖A‖∞ is sufficientlysmall. In addition, as (3.6) indicates A ≈ (Df(x̄))−1 is not a necessary condition. Thus,the radii polynomial provide considerable freedom in terms of their application in concreteproblems.

To demonstrate how the radii polynomials are used in practice we consider severalsimple examples.

Example 3.2.7 (The radii polynomial approach for a one-dimensional example).Consider the simplest nonlinear function f(x) = x2− 2. Following Corollary 3.2.3 the firststep is to choose an initial guess x̄ for the zero of f . In a typical application x̄ is takenas the output for a standard numerical procedure for finding zeros of a function. Thenext step is to fix a value for A ∈ R. Example 3.1.2 suggests that A be chosen based onDf(x̄)−1. We leave it general for now. To prove that

T (x) := x−A(x2 − 2)

is a contraction mapping we need to determine bounds Y and Z(r) that satisfy (3.3).Consider any Y such that

|T (x̄)− x̄| = |A(x̄2 − 2)| ≤ Y.

Obtaining Z(r) is slightly more complicated. It is convenient to write b, c ∈ Br (0) as whereu, v ∈ B1 (0). Using this notation

DT (x̄+ b)c = (1− 2A(x̄+ b))c= (1− 2A(x̄+ ru))rv= [(1− 2x̄A)v]r + [−2Auv]r2.


Thus

supb,c∈Br(0)

|DT (x̄+ b)c| = supu,v∈B1(0)

∣∣[(1− 2x̄A)v]r + [−2Auv]r2∣∣≤ sup

u,v∈B1(0)|(1− 2x̄A)v| r +

∣∣[2Auv]r2∣∣≤ |1− 2x̄A|r + 2Ar2

and hence we choose Z(r) := |1− 2x̄A|r + 2Ar2. For the choice Y = |A(x̄2 − 2)| and Z(r)as above the associated radii polynomial is given by

p(r) = Y + Z(r)− r= |A(x̄2 − 2)|+ |1− 2x̄A|r + 2Ar2 − r= 2Ar2 + (|1− 2x̄A| − 1) r + |A(x̄2 − 2)|.

Let us now make some explicit choices.

(a) The best possible approximations

Let x̄ =√

2 (this is an exact solution) and A = Df(x̄)−1 = 12x̄ (the exact inverse).In this case, the radii polynomial is p(r) = 2Ar2 + (|1− 2x̄A| − 1) r + |A(x̄2 − 2)| =√

22 r

2−r. Then, the existence interval for the radii polynomial is given by I =(0,√

2).

(b) Not the best approximations

For the purpose of applications it is not practical to assume that x̄ is the exactsolution, nor is it reasonable to assume that A = Df(x̄)−1. However, the proposedapproach works well even with coarse approximations.

Choose x̄ = 1.3. Since Df(x̄)−1 = (2x̄)−1 ≈ 0.384615, we set the approximate inverseto be A = 0.38. To prove that

T (x) := x− 0.38(x2 − 2)

is a contraction mapping we need to determine bounds Y and Z(r) that satisfy (3.3).To obtain Y observe that

|T (x̄)− x̄| = | − 0.38(1.32 − 2)| = 0.1178.

Since we only need a bound we choose Y := 0.12. As above, we have Z(r) =|1− 2x̄A|r + 2Ar2 = 0.012r + 0.76r2.For this choice of Y and Z(r) the associated radii polynomial is given by

p(r) = Y + Z(r)− r = 0.12 + 0.012r + 0.76r2 − r = 0.76r2 − 0.988r + 0.12.


r

p(r)

I

0 0.2 0.4 0.6 0.8 1 1.2

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

Figure 3.1: The radii polynomial p(r) = 0.76r2 − 0.988r + 0.12 of Example 3.2.7 andI = [0.136, 1.164], where the radii polynomial is strictly negative.

See Figure 3.1 for a geometrical interpretation of the radii polynomial p(r). Usingthe quadratic formula we see that p(r) < 0 for all r ∈ I = [0.136, 1.164]. By Corol-lary 3.2.3 we can conclude that x̃ ∈ B0.136 (1.3) = [1.164, 1.436]. Given that theactual root contained in this interval is

√2 ≈ 1.414 we see that the relative error for

this bound is1.414− 1.164

1.414≈ 0.1768.

We can also conclude that there is a unique root in the interval x̃ ∈ B1.164 (1.3) =[0.136, 2.464]. Given that −

√2 ≈ −1.414 is also a root, this is a reasonably accurate

statement.

It is reasonable to ask about optimal results involving the radii polynomial approach.For example, over how large a domain can one hope to show the unique existence of a root?Let x̄ = x̃ =

√2 and let A = Df(x̄)−1 = 1

2√

2. As is discussed in Remark 3.2.6, this choice

of x̄ and A essentially allows us to assume that α = 0 and hence by (3.5) that β = 12 . Thusthe only constraint remaining for Theorem 3.2.5 is (3.7), which for this particular examplegives rise to

‖Df(x̄+ b)−Df(x̄)‖∞‖A‖∞ =∣∣∣2(√2 + b)− 2√2∣∣∣ 1

2√

2

=b√2

<1

2= β.

Thus r+ =√

22 .


Therefore, by Theorem 3.2.5 we can conclude that there is a unique root in the interval

(√

2/2, 3√

2/2) ≈ (0.707, 2.121) ⊂ [0.136, 2.464]

where the last interval comes from the computations at the beginning of this example. Thisdemonstrates that the bounds presented in the hypothesis of Theorem 3.2.5 are sufficient,but far from necessary.

•Br(x̄) •

x⇤x̄• •

x⇤x̄

Br(x̄)

x⇤1•• •

x⇤x̄

Br(x̄)

Figure 3.2: For different values of r > 0, we can have different scenarios. On the left,the radius r is too small and p(r) ≥ 0. Hence the method based on the radii polynomialscannot conclude about the existence of a unique solution x∗ ∈ Br (x̄). In the context ofExample 3.2.7, that would correspond for instance to r = 0.1 /∈ I = [0.136, 1.164]. In themiddle, the radius r ∈ I is chosen so that p(r) < 0, which implies that the ball Br (x̄)contains a unique solution. In the context of Example 3.2.7, that would correspond forinstance to r = 0.15 ∈ I. On the right, the radius r is too large and p(r) ≥ 0. Hence,it could happen that there exist more than one solution in the ball Br (x̄), that is thereexist x∗, x∗1 ∈ Br (x̄) both solutions of f = 0. In the context of Example 3.2.7, that wouldcorrespond for instance to r = 3, where x∗ =

√2 and x∗1 = −

√2.

Example 3.2.7 demonstrates how the radii polynomials can be employed. It also makesclear that the choice of A plays a central role in the values of Y and Z(r), which in turndetermines whether the radii polynomial inequalities can be satisfied. With this in mindwe return to this example to gain some insight into how broad a range of As are possible.

Example 3.2.8. As a slight generalization of Example 3.2.7, let f(x) = x2 − λ. Recallthat the philosophy of the approach we are taking is that we begin with a guess x̄ of thelocation of a zero of f and from that attempt to prove that a zero exists. Using Newton’smethod as a guide we set A = (f ′(x̄))−1 = (2x̄)−1 and hence

T (x) = x− x2 − λ2x̄

.

Applying the same analysis as in Example 3.2.7, set

Y =|x̄2 − λ|

2x̄= |T (x̄)− x̄|

and

Z(r) =r2

x̄= sup

u,v∈B1(0)

| − r2uv|x̄

= supb,c∈Br(0)

|DT (x̄+ b)c|.


Thus, the radii polynomial inequality takes the form

p(r) = r2 − x̄r + |x̄2 − λ|

2< 0.

This in turn implies that the interval of convergence I = (r−, r+) for λ ≥ 0 is given by

r± =x̄±

√x̄2 − 2 |x̄2 − λ|

2.

This in turn tells us that if

x̄ ∈(√

2λ

3,√

2λ

),

then the radii polynomials will provide a positive answer to the question of existence of azero of f .

Example 3.2.9 (The radii polynomial approach for a two-dimensional example).Consider the problem of looking for equilibria of{

ẋ1 = x2 + 4x21 − λ

ẋ2 = x1 + x22 − 1,

(3.9)

where λ ∈ R is a parameter. An equilibrium solution x = (x1, x2) is a solution of f(x) = 0where f : R2 → R2 is given by the rand-hand side of (3.9). At some given parameter valuesλ ∈ R, there are up to four real solutions.

Given an initial guess x̄

Df(x̄) =

(8x̄1 11 2x̄2

).

and the exact formula for the inverse is

Df(x̄)−1 =1

16x̄1x̄2 − 1

(2x̄2 −1−1 8x̄1

).

Set A := Df(x̄)−1, and let T (x) := x−Af(x). To apply the radii polynomial approach wecompute the bounds Y,Z(r) ∈ R2 satisfying (3.3).

To obtain Y , realize that

T (x̄)− x̄ = −Af(x̄) = −Df(x̄)−1f(x̄) = − 116x̄1x̄2 − 1

(2x̄2 −1−1 8x̄1

)(x̄2 + 4x̄

21 − λ

x̄1 + x̄22 − 1

).

Using this expression, we can choose Yk such that |[T (x̄)− x̄]k| ≤ Yk, for k = 1, 2.The next step is to determine Zk(r) such that

supb,c∈Br(0)

|DTk(x̄+ b)c| ≤ Zk(r), k = 1, 2.


In order to simplify the computation of Zk we rescale the variables b and c. For r > 0, letb̃ := b/r and c̃ := c/r, in which case the desired bounds become

supb̃,c̃∈B1(0)

|DTk(x̄+ b̃r)c̃|r ≤ Zk(r), k = 1, 2.

To improve the estimates we consider the following splitting

DT (x̄+ b̃r)c̃r =(I −ADf(x̄+ b̃r)

)c̃r

= (I −ADf(x̄)) c̃r −A(Df(x̄+ b̃r)−Df(x̄)

)c̃r

= −A(Df(x̄+ b̃r)−Df(x̄)

)c̃r

where the last equality follows from the choice A = Df(x̄)−1. It is important to realizethat in general this is not the case, but does suggest why, in more complicated examples,it is useful to be able to choose A ≈ Df(x̄)−1.

To bound the second term in the splitting we note that(Df(x̄+ b̃r)−Df(x̄)

)c̃r =

[(8x̄1 + 8b̃1r 1

1 2x̄2 + 2b̃2r

)−(

8x̄1 11 2x̄2

)]c̃r

=

(8b̃1c̃12b̃2c̃2

)r2

and hence

A(Df(x̄+ b̃r)−Df(x̄)

)c̃r = − 1

16x̄1x̄2 − 1

(2x̄2 −1−1 8x̄1

)(8b̃1c̃12b̃2c̃2

)r2.

By definition b̃, c̃ ∈ B1 (0) implies that |b̃1|, |c̃1|, |b̃2|, |c̃2| ≤ 1, thus∣∣∣(A(Df(x̄+ b̃r)−Df(x̄)) c̃r)1

∣∣∣ ≤ ( 16|x̄2|+ 2|16x̄1x̄2 − 1|)r2

and ∣∣∣(A(Df(x̄+ b̃r)−Df(x̄)) c̃r)2

∣∣∣ ≤ ( 16|x̄1|+ 8|16x̄1x̄2 − 1|)r2.

Set

Z1(r) :=

(16|x̄2|+ 2|16x̄1x̄2 − 1|

)r2 and Z2(r) :=

(16|x̄1|+ 8|16x̄1x̄2 − 1|

)r2.

Finally, the two radii polynomials are defined by

p1(r) :=

(16|x̄2|+ 2|16x̄1x̄2 − 1|

)r2 − r + Y1 and p2(r) :=

(16|x̄1|+ 8|16x̄1x̄2 − 1|

)r2 − r + Y2. (3.10)


Observe that the definition of the radii polynomials (3.10) is the same for any approx-imate solution x̄ ∈ R2. This means that we have derived explicit formulas of the radiipolynomials that can be applied to any initial approximation x̄. If the initial approxi-mation is reasonable, then we expect to be able to prove the existence of a true solutionx̃. As an example, set λ = 3, and using some numerical scheme, e.g., Newton’s method,find x̄ = (x̄1, x̄2) ∈ R2 such that ‖f(x̄)‖ < tol for some fixed tolerance tol. We chosetol = 10−15, and computed four approximate solutions with Newton’s method obtaining

x̄(1) =

(−0.6545436118927946

1.286290640521338

), x̄(2) =

(0.79863337536104250.4487389270377125

),

x̄(3) =

(0.9086121587039679−0.3023042197787387

), x̄(4) =

(−1.052701922172216−1.432725347780312

).

(3.11)

For each i = 1, 2, 3, 4, we the following intervals I(i) are contained in the existenceintervals for the associated radii polynomials:

I(1) = [1.608556563336234× 10−16, 0.6402146110280825]I(2) = [6.468926557835299× 10−16, 0.2276100489022837]I(3) = [1.218510871878330× 10−15, 0.2391290678185533]I(4) = [4.855265252055830× 10−16, 0.9271769229945959].

(3.12)

Figure 3.3 shows the largest enclosures for each equilibrium of (3.9) for λ = 3. For eachi = 1, 2, 3, 4, the radius around x̄(i) is the largest value of I(i).

In Example 3.2.9, we defined A as the exact inverse of Df(x̄). For general finitedimensional problems, this is not always possible. In fact, as the dimension of the problemgrows, this becomes difficult. For infinite dimensional problems, obtaining an exact inverseis almost impossible. However, as the following example demonstrates, having the exactinverse is not necessary.

Example 3.2.10. The Lorenz system is given byẋ1 = σ(x2 − x1)ẋ2 = ρx1 − x2 − x1x3ẋ3 = −βx3 + x1x2

(3.13)

For any β > 0 and ρ > 1, the set of equilibria of (3.13) is given by{(0, 0, 0),

(±√β(ρ− 1),±

√β(ρ− 1), ρ− 1

)}which is obtained by solving f(x) = 0, where f : R3 → R3 is given by the right-hand sideof (3.13). At the classical parameter values, σ = 10, ρ = 28 and β = 8/3,

(√72,√

72, 27)


−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−2.5

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

Figure 3.3: Largest existence and uniqueness enclosures for each equilibrium of (3.9) forλ = 3. For each i = 1, 2, 3, 4, the radius around x̄(i) is the largest value of I(i). Thequantities x̄(i) and I(i) are found in (3.11) and (3.12) respectively. The smallest enclosureis too small to represent, which imp

ordinary di erential equations: a constructive...

Documents