introduction to mathematical analysis 1trutnau/lecturea12020.pdf · 2020-03-20 · lecture notes...

Lecture Notes

Introduction to Mathematical Analysis 1

Gerald Trutnau

Department of Mathematical Sciences

Seoul National University

Uncorrected Version: March 20, 2020

These notes are ©2020 by Gerald Trutnau.

They may be used for personal or classroom purposes,

but not for commercial purposes.

iv

Contents

Chapter 0. Introduction 10.1. Sets and functions (Prerequisites) 10.2. Additional examples 5

Chapter 1. The real line and the Euclidean space 91.1. Ordered fields and cardinality 91.2. Completeness and the real number system 151.3. Least upper bounds 211.4. Cauchy sequences and cluster points 231.5. The extended real number system, limit superior and inferior 261.6. Euclidean space 291.7. Norms, inner products and metrics 311.8. Construction of the real numbers 321.9. Section 1 Additional examples 32

Chapter 2. The topology of Euclidean space (and metric spaces) 372.1. Open sets 372.2. The interior of a set 392.3. Closed sets 392.4. Accumulation points 402.5. The closure of a set 412.6. The boundary of a set 422.7. Sequences 432.8. Completeness 442.9. Series of real numbers and vectors 452.10. Section 2 Additional Examples 51

Chapter 3. Compact and connected sets 553.1. Compactness 553.2. The Heine-Borel Theorem 573.3. The nested set property 573.4. Path-connected sets 583.5. Connected sets 593.6. Section 3 Additional Examples 60

Chapter 4. Continuous mappings 634.1. Continuity 634.2. Images of compact and connected sets 644.3. Operations on continuous mappings 664.4. Boundedness of continuous functions on compact sets 684.5. The intermediate value theorem 684.6. Uniform continuity 694.7. Differentiation of functions of one variable 704.8. The Riemann-Stieltjes Integral 77

v

vi CONTENTS

4.9. Functions of bounded variation 92

Bibliography 117

CHAPTER 0

Introduction

0.1. Sets and functions (Prerequisites)

The naive way regarding sets, especially building sets without limitations, leadsto inconsistencies, e.g.

Cantor’s paradox: “the set of all sets” M = {X | X is a set}

P(M) ⊂ M $see Theorem 1.1.7

P(M) ( means “a contradiction”)

Here P(M) = {X |X ⊂M} is the so-called power set of M .

Rusells’s paradox: “the set of all sets that are not members of themselves”

M = {X | X /∈ X}

Then M ∈M ⇐⇒ M /∈M (for instance, “the barber shaves all people, whodo not shave themselves”)

These are not sets ! axiomatic set theories (still an active research area)

A rigorous and consistent definition of the notion “set” is a complicated subjectand involves elements of logic (see for instance in the Appendix of John L. Kelly,General topology, Springer). Its study would lead us too far away.The maybe “best non-rigorous” definition is Georg Cantor’s (1845-1918) “defini-tion”:

By a “set” we mean any collection M into a whole of definite, distinct objects ofour perception or of our thought m, which are called the “elements” of M .

In short: “Any definable collection is a set”. This is not contradictory but toovague and so not adequate as a rigorous definition (see paradoxes above). In thiscourse, we presume to know intuitively what a set is !

0.1.1. Example. (i) N := {1, 2, 3, . . . } is called the set of positive inte-gers or the set of natural numbers.

(ii) Z := {0,±1,±2,±3, . . . } is called the set of all integers.

(iii) Q :={pq | p ∈ Z, q ∈ N

}is called the set of rational numbers.

Here “:=” means that the left hand side is defined through the right hand side.

Notation: Let S be a set.

x ∈ S means that “x is an element of the set S”x /∈ S “x is not an element of S”

1

2 0. INTRODUCTION

A ⊂ S “A is a subset of S”, i.e. every element of A is an element of S.Symbolically : x ∈ A =⇒︸︷︷︸

“implies”

x ∈ S.

∅ is called the empty set, i.e. the set that contains no elements.

For two sets A, B:

A = B means A ⊂ B and B ⊂ A ,for x, y ∈ A : x = y means {x} ⊂ {y} and {y} ⊂ {x}.

Generating principle for subsets: let S be a set, E be a property that concernsthe elements in S. Then

{x ∈ S | E(x)} := the set of all elements in S that have the property E.

Example: A = {x ∈ N | x is even } = {2, 4, 6, 8, . . .} ⊂ N.

Basic operations on sets. Let A, B ⊂ S.

A ∪B := {x ∈ S | x ∈ A or x ∈ B}is called the union of A and B. For A1, A2, A3, . . . ⊂ S⋃

i∈NAi = {x ∈ S | x ∈ Ai for some i ∈ N}

is called the union of the family {Ai}i∈N.

A ∩B := {x ∈ S | x ∈ A and x ∈ B}is called the intersection of A and B.⋂

i∈NAi := {x ∈ S | x ∈ Ai for all i ∈ N}

is called the intersection of the family {Ai}i∈N.

B \A := {x ∈ B | x /∈ A}is called the complement of A relative to B or B without A. In particular,

Ac := S \Ais called the complement of A in S.

The cartesian product of two sets M and N is defined as

M ×N := {(x, y) | x ∈M,y ∈ N}:= the set of all ordered pairs (x, y) with x ∈M and y ∈ N,

where the ordered pair (x, y) is defined by

(x, y) := {{x}, {x, y}} .Then

(x, y) = (x′, y′) ⇐⇒︸︷︷︸“equivalent”

x = x′ and y = y′.

Basics on functions: Let S, T be given sets.

A function (also called a map or mapping) f from S to T

f : S −→ T

is a rule that assigns to each x ∈ S a unique y ∈ T , denoted f(x). The rule issymbolically represented by x 7−→ f(x).

0.1. SETS AND FUNCTIONS (PREREQUISITES) 3

Example: For S, T ⊂ Z and f(x) = x2 := x · x one writes x 7−→ x2,

S is called the domain of f, T is called the target of f.

f(S) := {y ∈ T | ∃x ∈ S, f(x) = y} = {f(x) ∈ T | x ∈ S}is called the range or the image of f .

f is injective or one-to-one or an injection

:⇐⇒︸︷︷︸“equivalence by definition”

x1, x2 ∈ S and x1 6= x2 =⇒ f(x1) 6= f(x2)

⇐⇒︸︷︷︸equivalence by contraposition

A⇒B ⇐⇒ ¬B⇒¬A

x1, x2 ∈ S and f(x1) = f(x2) =⇒ x1 = x2

f is surjective or onto or a surjection

:⇐⇒ ∀y ∈ T ∃x ∈ S : f(x) = y

(“for all y ∈ T there is some x ∈ S with f(x) = y”).

f is bijective or a bijection :⇐⇒ f is one-to-one and onto

Example: f : S −→ f(S) is always surjective. If S ⊂ T , then

f : S −→ T

x 7−→ f(x) := x,

is injective, and

g : S −→ S

x 7−→ g(x) := x is bijective.

g is called the identity map on S and is denoted by idS , i.e. idS := g.

For f : S −→ T , and A ⊂ S, B ⊂ Tf(A) := {f(x) ∈ T | x ∈ A}

is called the image of A under f and

f−1(B) := {x ∈ S | f(x) ∈ B}is called the inverse image or the preimage of B under f .

Let f : S → T be a bijection. Then

∀y ∈ T ∃!︸︷︷︸“there exists a unique”

x ∈ S : f(x) = y.

Thus one can define its inverse function

f−1 : T −→ S

y 7→ f−1(y) := x, where x︸︷︷︸unique !

is such that f(x) = y.

Note the slight ambiguity in notation: the preimage f−1(B) exists for any functionf , i.e. it is defined even if f is not invertible (has no inverse function).

The inverse function f−1 is obviously again a bijection. Indeed, for any x ∈ S,

4 0. INTRODUCTION

∃y = f(x) ∈ T with f−1(y) = x, hence f−1 is surjective and given y, y ∈ T ,with f−1(y) = f−1(y), there exist x, x ∈ S with f(x) = y, f(x) = y, hencef−1(y) = x = x = f−1(y) and so y = y. Therefore f−1 is injective.

If f : S −→ T and g : T −→ U are functions, then

g ◦ f : S −→ U

x 7→ g ◦ f(x) := g(f(x)) (g “circle”f , or g “round”f)

is called the composition of g with f .

Sf−→ T

g−→ U, Sg◦f−→ U.

Example:f : S −→ T is bijective

=⇒ f−1 ◦ f = idS and f ◦ f−1 = idT .

Let f : S −→ T , A ⊂ S. The restriction of f to A is defined as

f |A : A −→ T, f |A(x) := f(x), x ∈ Af is then some extension of f |A to S.

Cardinality (notion of size for sets): Let A, B be sets.

A finite :⇐⇒{

either A = ∅, or∃n ∈ N with A = {a1, . . . , an}

A is infinite :⇐⇒ A is not finite

A and B have the same cardinality :⇐⇒ ∃ f : A −→ B, f bijection.

If A has the same cardinality as N, then A is called denumerable.

If A is either finite or denumerable, then A is called countable.

uncountable := not countable

Note: If f : A −→ B and g : B −→ C are bijections. Then g ◦ f : A −→ C is abijection.

Proof. a) Show that g ◦ f is injective. Let x, y ∈ A. Suppose g(f(x)) =g(f(y)). Then f(x) = f(y) since g is injective and f(x) = f(y) implies x = y sincef is injective. Thus g ◦ f is injective.b) Show that g ◦ f is surjective. Let z ∈ C be arbitrary. Since g is surjective:∃y ∈ B with g(y) = z and since f is surjective: ∃x ∈ A with f(x) = y. Therefore(

g ◦ f)(x) = g(f(x)) = g(y) = z,

and g ◦ f : A −→ C is surjective. �

Therefore,

A ∼ B :⇐⇒ A and B have same cardinality,

defines an equivalence relation ∼ on the class of all sets, i.e. for arbitrary setsA,B,C, we have

(1) A ∼ A (reflexivity).(2) A ∼ B ⇐⇒ B ∼ A (symmetry).

0.2. ADDITIONAL EXAMPLES 5

(3) A ∼ B and B ∼ C ⇒ A ∼ C (transitivity).

For a set A, we can then define its equivalence class

|A| := {C |C is a set such that C ∼ A}.Obviously, A ∼ B ⇐⇒ |A| = |B| and the class of all sets splits in disjoint equiva-lence classes. |A| is also called the cardinal number of the set A. For finite setsA the cardinal number |A| can be identified with the number of elements in A. Forinfinite sets and more on cardinal numbers see Remark 1.1.8 below.

Sequences: Let S 6= ∅. A sequence in S is a map f : N→ S.

Common notation:(f(n)

)n∈N or (xn)n∈N.

(yn)n∈N is a subsequence of (xn)n∈N, if

yk = xnk , ∀k ∈ N,

where

n : N −→ Nk 7−→ nk := n(k)

satisfies nk < nl ∀k < l.

0.2. Additional examples

1) Let Ai ⊂ S, i ∈ N. Then

(i)( ⋃i∈N

Ai)c

=⋂i∈N

Aci

(ii)( ⋂i∈N

Ai)c

=⋃i∈N

Aci

Proof. (i) ( ⋃i∈N

Ai)c

= S \⋃i∈N

Ai

= {x ∈ S | x /∈⋃i∈N

Ai}

= {x ∈ S | x /∈ Ai for all i ∈ N}= {x ∈ S | x ∈

⋂i∈N

Aci} =⋂i∈N

Aci

(ii) ( ⋂i∈N

Ai)c

= S \⋂i∈N

Ai

= {x ∈ S | x /∈⋂i∈N

Ai}

= {x ∈ S | x ∈ Aci for some i ∈ N}= {x ∈ S | x ∈

⋃i∈N

Aci} =⋃i∈N

Aci .

�

2) For A,B,C ⊂ S, we have

A ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C).

6 0. INTRODUCTION

Proof.

x ∈ A ∩ (B ∪ C) =⇒ x ∈ A and x ∈ B ∪ C=⇒ x ∈ A ∩B or x ∈ A ∩ C=⇒ x ∈ (A ∩B) ∪ (A ∩ C)

This shows A ∩ (B ∪ C) ⊂ (A ∩B) ∪ (A ∩ C). Now

x ∈ (A ∩B) ∪ (A ∩ C) =⇒ x ∈ A ∩B or x ∈ A ∩ C=⇒ x ∈ A and x ∈ B or x ∈ A ∩ C=⇒ x ∈ A and x ∈ B or x ∈ C=⇒ x ∈ A ∩ (B ∪ C)

�


A \ (B ∪ C) = (A \B) ∩ (A \ C).

Proof.

A \ (B ∪ C) = A ∩ (B ∪ C)c

=1)(i)

A ∩ (Bc ∩ Cc) = (A ∩Bc) ∩ Cc

= (A ∩Bc) ∩ (A ∩ Cc) = (A \B) ∩ (A \ C).

�


A \ (B ∩ C) = (A \B) ∪ (A \ C).

Proof.

A \ (B ∩ C) = A ∩ (B ∩ C)c =1)(ii)

A ∩ (Bc ∪ Cc)

=2)

(A ∩Bc) ∪ (A ∩ Cc) = (A \B) ∪ (A \ C).

�

5) Let M,N,K,L ⊂ S:

(i) M ×N ∩K × L = M ∩K ×N ∩ L .(ii) M ×N ∪K × L =?, can be simplified if M = K or N = L.

(a) M ×N ∪M × L = M ×N ∪ L.(b) M ×N ∪K ×N = M ∪K ×N .

0.2. ADDITIONAL EXAMPLES 7

Proof. without (draw a picture).

�

6) Let f : S → T be a map and Ai ⊂ S, Bi ⊂ T , i ∈ N. Then

(i) f−1(∪i∈NBi) = ∪i∈Nf−1(Bi).

(ii) f−1(∩i∈NBi) = ∩i∈Nf−1(Bi).

(iii) f−1(T \B1) = S \ f−1(B1).

(iv) f−1(B2 \B1) = f−1(B2) \ f−1(B1).

(v) f(∪i∈NAi) = ∪i∈Nf(Ai).

(vi) f(∩i∈NAi) ⊂ ∩i∈Nf(Ai).

(vii) A1 ⊂ f−1(f(A1)).

(viii) f(f−1(B1)) ⊂ B1.

Note: equality holds in (vi), (vii), if f is injective and equality holds in (viii), if fis surjective (see proof and exercises).

Proof. (i) We have

f−1(∪i∈NBi) = {x ∈ S | f(x) ∈ ∪i∈NBi} = {x ∈ S | f(x) ∈ Bi for some i ∈ N}and

∪i∈Nf−1(Bi) = ∪i∈N {x ∈ S | f(x) ∈ Bi} = {x ∈ S | f(x) ∈ Bi for some i ∈ N} .(ii) The proof is similar to (i).

8 0. INTRODUCTION

(iii) We havef−1(T \B1) = {x ∈ S | f(x) ∈ T \B1}

andS \ f−1(B1) = {x ∈ S | f(x) /∈ B1} = {x ∈ S | f(x) ∈ T \B1} .

(iv) follows from (ii) and (iii).

(v) We havef(∪i∈NAi) = {f(x) | x ∈ Ai for some i ∈ N}

and

∪i∈Nf(Ai) = ∪i∈N {f(x) | x ∈ Ai} = {f(x) | x ∈ Ai for some i ∈ N} .(vi) We have

f(∩i∈NAi) = {f(x) | x ∈ Ai for all i ∈ N} ⊂ {f(x) | x ∈ Ai} for any i ∈ N.Thus

f(∩i∈NAi) ⊂ ∩i∈N {f(x) | x ∈ Ai} = ∩i∈Nf(Ai).

(vii) We have f−1(f(A1)) = {x ∈ S | f(x) ∈ f(A1)} ⊃ A1. In particular, if f isinjective, then there cannot be any x ∈ Ac1 with f(x) ∈ f(A1). Therefore equalityholds in this case.

(viii) We have f(f−1(B1)) = f({x ∈ S | f(x) ∈ B1}) ⊂ B1. In particular, if f is sur-jective, then for any y ∈ B1 there exists x ∈ S with f(x) = y. Thus y ∈ f(f−1(B1)).Therefore equality holds in this case.

�

Counterexample to equality in (vi): Let

f : Z \ {0} −→ N,z 7−→ |z|.

andA1 := {1, 2, 3, . . .} , A2 := {−1,−2,−3, . . .} .

Thenf(A1 ∩A2) = ∅ $ f(A1) ∩ f(A2) = N.

Exercise: Find counterexamples to equality in (vii) and (viii).

CHAPTER 1

The real line and the Euclidean space

1.1. Ordered fields and cardinality

What are the real numbers (denoted by R) ?Intuitively clear, but we want to carry out the definition (and construction) in aformal way. First start with the

Field Axioms: Let F be a set whose elements will be called numbers. Con-sider the following axioms for this set:

I. Addition axioms: There is an addition operation

+ : F× F −→ F, (x, y) 7→ x+ y,

such that for all numbers x, y, z, the following holds:(1) x+ y = y + x (commutativity)(2) x+ (y + z) = (x+ y) + z (associativity)(3) ∃0 ∈ F with x+ 0 = x (zero)(4) ∀x ∈ F ∃ − x ∈ F with x+ (−x) = 0 (additive inverse)

The zero is also called “neutral element of addition”and usually one writes y−x :=y + (−x). (F,+) such that (1)-(4) hold, is called an Abelian group.

II. Multiplication axioms: There is a multiplication operation

· : F× F −→ F, (x, y) 7→ x · y, such that(5) x · y = y · x (one writes xy for short) (commutativity)(6) x(yz) = (xy)z (associativity)(7) ∃1 ∈ F \ {0} with 1 · x = x (unity)(8) ∀x ∈ F \ {0} ∃x−1 ∈ F \ {0} with x · x−1 = 1 (reciprocals)(9) x(y + z) = xy + xz (distributive law)

The unity is also called “neutral element of multiplication”and reciprocals are alsocalled “multiplicative inverses”. (F \ {0}, ·) is an Abelian group if (5)-(8) hold.

Any set F with operations + and · obeying the rules (1)-(9) is called a field.

Example: Q is a field, Z not.

From the field axioms one can deduce all usual calculation rules that we know fromschool (see for instance Propositions 1.14 - 1.16 in Rudin’s book and the other ref-erences), e.g.

Claim: x ∈ F =⇒ −(−x) = x

9

10 1. THE REAL LINE AND THE EUCLIDEAN SPACE

Proof. We have

(−x) + (−(−x)) = 0 by (4)

and

(−x) + x = x+ (−x) = 0 by (1) and (4).

=⇒ −(−x) and x are additive inverses of (−x).

If additive inverses in a field are unique, then we are done !Let y1, y2 be additive inverses to y ∈ F. Then

y1 =(3)

y1 + 0 =assumpt

y1 + (y + y2) =(2)

(y1 + y) + y2

=(1)

(y + y1) + y2 =assumpt

0 + y2 =(1)y2 + 0 =

(3)y2.

�

Or as another example:

Claim: x, y ∈ F: xy = 0 =⇒ x = 0 or y = 0, (“no divisors of zero”).

Proof. Suppose x 6= 0. Then

y =(7)

1 · y =(8),(5)

(x−1x)y =(6)x−1(xy) =

assumptx−1 · 0 =

(5)0 · x−1.

Thus we are done, if 0 · z = 0 for any z ∈ F. But this is true, see examples. �

The real numbers come with a natural ordering, we visualize them arranged ina line

Our goal is to characterize uniquely the real numbers, so we are led to introducethe

Order axioms:

III. Consider a set S with a relation “≤ ”, such that for all x, y, z ∈ S(10) x ≤ x (reflexivity)

(11) If x ≤ y and y ≤ x, then x = y (antisymmetry)

(12) If x ≤ y and y ≤ z, then x ≤ z (transitivity)

(13) We have x ≤ y or y ≤ x (linear or total ordering)

(14) x ≤ y =⇒ x+ z ≤ y + z (compatibility of ≤ and +)

(15) If 0 ≤ x and 0 ≤ y, then 0 ≤ xy (compatibility of ≤ and ·)

Remark. (i) The relations (10)-(12) are called partial ordering. A set S sat-isfying (10)-(12) is called partially ordered.(ii) (13) =⇒ (10) !(iii) (10)-(15) is fulfilled by N, Z, Q, R.(iv) A set satisfying (1)-(15) is called an ordered field.

Definition. y ≥ x :⇐⇒ x ≤ y and

y > x :⇐⇒ x < y :⇐⇒ x ≤ y and x 6= y︸︷︷︸:⇐⇒ x=y does not hold

1.1. ORDERED FIELDS AND CARDINALITY 11

1.1.1. Law of trichotomy. Let F be an ordered field, and x, y ∈ F. Thenexactly one of the relations

x < y, x = y, x > y holds.

Proof. Either x 6= y or x = y.

x 6= y =⇒(11)

x ≤ y and y ≤ x cannot hold at the same time.

But one inequality must hold by (13). This together with x 6= y gives either x < yor y < x. Therefore either x = y or x < y or y < x. �

Definition. The absolute value of a number x in an ordered field is definedby

|x| :={x if x ≥ 0−x if x < 0

|x− y| is called the distance between x and y.

For the absolute value the usual properties holds (e.g. |x| ≥ 0 ∀x, |x| =0 ⇐⇒ x = 0, |x + y| ≤ |x| + |y|, etc. see references or examples). In an orderedfield we can also do the usual manipulations in (in)equalities. We will assume thesehenceforth, e.g. (x+ y)2 = x2 + 2xy + y2 or

0 ≤ x < y =⇒ x2 < y2 (see references),

or2xy ≤ x2 + y2,

wherexn := x · ... · x︸︷︷︸

n-times

and

2 := 1 + 1

3 := 2 + 1

4 := 3 + 1, etc.

Well-ordering and induction:

The natural numbers N := {1, 2, 3, . . . } (and equally N ∪ {0} = {0, 1, 2, 3, . . . })can formally be introduced by the Peano axioms. We will not do this and assumethe natural numbers as to be “naturally” given. The fifth Peano axiom correspondsto the principle of complete induction. Assuming that N ⊂ R is “naturally” given(satisfying hence in particular (10)-(15)), we show in the next theorem that theprinciple of complete induction implies the well-ordering property.

Principle of complete induction for N: Let S ⊂ N, such that

(i) 1 ∈ S “base case”,

(ii) k ∈ S =⇒ k + 1 ∈ S “inductive step”.

Then S = N.

Well-ordering property for N (well-ordered by the relation “≤”): Let S ⊂ N.Then

S 6= ∅ =⇒ ∃k0 ∈ S, k0 ≤ k ∀k ∈ S.

(“every non-empty subset of N has a smallest element”)


We have:

1.1.1. Theorem. The principle of complete induction for N implies the well-ordering property for N.

Proof. (by contraposition of well-ordering) Suppose that S ⊂ N has no small-est element. Let T := N \ S. We have to show S = ∅, i.e. T = N. Let

T0 := {n ∈ N | {1, . . . , n} ⊂ T} (⊂ T ⊂ N)

So, if we can show T0 = N, then T = N as desired. We will show T0 = N bycomplete induction:

(i) “base case”: 1 ∈ T0, because

1 /∈ S︸︷︷︸otherwise 1 smallest element of S

=⇒ 1 ∈ T =⇒ 1 ∈ T0 (hence base case correct)

(ii) Suppose k ∈ T0 (“induction hypothesis”), then 1, . . . , k ∈ T . But k + 1 /∈S, otherwise k+ 1 would be the smallest element of S, since 1, . . . , k /∈ S.Hence k + 1 ∈ T , and so k + 1 ∈ T0 (“inductive step”). Thus by theprinciple of complete induction T0 = N.

�

Complete induction with respect to a subset of Z:

Let n0 ∈ Z, and A(n) be a property, given for every n ∈ S := {k ∈ Z | k ≥ n0}.If

(i) “Base case” : A(n0) holds

(ii) ∀k ∈ S : A(k) holds︸︷︷︸(IH): “induction hypothesis”

=⇒ A(k + 1) holds︸︷︷︸(IS): “inductive step”

Then A(n) holds for all n ∈ S.

Proof. Let

M := {m ∈ N | A(n0 − 1 +m) holds } ⊂ NThen 1 ∈ M by (i). Suppose m ∈ M , then k := n0 − 1 + m ≥ n0, i.e. k ∈ S andA(k) holds. Applying (ii), we then obtain that A(k + 1) = A(n0 − 1 + (m + 1))holds, i.e. m + 1 ∈ M . Therefore, M = N by the principle of complete induction,i.e. A(n) holds for any n ∈ S. �

Example (generalized triangle inequality in an ordered field F):

∀k ∈ N : x0, . . . , xk ∈ F =⇒ |x0 + . . . ,+xk| ≤ |x0|+ · · ·+ |xk|

Proof. (k = 0 : |x0| ≤ |x0| by reflexivity of “≤”). Base case k = 1: x0, x1 ∈F =⇒ |x0 + x1| ≤ |x0|+ |x1| holds by the triangle inequality.(IS) k → k + 1: Suppose that for some k ∈ N:

x0, . . . , xk ∈ F =⇒ |x0 + · · ·+ xk| ≤ |x0|+ · · ·+ |xk| (IH).

Then for arbitrary xk+1 ∈ F|x0 + · · ·+ xk+1| = |(x0 + · · ·+ xk︸︷︷︸

∈F

) + xk+1︸︷︷︸∈F

| ≤triangleinequ.

|x0 + · · ·+ xk|+ |xk+1|

≤(IH)

|x0|+ · · ·+ |xk|+ |xk+1|.

�

1.1. ORDERED FIELDS AND CARDINALITY 13

1.1.2. Theorem. An infinite subset A of a denumerable set B is denumerable.

Proof. By assumption ∃ bijection f : N −→ B. Consider the list

f(1), f(2), f(3), . . .

Each element of A appears exactly once in the list, since f is bijective.

Let

n1 := smallest element of f−1(A), (∃ by well-ordering, since f−1(A) 6= ∅).

Then f(n1) is the first element of A in the above list.

For k ∈ N, k ≥ 2, let

nk := smallest element of f−1(A) \ {n1, . . . , nk−1}Again, ∃nk by the well-ordering property since A is infinite and f(nk) = k-thelement of A in the list. Then

f(n·) : N −→ A, k 7→ f(nk) is a bijection.

�

1.1.3. Theorem. If h : N −→ B is surjective and B is infinite, then B isdenumerable.

Proof. ∀x ∈ B: h−1({x}) =: Ax ⊂ N and Ax 6= ∅ since h is surjective.Hence by the well-ordering property Ax has a smallest element, say ax. Define

i : B −→ N,x 7→ ax.

Then i is injective, since

x 6= y ⇒ Ax ∩Ay = h−1({x} ∩ {y}) = ∅⇒ ax 6= ay.

Now

i(B) ⊂ N, i(B) infinite =⇒Theorem 1.1.2

i(B) denumerable

=⇒i:B→i(B)bijection

B denumerable.

�

1.1.4. Theorem. N× N is denumerable.

Proof. f : N × N −→ N, (k, n) 7→ (k+n−1)(k+n−1)2 + n is a bijection (see

Assignment no.1). �

1.1.5. Corollary. The denumerable union of denumerable sets is denumer-able.

Proof. Let (Ai)i∈I be a family of denumerable sets Ai where I is denumerable.We have to show ⋃

i∈IAi is denumerable.

I denumerable ⇒ ∃ bijection f : N −→ I, hence⋃i∈I

Ai =⋃k∈N

Af(k).

Since each Af(k), k ∈ N is denumerable, we can write

Af(k) = {akn | n ∈ N}, k ∈ N.


Now define

g : N× N −→⋃k∈N

Af(k),

(k, n) 7→ akn.

Then g is surjective. Hence by Theorem 1.1.4

∃ surjection h : N −→⋃k∈N

Af(k).

Since⋃k Af(k) is infinite the statement now follows from Theorem 1.1.3. �

1.1.6. Proposition. Q is denumerable.

Proof. Clearly Z is denumerable. Hence

Q =⋃q∈N

{p

q| p ∈ Z

}is denumerable by Corollary 1.1.5. �

1.1.7. Theorem (Cantor). Let M be any set. There exists no surjection formM onto P(M) := {A | A ⊂M}.

Proof. For M = ∅ (we have P(∅) = {∅}) and it is obviously true. Let henceM 6= ∅. Suppose there exists a surjection

ϕ : M −→ P(M)

x 7→ ϕ(x) ⊂M.

DefineA := {x ∈M | x /∈ ϕ(x)} ⊂M =⇒ A ∈ P(M).

Since ϕ is surjective∃y ∈M ϕ(y) = A.

Nowy ∈ A =⇒ y /∈ ϕ(y) = A

andy /∈ A =⇒ y ∈ ϕ(y) = A .

�

1.1.8. Remark. It follows directly from Theorem 1.1.7, that P(N) is uncount-able. Actually, P(N) has the same cardinality as R, i.e. |P(N)| = |R|. Theorem1.1.7 also tells us, that we can always build a strictly bigger infinite set than anygiven one, so there is no limit for the notion of infinity of sets. For a finite setits cardinal number is given by the number of its elements (see Section 0.1). Ingeneral, we define for two sets A,B and their cardinal numbers |A|, |B|:

|A| ≤ |B| :⇐⇒ there is an injection f : A→ B.

and|A| < |B| :⇐⇒ |A| ≤ |B| and |A| 6= |B|.

Using the axiom of choice one can show that for any two sets A,B, we have

|A| ≤ |B| or |B| ≤ |A|.The first (smallest) infinite cardinal number is denoted by ℵ0 (aleph-naught) andit holds ℵ0 := |N|. Indeed, if A is any infinite set then there exists an injectionf : N→ A, hence |N| ≤ |A|. One says N has ℵ0 elements.The cardinal number of R is denoted by c (like continum) and it holds c = |P(N)|.The second infinite cardinal number is ℵ1 (aleph one), i.e. ℵ0 < ℵ1 and for any

1.2. COMPLETENESS AND THE REAL NUMBER SYSTEM 15

three sets A,B,C such that A ⊂ B ⊂ C and ℵ0 = |A|, ℵ1 = |C| it must hold|B| = ℵ0 or |B| = ℵ1.The Continum Hypothesis states that if N ⊂ S ⊂ R, then S has either thecardinality of N or the one of R. In other words ℵ1 = |P(N)|. The continumhypothesis is still a “mistery”in mathematics in the sense that it can neither beproved nor disproved on the fundament of reasonable set theories. Correspondingly,there is a Generalized Continuum Hypothesis stating that

ℵn = |P(...P(N))︸︷︷︸n-times

|, n ≥ 1.

For later purpose we show the following:

1.1.9. Lemma.√

2 /∈ Q, i.e. there is no r ∈ Q with r2 = 2. In particular,R 6= Q.

Proof. Suppose ∃r = pq , p, q ∈ N with r2 = 2. If p and q are even, we may

simplify the expression pq until either p or q is odd. Thus

r2 = 2 =⇒ p2 = 2q2 =⇒ p2 is even =⇒(?)

p is even .

(?) follows by contraposition, since: p odd =⇒ p2 odd, because (2n + 1)2 =4n2 + 4n︸︷︷︸

even

+1 is odd.

Thus p = 2p′ for some p′ ∈ N and so 2q2 = 4(p′)2. Theorefore q2 is even, hence qis even. �

1.2. Completeness and the real number system

1.2.1. Definition. Let (xn)n∈N be a sequence in an ordered field F. (xn)n∈Nis said to converge to a limit x ∈ F, if

∀ε > 0 ∃N = N(ε) ∈ N : |xn − x| < ε, ∀n ≥ N.

In this case we say that (xn)n∈N is convergent, and wirte

limn→∞

xn = x or xn −→n→∞

x, or simply limxn = x or xn −→ x.

Note: N = N(ε) means that N depends on ε.

Remark. |xn−x| < ε ⇐⇒ x− ε < xn < x+ ε (“xn” is in an ε-neighborhoodof x”)

1.2.2. Proposition. If the limit of a sequence in an ordered field exists, thenit is unique, i.e.

xn −→ x and xn −→ y =⇒ x = y.

Proof. Suppose |x− y| 6= 0. For ε := |x−y|2 choose N1(ε), N2(ε), such that

|xn − x| < ε, ∀n ≥ N1(ε),

|xn − y| < ε, ∀n ≥ N2(ε).

Let N := max(N1(ε), N2(ε)). Then ∀n ≥ N .

|x− y| = |x− xn + xn − y| ≤ |x− xn|+ |xn − y| < 2ε = |x− y| .

�


Definition. A sequence (xn)n∈N in an ordered field F is called

bounded :⇐⇒ ∃M ∈ F,M ≥ 0 : |xn| ≤M ∀n ∈ N,bounded above :⇐⇒ ∃M ∈ F : xn ≤M ∀n ∈ N,bounded below :⇐⇒ ∃M ∈ F : xn ≥M ∀n ∈ N.

1.2.3. Proposition. A convergent sequence is bounded.

Proof. Let x := limxn. For ε = 1, ∃N(ε) ∈ N with |xn − x| < 1, ∀n ≥ N(ε).Thus

|xn| = |xn − x+ x| ≤ |xn − x|+ |x| < 1 + |x|, ∀n ≥ N(ε).

Therefore

|xn| ≤ max(

max(|x1|, . . . , |xN(ε)|), 1 + |x|), ∀n ∈ N.

�

1.2.4. Theorem (Limit theorem for sequences). Suppose xn −→ x and yn −→y. Then:

(1) xn + yn −→ x+ y,(2) xnyn −→ xy,(3) If y 6= 0, then ∃n0 ∈ N: yn 6= 0 ∀n ≥ n0 and

limn→∞n≥n0

xnyn

=x

y,

(4) xn ≤ yn ∀n ≥ n0 for some n0 ∈ N =⇒ x ≤ y.

1.2.5. Remark. Choosing first xn ≡ a, ∀n ≥ n0, and then yn ≡ b, ∀n ≥ n0, weimmediately get from (4):

a, b ∈ F, a ≤ zn ≤ b, ∀n ≥ n0 =⇒ a ≤ lim zn ≤ b.

Proof of Theorem 1.2.4. (1) Let ε > 0 be arbitrary.Then ∃N1(ε), N2(ε) ∈ N with |xn − x| < ε

2 , ∀n ≥ N1(ε) and |yn − y| < ε2 ,

∀n ≥ N2(ε).Thus ∀n ≥ max

(N1(ε), N2(ε)

)=: N(ε),

|xn + yn − (x+ y)| ≤ |xn − x|+ |yn − y| < ε.

(2) Proposition 1.2.3 =⇒ ∃K > 0 with |xn| ≤ K, ∀n ∈ N. For arbitrary ε > 0,∃N1, N2 ∈ N with

|xn − x| <ε

K + |y|, ∀n ≥ N1 and |yn − y| <

ε

K + |y|, ∀n ≥ N2.

Hence ∀n ≥ max(N1, N2)

|xnyn − xy| = |xn(yn − y) + (xn − x)y| ≤ |xn||yn − y|+ |xn − x||y|

≤ Kε

K + |y|+

ε

K + |y||y| = ε.

(3) If y 6= 0, then for ε1 := |y|2 , ∃N(ε1) such that

|yn − y| <|y|2, ∀n ≥ N(ε1).


In particular, ∣∣|yn| − |y|∣∣ ≤ |yn − y| < |y|2, ∀n ≥ N(ε1),

hence

|yn| > |y| −|y|2

=|y|2> 0, ∀n ≥ N(ε1) =: n0 (1.1)

Because of (2) it is enough to show that

limn→∞

n≥N(ε1)

1

yn=

1

y.

By assumption: ∀ε > 0, ∃N ∈ N with |yn − y| < ε|y|22 , ∀n ≥ N . Hence ∀n ≥

max(N,N(ε1))∣∣∣∣ 1

yn− 1

y

∣∣∣∣ =

∣∣∣∣y − ynyny

∣∣∣∣ =1

|yn||y|· |y − yn| <

(1.1)

2

|y|2ε|y|2

2= ε.

(4) suppose x > y. Then x > y +x− y

2︸︷︷︸=:ε

.

Since (yn)n∈N is convergent to y: ∃N(ε) ≥ n0, yn < y + ε, ∀n ≥ N(ε).But y + ε = x+y

2 = x− ε, hence xn ≤ yn < x− ε and so

x− xn > ε, ∀n ≥ N(ε), i.e. (xn)n∈N does not converge to x .

�

1.2.6. Definition. A sequence (xn)n∈N in an ordered field is called:

increasing :⇐⇒ xn ≤ xn+1, ∀n ∈ N,strictly increasing :⇐⇒ xn < xn+1, ∀n ∈ N,

decreasing :⇐⇒ xn ≥ xn+1, ∀n ∈ N,strictly decreasing :⇐⇒ xn > xn+1, ∀n ∈ N.

(xn)n∈N is called monotone, if one of these conditions holds.

1.2.7 (Monotone sequence property). Let F be an ordered field. F hasthe monotone sequence property (MSP), if every increasing sequence in F that isbounded above converges to a limit in F.

1.2.8. Definition. An ordered field is said to be complete if it has the MSP.

Archimedian property: An ordered field F is called Archimedian, if one ofthe following equivalent properties holds:

(i) x ∈ F =⇒ ∃n ∈ N with x < n.(ii) x, y ∈ F and x > 0 =⇒ ∃n ∈ N with y < nx.

(iii) x ∈ F and x > 0 =⇒ ∃n ∈ N with 0 < 1n < x.

Remark. We (can) always assume N ⊂ F by the identification

1 + · · ·+ 1 ∈ F←→ n ∈ N.

1.2.9. Example. In an complete ordered field define inductively (xn)n∈N by :

x0 = 0, xn :=√

2 + xn−1, n ≥ 1.

Note: The positive square root√a := b ∈ F with b > 0 and b2 = a exists ∀a > 0

(see Assignment no.2). Show that xn −→n→∞

2.


Proof. First show (xn)n∈N is increasing and bounded above:Claim: xn−1 ≥ 0 and rn−1 := xn − xn−1 ≥ 0, ∀n ∈ N.proof by induction: n = 1, clear.n− 1→ n: Suppose it is true for n− 1, i.e. xn−1 ≥ 0 and rn−1 ≥ 0. Then

xn =√

2 + xn−1 ≥√

2 > 0 and

rn =√

2 + xn −√

2 + xn−1

=xn − xn−1√

2 + xn +√

2 + xn−1≥ 0.

The claim is hence shown.Now show xn ≤ 2, ∀n ∈ N ∪ {0} by induction: for n = 0, 1 it holds.

n− 1→ n: suppose xn−1 ≤ 2, then xn =√

2 + xn−1 ≤√

4 = 2.Thus this is also true.Since F is complete (xn)n∈N is convergent, say limxn = x. Then

x = limn→∞

xn = limn→∞

√2 + xn−1 =

√2 + x,

since by the limit theorem for sequences 1.2.4

(lim√

2 + xn−1)2 = lim(2 + xn−1) = 2 + limxn−1 = 2 + x.

Thus x2 = 2 + x. The positive solution of this equation is x = 2, hence

limn→∞

xn = 2.

�

1.2.10. Proposition. A complete ordered field F is Archimedian.

Proof. Let x ∈ F. We have to show: ∃n ∈ N with x < n. Suppose thisis wrong. Then xn := n ≤ x, ∀n ∈ N. By the MSP, (xn)n∈N converges, sayy := limn→∞ xn. Then ∀ε > 0, ∃N(ε) ∈ N with

|xn − y| < ε, ∀n ≥ N(ε).

But

1 = |N(ε) + 1−N(ε)| ≤ |N(ε) + 1︸︷︷︸=xN(ε)+1

−y|+ |y − N(ε)︸︷︷︸=xN(ε)

| < 2ε (choose ε =1

2).

�

1.2.11. Example. In a complete ordered field 1n −→ 0.

Proof. By the Archimedian property (iii): ∀ε > 0, ε ∈ F, ∃N(ε) ∈ N with

1

N(ε)< ε.

Consequently, ∣∣∣∣ 1n − 0

∣∣∣∣ =1

n≤ 1

N(ε)< ε, ∀n ≥ N(ε).

�

1.2.12. Example. By the limit theorem for sequences and Example 1.2.11

limn→∞

3n2 + 13n

n2 − 2= limn→∞

3 + 13n

1− 2n2

=3 + 13 lim 1

n

1− 2 lim 1n lim 1

n

= 3.

1.2.13. Theorem. There is a unique complete ordered field R, called real num-bers.


1.2.14. Remark. (i) The uniqueness in 1.2.13 is meant in the following way:Given two complete ordered fields F1, F2. Then

∃f : F1 −→ F2, f bijection with(a) f(x+ y) = f(x) + f(y)(b) f(xy) = f(x)f(y)(c) x ≤ y ⇒ f(x) ≤ f(y)

∀x, y ∈ F1.

f is called an order preserving isomorphism.

(ii) The proof of uniqueness in (i) is difficult and we will not present it here. Butlater, we may show the existence of a complete ordered field.

1.2.15. Proposition. Q is dense in R, that is:(i) x, y ∈ R and x < y =⇒ ∃r ∈ Q with x < r < y.(ii) x, ε ∈ R and ε > 0 =⇒ ∃r ∈ Q with |x− r| < ε.

Proof. (i) x < y =⇒ y − x > 0 =⇒Arch. prop.(iii)

0 < 1n < y − x for some n ∈ N.

Arch. prop (ii) =⇒ ∃k ∈ N with k · 1n > x. By the well-ordering property there is

a smallest such k ∈ N.Assume x > 0, then

k − 1

n≤ x < k · 1

n. (?)

(For k = 1 the first inequality is only true if x ≥ 0.) Thus

x <(?)

k − 1

n+

1

n︸︷︷︸=:r

≤(?)

x+1

n< y. (??)

If x ≤ 0, then by Arch. prop (i): ∃k ∈ N, |x| < k, hence 0 < x+ k < y + k and by(??)

∃r′ ∈ Q with x+ k < r′ < y + k,

hence

x < r′ − k︸︷︷︸=:r

< y.

(ii) Let x, ε ∈ R, ε > 0. Then x < x+ ε and so by (i): ∃r ∈ Q with

( x− ε < ) x < r < x+ ε =⇒ |x− r| < ε.

�

For x, y ∈ R, define

]x, y[:= {a ∈ R | x < a < y}.

1.2.16. Theorem. The open unit interval ]0, 1[⊂ R is uncountable.


Proof. (Cantor) By 1.2.15(i), ]0, 1[ cannot be finite (∃r0 ∈]0, 1[, ∃r1 ∈]0, r0[, . . . ).Suppose ]0, 1[ were denumerable. Let

x1, x2, x3, . . . , be an enumeration of ]0, 1[.

We write each xk in a decimal expansion (this is possible, see later). Repeating“9’s”are chosen by preference over terminating decimal expansions, e.g we write

0, 249999... instead of 0, 25 (→ uniqueness of the representation.)

By this we obtain a list with each aij ∈ {0, . . . , 9}, i.e. xi =∑∞j=1 aij · 10−j , i ∈ N:

x1

x2

x3

...xk...

∣∣∣∣∣∣∣∣∣∣∣∣∣∣

0, a11 a12 a13 a14 . . .0, a21 a22 a23 a24 . . .0, a31 a32 a33 a34 . . .

...0, ak1 ak2 ak3 ak4 . . .

...

Let x = 0, b1b2b3b4 . . . such that for any k ∈ N :

bk = 5 if akk 6= 5,

bk = 7 if akk = 5.

Then x ∈]0, 1[, but x is not in the list.

�

1.2.17. Corollary. x, y ∈ R, x < y =⇒ ]x, y[∩Q is denumerable, and]x, y[∩ (R \Q) is uncountable.

Proof. By 1.2.15(i), ]x, y[∩Q is infinite, hence denumerable by Theorem 1.1.2of Section 1.1. Since f : ]0, 1[−→ ]x, y[ , s 7→ f(x) := x+(y−x) ·s is a bijection and]0, 1[ is uncountable by 1.2.16, ]x, y[ is also uncountable. Now suppose ]x, y[∩ (R\Q)is countable. Then

]x, y[= ]x, y[∩Q︸︷︷︸denumerable

∪ ]x, y[∩(R \Q)︸︷︷︸countable

is denumerable .

�

1.2.18. Example (Unboundedness of the harmonic series). The sequence

x1 := 1, xn := 1 +1

2+ · · ·+ 1

n, n ∈ N, n ≥ 2, is unbounded.

(xn)n∈N is called the harmonic series (since the sequence members are given assums).

Proof.

x2 = 1 +1

2

x4 = 1 +1

2+

1

3+

1

4︸︷︷︸> 2

4>12

> 1 +1

2+

1

2

x8 = 1 +1

2+

1

3+

1

4︸︷︷︸> 1

2

+1

5+

1

6+

1

7+

1

8︸︷︷︸> 4

8 = 12

> 1 +1

2+

1

2+

1

2

......

By induction:

x2n > 1 +n

2, ∀n ∈ N (here 2n := 2 · ... · 2︸︷︷︸

n-times

).

Hence (xn)n∈N is unbounded. �

1.3. LEAST UPPER BOUNDS 21

In particular ∃xn ↗↗ (in words: there exists a strictly increasing sequence(xn)n∈N), with |xn+1 − xn| < 1

n , ∀n ≥ 1, but (xn)n∈N does not converge.

1.3. Least upper bounds

The completeness of an ordered field can be described through several equiva-lent properties. One is the MSP (see 1.2.7). We will see that the MSP is equivalentto the LUBP (least upper bound property) in Theorem 1.3.6 and to the Cauchy-completeness together with the Archimedian property in Section 1.4.

Throughout Section 1.3, if not otherwise stated, we fix an ordered field F andS ⊂ F, S 6= ∅.

1.3.1. Definition. b ∈ F is called an upper bound for S, if x ≤ b, ∀x ∈ S.In this case, we also write b ≥ S. An upper bound b for S is called least upperbound for S, if b ≤ b′ for any upper bound b′ for S.

S is called bounded above, if there exists some upper bound for S.

Remark. Least upper bounds are unique.

Expressions and notations: A least upper bound (lub) is also called supremum.For the least upper bound of S one writes

lub(S), lubS, sup(S), supS.

If S is not bounded above, then

lub(S) = sup(S) := +∞

Example. sup{1− 1n | n ∈ N} =

1.2.111 /∈ S, sup(N) = +∞.

1.3.2. Proposition. Let b ∈ F.

lub(S) = b ⇐⇒b is an upper bound for S, and∀ε > 0, ∃x ∈ S with x > b− ε︸︷︷︸

“∀ε>0, b−ε is not an upper bound for S”

Proof. “=⇒”: Let b = lub(S). Then b is an upper bound for S. Suppose∃ε > 0, such that ∀x ∈ S, we have x ≤ b− ε. Then b− ε is an upper bound for Sand b− ε < b. Thus b 6= lub(S) .

“⇐=”: Suppose there is an upper bound b′ for S with b′ < b. Then b′ = b− ε(with ε = b−b′ > 0) is an upper bound for S. But this is impossible by assumption.Hence b′ ≥ b for any upper bound b′ for S, i.e. lub(S) = b.

�

Definition. Let b ∈ F. Then:

b is a lower bound for S :⇐⇒ b ≤ x, ∀x ∈ S.

b is the greatest lower bound for S :⇐⇒ b is a lower bound for S, and b ≥ b′for any lower bound b′ for S.

S is bounded below :⇐⇒ ∃ lower bound for S.

Remark. Greatest lower bounds are unique.


Expressions and notations: A greatest lower bound (glb) is also called infimum.For the greatest lower bound of S one writes

glb(S), glbS, inf(S), inf S.

If S is not bounded below, then

glb(S) = inf(S) := −∞( with the convention

−∞<x, ∀x∈Fsee also Section 1.5

).

For the glb a similar statement to 1.3.2 holds.

1.3.3. Proposition. Let S ⊂ T ⊂ F. Then

inf(T ) ≤ inf(S) ≤ sup(S) ≤ sup(T ).

Proof. Clear: Use the convention −∞ < x < +∞, ∀x ∈ F. �

1.3.4. Theorem. In R (or equivalently in any complete ordered field F), itholds:

(i) Least upper bound property (LUBP): Every non-empty subset S ofR which is bounded above has a lub (its supremum) in R.

(ii) Greatest lower bound property (GLBP): Every non-empty subset Tof R which is bounded below has a glb (its infimum) in R.

Proof. (i) Since S 6= ∅, there is x ∈ S. For k ∈ N, let Nk be the smallestnatural number with

yk := x+Nk2k≥ S.

Nk exists by well-ordering and the Archimedian property since S is bounded above.Now:

∀k ∈ N, x+2Nk2k+1

= x+Nk2k≥ S =⇒ Nk+1 ≤ 2Nk =⇒ yk ↘

and

yk ↘ and yk ≥ x, ∀k ∈ N =⇒MSP

∃y := lim yn ∈ R.

Want to show: y = lub(S) (for this, we will use 1.3.2).

Claim 1: y ≥ S (“y is an upper bound for S”).

Indeed, for any z ∈ S: yk ≥ z, ∀k ∈ N=⇒

1.2.4(4)y = limk→∞ yk ≥ z, ∀z ∈ S =⇒ y ≥ S.

Claim 2: “∀ε > 0, y − ε is not an upper bound for S”.

Let ε > 0. Since 12k−→ 0 there exists k ∈ N with 1

2k< ε. Then

y − ε ≤yk↘y

yk − ε < yk −1

2k= x+

Nk − 1

2k

< z for some z ∈ S︸︷︷︸by definition of Nk

if Nk ≥ 2

= x ∈ S if Nk = 1

=⇒ y − ε < z for some z ∈ S, i.e. y − ε is not an upper bound for S.

Claim 1 + Claim 2 =⇒1.3.2

y = lub(S).

(ii) Let −T := {−x | x ∈ T}. Then −T is a non-empty subset of R which is boundedabove. Thus sup(−T ) ∈ R by (i). But then also inf(T ) = − sup(−T ) ∈ R and (ii)is proved. �

1.4. CAUCHY SEQUENCES AND CLUSTER POINTS 23

1.3.5. Remark. b ≤ x, ∀x ∈ T︸︷︷︸b lower bound for T

⇐⇒ −b ≥ −x, ∀x ∈ T︸︷︷︸−b upper bound for −T

Hence,

b+ ε not lower bound for T ⇐⇒ −b− ε not upper bound for −T ,

b = inf(T ) ⇐⇒ −b = sup(−T ).

1.3.6. Theorem. Let F be an ordered field. Then:

F has the LUBP ⇐⇒ F has the MSP (⇐⇒ : F is complete)m

F has the GLBP.

Proof. The equivalence “m” is clear from Remark 1.3.5. “⇐” follows from1.3.4(i). “⇒” is shown in 1.3.7 below. �

Remark. Because of the above theorem the LUBP is often also called thecompleteness axiom.

1.3.7. LUBP =⇒ MSP: Let (xn)n∈N be an increasing sequence in F that isbounded above. Set T := {x1, x2, . . . } and assume the LUBP. Then ∃ limxn =sup(T ) ∈ F.

Proof. Since T 6= ∅ and T is bounded above, ∃ sup(T ) ∈ F by the LUBP. Letε > 0: By 1.3.2, sup(T )− ε is not an upper bound for T . Thus ∃N(ε) ∈ N with

sup(T )− ε < xN(ε) ≤ xn ≤ sup(T ) < sup(T ) + ε, ∀n ≥ N(ε),

i.e.

|xn − sup(T )| < ε, ∀n ≥ N(ε).

�

1.4. Cauchy sequences and cluster points

All sequences that we consider in Sections 1.4 and 1.5 are sequences in R.

1.4.1. Definition. A sequence (xn)n∈N of real numbers is called a Cauchysequence, if for all ε > 0 there exists an N(ε) ∈ N such that

|xn − xm| < ε, ∀n,m ≥ N(ε).

1.4.2. Proposition. (xn)n∈N is convergent =⇒ (xn)n∈N is a Cauchy sequence.

Proof. (xn) convergent =⇒ ∃x ∈ R with limxn = x. Then: ∀ε > 0, ∃N(ε) ∈N, with |xn − x| < ε

2 , ∀n ≥ N(ε). Thus

|xn − xm| ≤ |xn − x|+ |x− xm| < ε, ∀n,m ≥ N(ε).

�

The converse of Proposition 1.4.2 is:

1.4.3. Theorem. R is (Cauchy-)complete, i.e. every Cauchy sequence in R isconvergent and has a limit in R.

For the proof of Theorem 1.4.3, we will need some preparations.

1.4.4. Proposition. (xn)n∈N converges to x ⇐⇒ every subsequence (xnk)k∈Nof (xn)n∈N converges to x.


Proof. “⇒”: ∀ε > 0, ∃N(ε) ∈ N, with |xn−x| < ε, ∀n ≥ N(ε). Let (xnk)k∈Nbe any subsequence. ∃K(ε) ∈ N, with nk ≥ N(ε), ∀k ≥ K(ε). Hence

|xnk − x| < ε ∀k ≥ K(ε).

“⇐”: Clear. �

1.4.5. Definition. x ∈ R is called a cluster point (or accumulation point)of the sequence (xn)n∈N, if for every ε > 0, there are infinitely many n ∈ N with|x− xn| < ε.(In other words: ∀ε > 0: xn ∈ (x − ε, x + ε) for infinitely many n ∈ N, or ∀ε > 0and n0 ∈ N: ∃n ∈ N, with n ≥ n0 and |xn − x| < ε.

1.4.6. Proposition. Let (xn)n∈N be a sequence in R and x ∈ R. Then:

x is an accumulationpoint of (xn)n∈N

⇐⇒ ∃ subsequence (xnk)k∈N of (xn)n∈Nwith xnk −→ x

Proof. “⇐=”: ∀ε > 0, ∃N(ε) ∈ N, with |x − xnk | < ε, ∀k ≥ N(ε). Thusxn ∈]x− ε, x+ ε[ for infinitely many n ∈ N.

“=⇒”: x is an accumulation point of (xn)n∈N =⇒ ∃n1 ∈ N, with |x − xn1| < 1.

Assume n1, . . . , nk ∈ N are already chosen, then choose nk+1 ∈ N with nk+1 ≥ nk+1such that |x−xnk+1

| < 1k+1 . In this way we have inductively defined a subsequence

of (xn)n∈N that clearly converges to x. (For ε > 0 choose N(ε) ∈ N, such that1

N(ε) < ε. Then |x− xnk | < 1k ≤

1N(ε) < ε, ∀k ≥ N(ε)). �

Example.

sequence cluster points explanation

xn := (−1)n, n ∈ N 1,−1 x2n −→ 1, x2n+1 −→ −1

xn := (−1)n + 1n , n ∈ N 1,−1 x2n −→ 1, x2n+1 −→ −1

xn := n, n ∈ N noneevery subsequence is unbounded,thus not convergent

xn :=

{n if n is even1n if n is odd

0

x2n+1 −→ 0(exactly one cluster point,

but not convergent !)

1.4.7. Theorem (Bolzano-Weierstrass). Let (xn)n∈N be a bounded sequence inR. Then (xn)n∈N has at least one cluster point in R. Every cluster point of (xn)n∈Nlies in the interval [inf{x1, x2, x3, . . . }, sup{x1, x2, x3, . . . }].

Here for x, y ∈ R[x, y] := {a ∈ R | x ≤ a ≤ y}.

Proof. For N ∈ N let

yn := inf{xm | m ≥ n} (∈ R, ∃ because of the GLBP).

Then ∀n ∈ N: yn ≤ yn+1 and

yn ≤ sup{xm | m ∈ N} (∈ R, ∃ because of the LUBP).

Thus, by the MSP ∃y := lim yn ∈ R (in fact y = lim infn→∞ xn, see later).

1.4. CAUCHY SEQUENCES AND CLUSTER POINTS 25

Claim: y is a cluster point of (xn)n∈N (in fact it is the smallest).Indeed, let ε > 0, n0 ∈ N, be arbitrary. Then: ∃N(ε) ∈ N, N(ε) ≥ n0 with

|y − yn| <ε

2, ∀n ≥ N(ε),

and ∃m ∈ N, m ≥ N(ε) (≥ n0), with

|xm − yN(ε)| <ε

2

(otherwise yN(ε) 6=inf{xm|m≥N(ε)}, because

yN(ε)+ε4 then also lower bound

).

Thus for any ε > 0 and n0 ∈ N, ∃m ≥ n0 with

|xm − y| ≤ |xm − yN(ε)|+ |yN(ε) − y| < ε

and so y is a cluster point of (xn)n∈N (cf. explanations to Definition 1.4.5).Now, we show the remaining statement: Let x ∈ R be a cluster point of (xn)n∈N.By 1.4.6, ∃xnk −→ x. But

inf{xn | n ∈ N} ≤ xnk ≤ sup{xn | n ∈ N} ∀k ∈ N,

implies

inf{xn | n ∈ N} ≤ x ≤ sup{xn | n ∈ N}by Remark 1.2.5. �

1.4.8. Theorem. Every bounded sequence of real numbers has a convergentsubsequence.

Proof. This follows immediately from Bolzano-Weierstrass and 1.4.6. �

1.4.9. Remark.

xn −→ x ∈ R ⇐⇒ (xn)n∈N is bounded and x ∈ R isthe sole accumulation point of (xn)n∈N

Proof. “⇒”: If xn −→ x. Then xnk −→ x for any subsequence by Proposition1.4.4. Theorefore, x is the sole accumulation point by 1.4.6. Furthermore (xn)n∈Nis bounded by Proposition 1.2.3.“⇐”: Suppose (xn)n∈N is bounded and there is only one cluster point x ∈ R. If(xn)n∈N does not converge to x, then ∃ε > 0 such that ∀N ∈ N: |x − xn| ≥ ε forsome n ≥ N . Thus by induction, we can find a subsequence (xnk)k∈N of (xn)n∈Nwith

|x− xnk | ≥ ε, ∀k ∈ N. (?)

Since (xnk)k∈N is bounded, (xnk) has a cluster point y ∈ R by Bolzano-Weierstrass.y is then also a cluster point of (xn)n∈N and so y = x. But by (?), x cannot be acluster point of (xnk). Hence (xn)n∈N must converge to x. �

1.4.10. Lemma. (xn)n∈N is a Cauchy sequence =⇒ (xn)n∈N is bounded.

Proof. ∃N ∈ N, with |xn − xm| < 1, ∀n,m ≥ N . Thus |xn| ≤ |xn − xN | +|xN | < 1 + |xN |, ∀n ≥ N , and so

|xn| ≤ max (max(|x1|, . . . , , |xN−1|), 1 + |xN |) , ∀n ∈ N.

�

1.4.11. Lemma. Let (xn)n∈N be a Cauchy sequence and (xnk)k∈N a subsequencewith xnk −→ x. Then xn −→ x.

Proof. Let ε > 0.

(xn) Cauchy =⇒ ∃N(ε) ∈ N, such that |xn − xm| < ε2 , ∀n,m ≥ N(ε).

xnk −→ x =⇒ ∃K1(ε) ∈ N, with |xnk − x| < ε2 , ∀k ≥ K1(ε).

nk ↗, unbounded =⇒ ∃K2(ε) ∈ N, such that nk ≥ N(ε), ∀k ≥ K2(ε).


Let k0 := max(K1(ε),K2(ε)). Then ∀n ≥ N(ε)

|xn − x| ≤ |xn − xnk0 |︸︷︷︸< ε

2

since n,nk0≥N(ε)

+ |xnk0 − x|︸︷︷︸< ε

2 since k0≥K1(ε)

< ε.

�

Proof of 1.4.3 Theorem.

(xn) is a Cauchy sequence =⇒1.4.10

(xn) is bounded.

=⇒1.4.8

∃ convergent subsequence, say xnk −→ x

for some x ∈ R=⇒

1.4.11xn −→ x for some x ∈ R.

Thus we have shown that every Cauchy sequence converges in R. �

1.4.12. Remark. Let F be an ordered field. We have seen

F is Archimedian ⇐=1.2.10

F has the MSP

⇐⇒cf. Theorem 1.3.6

F has the LUBP

=⇒1.4.31.2.10

F Cauchy-complete and F has theArchimedian property

“⇐=” also holds

In particular, an ordered field with any one of the latter 3 equivalent propertiesexists up to an order preserving isomorphism (see 1.2.13 Remark(i)) and is denotedby R.

1.5. The extended real number system, limit superior and inferior

1.5.1. Definition. The extended real number system R consists of R and twosymbols, +∞ and −∞, i.e. R = R ∪ {+∞,−∞}. We define

−∞ < x < +∞, ∀x ∈ R.

Then R becomes a totally ordered set, i.e. it satisfies (10)-(13) of Section 1.1. Wecan define inf and sup for nonempty subsets M ⊂ R by

sup(M) :=

+∞ if +∞ ∈M

as before if M ⊂ R (M 6= ∅)sup(M ′) if M = {−∞} ∪M ′, ∅ 6= M ′ ⊂ R−∞ if M = {−∞}

Here C = A∪B : ⇐⇒ C = A ∪ B and A ∩ B = ∅ (“disjoint union”). inf(M) isdefined arccordingly.

We also extend partially the field operations + and · , namely for x ∈ R

x+ (+∞) := +∞ for x > −∞, x− (+∞) := −∞ for x < +∞,

x · (+∞) :=

{+∞, if x > 0,−∞, if x < 0,

x · (−∞) :=

{−∞, if x > 0,+∞, if x < 0,

and for x ∈ R,x

+∞:=

x

−∞:= 0.

We assume that all these operations on R are commutative.

1.5. THE EXTENDED REAL NUMBER SYSTEM, LIMIT SUPERIOR AND INFERIOR 27

Note: Some expression like

+∞− (+∞), 0 · (±∞),±∞±∞

,±∞−∞

,0

0,±∞

0

are not defined and R is not a field (why ?).

1.5.2. Definition (Convergence and accumulation points in R). Let (xn)n∈Nbe a sequence in R.

(i) We say that (xn)n∈N converges to +∞ (resp. −∞) and write

limxn = +∞, xn −→ +∞, etc.

(resp. limxn = −∞, xn → −∞, etc.)

if

∀K ∈ R, ∃N(K) ∈ N : xn > K, ∀n ≥ N(K)

(resp. xn < K, ∀n ≥ N(K)).

(ii) xn −→ x ∈ R, if (xn)n∈N converges in the sense of either (i) or 1.2.1 Definition(formulated in the same way for sequences in R with limit in R).

(iii) We say that x ∈ R is a cluster point (or accumulation point) of (xn)n∈N, ifthere exists a subsequence (xnk)k∈N of (xn)n∈N with xnk −→ x.

Example. 1) xn :=

{−∞, if n < 20,

1n , if n ≥ 20,

=⇒ xn → 0.

2) xn := n, ∀n ∈ N =⇒ xn → +∞.

3) xn :=

{0, n even,

+∞, n odd ,has the cluster points 0 and +∞.

1.5.3. Proposition. Every monotone sequence in R has a limit in R.

Proof. It suffices to consider increasing sequences. If an increasing sequence(xn)n∈N in R is bounded above in R, then either xn ≡ −∞ (i.e. xn = −∞ forall n ∈ N) or xn = −∞ for only finitely many n ∈ N. In the first case the limitis −∞ ∈ R. In the second case (xn)n∈N converges by the MSP in the sense ofDefinition 1.5.2(ii). If (xn) is not bounded above, then it converges in the sense ofDefinition 1.5.2(i) and the limit is +∞ ∈ R. �

1.5.4. Definition. Let (xn)n∈N be a sequence in R. For n ∈ N we set

yn := supk≥n

xk := sup{xk | k ≥ n}, zn := infk≥n

xk := inf{xk | k ≥ n}.

Obviously (yn)n∈N is a decreasing, and (zn)n∈N is an increasing sequence in R.Therefore, the following limits exist in R by Proposition 1.5.3:

lim supn→∞

xn := limn→∞xn := limn→∞

(supk≥n

xk) (“limit superior”),

lim infn→∞

xn := limn→∞

xn := limn→∞

( infk≥n

xk) (“limit inferior”).

In particular (by Proposition 1.5.3 and the definition of inf and sup in R)

lim supn→∞

xn = infn∈N

(supk≥n

xk), lim inf

n→∞xn = sup

n∈N

(infk≥n

xk).


1.5.5. Proposition. Let C ⊂ R be the set of all accumulation points (in thesense of Definition 1.5.2) of a sequence (xn)n∈N in R. Then C 6= ∅ and

lim infn→∞

xn = inf(C) and lim supn→∞

xn = sup(C).

Proof. Letx∗ = lim sup

n→∞xn = inf

nsupk≥n

xn︸︷︷︸=:yn↘

.

Have to show: x∗ = sup(C). (The proof for the limit inferior is similar using Re-mark 1.3.5).

Case 1 : x∗ = −∞. Then yn ↘ −∞. In particular ∀K ∈ R, ∃n ∈ N with

supk≥n

xk = yn < K, hence xk < K, ∀k ≥ n.

Thus xn → −∞ and −∞ is the sole cluster point of (xn)n∈N, i.e. C = {−∞} andsup(C) = −∞ = x∗.

Case 2: x∗ ∈ R. Let z > x∗. Then: ∃n ∈ N with z > yn = supk≥n xk.Hence z > x∗ cannot be a cluster point, and so x∗ ≥ z for any cluster point z of(xn)n∈N. Now it suffices to verify that x∗ is a cluster point of (xn)n∈N. ∀ε > 0 and∀n ∈ N, we have

supk≥n

xk = yn ≥ x∗, and

yn − ε is not an upper bound for {xk | k ≥ n}. Thus, we can find some k(n) ≥ nwith xk(n) > yn − ε ≥ x∗ − ε. Repeating this procedure with k(n) + 1 insteadof n, etc. ... , we can find a subsequence with xnk ≥ x∗ + ε, ∀k ∈ N. Therecannot be infinitely many k with xnk ≥ x∗ + ε (otherwise yn ≥ x∗ + ε ∀n, hencex∗ = lim yn ≥ x∗ + ε ) and so x∗ is a cluster point.

Case 3: x∗ = +∞. Since x∗ = infn∈N yn it follows yn = supk≥n xk = +∞, ∀n ∈ N.

Thus for every K ∈ R and n ∈ N there is some k ≥ n with xk > K and so x∗ is acluster point of (xn)n∈N. It is obviously the biggest. �

1.5.6. Theorem. Let (xn)n∈N be a sequence in R. Then

∃x ∈ R with xn → x in R ⇐⇒ limn→∞

xn = limn→∞

xn

In this case x is uniquely determined through

x = limn→∞

xn = limn→∞

xn = limn→∞

xn

Proof. If (xn)n∈N is bounded, this is just Remark 1.4.9 (using 1.5.5). Supposehence (xn)n∈N is unbounded. Then (xn)n∈N must be either bounded below or above(otherwise: @ limxn in R and limxn 6= limxn).

“=⇒”: If (xn)n∈N is bounded below and unbounded above and xn → x, thenlimxn = +∞ and x = ∞ is the sole accumulation point. Consequently limxn =limxn = x.

If (xn)n∈N is bounded above and unbounded below for some N ∈ N and xn → x,similarly limxn = limxn = limxn = −∞.

“⇐=”: If (xn)n∈N is bounded below and unbounded above or bounded above andunbounded below and limxn = limxn then similarly either ∃ limxn = +∞ or∃ limxn = −∞ (this is left as an exercise.)

�

1.6. EUCLIDEAN SPACE 29

1.6. Euclidean space

1.6.1. Definition. Let n ∈ N. The n-dimensional Euclidean space Rn consistsof all n-tuples of real numbers, i.e.

Rn := {x = (x1, . . . , xn) | x1, . . . , xn ∈ R}.Thus Rn = R× · · · × R is the n–fold Cartensian product of R with itself.

Addition and scalar multiplication of n-tuples is defined by

(x1, . . . , xn) + (y1, . . . , yn) := (x1 + y1, . . . , xn + yn),

α · (x1, . . . , xn) := (α · x1, . . . , α · xn), α ∈ R.

Addition and scalar multiplication satisfy the commutative, associative and dis-tributive laws and make Rn into a vector space over the real numbers.The zero element of Rn,

0 := (0, 0, . . . , 0︸︷︷︸n - times

) is also called the origin or null vector.

1.6.2. Definition. The Euclidean norm of a vector x = (x1, . . . , xn) ∈ Rnis defined by

‖x‖ :=

(n∑i=1

x2i

)1/2

:=

√√√√ n∑i=1

x2i .

Heren∑i=1

ai := a1 + · · ·+ an.

The Euclidean distance between two vectors x = (x1, . . . , xn) and y = (y1, . . . , yn)is defined by

d(x, y) := ‖x− y‖ =( n∑i=1

(xi − yi)2)1/2

.

The Euclidean inner product of x = (x1, . . . , xn) and y = (y1, . . . , yn) is definedby

〈x, y〉 :=

n∑i=1

xi · yi.

1.6.3. Theorem. Let x, y, z ∈ Rn.


I. Properties of the Euclidean inner product

(i) 〈x, x〉 ≥ 0 positivity

(ii) 〈x, x〉 = 0 ⇐⇒ x = 0 nondegeneracy

(iii) 〈x, y + z〉 = 〈x, y〉+ 〈x, z〉 distributivity

(iv) 〈αx, y〉 = α〈x, y〉 multiplicativity

(v) 〈x, y〉 = 〈y, x〉 symmetry

II. Properties of the Euclidean norm

(i) ‖x‖ ≥ 0 positivity

(ii) ‖x‖ = 0 ⇐⇒ x = 0 nondegeneracy

(iii) ‖α · x‖ = |α| · ‖x‖ multiplicativity

(iv) ‖x+ y‖ ≤ ‖x‖+ ‖y‖ triangle inequality

III. Properties of the Euclidean distance

(i) d(x, y) ≥ 0 positivity

(ii) d(x, y) = 0 ⇐⇒ x = y nondegeneracy

(iii) d(x, y) = d(y, x) symmetry

(iv) d(x, z) ≤ d(x, y) + d(y, z) triangle inequality

IV. Cauchy-Schwarz inequality

|〈x, y〉| ≤ ‖x‖ · ‖y‖.

Proof. For IV. see 1.7.6 below. The other statements are mostly obvious fromthe definitions. We only prove II.(iv) (⇒ III.(iv)):

‖x+ y‖2 = 〈x+ y, x+ y〉=

I.(iii), (v)〈x, x〉+ 2〈x, y〉+ 〈y, y〉

≤IV.

‖x‖2 + 2‖x‖ · ‖y‖+ ‖y‖2 =(‖x‖+ ‖y‖

)2.

�

1.6.4. Proposition. Let x = (x1, . . . , xn) ∈ Rn. Define the max-norm by

|x|max := max1≤i≤n

|xi|.

Then

|x|max ≤ ‖x‖ ≤√n |x|max.

Proof. Almost obvious. �

1.7. NORMS, INNER PRODUCTS AND METRICS 31

1.7. Norms, inner products and metrics

1.7.1. Definition. A metric on a set M is a mapping

d : M ×M −→ R, such that for any x, y, z ∈M :

(i) d(x, y) = 0 ⇐⇒ x = y nondegeneracy

(ii) d(x, y) = d(y, x) symmetry

(iii) d(x, z) ≤ d(x, y) + d(y, z) triangle inequality

The pair (M,d) is called a metric space.

Remark. (i), (ii), (iii) =⇒ d(x, y) ≥ 0, hence

d : M ×M −→ R+ := [0,∞[:= {x ∈ R | 0 ≤ x <∞}.

Proof. 0 = d(x, x) ≤ d(x, y) + d(y, x) = 2d(x, y). �

1.7.2. Example. (a) M = Rn, d(x, y) := ‖x− y‖ (Euclidean distance).

(b) Discrete metrics on M : for x, y ∈M

d(x, y) :=

{0 if x = y,1 if x 6= y.

(c) Induced metric: (M,d) metric space, A ⊂M .

dA := d|A×A, i.e. dA(x, y) = d(x, y) x, y,∈ A.

=⇒ (A, dA) metric space.

1.7.3. Definition. A normed vector space (V, ‖ · ‖) is a vector space V (overthe real numbers) together with a map ‖ ·‖ : V → R, called a norm on V , such that

(i) ‖x‖ = 0, ⇐⇒ x = 0,

(ii) ‖λ · x‖ = |λ| · ‖x‖, ∀x ∈ V, λ ∈ R,(iii) ‖x+ y‖ ≤ ‖x‖+ ‖y‖, ∀x, y ∈ V.

Remark. A norm satisfies ‖x‖ ≥ 0, ∀x ∈ V , because

0 = ‖x− x‖ ≤ ‖x‖+ ‖ − x‖ = 2‖x‖.

Hence ‖ · ‖ : V → R+.

1.7.4. Proposition. (V, ‖ · ‖) normed vector space =⇒ d(x, y) := ‖x − y‖,x, y ∈ V , defines a metric on V .

Proof. Immediate consequence of the definitions. �

Remark. A bounded metric on a vector space V 6= {0} (over the real numbers)cannot come from a norm as in 1.7.4, because if this would be the case and v ∈ V ,v 6= 0, then

d(λ · v, 0) = ‖λ · v‖ = |λ| · ‖v‖ −→λ→∞

∞ .

1.7.5. Definition. A vector space V (over the real numbers) together with amap

〈·, ·〉 : V × V −→ R


is called an inner product space, if

(i) 〈v, v〉 ≥ 0, ∀v ∈ V.(ii) 〈v, v〉 = 0 ⇐⇒ v = 0.

(iii) 〈λ · v, w〉 = λ〈v, w〉, ∀v, w ∈ V, λ ∈ R.(iv) 〈u, v + w〉 = 〈u, v〉+ 〈u,w〉, ∀u, v, w ∈ V.(v) 〈v, w〉 = 〈w, v〉, ∀v, w ∈ V.

〈·, ·〉 is then called an inner product in V .

1.7.6 (Cauchy-Schwarz inequality). If (V, 〈·, ·〉) is an inner product space, then

|〈v, w〉| ≤ 〈v, v〉1/2〈w,w〉1/2, ∀v, w ∈ V.

If w 6= 0 and equality holds, then v = ρ · w for some ρ ∈ R.

Proof. Let v, w ∈ V . If w = 0, then 〈v, w〉 = 0 and the statement holds. Ifw 6= 0, define

λ := 〈w,w〉 > 0, µ := −〈v, w〉.Then

0 ≤ 〈λ · v + µ · w, λ · v + µ · w〉= λ2 〈v, v〉+ 2λµ 〈v, w〉+ µ2 〈w,w〉= λ

(〈w,w〉〈v, v〉 − 2〈v, w〉2 + 〈v, w〉2

)= λ

(〈w,w〉〈v, v〉 − 〈v, w〉2

). (?)

Since λ > 0, we get 〈v, w〉2 ≤ 〈v, v〉〈w,w〉. Furthermore, from (?) we get

|〈v, w〉| = 〈v, v〉1.2〈w,w〉1/2

⇐⇒ 〈λ · v + µ · w, λ · v + µ · w〉 = 0

⇐⇒ λ · v + µw = 0

⇐⇒ v = −µλ· w.

�

1.7.7. Proposition. If (V, 〈·, ·〉) is an inner product space and

‖v‖ := 〈v, v〉1/2, v ∈ V,

then ‖ · ‖ is a norm on V .

Proof. All except the triangle inequality is straightforward. The triangleinequality follows as in the Euclidean case 1.6.3 IV. immediately from the Cauchy-Schwarz inequality. �

1.8. Construction of the real numbers

This section will be added later.

1.9. Section 1 Additional examples

Examples Section 1.1

1) For any z ∈ F: 0 · z = 0

1.9. SECTION 1 ADDITIONAL EXAMPLES 33

Proof. 0 ·z =(3)

(0 + 0) · z =(5),(9)

0 · z + 0 · z (∗)

and

0 · z =(1),(3)

0 + 0 · z =(1),(4)

(−0 · z + 0 · z) + 0 · z

=(2)

−0 · z + (0 · z + 0 · z)

=(∗)

−0 · z + 0 · z

=(1),(4)

0.

�

2) x, y ∈ F =⇒ (x+ y)2 = x2 + 2xy + y2.

Proof.

(x+ y)2 =def

(x+ y)(x+ y)

=(9)

(x+ y)x+ (x+ y)y

=(5),(9)

xx+ xy + yx+ yy

=def. (5)

x2 + xy + xy + y2.

But for all z ∈ F: z + z = z(1 + 1︸︷︷︸=:2

) = z · 2 = 2z.

Therefore xy + xy = 2xy. �

3) x ∈ F =⇒ −x = (−1) · x.

Proof. x+(−1)x =(7)

1 ·x+(−1)x =(9),(4)

(1+(−1))x =(4)

0 ·x =1)

0. Since additive

inverses are unique, we get −x = (−1) · x. �

4) (−1)2 = 1.

Proof. (−1)2 − 1 =3)

(−1)(−1) + (−1)1 =(9),(1)

(−1)(1 + (−1)) =(4)

(−1) · 0 =(5)

0 · (−1) =1)

0. Now add 1 on both sides to conclude. �

If F is an ordered field, then:

5) x ≤ 0 ⇒ −x ≥ 0.

Proof. If x ≤ 0, then

0 =(1),(4)

−x+ x ≤(14)−x+ 0 =

(3)−x.

�

6) x ∈ F =⇒ x2 ≥ 0.


Proof. For x ≥ 0 this is true by (15).If x ≤ 0, then

x2 =lecture

(− (−x)

)(− (−x)

)=3)

(− (−x)

)((−1)(−x)

)=

(5),(6)(−1)

(− (−x)(−x)

)=

3),(6)(−1)2(−x)2 =

4)( −x︸︷︷︸≥0 by (5)

)2.

�

7) x ∈ F =⇒ |x| ≥ 0.

Proof. True for x ≥ 0 by definition. If x < 0, then |x| = −x ≥5)

0. �

8) Examples of partially ordered sets S, that are not totally ordered:

(i) (S,≤) where S := {f | f : R→ R is a map} and for f, g ∈ S

f ≤ S :⇐⇒ f(x) ≤ g(x), ∀x ∈ R.

(ii) (S,≤) where S := R× R, and for (x, y), (x, y) ∈ R× R:

(x, y) ≤ (x, y) :⇐⇒ x ≤ x and y ≤ y.

(iii) (S,⊂) where for X set, S := P(X).

Examples Section 1.4

1) Let (xn)n≥1 be a sequence in R, such that

|xn − xn+1| <1

2n, ∀n ≥ 1.

Show that (xn)n≥1 converges.

Proof. For any k, n ≥ 1.

|xn − xn+k| ≤ |xn − xn+1|+ |xn+1 − xn+2|+ · · ·+ |xn+k−1 − xn+k|

≤ 1

2n+

1

2n+1+ · · ·+ 1

2n+k−1

=1

2n

(1 +

1

2+ . . .

1

2k−1

)≤ 2

2n.

Thus |xn − xm| ≤ 12n−1 if m ≥ n.

Given ε > 0 let N ∈ N be such that 12N−1 < ε. Then

|xn − xm| ≤1

2N−1< ε ∀m ≥ n ≥ N.

(xn)n≥1 is then a Cauchy sequence and hence converges by 1.4.3.�

Example Section 1.5:

1) Construct a sequence (xn)n≥1 in R with cluster points 0, 1,−1, and sup{xn |


n ≥ k} 6= lim supxn, inf{xn | n ≥ k} 6= lim inf xn for any k ≥ 1.For instance:

xn =

1 + 1n if n = 3j + 1

0 if n = 3j + 2−1− 1

n if n = 3j + 3

where j = 0, 1, 2, 3, . . . , hence n ∈ N.

CHAPTER 2

The topology of Euclidean space (and metricspaces)

In this section we study the basic topological properties of Rn. Most of thematerial depends only on basic properties of the distance function and so it makessense to consider general metric spaces.

2.1. Open sets

2.1.1. Definition. Let (M,d) be a metric space. For each fixed x ∈ M andε > 0, the set

B(x, ε) := {y ∈M | d(x, y) < ε}is called the open ε-ball about x (also called ε-neighborhood about x, or openball with radius ε and center x). A set A ⊂ M is said to be open, if for eachx ∈ A, there exists an ε > 0 such that B(x, ε) ⊂ A. A neighborhood of a pointin M is an open set containing that point.

Remark. By definition ∅ and M are open !

Example. (i) ]0, 1[ is open in R, but not open in R2 with Euclidean distance.

(ii) Let d be the Euclidean metric on R2. Then D := {x ∈ R2 | d(x, y) = ‖x‖ ≤ 1}

37

38 2. THE TOPOLOGY OF EUCLIDEAN SPACE (AND METRIC SPACES)

is not open.

2.1.2. Proposition. In a metric space, each ε-ball B(x, ε) is open.

Proof. Let y ∈ B(x, ε). Choose ε′ = ε−d(x, y). Then ε′ > 0. Let z ∈ B(y, ε′),i.e. d(z, y) < ε′. By the triangle inequality

d(x, z) ≤ d(x, y) + d(y, z) < d(x, y) + ε′ = ε,

hence z ∈ B(x, ε), and so B(y, ε′) ⊂ B(x, ε), i.e. B(x, ε) is open.

�

2.1.3. Proposition. Let (M,d) be a metric space. Then:

(i) The intersection of a finite number of open subsets of M is open.(ii) The union of an arbitrary collection of open subsets of M is open.

(iii) The empty set and the whole set M are open.

Remark. A general set M (not necessarily with a metric on it) together witha collection of subsets of M , called the open sets, satisfying (i)-(iii) of 2.1.3 is calleda topological space and the collection of open sets (satisfying (i)-(iii)) is thencalled a topology on M .

Proof of 2.1.3. (i) It is enough to show that the intersection of two opensets is again open. The general result then follows by induction:

A1 ∩ · · · ∩An+1 = (A1 ∩ · · · ∩An) ∩An+1.

Let hence A,B be open and let C := A ∩ B 6= ∅ (otherwise already open bydefinition). Then ∃x ∈ C. Since x ∈ A and x ∈ B and A,B are open: ∃ε, ε′ > 0with B(x, ε) ⊂ A and B(x, ε′) ⊂ B. Let ε′′ := min(ε, ε′). Then B(x, ε′′) ⊂ A andB(x, ε′′) ⊂ B hence B(x, ε′′) ⊂ A ∩B and so A ∩B is open.(ii) Let (Ui)i∈I be a collection of open sets, I an arbitrary index set. Let A :=⋃i∈I Ui, and x ∈ A. Then x ∈ Ui for some i ∈ I. Since Ui is open ∃ε > 0 with

B(x, ε) ⊂ Ui ⊂ A, hence A is open.(iii) This holds by definition. �

2.3. CLOSED SETS 39

2.1.4. Example. (i)⋂n∈N] − 1

n , 1 + 1n [= [0, 1] is not open ! In fact it is

closed, see below.

(ii) S := {(x, y) ∈ R2 | x ∈]0, 1[} is open, since

(x, y) ∈ S =⇒ B((x, y),min(x, 1− x)

)⊂ S.

2.2. The interior of a set

2.2.1. Definition. Let (M,d) be a metric space, A ⊂M .x ∈ A is called an interior point of A : ⇐⇒ ∃U open with x ∈ U ⊂ A. Theinterior of A is the collection of all interior points of A and is denoted by int(A)

or A.

Example. a) Let x ∈ Rn, Rn equipped with Euclidean metric, n ≥ 1.Then

int({x}) = ∅,

int([0, 1]

)=

{∅, if n ≥ 2,

]0, 1[, if n = 1.

b) int(A) =⋃U⊂AU open

U (easy exercise).

2.3. Closed sets

2.3.1. Definition. Let (M,d) be a metric space. Then:

B ⊂M is closed :⇐⇒ Bc := M \B is open.

Example. a) Let x ∈ Rn. Then {x} is closed, because Rn \ {x} is open.

b) [0, 1] is closed in Rn, n ≥ 1, since Rn \ [0, 1] is open in Rn. [0, 1[ is not closed inRn, n ≥ 1. For instance n = 3, (1, 0, 0) ∈ R3 \ [0, 1[ but B((1, 0, 0), ε) 6⊂ R3 \ [0, 1[∀ε > 0. Hence R3 \ [0, 1[ not open, and so [0, 1[ not closed in R3.


(i) The union of a finite number of closed subsets of M is closed.(ii) The intersection of an arbitrary family of closed subsets of M is closed.

(iii) The whole set M and the empty set are closed.

Proof. Follows directly from 2.1.3 using⋂i∈I

Ai =(⋃i∈I

Aci

)c.


�

Example. a) Let (M,d) be a metric space, ε > 0, x ∈M . B(x, ε) is closed in(B(x, ε), dB(x,ε)

)because ∅ is open in B(x, ε).

b) {x1, x2, . . . , xd} ⊂ Rn is closed in Rn, because {xi} is closed in Rn and {x1, . . . , xd} =d⋃i=1

{xi}.

2.4. Accumulation points

2.4.1. Definition. Let (M,d) be a metric space. x ∈M is called an accumu-lation point of A ⊂ M , if every ε-neighborhood of x contains also some point ofA other than x (⇐⇒ ∀ε > 0: B(x, ε) ∩ (A \ {x}) 6= ∅) ).

Notation: The accumulation points of A are denoted by acc(A).

There is also the notion of adherent point x of A:

x ∈ ad(A) :⇐⇒ x ∈M and ∀ε > 0 : B(x, ε) ∩A 6= ∅.

Then: acc(A) ⊂ ad(A) and

ad(A) \ acc(A) = “isolated” points of A,

i.e. x ∈ ad(A) \ acc(A), iff

∀ε > 0 : B(x, ε) ∩A 6= ∅ and ∃ε > 0 : B(x, ε) ∩ (A \ {x}) = ∅,

which means

∃ε > 0 : B(x, ε) ∩A = {x}.Moreover, we have:

2.4.2. Lemma. We have ad(A) = A ∪ acc(A).

Proof. Let x ∈ ad(A) ∩Ac. Then for any ε > 0

B(x, ε) ∩ A︸︷︷︸=A\{x}

6= ∅

and so x ∈ acc(A). Thus

ad(A) = (ad(A) ∩A) ∪ (ad(A) ∩Ac) ⊂ A ∪ acc(A),

and obviously

A ∪ acc(A) ⊂ ad(A) ∪ ad(A) = ad(A).

�

2.4.3. Example. a) acc(]0, 1[) = [0, 1] = ad(]0, 1[).

b) acc(]0, 1[∪{3}) = [0, 1] 6= ad(]0, 1[∪{3}) = [0, 1] ∪ {3}.c) acc({(−1)n + 1

n | n ∈ N}) = {−1, 1}.d) acc({(−1)n | n ∈ N})=∅, but ad({(−1)n | n ∈ N})= {−1, 1}.e) A ⊂ B =⇒ acc(A) ⊂ acc(B).

2.4.4. Theorem. A ⊂M is closed ⇐⇒ A ⊃ acc(A).

Remark. In general A 6⊂ acc(A), e.g. A = {x}, then {x} 6⊂ acc({x}) = ∅. Orsee Example 2.4.3.

2.5. THE CLOSURE OF A SET 41

Proof of 2.4.4. “=⇒”: Let A be closed. Then Ac is open. W.l.o.g. Ac 6= ∅.Let x ∈ Ac. Then: ∃ε > 0 with B(x, ε) ⊂ Ac, hence

B(x, ε) ∩ (A \ {x}︸︷︷︸=A

) = ∅

and so x /∈ acc(A), i.e. acc(A) ⊂ A.

“⇐=”: Suppose acc(A) ⊂ A, and let x ∈ Ac. Then x /∈ acc(A), hence there is someε > 0 with B(x, ε) ∩ (A \ {x}︸︷︷︸

=A

) = ∅. Consequently B(x, ε) ⊂ Ac and so Ac is open,

i.e. A is closed. �

2.4.5. Remark. A ⊂M is closed ⇐⇒ A ⊃ ad(A).

Proof. This follows directly form Lemma 2.4.2 and Theorem 2.4.4. �

2.4.6. Example. acc(Q) = R ( =⇒2.4.4

Q not closed in R ).

Proof. Let x ∈ R, ε > 0 be arbitrary. Then by 1.2.18: ∃r ∈ Q with x < r <x+ ε

=⇒ B(x, ε) ∩ (Q \ {x}) 6= ∅ =⇒ x ∈ acc(Q).

�

2.5. The closure of a set

2.5.1. Definition. Let (M,d) be a metric space, A ⊂ M . The closure of Ais defined as

cl(A) := A :=⋂C⊃A

C closed

C, (closed by 2.3.2(ii)).

Example. cl(Q) = ? ... see the following.

2.5.2. Proposition. A ⊂M . Then cl(A) = A ∪ acc(A) = ad(A).

Proof. The last equality follows from Lemma 2.4.2. Have to show:⋂C⊃A

C closed

C = A ∪ acc(A).

Let C be closed, C ⊃ A. Then

C ⊃2.4.4

acc(C) ⊃ acc(A).

and so C ⊃ A ∪ acc(A). Consequently,⋂C⊃A

C closed

C ⊃ A ∪ acc(A).

It remains to show: A ∪ acc(A) is closed, because then also

A ∪ acc(A) ⊃⋂C⊃A

C closed

C.

Let x ∈(A ∪ acc(A)

)c= ad(A)c. Then

∃ε > 0 with B(x, ε) ∩A = ∅,i.e. B(x, ε) ⊂ Ac. But then also B(x, ε) ⊂ ad(A)c, because

y ∈ B(x, ε) =⇒ ∃ε′ > 0 with B(y, ε′) ⊂ B(x, ε) ⊂ Ac

and so ∃ε′ > 0 withB(y, ε′) ∩A = ∅.


Thus(A ∪ acc(A)

)c= ad(A)c is open and so A ∪ acc(A) is closed.

�

Example. a) cl(Q) = Q ∪ acc(Q) =Example 2.4.6

Q ∪ R = R.

b) cl(

]0, 1]︸︷︷︸:={x∈R|0<x≤1}

∪ {3})

=]0, 1] ∪ {3} ∪ [0, 1] = [0, 1] ∪ {3}.

2.5.3. Example. cl(A) ={x ∈M | inf{d(x, y) | y ∈ A}︸︷︷︸

=:d(x,A)

= 0}

Proof. “⊂”: Let x ∈ cl(A). Then x ∈ A ∪ acc(A) by 2.5.2.1. case : x ∈ A⇒ d(x,A) = 02. case:

x ∈ acc(A) ⇒ ∀ε > 0, B(x, ε) ∩(A \ {x}

)6= ∅.

⇒ ∀ε > 0 ∃y ∈ A, y 6= x, d(x, y) < ε

⇒ d(x,A) = 0.

We have hence shown: x ∈ cl(A)⇒ x ∈M and d(x,A) = 0.

“⊃”: Now, let x ∈M and d(x,A) = 0. Then

∀ε > 0, ∃y ∈ A : d(x, y) < ε.

Thus, if x /∈ A, then:

∀ε > 0, B(x, ε) ∩(A \ {x}

)6= ∅ ⇒ x ∈ acc(A) ⊂ A ∪ acc(A).

If x ∈ A, then x ∈ A ∪ acc(A). Consequently x ∈ A ∪ acc(A) = cl(A). �

2.6. The boundary of a set

2.6.1. Definition. Let (M,d) be a metric space, A ⊂M . The boundary of Ais defined by

bd(A) := ∂A := cl(A) ∩ cl(Ac).

2.6.2. Remark. ∂A is closed and ∂A = ∂Ac. Moreover, by Theorem 2.5.2, wehave ∂A = ad(A) ∩ ad(Ac), i.e.

∂A = {x ∈M | ∀ε > 0 : B(x, ε) ∩A 6= ∅ and B(x, ε) ∩Ac 6= ∅}.

Example. S1 := {x ∈ R2 | ‖x‖ = 1}. Then

∂S1 = cl(S1) ∩ cl(R2 \ S1)︸︷︷︸=(R2\S1) ∪ acc(R2 \ S1)︸︷︷︸

=R2

= cl(S1) = S1.

Example. a) ∂A 6⊂ acc(A) in general. Counterexample:

A = [0, 1] ∪ {3} ⊂ R.Then ∂A = {0, 1, 3}, but acc(A) = [0, 1].

Of course acc(A) 6⊂ ∂A in general since acc(A) ⊃ int(A).

b) A := { 1n | n ∈ N} ⊂ R. Then

cl(A) = A ∪ acc(A) = A ∪ {0}, ∂A = cl(A) ∩ cl(Ac) = (A ∪ {0}) ∩ R = A ∪ {0},and

int(A) = ∅, cl(A) = int(A) ∪ ∂A.

2.7. SEQUENCES 43

2.7. Sequences

2.7.1. Definition. Let (M,d) be a metric space and (xn)n∈N be a sequence inM . We say that (xn)n∈N converges to x ∈M , written

limn→∞

xn = x or xn → x (as n→∞),

if for any neighborhood Ux of x, there exists some N = N(Ux) ∈ N, with

xn ∈ Ux, ∀n ≥ N.

2.7.2. Proposition. Let (xn)n∈N be a sequence in a metric space (M,d), x ∈M . Then:

xn → x ⇐⇒ ∀ε > 0, ∃N(ε) ∈ N : d(x, xn) < ε, ∀n ≥ N(ε).

Proof. “=⇒”: Suppose xn → x, and let ε > 0. Since B(x, ε) is a neighbor-hood of x (i.e. an open set containing x), ∃N(B(x, ε)) ∈ N with

xn ∈ B(x, ε)︸︷︷︸⇐⇒ d(x,xn)<ε

, ∀n ≥ N(B(x, ε)).

“⇐=”: Let Ux be a neighborhood of x. Then: ∃ε > 0 with B(x, ε) ⊂ Ux. Byassumption: ∃N(ε) ∈ N with

d(x, xn) < ε︸︷︷︸⇐⇒ xn∈B(x,ε)

(⊂Ux

), ∀n ≥ N = N(ε).

�

Example. Convergence in a normed vector space(V, ‖ · ‖

). Let (xn)n∈N be a

sequence in V . Then for x ∈ V :

xn → x ⇐⇒ ∀ε > 0, ∃N(ε) ∈ N : ‖x− xn‖ < ε, ∀n ≥ N(ε).

(looks formally exactly the same as convergence in Rn)

2.7.3. Proposition. Let (vn)n∈N, (wn)n∈N, be sequences in a normed vectorspace V , (λn)n∈N be a sequence in R, λ ∈ R, v, w ∈ V such that λn → λ in R,vn → v, wn → w in V . Then

(i) vn + wn −→ v + w.

(ii) µvn −→ µv, ∀µ ∈ R.

(iii) λnu −→ λu, ∀u ∈ V .

(iv) λnvn −→ λv.

(v) If λn 6= 0, n ∈ N, λ 6= 0, then 1λnvn −→ 1

λv.

Proof. Same as with V = R. �

In the vector space V = Rd, the vectors have coordinates in R,

v = (v1, . . . , vd), vn = (vn1, . . . , vnd).

2.7.4. Proposition. vn −→ v in Rd ⇐⇒ vni −→ vi in R, 1 ≤ i ≤ d.


Proof. “=⇒” If vn −→ v, then ∀ε > 0, ∃N(ε) ∈ N, with ‖v − vn‖ < ε,∀n ≥ N(ε). Then for every i = 1, . . . , d

|vi − vni| =√

(vi − vni)2 ≤

√√√√ d∑i=1

(vi − vni)2 = ‖v − vn‖ < ε, ∀n ≥ N(ε),

hence vni −→ vi in R for every i = 1, . . . , d.

“⇐=” If vni −→ vi, 1 ≤ i ≤ d, then for ε > 0, ∃N1(ε), . . . , Nd(ε) ∈ N, with

|vni − vi| < ε, ∀n ≥ Ni(ε), 1 ≤ i ≤ d.Let N(ε) := max

(N1(ε), . . . , Nd(ε)

). Then

‖vn − v‖ ≤1.6.4

√d max

1≤i≤d|vni − vi| <

√d · ε, ∀n ≥ N(ε).

�


(i) A ⊂M is closed ⇐⇒for every sequence (xn)n∈N in A that convergesin M , we have lim

n→∞xn ∈ A.

(ii) Let B ⊂M and x ∈M . Then

x ∈ cl(B) ⇐⇒ ∃ sequence (xn)n∈N in B with limxn = x.

Proof. (i) “=⇒”: Let A be closed, and xn ∈ A −→ x ∈M .Suppose x ∈ Ac. SinceAc is open, ∃ε > 0 withB(x, ε) ⊂ Ac. But then d(x, xn) ≥ ε∀n ∈ N, thus xn −→ x is impossible .

“⇐=”: We want to show acc(A) ⊂ A. Let hence x ∈ acc(A). Thus ∀n ∈ N:∃xn ∈ B(x, 1

n ) ∩ (A \ {x}). Clearly xn ∈ A −→ x. Then x ∈ A by assumption.

(ii) “=⇒” Let x ∈ cl(B)2.5.2= B ∪ acc(B). If x ∈ B then choose xn = x, ∀n ∈ N. If

x ∈ acc(B), then: ∃xn ∈ B \ {x} with xn −→ x as in (i).

“⇐=”: If ∃xn ∈ B −→ x ∈M and x /∈ B, then

∀ε > 0, ∃xn ∈ B(x, ε) ∩ (B \ {x}), i.e. x ∈ acc(B).

�

2.8. Completeness

2.8.1. Definition. Let (M,d) be a metric space and (xn)n∈N be a sequence inM .

(xn)n∈N is a Cauchy sequence :⇐⇒ ∀ε > 0, ∃N = N(ε) with

d(xm, xn) < ε ∀m,n ≥ N.(M,d) is complete :⇐⇒ every Cauchy sequence in M converges in M .

2.8.2. Definition. A sequence (xn)n∈N in a metric space (M,d) is calledbounded, if there exists x0 ∈ M and b ∈ R with d(xn, x0) ≤ b for all n ∈ N.If 0 ∈ M (for instance if M is a vector space), we may choose w.l.o.g. the originx0 = 0 as reference point.

2.8.3. Proposition. Let (xn)n∈N be a sequence in a metric space. Then

(xn)n∈N converges =⇒ (xn)n∈N is bounded.

2.9. SERIES OF REAL NUMBERS AND VECTORS 45

Proof. Similarly to the case of R (see 1.2.3). �

2.8.4. Proposition. In a metric space

(i) every convergent sequence is a Cauchy sequence,

(ii) Cauchy sequences are bounded,

(iii) if a subsequence of Cauchy sequence converges to x, then so does the Cauchysequence.

Proof. Similarly to the case where the metric space is R. �

2.8.5. Theorem. In Rn the following holds:

(xn)n∈N converges ⇐⇒ (xn)n∈N is a Cauchy sequence.

Proof. Follows directly from the corresponding results in R together with2.7.4. �

2.8.6. Definition. Let (M,d) be a metric space. x ∈ M is called a clusterpoint (or accumulation point) of a sequence (xn)n∈N in M , if ∀ε > 0, ∃ infinitelymany n ∈ N with d(x, xn) < ε.

2.8.7. Proposition. Let (xn)n∈N be a sequence in a metric space (M,d) andx ∈M . Then:

(i) x is a cluster point of (xn)n∈N ⇐⇒ ∀ε > 0, ∀N ∈ N, ∃n > N : d(xn, x) < ε.

(ii) x is a cluster point of (xn)n∈N ⇐⇒ ∃ subsequence with xnk −→ x.

(iii) xn −→ x ⇐⇒ every subsequence (xnk)k∈N converges to x.

(iv) xn −→ x ⇐⇒ every subsequence (xnk)k∈N has a further subsequence

(xnkl )l∈N with xnkl −→ x.

Proof. (i), (ii) and (iii) are similar to the corresponding proofs in R.(iv) “=⇒”: is clear.“⇐=” (by contraposition): Suppose (xn)n∈N does not converge to x. Then: ∃ε > 0,such that ∀N ∈ N, d(x, xn) ≥ ε for some n ≥ N . By this we can find a subsequence(xnk)k∈N with

d(xnk , x) ≥ ε, ∀k ∈ N.Obviously (xnk) has no subsequence (xnkl ) that converges to x . �

2.9. Series of real numbers and vectors

Throughout 2.9. Let(V, ‖ · ‖

)be a complete normed vector space.

2.9.1. Definition. Let (xk)k∈N be a sequence in V . For 1 ≤ p ≤ q, p, q ∈ N,we define

q∑k=p

xk := xp + xp+1 + · · ·+ xq,

sn :=

n∑k=1

xk, n ≥ 1, is called the n-th partial sum.

The sequence of partial sums (sn)n≥1 in V is called an (infinite) series and is denotedby

∞∑k=1

xk.

46 2. THE TOPOLOGY OF EUCLIDEAN SPACE (AND METRIC SPACES)∑∞k=1 xk is said to converge (resp. to converge to x ∈ V ), if (sn)n≥1 converges

(resp. converges to x in V ). If∑∞k=1 xk converges to x ∈ V , we write

∞∑k=1

xk := limn→∞

sn = x.

If (sn)n≥1 diverges, i.e. does not converge, then∑∞k=1 xk is said to diverge.

Remark. One can also consider series of the form∑∞k=0 xk by an obvious

modification of 2.9.1.

2.9.2. Theorem (Cauchy-criterion for series). A series∑∞k=1 xk in V con-

verges, iff ∀ε > 0, ∃N = N(ε) ∈ N, such that

‖m∑k=n

xk‖ < ε, ∀m ≥ n ≥ N. (2.1)

In particular, if∑∞k=1 xk converges, then xk −→ 0 (by taking k = m = n).

Proof. (2.1) just means that the sequence of partial sums (sn)n≥1 is a Cauchysequence in V , which converges by completeness. �

2.9.3. Definition. Let∑∞k=1 xk be a series in V . Then:

(i)

∞∑k=1

xk is converges absolutely :⇐⇒∞∑k=1

‖xk‖ converges (in R).

(ii) Let σ : N→ N be a bijection. Then the series

∞∑k=1

xσ(k)

is called a rearrangement of∑∞k=1 xk.

Remark.∑∞k=1 xk is absolutely convergent =⇒

∑∞k=1 xk is convergent

(just use the triangle inequality∥∥∑m

k=n xk∥∥ ≤∑m

k=n ‖xk‖ and the Cauchy-criterion).

2.9.4. Theorem. Let

∞∑k=1

xk be absolutely convergent in V . Then

∞∑k=1

xσ(k)

converges absolutely and

∞∑k=1

xσ(k) =

∞∑k=1

xk as well as

∞∑k=1

‖xσ(k)‖ =

∞∑k=1

‖xk‖ for

any rearrangement

∞∑k=1

xσ(k) of

∞∑k=1

xk.

Proof. Let ε > 0 be arbitrary. By the Cauchy-criterion for∑∞k=1 ‖xk‖:

∃N ∈ N withm∑k=n

‖xn‖ < ε, ∀m ≥ n ≥ N + 1. (∗∗)

Now, M := max(σ−1(1), . . . , σ−1(N)

)is the smallest integerM ≥ N with {1, . . . , N} ⊂

{σ(1), σ(2), . . . , σ(M)}, and so ∀l ≥M

∥∥ l∑k=1

xσ(k) −l∑

k=1

xk∥∥ ≤ max(σ(1),...,σ(l))∑

k=N+1

2‖xk‖ <(∗∗)

2ε


and ∣∣∣∣∣l∑

k=1

‖xσ(k)‖ −l∑

k=1

‖xk‖

∣∣∣∣∣ ≤max(σ(1),...σ(l))∑

k=N+1

2‖xk‖ <(∗∗)

2ε.

�

The most important tests for the convergence of real series are:

2.9.5. Theorem. (i) Geometric series: For |r| < 1, we have

∞∑n=0

rn =1

1− r.

If |r| ≥ 1, then

∞∑n=0

rn diverges.

(ii) Comparison test: If

∞∑k=1

ak converges and 0 ≤ bk ≤ ak, ∀k ∈ N, then

∞∑k=1

bk

converges. If

∞∑k=1

ck diverges and 0 ≤ ck ≤ dk, ∀k ∈ N, then

∞∑k=1

dk diverges.

(iii) p-series:

∞∑n=1

1

npconverges, if p > 1 and diverges (to +∞), if p ≤ 1.

(iv) Ratio test: If∣∣∣an+1

an

∣∣∣ ≤ r < 1, ∀n ≥ n0 ∈ N, then

∞∑n=1

an converges absolutely.

If |an+1

an| ≥ 1, ∀n ≥ n0 ∈ N, then

∞∑n=1

an diverges. If limn→∞

∣∣∣∣an+1

an

∣∣∣∣ = 1, nothing can

be deduced (i.e. the test is inconclusive).

(v) Root test: If |an|1/n ≤ r < 1, ∀n ≥ n0 ∈ N, then

∞∑n=1

an converges absolutely.

If |an|1n ≥ 1, ∀n ≥ n0 ∈ N, then

∞∑n=1

an diverges. If limn→∞ |an|1n = 1, nothing can

be deduced (i.e. the test is inconclusive).

(vi) Integral test: If f is continuous, f ≥ 0, and decreasing on [1,∞), then∑∞n=1 f(n) and

∫∞1f(t)dt converge or diverge together.

(vii) Ratio comparison test: Let

∞∑i=1

ai,

∞∑i=1

bi be series and bi > 0, ∀i ∈ N.

Then:

∃ limi→∞|ai|bi

<∞ and∞∑i=1

bi converges

=⇒∞∑i=1

ai converges absolutely,

∃ limi→∞aibi> 0 and

∞∑i=1

bi diverges

=⇒∞∑i=1

ai diverges.

(viii) Alternating series: If the ai alternate in sign, |ai| ↘ 0, then

∞∑i=1

ai con-

verges and ∣∣∣∣∣∞∑i=1

ai −n∑i=1

ai

∣∣∣∣∣ ≤ |an+1|, ∀n ∈ N.


Proof. (i) For r = 1 the series clearly diverges. If r 6= 1, then

sn =

n∑k=0

rk = 1 + r + · · ·+ rn =(1 + r + · · ·+ rn)(1− r)

1− r=

1− rn+1

1− r,

and the result follows letting n→∞.

(ii) The partial sums∑nk=1 bk, n ∈ N, form an increasing sequence that is bounded

above by∑∞k=1 ak ∈ R =⇒

MSP

∑∞k=1 bk converges. Finally

0 ≤n∑k=1

ck︸︷︷︸increasing

=⇒ unbounded in notherwise convergent

≤n∑k=1

dk =: sn cannot converge in R since (Sn) isincreasing and unbounded in n

(iii) If p ≤ 1, then 1np ≥

1n ≥ 0, ∀n ∈ N. Since the harmonic series

∑∞n=1

1n diverges

by 1.2.18,∑∞n=1

1np also diverges by (ii). Now, let p > 1. Then the partial sums

sk =1

1p+

1

2p+

1

3p+ · · ·+ 1

kp↗ (in k), sk ≥ 0,

and

S2k−1 =1

1p+( 1

2p+

1

3p

)+( 1

4p+

1

5p+

1

6p+

1

7p

)+ . . .

+( 1

(2k−1)p+ . . . · · ·+ 1

(2k − 1)p

)≤ 1

1p+

2

2p+

4

4p+ · · ·+ 2k−1

(2k−1)p

=(1

2

)p−1

+(1

2

)p−1

+(1

4

)p−1

+ · · ·+( 1

2k−1

)p−1

=

k−1∑n=0

( 1

2n

)p−1

=

k−1∑n=0

( 1

2p−1

)n=

1−(

12p−1

)k1− 1

2p−1

<1

1− 12p−1

.

Thus (Sk) is bounded above and so∑∞n=1

1np converges.

(iv) By assumption, we can find n0 ∈ N with∣∣∣∣an+1

an

∣∣∣∣ ≤ r < 1, ∀n ≥ n0.

Thus

|an0+k| ≤ r |an0+k−1| ≤ ... ≤ rk |an0|, ∀k ≥ 1,

=⇒(i),(ii)

∑n≥n0

|an| convergent =⇒∑∞n=1 |an| convergent.

If |an+1| ≥ |an|︸︷︷︸6=0

, ∀n ≥ n0, then

|an| ≥ |an0 | > 0, ∀n ≥ n0.

Thus limn→∞ an = 0 cannot hold and so∑∞n=1 an diverges by 2.9.2.

If limn→∞

∣∣∣an+1

an

∣∣∣ = 1, then consider∑∞n=1

1n and

∑∞n=1

1n2 . Both series satisfy this

criterion but one series diverges and the other converges. Thus the test cannot beconclusive.


(v) If |an|1n ≤ r < 1, ∀n ≥ n0, then |an| ≤ rn, ∀n ≥ n0. Hence after comparison

with the geometric series∑∞n=1 an converges absolutely. If |an| ≥ 1n = 1, ∀n ≥

n0 ∈ N, then again lim an = 0 cannot hold and so∑∞n=1 an diverges.

In the case limn→∞ |an|1/n = 1, consider an = 1n and an = 1

n2 , ∀n ∈ N. Then bothcorresponding series satisfy the criterion but one diverges and the other converges.Hence the test cannot be conclusive.

(vi) We have not introduced the integral nor continuity (for the proof see later).

(vii) Clear, e.g. if |ai|bi −→M , then |ai|bi < M + 1, ∀i ≥ n0. Hence |ai| < (M + 1)bi,

∀i ≥ n0. Then use (ii), etc.

(viii) W.l.o.g. a1 > 0. Otherwise consider ai := −ai, i ∈ N, and apply the samereasoning as below. For convenience of notation define bi := (−1)i+1ai, i ∈ N.Then bi ≥ 0 and

∑ai = b1 − b2 + b3 − b4 + . . . . Moreover, we have bi = |ai| ↘ 0.

Now

s2n = (b1 − b2︸︷︷︸≥0

) + . . . + (b2n−1 − b2n︸︷︷︸≥0

) ↗

s2n+1 = b1 − (b2 − b3︸︷︷︸≥0

) − . . . − (b2n − b2n+1︸︷︷︸≥0

) ↘

s2 ≤ s4 ≤ . . . ≤ s2n ≤ s2n+1 ≤ s2n−1 ≤ . . . ≤ s1.

Hence

(s2n)↗ , s2n ≤ s1 =⇒ s2n ↗ limn→∞

s2n =: seven,

(s2n+1)↘ , s2n+1 ≥ s2 =⇒ s2n+1 ↘ limn→∞

s2n+1 =: sodd.

But s2n+1 − s2n = a2n+1 −→ 0, thus seven = sodd =: s

Note that s =∑∞i=1 ai, since ∀ε > 0, ∃N ∈ N, with

|s− s2n|+ |s− s2n+1| < ε, ∀n ≥ N,

hence

|s− sn| < ε, ∀n ≥ 2N.

Finally,

|s− s2n| =s2n↗s

s− s2n ≤s2n+1↘s

s2n+1 − s2n = a2n+1

|s− s2n+1| =s2n+1↘s

s2n+1 − s ≤s2n+2↗s

s2n+1 − s2n+2 = −a2n+2.

�

2.9.6. Example. Consider the alternating series∑k≥1

(−1)k+1 1

k= 1− 1

2+

1

3− 1

4+

1

5− 1

6+ ...

By Theorem 2.9.5(viii), we know that

sn :=

n∑k=1

(−1)k+1 1

k, n ≥ 1, converges, say s := lim

n→∞sn.

And by 1.2.18, we know that∑k≥1(−1)k+1 1

k does jot converge absolutely. Define

hn :=

n∑k=1

1

k, n ≥ 1.


Then

s2n =

2n∑k=1

1

k− 2

n∑k=1

1

2k= h2n − hn, n ≥ 1.

Consider the following rearrangement of∑k≥1(−1)k+1 1

k (two positive terms fol-

lowed by one negative term):

1 +1

3− 1

2︸︷︷︸ +1

5+

1

7− 1

4︸︷︷︸ +1

9+

1

11− 1

6︸︷︷︸ +1

13+

1

15− 1

8︸︷︷︸ + ...

and denote the corresponding sequence of partial sums by (s′n)n≥1. Then (s′n)n≥1

converges (proof is left to the reader) and

s′3n =

2n∑k=1

1

2k − 1−

n∑k=1

1

2k

= h4n −2n∑k=1

1

2k− 1

2

n∑k=1

1

k

= h4n −1

2h2n −

1

2hn = (h4n − h2n) +

1

2(h2n − hn)

= s4n +1

2s2n −→

n→∞

3

2s.

Consider another rearrangement of∑k≥1(−1)k+1 1

k (a positive term followed by

two negative terms):

1− 1

2− 1

4︸︷︷︸ +1

3− 1

6− 1

8︸︷︷︸ +1

5− 1

10− 1

12︸︷︷︸ +1

7− 1

14− 1

16︸︷︷︸ + ...

and denote the corresponding sequence of partial sums by (s′′n)n≥1. Then (s′′n)n≥1

converges (proof is left to the reader) and

s′′3n =

n∑k=1

1

2k − 1−

2n∑k=1

1

2k

= h2n −n∑k=1

1

2k− 1

2

2n∑k=1

1

k

= h2n −1

2hn −

1

2h2n =

1

2(h2n − hn) =

1

2s2n −→

n→∞

1

2s.


One can also find a rearrangement of∑k≥1(−1)k+1 1

k that is unbounded above

(likewise unbounded below), hence diverges:

1− 1

2+

(1

3

)− 1

4

+(1

5+

1

7

)− 1

6

+(1

9+

1

11+

1

13+

1

15

)− 1

8+ . . .

+( 1

2n + 1+

1

2n + 3+ · · ·+ 1

2n + (2n − 1)

)︸︷︷︸

> 2n

2 ·1

2n+1 = 14

− 1

2n+ 2

+ . . .

A corresponding general statement can be found in the following theorem:

2.9.7. Theorem (Riemann). Let∑n≥1 xn be a sequence of real numbers that

converges, but not absolutely. Then for any extended real number α ∈ R, thereexists a rearrangement

∑n≥1 xσ(n) of

∑n≥1 xn with

∑n≥1 xσ(n) = α.

Proof. See [2, 3.54 Theorem] where an even more general statement is proven.�

2.10. Section 2 Additional Examples

Section 2.1 examples:

1) Let A ⊂ Rn be open and B ⊂ Rn. Let

A+B := {x+ y ∈ Rn | x ∈ A and y ∈ B}.

Prove that A+B is open.

Proof. Let w ∈ A+B. Then ∃x ∈ A, y ∈ B with w = x+y. Since A is open:∃ε > 0 with B(x, ε) ⊂ A. Want to show B(w, ε) ⊂ A+B.

Suppose z ∈ B(w, ε). Then

‖z − w‖ = ‖z − (x+ y)‖ = ‖(z − y)− x‖ < ε,

hence z − y ∈ B(x, ε) ⊂ A and so z = (z − y) + y ∈ A + B. ConsequentlyB(w, ε) ⊂ A+B. �


1) Is it true that int(A ∪B) = int(A) ∪ int(B) ?

No. Counterexample: A = [0, 1], B = [1, 2]. Then int(A ∪ B) = (0, 2) butint(A) ∪ int(B) = (0, 1) ∪ (1, 2).

2) Is it true in general that in a metric space (M,d)

B(x, ε) = int({y ∈M | d(x, y) ≤ ε})?

No, counterexample: Let d be the discrete metric on M and suppose that Mcontains more than one element. Let x ∈M and ε = 1. Then

{x} = B(x, 1) 6= M = int({y ∈M | d(x, y) ≤ 1}︸︷︷︸=M

)



1) Let (M,d) be a metric space and A ⊂M be a finite set. Show that

B := {x ∈M | d(x, y) ≤ 1 for some y ∈ A}

is closed.

Proof. Since B =⋃y∈A{x ∈ M | d(x, y) ≤ 1} it is enough to show that for

given y ∈ A, D := {x ∈M | d(x, y) ≤ 1} is closed (cf. 2.3.2(i)).

We will show that Dc is open. W.l.o.g. Dc 6= ∅. Let x ∈ Dc. Then d(x, y) > 1.Thus ε := d(x, y)− 1 > 0. Let z ∈ B(x, ε). Then

d(x, z) + d(z, y) ≥ d(x, y),

Consequently,

d(z, y) ≥ d(x, y)− d(x, z)︸︷︷︸<ε

> 1.

and z ∈ Dc. Thus B(x, ε) ⊂ Dc and Dc is open.

�


1) Let (M,d) be a metric space. Is it true that

D(x, ε) := {y ∈M | d(x, y) ≤ ε} ⊂ acc(B(x, ε)

)?

No, counterexample: Let d be the discrete metric and suppose that M containsmore than one point. Then

D(x, 1) = M and B(x, 1) = {x}, hence acc(B(x, 1)

)= ∅.


1) Is it true that cl(A ∩B) = cl(A) ∩ cl(B) ?No, counterexample: A = [0, 1], B = (1, 2]. Then

cl(A ∩B) = cl(∅) = ∅ but cl(A) ∩ cl(B) = [0, 1] ∩ [1, 2] = {1}.

2) Let ∅ 6= S ⊂ R and S be bounded above. Then sup(S) ∈ cl(S)


Proof. Let x := sup(S). We have to show x ∈ ad(S) = cl(S). By 1.3.2

∀ε > 0, ∃y ∈ S, with y > x− ε, i.e. |x− y| < ε.

Therefore,

∀ε > 0 : B(x, ε) ∩ S 6= ∅

and x ∈ ad(S). �


1) Let (xn) be a convergent sequence in a normed vector space, such that ‖xn‖ ≤ 1for all n ≥ 1. Show that the limit x also satisfies ‖x‖ ≤ 1. If ‖xn‖ < 1 ∀n ≥ 1then must we have ‖x‖ < 1?

Proof. One first shows that D(0, 1) := {y ∈ V | ‖y‖ ≤ 1} is closed (seeSection 2.3 examples 1)). Then by 2.7.5(i), xn ∈ D(0, 1), n ≥ 1 and xn → x impliesx ∈ D(0, 1), hence ‖x‖ ≤ 1.

This is not true if “≤” is replaced by “<”. For example in R consider xn = 1− 1n ,

n ≥ 1. �


1) Let (M,d) be a complete metric space and N ⊂ M a closed subset. ThenN is also complete with induced metric.

Proof. Let (xn) be a Cauchy sequence in (N, dN ). Then (xn) is a Cauchysequence in (M,d). Since M is complete (xn) converges to some x ∈M . But sinceN is closed, we get x ∈ N by 2.7.5(i). �

2) Show that the set S of all cluster points of a sequence (xn) in a metric space(M,d) is closed.

Proof. Let (yn) be a sequence in S that converges to y ∈ M . We have toshow y ∈ S, because then S is closed by 2.7.5(i). Since yn → y,

∀ε > 0, ∃N ∈ N with d(yn, y) < ε/2, ∀n ≥ N.

Since yN is a cluster point, there are infinitely many k ∈ N with d(xk, yN ) < ε2 .

Thus

d(xk, y) ≤ d(xk, yN ) + d(yN , y) < ε

for infinitely many k ∈ N and so y ∈ S. �


1) Let xn = ( 1n2 ,

1n ), n ≥ 1. Does

∑n≥1

xn converge?

No. By Proposition 2.7.4,( n∑k=1

xk

)n≥1

converges, iff( n∑k=1

1

k2

)n≥1

and( n∑k=1

1

k

)n≥1

converge. But( n∑k=1

1

k

)n≥1

does not converge by 1.2.18.


2) Let (xn) be a sequence in a complete normed vector space(V, ‖ · ‖

). Suppose

‖xn‖ ≤1

2n, ∀n ≥ 1. Show that

∑n≥1

xn converges (cf. Examples Section 1.4) and

that∥∥∥∑n≥1

xn

∥∥∥ ≤ 1.

Proof. We first verify the Cauchy criterion. For m ≥ n, we have

‖m∑k=n

xk‖ ≤m∑k=n

‖xk‖ ≤m∑k=n

1

2k≤∑k≥n

1

2k=

1

2n−1.

Thus given ε > 0, we choose N so that 12N−1 < ε, then

‖m∑k=n

xk‖ ≤1

2n−1≤ 1

2N−1< ε, ∀m ≥ n ≥ N,

and so∑k≥1 xk converges by 2.9.2. Moreover, for any n ∈ N

‖sn‖ = ‖n∑k=1

xk‖ ≤n∑k=1

‖xk‖ ≤∞∑k=1

1

2k= 1.

Thus the limit s of the partial sums also satisfies ‖s‖ ≤ 1 by Section 2.7 examples1). �

Remark: one can also show that∑k≥1 ‖xk‖ as in 2) converges by comparison

with the geometric series.

CHAPTER 3

Compact and connected sets

3.1. Compactness

3.1.1. Definition. Let M be a metric space, A ⊂M . Then

A is sequentiallycompact

:⇐⇒ every sequence in A has a convergentsubsequence with limit in A.

Remark. By 2.8.7(ii) this is equivalent to saying that every sequence in A hasa cluster point in A.

Let I be an arbitrary index set. An open cover of A is a family of open sets{Ui}i∈I in M with

A ⊂⋃i∈I

Ui.

3.1.2. Definition. Let M be a metric space, A ⊂M . Then

A is compact :⇐⇒ for any open cover {Ui}i∈I of A there arefinitely many i1, . . . , ik ∈ I with Ui1 ∪ · · · ∪ Uik ⊃ A.

In words: “Every open cover of A has a finite subcover”.

3.1.3. Definition. A ⊂M is called totally bounded (or precompact), if

∀ε > 0, ∃nε ∈ N, and x1, . . . , xnε ∈ A, with A ⊂nε⋃i=1

B(xi, ε).

3.1.4. Remark. (i) A is totally bounded =⇒ A is bounded

(because A ⊂ B(x,max1≤i≤nε d(x1, xi) + ε)

(ii) In Definition 3.1.3 it does not matter whether x1, . . . , xnε ∈M or x1, . . . , xnε ∈A. Indeed, suppose ∀ε > 0, ∃x1, . . . , xnε ∈M with

A ⊂nε⋃i=1

B(xi, ε).

Choose yi ∈ B(xi, ε) ∩A (w.l.o.g B(xi, ε) ∩A 6= ∅), i = 1, . . . , nε. Then

A ⊂nε⋃i=1

B(xi, ε) ⊂nε⋃i=1

B(yi, 2ε).

In particular: subsets of precompact sets are again precompact.

3.1.5. Theorem. Let (M,d) be a metric space, and A ⊂M . Then are equiva-lent:

(i) A is a compact,

(ii) A is sequentially compact,

(ii) A is complete and totally bounded.

55

56 3. COMPACT AND CONNECTED SETS

Proof. (i) ⇒ (ii): Let (xk)k∈N be a sequence in A that has no cluster pointin A. Then

∀y ∈ A : ∃ry > 0 with Ny := {k ∈ N | xk ∈ B(y, ry) ∩A} is finite.︸︷︷︸y is not a cluster point of (xk)k∈N

{B(y, ry)}y∈A is an open cover of A. Thus by (i), ∃ y1, . . . , yn ∈ A with

A ⊂n⋃i=1

B(yi, ryi),

and son⋃i=1

Nyi =

n⋃i=1

{k ∈ N | xk ∈ ∪ni=1B(yi, ryi) ∩A} = N .

(ii) =⇒ (iii): Let (xk)k∈N be a Cauchy sequence in A. By (ii), (xk)k∈N has aconvergent subsequence in A, say xkl −→ x ∈ A. Then

d(xk, x) ≤ d(xk, xkl)︸︷︷︸< ε

2 for k, l big

+ d(xkl , x)︸︷︷︸< ε

2 for l big

,

thus xk −→ x and so A is complete.

Suppose ∃ε > 0 such that A can not be covered by finitely many ε-balls. Then onecan find inductively xk ∈ A with

xk+1 ∈ A \k⋃i=1

B(xi, ε).

But then (xk)k∈N can not have a convergent subsequence.

(iii) =⇒ (i) Let {Ui}i∈I be an open cover of A. Define

B := {B ⊂ A | J ⊂ I, B ⊂ ∪j∈JUj =⇒ J infinite }.B consists of the subsets B ⊂ A for which there is no finite subcover.

We want to show: A /∈ B.

Since A is totally bounded, we have

B ∈ B, ε > 0 =⇒ ∃x1, . . . , xnε ∈ A with B ⊂ A ⊂nε⋃i=1

B(xi, ε)

=⇒ B(xiε , ε) ∩B ∈ B for at least one iε ∈ {1, . . . , nε} (∗)Suppose: A ∈ B. It then follows inductively from (∗) with ε = 1

k , k ∈ N. that thereis a sequence (xk)k∈N in A with

Bm := ∩mk=1B(xk,

1

k

)∩A ∈ B, ∀m ∈ N. (Bm ↘)

For each m ∈ N choose ym ∈ Bm. Then for m ≤ l it holds: ym, yl ∈ Bm ⊂B(xm,

1m ), thus

d(ym, yl) <2

m,

i.e. (ym)m∈N is a Cauchy sequence in A. By assumption ∃y ∈ A with

εm := d(ym, y) −→m→∞

0.

Since y ∈ Ui0 for some i0 ∈ I, we get

Bm ⊂ B(xm,1

m) ⊂ym∈B(xm,

1m )

B(ym,2

m) ⊂ B(y,

2

m+ εm) ⊂

m largeUi0 ,

i.e. for m large, we have Bm /∈ B . �

3.3. THE NESTED SET PROPERTY 57

3.2. The Heine-Borel Theorem

3.2.1. Theorem (Heine-Borel). Let A ⊂ Rn. Then

A is compact ⇐⇒ A is closed and bounded.

Proof. “=⇒”: A compact ⇒3.1.5(iii)

A totally bounded and complete.

=⇒ A bounded by Remark 3.1.4(i) and A closed by 2.7.5(i), 2.8.4, 2.8.1.

“⇐=”: (using Bolzano-Weierstrass) Let A ⊂ Rn be closed and bounded. We willshow that A is sequentially compact. For this let xk = (x1

k, . . . , xnk ), k ∈ N, be a

sequence in A. Since A is bounded, each coordinate sequence is bounded. Henceby 1.4.8

(x1k)k∈N has a convergent subsequence (x1

f1(k))k∈N in R

(x2f1(k))k∈N · · · (x2

f2(k))k∈N ”

......

(xnfn−1(k))k∈N · · · (xnfn(k))k∈N ”

and so xfn(k) =(x1fn(k), . . . , x

nfn(k)

), k ∈ N, converges in Rn by 2.7.4. Since xfn(k) ∈

A, k ∈ N, and A is closed, the limit is again in A by 2.7.5(i). Thus A is sequentiallycompact. �

Remark. (i) 3.2.1 does not hold in general metric spaces. E.g. M = infiniteset with discrete metric. Then M is closed and bounded, but not compact, since{B(x, 1

2 )}x∈M is an open cover of M with no finite subcover. Indeed, B(x, 12 ) = {x}

∀x ∈M .

(ii) 3.2.1 holds in any normed vector space where B(0, 1) = {x | ‖x‖ ≤ 1} iscompact, or equivalently in any normed vector space of finite dimension.

3.3. The nested set property

3.3.1. Theorem. Let {Kα}α∈I be a family of compact subsets of a metric space.Suppose that the intersection of every finite subfamily is nonempty. Then⋂

α∈IKα 6= ∅.

(ii) Nested set property: Let (Fk)k∈N be a (countable) family of compact nonemptysets in a metric space, such that Fk+1 ⊂ Fk, ∀k ∈ N. Then⋂

k∈NFk 6= ∅.

Proof. (i) Let i ∈ I. Suppose⋂α∈I Kα = Ki ∩ (

⋂α6=iKα) = ∅. Then

Ki ⊂⋃α6=iK

cα and so {Kc

α}α 6=i is an open cover of Ki. Since Ki is compact, thereare finitely many indices α1, . . . , αn ∈ I, with

Ki ⊂ Kcα1∪ . . . ∪Kc

αn .

Thus Ki ∩ Kα1 ∩ . . . ∩ Kαn = Ki ∩((Kc

α1∪ . . . ∪Kc

αn)c)6= ∅. But this

contradicts the assumption that the intersection of any finite subfamily is nonempty.

(ii) (Fk)k∈N satisfies the assumptions of (i). �


3.3.2. Remark. 3.3.1 serves to show that certain inductively defined fractallike sets, e.g. the Cantor set, Sierpinski’s gasket and carpet, are non-empty. Forinstance Sierpinski’s carpet (in short: SC) in R2:

SC =⋂n∈N

Fn 6= ∅ (by 3.3.1(ii))

Moreover SC is compact by Heine-Borel and for the area A(Fn) of Fn, it holds:

A(F1) = 1, A(F2) =8

9A(F1), A(F3) =

(8

9

)A(F2) =

(8

9

)2

A(F1), . . .

=⇒ limn→∞

A(Fn+1) = limn→∞

(8

9

)nA(F1) = 0.

3.4. Path-connected sets

3.4.1. Definition. Let M be a metric space and [a, b] ⊂ R, a < b. Then

ϕ : [a, b] −→M is continuous: ⇐⇒ for every sequence (tk)k∈N in [a, b]with tk → t we have ϕ(tk)→ ϕ(t)

A path from x ∈ M to y ∈ M is a continuous map ϕ : [a, b] → M with ϕ(a) = xand ϕ(b) = y.A path is said to lie in a set A ⊂ M , if ϕ(t) ∈ A for all t ∈ [a, b]. A ⊂ M is calledpath-connected, if for any two points x, y ∈ A, there is a path from x to y thatlies in A.

Remark. Let x, y, z ∈M and

ϕ : [a, b] −→M be a path from x to y,

ψ : [c, d] −→M be a path from y to z.

Then γ : [0, 1] −→M defined by

γ(t) :=

{ϕ((1− 2t) · a+ 2t · b

), t ∈ [0, 1/2],

ψ((2− 2t) · c+ (2t− 1) · d

), t ∈ [1/2, 1],

is a path from x to z.

3.5. CONNECTED SETS 59

3.5. Connected sets

3.5.1. Definition. Let M be a metric space and A ⊂ M . A is said to bedisconnected, if there exists open sets U, V in M with:

(i) (A ∩ U) ∩ (A ∩ V ) = ∅,(ii) A ∩ U 6= ∅,(iii) A ∩ V 6= ∅,(iv) (A ∩ U) ∪ (A ∩ V ) = A.

A is called connected, if it is not disconnected.

Exercise: (M,d) metric space, A ⊂M . Then

U ′ is open in(A, d|A

)⇐⇒ ∃U open in M with U ′ = U ∩A.

From the exercise we see: A is connected, iff A is not the union of two disjointnonempty open subsets of A.

Definition. I ⊂ R is an interval :⇐⇒ ∀a, b ∈ I, c ∈ R: a < c < b =⇒ c ∈ I.

Example. Let a, b ∈ R. Then [a, b[ := {x ∈ R | a ≤ x < b}, {a}, and ∅, areintervals.

3.5.2. Lemma. I ⊂ R is connected ⇐⇒ I ⊂ R is an interval.

Proof. We may assume that I contains more than one point, because ∅ andsingle points are connected intervals.

“⇐=” Suppose I is a disconnected interval. Then I = U ∪ V , where U and V aredisjoint nonempty open subsets of I. Thus we can find a ∈ U and b ∈ V with a < b(rename if necessary). Define

c := sup{x ∈ R | [a, x[ ⊂ U}.

Then a ≤ c ≤ b, hence c ∈ I, because a, b ∈ I and I is an interval. Clearly, c ∈ U(closure in I), because c is the limit of an increasing sequence in U that converges inI. But U is closed in I, since U = V c. Thus c ∈ U , and so c < b and moreover sinceU is also open in I: ∃δ > 0 B(c, δ)∩I ⊂ U and c+δ < b. Then [c, c+δ[ ⊂ [c, b] ⊂ Isince c, b ∈ I and I is an interval. Hence [c, c + δ[ ⊂ B(c, δ) ∩ I ⊂ U and so[a, c+ δ[ ⊂ U (to the definition of c).

“=⇒”: Let I ⊂ R be connected. If I is not an interval, then we can find a, b ∈ Iand c /∈ I with a < c < b. Hence

I ∩ (c,∞), I ∩ (−∞, c),

are disjoint nonempty open subsets of I whose union is I. Thus I is disconnected . �

3.5.3. Theorem. Path-connected sets are connected.

Proof. Let A be a path-connected set that is not connected. Then: ∃U, Vnonempty, open in A, with U ∩ V = ∅ and U ∪ V = A. Let x ∈ U, y ∈ V .A path-connected =⇒ ∃ path ϕ : [a, b]→ A from x = ϕ(a) to y = ϕ(b). Let

C := ϕ−1(U), D := ϕ−1(V ).


Clearly

C ∪D = ϕ−1(U ∪ V ) = [a, b],

C ∩D = ϕ−1(U ∩ V ) = ∅,

a ∈ C 6= ∅, and b ∈ D 6= ∅.

Hence it is enough to show that C, D are closed in [a, b], because then [a, b] isdisconnected, which contradicts the previous Lemma 3.5.2. Let hence tk ∈ C,tk → t ∈ [a, b]. By definition of C, we have ϕ(tk) ∈ U and by continuity of ϕ , wehave ϕ(tk) −→ ϕ(t) ∈ U since U is closed in A. Thus t ∈ C and so C is closed in[a, b]. Analoguously D is closed in [a, b]. �

Remark. Let U ⊂ Rn, U open. Then

U is path-connected ⇐⇒ U is connected

Proof. “=⇒” follows from Theorem 3.5.3.

“⇐=”: Let x0 ∈ U be arbitrary. Define

V := {y ∈ U | ∃ path from x0 to y} ⊂ U.

Obviously x0 ∈ V , hence V 6= ∅. Suppose, we can show V is both open and closedin U . Then V = U . (Otherwise U is the disjoint union of the two nonempty opensets V and U \ V , hence disconnected ). But if V = U , then U is path connectedand we are done. Now, V is open in U for the following reason:

If y ∈ V , choose ε > 0 with B(y, ε) ⊂ U . If z ∈ B(y, ε), then there is a path fromx0 to z, since there is a path from x0 to y and from y to z (straight line). Thusz ∈ V , i.e. B(y, ε) ⊂ V and V is open.

To show that V is closed in U , let yk ∈ V be such that yk → y ∈ U . Since U isopen: ∃ε > 0 with B(y, ε) ⊂ U . Since yk → y: ∃N ∈ N with yk ∈ B(y, ε), ∀k ≥ N .There is a path from x0 to yN (since yN ∈ V ) and from yN to y (straight line),hence from x0 to y, i.e. y ∈ V and so V is also closed in U . �

3.6. Section 3 Additional Examples

1) Find an open cover of (0, 1] that has no finite subcover.Solution: Un = ( 1

n , 2), n ≥ 1.

2)Let (xn)n≥1 converge in a metric space to x. Show that

K := {xn | n ≥ 1} ∪ {x} is compact.

Proof. Let {Ui}i∈I be an open cover of K. Then x ∈ Ui for some i0 ∈ I.By 2.7.1, there exists N = N(Ui0) ∈ N with xn ∈ Ui0 , ∀n ≥ N . If N > 1, choosei1, . . . , iN−1 ∈ I such that xk ∈ Uik , k = 1, . . . , N − 1. Then

K ⊂ Ui0 ∪N−1⋃k=1

Uik .

�

3) A closed subset F of a compact metric space K is compact.

1st proof. Let {Ui}i∈I be an open cover of F . Then {{Ui}i∈I ,K \ F} is anopen cover of K, which has a finite subcover since K is compact. Thus F has afinite subcover. �


2nd proof. Let (xn)n≥1 be a sequence in F . Then (xn)n≥1 is a sequence in Kand has a convergent subsequence, say xnk −→ x ∈ K by 3.1.5. Since F is closed,we have x ∈ F . Thus F is hence sequentially compact. �

3rd proof. Since K is totally bounded and F ⊂ K, F is also totally bounded.But F is also complete since F is a closed subset of a complete space K. �

4) Special examples of path-connected sets.

a) The Sierpinski carpet is path-connected (without proof).

b) Consider the following recursively defined subset C ⊂ [0, 1]:

F1 :=

[0,

1

3

]∪[

2

3, 1

]obtained from [0, 1] by removing the inner third.

F1 := [0,1

9] ∪[

2

9,

3

9

]∪[

6

9,

7

9

]∪[

8

9, 1

]obtained from F1

by removing the inner thirds of the intervals of F1.

In general, Fn is a union of intervals and Fn+1 is obtained from Fn by removingthe inner thirds of these intervals. Then

C :=⋂n≥1

Fn is called the Cantor set.

C has for instance the following properties:

(i) C is compact and C 6= ∅ (exactly as for the Sierpinski carpet, we can use thenested set property to show this)

(ii) C has infinitely many points (just look at the end points of the intervals in Fn).

(iii) int(C) = ∅.

5) Examples of connected sets that are not path-connected (without proof):

(i) {(x, sin 1x ) | x > 0} ∪ {(0, y) | y ∈ [−1, 1]} ⊂ R2.

(ii) L = {(1, 0)} ∪⋃n≥1 Ln ⊂ R2, where Ln = {(x, xn ) ∈ R2 | x ∈ [0, 1]}.

(iii) D = [0, 1]× {0} ∪⋃n≥1

({ 1

n

}× [0, 1]

)∪ {(0, 1)}.

6) Let A ⊂ Rn, x ∈ A, y ∈ Rn \A, and ϕ : [0, 1] −→ Rn be a path from x to y.Show: ∃t ∈ [0, 1] with ϕ(t) ∈ ∂A.

Proof. Let B := {x ∈ [0, 1] | ϕ([0, x]) ⊂ A}. Then 0 ∈ B, hence ∅ 6= B ⊂[0, 1] and ∃t := sup(B). We will show ϕ(t) ∈ ∂A.

Let ε > 0 be arbitrary. Choose tk ∈ [0, t], tk −→ t such that ϕ(tk) ∈ A (possible bythe definition of t). Since ϕ(tk) −→ ϕ(t), ∃N1 = N1(ε) such that ϕ(tk) ∈ B(ϕ(t), ε),∀k ≥ N1. By definition of t, there exists for any k ≥ 1, sk ∈ [0, 1] with t ≤ sk ≤ t+ 1

kand ϕ(sk) /∈ A. Again since ϕ(sk) −→ ϕ(t), ∃N2 = N2(ε) with ϕ(sk) ∈ B(ϕ(t), ε)∀k ≥ N2. Thus B(ϕ(t), ε) ∩ A 6= ∅ and B(ϕ(t), ε) ∩ Ac 6= ∅, i.e. ϕ(t) ∈ ∂A by2.6.2.


�

CHAPTER 4

Continuous mappings

4.1. Continuity

Let (M,d) and (N, ρ) be two metric spaces, A ⊂M and

f : A −→ N

be a given map.

4.1.1. Definition. Let x0 ∈ acc(A). We say b ∈ N is the limit of f at x0,written

b = limx→x0

f(x),

if for any ε > 0, there exists a δ > 0 such that for all x ∈ A\{x0} with d(x, x0) < δ,we have ρ(f(x), b) < ε.

Remark. (a) If limx→x0f(x) exists, it is unique.

Proof. (Obvious 2ε-argument). Suppose b, b′ = limx→x0 f(x), and let ε > 0be arbitrary. ∃δ1, δ2 > 0 with

x ∈ A \ {x0}, d(x, x0) < δ1 =⇒ ρ(f(x), b) < ε

x ∈ A \ {x0}, d(x, x0) < δ2 =⇒ ρ(f(x), b′) < ε.

Thus

x ∈ A \ {x0}, d(x, x0) < min(δ1, δ2)︸︷︷︸∃x, ∀ε>0 since x0∈acc(A)

=⇒ ρ(b′, b) ≤ ρ(b, f(x)) + ρ(f(x), b′) < 2ε.

�

(b) If x0 ∈ A and limx→x0f(x) exists, it can be different from f(x0). Let e.g.

f : A = R −→ N = R, f(x) := x, for x ∈ R \ {0}, f(0) := 1, x0 = 0 (ε = δ).

Notation: If A =]x0, b] ⊂ R, a > x0 (resp. A = [a, x0[⊂ R, a < x0) and N ⊂ R,then one writes

limx→x0

f(x) =: limx→x0+

f(x) = b (resp. limx→x0

f(x) =: limx→x0−

f(x) = b).

4.1.2. Definition. Let A ⊂M and f : A→ N . f is continuous at x0 ∈ A,if either x0 /∈ acc(A) or (x0 ∈ acc(A) and) lim

x→x0

f(x) = f(x0).

Equivalent formulation: f is continuous at x0 ∈ A, if ∀ε > 0, ∃δ > 0 suchthat

x ∈ A and d(x, x0) < δ =⇒ ρ(f(x), f(x0)) < ε.

4.1.3. Definition. f : A ⊂ M → N is continuous on B ⊂ A, if it iscontinuous at each x ∈ B. If B = A we simply say that f is continuous.

63

64 4. CONTINUOUS MAPPINGS

4.1.4. Theorem. Let f : A ⊂M → N be a map. Equivalent are:

(i) f is continuous,(ii) For each convergent sequence xk → x0 in A we have f(xk)→ f(x0),

(iii) ∀U open in N , f−1(U) is open in A,(iv) ∀F closed in N , f−1(F ) is closed in A.

Proof. (i) ⇒ (ii): Suppose xk → x0 in A. Let ε > 0. By (i), ∃δ > 0 with


Let N(δ) ∈ N be such that d(xk, x0) < δ, ∀k ≥ N(δ). Then

ρ(f(xk), f(x0)) < ε, ∀k ≥ N(δ),

and (ii) is shown.

(ii) ⇒ (iv): Let ∅ 6= F ⊂ N , F closed in N . Let xk ∈ f−1(F ), k ∈ N. Supposexk −→ x0 ∈ A. By (ii), f(xk) −→ f(x0) in N . Since f(xk) ∈ F , k ∈ N, and F isclosed,we have f(x0) ∈ F . Hence x0 ∈ f−1(F ), and so f−1(F ) is closed.

(iv) =⇒ (iii): If U is open in N , then F := N \U is closed in N . Thus f−1(N \U) =f−1(N) \ f−1(U) = A \ f−1(U) is closed in A by (iv) and so f−1(U) is open in A.

(iii)⇒ (i): Let x0 ∈ A, ε > 0. Since B(f(x0), ε) is open in N , f−1(B(f(x0), ε)) isopen in A by (iii). Thus for x0 ∈ f−1(B(f(x0), ε)), ∃δ > 0 with B(x0, δ) ∩ A ⊂f−1(B(f(x0), ε)) and so

x ∈ A and d(x, x0) < δ =⇒ x ∈ f−1(B(f(x0), ε)

)=⇒ f(x) ∈ B(f(x0), ε)

=⇒ ρ(f(x), f(x0)) < ε.

�

4.1.5. Remark. In the situation of 4.1.1, we have:

limx→x0

f(x) = b ⇐⇒ limk→∞

f(xk) = b for any sequence (xk)k∈N in A

with xk → x0 and xk 6= x0, ∀k ∈ N.

Proof. “=⇒”: Exactly as in 4.1.4 (i) =⇒ (ii).“⇐=”: Suppose that the right hand statement is true but that we do not havelimx→x0 f(x) = b. Then there exists ε > 0, such that for any δ > 0: ∃x ∈(A \ {x0}) ∩ B(x0, δ) with ρ(f(x), b) ≥ ε. For δn = 1

n , n ∈ N, choose xn ∈(A \ {x0}) ∩ B(x0,

1n ) with ρ(f(xn), b) ≥ ε. Then xn → x0 and xk 6= x0, ∀k ∈ N,

but f(xn) 6→ b .�

4.2. Images of compact and connected sets

Let (M,d) and (N, ρ) be metric spaces.

4.2.1. Theorem. Let f : M → N be continuous and K ⊂ M be connected(resp. path-connected). Then f(K) is connected (resp. path-connected).

Proof. Suppose that K is connected but f(K) is disconnected. Then

∃U ⊂ f(K), U 6= ∅, U 6= f(K),

that is both open and closed in f(K). Thus we can find U ′ open in N and F ′ closedin N with

U = f(K) ∩ U ′ = f(K) ∩ F ′.

4.2. IMAGES OF COMPACT AND CONNECTED SETS 65

Since g := f |K : K → f(K) is continuous (by 4.1.4(ii)), we have that g−1(U) =K ∩ g−1(U ′) = K ∩ g−1(F ′) is both open and closed in K (by 4.1.4(ii), (iv)). But∅ 6= g−1(U) 6= K, hence K is disconnected .

Now suppose that K is path-connected and let f(x), f(y), x, y ∈ K be two arbitraryelements of f(K). Since K is path-connected, there exists a path c : [a, b] → Kfrom x to y. Then f ◦ c : [a, b]→ f(K) is a path from f(x) to f(y). Indeed

tk → t0 =⇒c continuous

c(tk)→ c(t0) =⇒f continuous

f(c(tk))→ f(c(t0)).

�

Example. There is no continuous bijection f : R2 −→ R.

Proof. If there is one, then f : R2 \ {0} → R \ {f(0)} is still a continuousbijection. Hence f(R2 \ {0}) = R \ {f(0)}, since f is bijective, and R \ {f(0)} isconnected by 4.2.1, since f is continuous and R2 \ {0} is connected. But R \ {f(0)}is clearly not connected . �

4.2.2. Theorem. Let f : M → N be continuous, and K ⊂ M be compact.Then f(K) is compact.

Proof. Let f(xn), xn ∈ K, n ∈ N, be a sequence in f(K). Since (xn)n∈N isa sequence in K and K is compact (xn)n∈N has a convergent subsequence in K,say xnk → x ∈ K as k → ∞. Since f is continuous ynk := f(xnk) → f(x) ∈f(K), hence (f(xn))n∈N has a convergent subsequence in f(K) and so f(K) iscompact. �

Example. A similar statement to 4.2.2 does not hold for open or closed sets.E.g.


and

4.3. Operations on continuous mappings

4.3.1. Theorem. Let M,N,P be metric spaces. Let f : A ⊂ M → N andg : B ⊂ N → P be continuous maps and f(A) ⊂ B. Then

g ◦ f : A ⊂M → P is continuous.

Proof. Let U be open in P . Then

(g ◦ f)−1(U) = f−1( g−1(U)︸︷︷︸U ′∩B for some U ′ open in N

)

= f−1(U ′)︸︷︷︸U ′′∩A for some U ′′ open in M

∩ f−1(B)

= U ′′ ∩A is open in A,

=⇒4.1.4(iii)

g ◦ f is continuous. �

Remark. The latter proof may be easier using the characterization of conti-nuity with sequences.

4.3.2. Proposition. Let M be a metric space, A ⊂ M , x0 ∈ acc(A). Letfurther V be a real normed vector space and

f : A −→ V with ∃ limx→x0

f(x) =: a,

g : A −→ V with ∃ limx→x0

g(x) =: b,

h : A −→ R with ∃ limx→x0

h(x) =: r.

Then, we have:

(i) ∃ limx→x0

(f + g)(x) = a+ b, ((f + g)(x) := f(x) + g(x)).

(ii) ∃ limx→x0

(h · g)(x) = r · b, ((h · g)(x) := h(x)g(x)).

(iii) If r 6= 0 then h 6= 0 in a “neighborhood” Ux0of x0 and

∃ limx→x0x∈Ux0

g

h(x) =

b

r,

( gh

(x) :=g(x)

h(x)whenever it makes sense

).

4.3. OPERATIONS ON CONTINUOUS MAPPINGS 67

Proof. The proof is very similar to the corresponding one of 1.2.4. Thereforewe assume (i), (ii) to be already proved and only show (iii) (cf. 1.2.4(iii)):

Since limx→x0h(x) = r: for εr := |r|

2 > 0, ∃δ(εr) > 0 with

x ∈(A \ {x0}

)∩B

(x0, δ(εr)

)︸︷︷︸=:Ux0

=⇒ h(x) ∈ B(r,|r|2

).

Thus

|h(x)| > |r|2, ∀x ∈ Ux0

. (∗)

By (ii), it suffices to show

limx→x0x∈Ux0

1

h(x)=

1

r.

Let ε > 0. Again, since limx→x0h(x) = r: for ε := εr2

2 , ∃δ(ε) > 0 with

x ∈(A \ {x0}

)∩B(x0, δ(ε)) =⇒ h(x) ∈ B(r, ε). (∗∗)

Thus there exists δ := min(δ(ε), δ(εr)

)with

x ∈ Ux0∩B(x0, δ)︸︷︷︸

=(A\{x0})∩B(x0,δ)

=⇒(∗)(∗∗)

|h(x)| > |r|2

and |h(x)− r| < εr2

2

=⇒∣∣∣∣ 1

h(x)− 1

r

∣∣∣∣ =

∣∣∣∣r − h(x)

h(x) · r

∣∣∣∣ =1

|h(x)| · r· |r − h(x)| < 2

r2· εr

2

2= ε

=⇒ 1

h(x)∈ B

(1

r, ε).

Note: we could also have used Remark 4.1.5 to show the satements. �

4.3.3. Corollary. Let M be a metric space and V be a real normed vectorspace. Let further A ⊂M , x0 ∈ A and suppose that

f, g : A −→ V and h : A −→ R

are continuous at x0 ∈ A. Then, we have:

(i) f + g : A→ V is continuous at x0 ∈ A,(ii) h · g : A→ V is continuous at x0 ∈ A,(iii) If h(x0) 6= 0, then h(x) 6= 0 for all x in some neighborhood Ux0 of x0 and

g

h: Ux0 −→ V is continuous at x0 ∈ A.

Proof. By 4.1.4(ii) f (and similarly g and h) is continuous at x0 ∈ A, if andonly if

for each convergent sequence xn → x0 in A, we have f(xn)→ f(x0).

Using this fact and the limit theorem for sequences 1.2.4, the assertions follow im-mediately. The existence of a (true) neighborhood Ux0

= A∩B(x0, δ) (containingx0) as in (iii) follows similarly to 4.3.2(iii). �


4.4. Boundedness of continuous functions on compact sets

4.4.1. Theorem. [Maximum-Minimum Theorem] Let (M,d) be a metric space,∅ 6= K ⊂M , K compact, and f : M → R be continuous. Then f is bounded on Kand attains its maximum and minimum on K, i.e. ∃x0, x1 ∈ K with

f(x0) = infx∈K

f(x) := inf{f(x) | x ∈ K},

f(x1) = supx∈K

f(x) := sup{f(x) | x ∈ K}.

Proof. By 4.2.2 f(K) is compact, hence closed and bounded by 3.2.1. Sincef(K) is bounded, inf(f(K)) and sup(f(K)) are finite and by definition (of inf andsup), it holds

∃f(yn) ∈ f(K) −→ inf(f(K))∃f(zn) ∈ f(K) −→ sup(f(K))

}as n→∞.

Since f(K) is closed we have limn→∞ f(yn), limn→∞ f(zn) ∈ f(K), hence inf(f(K)) ∈f(K) and sup(f(K)) ∈ f(K) as desired. �

4.5. The intermediate value theorem

4.5.1. Theorem (Intermediate value Theorem (IVT)). Let M be a metric spaceand K ⊂M be connected. Let f : M → R be continuous. Let x, y ∈ K, c ∈ R, withf(x) < c < f(y). Then there exists z ∈ K with f(z) = c.

Proof. Suppose there is no such z. Let U := (−∞, c), V := (c,∞). Bycontinuity of f , f−1(U)∩K and f−1(V )∩K are open in K and nonempty, becausex ∈ f−1(U), y ∈ f−1(V ) and x, y ∈ K. Moreover(

f−1(U) ∩K)∩(f−1(V ) ∩K

)= ∅ and

(f−1(U) ∩K

)∪(f−1(V ) ∩K

)= K

Hence K is disconnected . �

Alternative proof. K connected4.2.1=⇒ f(K) connected

Lemma3.5.2=⇒ f(K) is an

interval. Hence ∀f(x), f(y) ∈ f(K), c ∈ R with f(x) < c < f(y) it follows c ∈ f(K)and so there exists z ∈ K with f(z) = c. �

Example. (i) Let f : [0, 1] → [0, 1] be continuous. Then f has a fixed point,i.e. ∃x ∈ [0, 1] with f(x) = x.

Proof. Set g(x) := f(x) − x, x ∈ [0, 1]. Then g : [0, 1] → [−1, 1] is againcontinuous (use e.g. 4.3.3 (i)). Suppose g(0) 6= 0 and g(1) 6= 0 (otherwise we aredone). Then g(1) < 0 < g(0), hence by the IVT: ∃z ∈ [0, 1] with g(z) = 0.

�

(ii) Let f : R→ R be a polynomial of degree n, i.e.

f(x) = anxn + · · ·+ a1x+ a0, an 6= 0, ai ∈ R, n ∈ N.

If n is odd, then f has a real root.

Proof. Clearly f is continuous (use 4.1.4 (ii)). For x 6= 0 we have

f(x) = anxn

(1 +

an−1

an · x+

an−2

an · x2+ · · ·+ a0

an · xn

)

4.6. UNIFORM CONTINUITY 69

Thus for large l ∈ Nf(l) ≈ anln

f(−l) ≈ −anln

f(l) and f(−l) have opposite sign =⇒ ∃x0 ∈ [−l, l] with f(x0) = 0. �

4.6. Uniform continuity

Let (M,d) and (N, ρ) be metric spaces.

4.6.1. Definition. f : A ⊂ M −→ N is called uniformly continuous on A,if for every ε > 0 there is some δ > 0, such that

x, y ∈ A and d(x, y) < δ =⇒ ρ(f(x), f(y)) < ε.

Remark. (i) Uniform continuity is a property of a function on a set (continuitycan be defined pointwise). Clearly:

f uniformly continuous on A =⇒ f continuous at every x ∈ A=⇒ f continuous on A.

(ii) If f is continuous on A, then for each ε > 0 and x0 ∈ A, there exists a δ > 0,such that


The δ depends on ε and x0, whereas if f is uniformly continuous, the δ only dependson ε (and not on x0 ∈ A).

Example. Let f : ]0, 1]→ R, f(x) := 1x . Clearly, f is continuous (use 4.1.4(ii)).

But f is not uniformly continuous.

Proof. Suppose f is uniformly continuous. Then for ε = 1, ∃δ > 0 with

x, y ∈ ]0, 1] and |x− y| < δ =⇒ |f(x)− f(y)| < 1.

But ∃n ≥ 1 with | 1n −1

2n | =1

2n < δ and∣∣f ( 1

n

)− f

(1

2n

)∣∣ = n ≥ 1. �

4.6.2. Theorem. Let f : M → N be continuous and K ⊂M be a compact set.Then f is uniformly continuous on K.

Proof. Let ε > 0. By continuity of f : for any x ∈ K, ∃δx > 0 with

y ∈ K ∩B(x, δx) =⇒ ρ(f(x), f(y)) <ε

2. (∗)

Since {B(x, δx2 )}x∈K is an open cover of the compact set K, there exists a finite

subcover B(x,δx12 ), . . . , B(xN ,

δN2 ) of K. Let δ := min

(δx12 , . . . ,

δxn2

). Then

z, y ∈ K and d(z, y) < δ

=⇒ ∃i ∈ {1, . . . , N} with y ∈ K ∩B(xi,δxi2

) and

d(z, y) + d(y, xi) < δ +δxi2≤ δxi and z ∈ K

=⇒ ∃i ∈ {1, . . . , N} with z, y ∈ K ∩B(xi, δxi)

=⇒ ρ(f(z), f(y)) ≤ ρ(f(z), f(xi)) + ρ(f(xi), f(y)) <(∗)

ε

2+ε

2= ε.


�

4.6.3. Example. Let f : ]0, 1]→ R, f(x) := 1x . Then f is uniformly continuous

on [a, 1] for any a ∈ ]0, 1].

4.7. Differentiation of functions of one variable

4.7.1. Definition. Let I :=]a, b[ ⊂ R be an open interval and x0 ∈ I. We saythat f : I → R is differentiable at x0, if

∃f ′(x0) := limh→0h6=0

f(x0 + h)− f(x0)

h.

In this case f ′(x0) is called the derivative of f at x0.

Remark. The limit is in the sense of 4.1.1 with g(h) = f(x0+h)−f(x0)h which is

defined in

A :=] a− x0︸︷︷︸<0

, b− x0︸︷︷︸>0

[ \ {0}

and the limit is considered at 0 ∈ acc(A). Thus f is differentiable at x0 withderivative f ′(x0) , if ∀ε > 0, ∃δ > 0, with

h ∈(

]a− x0, b− x0[ \ {0})

︸︷︷︸⇐⇒ h 6=0 and x0+h∈ ]a,b[

∩ B(0, δ) =⇒ |g(h)− f ′(x0)| < ε.

Multiplying with |h|, we get

|f(x0 + h)− f(x0)− f ′(x0) · h| < ε · |h|. (4.1)

In (4.1), we can allow h = 0 replacing “<” by “≤”.

Setting h = x− x0 in 4.7.1 we get

limx→x0

x∈I\{x0}

f(x)− f(x0)

x− x0︸︷︷︸function defined in I\{x0},

and x0∈acc(I\{x0})

= f ′(x0).

4.7.2. Proposition. f differentiable at x0 =⇒ f continuous at x0.

Proof. We have

limx→x0

x∈I\{x0}f(x) = lim

x→x0

x∈I\{x0}

(f(x)− f(x0)

x− x0(x− x0) + f(x0)

)=

4.3.2f ′(x0) · 0 + f(x0) = f(x0),

and so

limx→x0

x∈I\{x0}f(x) = f(x0) =⇒ lim

x→x0

f(x) = f(x0).

�

4.7. DIFFERENTIATION OF FUNCTIONS OF ONE VARIABLE 71

Alternative proof. By (4.1): ∀ε > 0, ∃δ > 0 such that x0 + h ∈ ]a, b[ and|h| < δ implies

|f(x0 + h)− f(x0)− f ′(x0) · h| ≤ ε|h|

hence

|f(x0 + h)− f(x0)| ≤ |f ′(x0) · h|+ ε|h| =(|f ′(x0)|+ ε

)· |h|

and so f has even a Lipschitz property at x0. �

Remark. By Remark 4.1.5, 4.7.1 can be reformulated as:

“For any sequence (hn)n∈N in R with hn 6= 0, x0 + hn ∈ I for all n ∈ N, andlimn→∞

hn = 0, we have

∃ limn→∞

f(x0 + hn)− f(x0)

hn=: f ′(x0).”

Example. (i) f(x) = |x|. Then f is uniformly continuous on R, but notdifferentiable at x = 0, because

|f(x)− f(y)| = ||x| − |y|| ≤ |x− y|, (for any ε > 0 choose δ = ε)

and

limh→0+h 6=0

f(0 + h)− f(0)

h= limh→0+h6=0

|h|h

= 1

limh→0−h6=0

f(0 + h)− f(0)

h= limh→0−h6=0

|h|h

= −1

=⇒ @ limh→0h6=0

f(0 + h)− f(0)

h.

(ii) f(x) := x2, f : R→ R. Then for any x ∈ R

f(x+ h)− f(x)

h=

(x+ h)2 − x2

h=

2xh+ h2

h= 2x+ h −→

h→0h6=0

2x.

=⇒ ∀x ∈ R, ∃f ′(x) = 2x.

For the calculation of derivatives it is in practice more convenient to use the fol-lowing theorem.

4.7.3. Theorem. Let f, g : ]a, b[ −→ R be differentiable at x0 ∈ ]a, b[ and c ∈ R.

Then c · f , f + g, f, g and fg (assuming here that g(x0) 6= 0) are differentiable at

x0, and

(i) (c · f)′(x0) = c · f ′(x0),

(ii) (f + g)′(x0) = f ′(x0) + g′(x0),

(iii) (f · g)′(x0) = f ′(x0)g(x0) + f(x0) · g′(x0), (“product rule”)

(iv)

(f

g

)′(x0) =

f ′(x0) · g(x0)− f(x0)g′(x0)

g(x0)2. (“quotient rule”)

Proof. (i) and (ii) follows immediately from the calculation rules for limits ofsequences (see 1.2.4 and 4.3.2). The same is true for (iii) and (iv), but we will showthese statements.


(iii) We have

f(x0 + h)g(x0 + h)− f(x0)g(x0)

h

=1

h

(f(x0 + h)

(g(x0 + h)− g(x0)

)+(f(x0 + h)− f(x0)

)g(x0)

)= f(x0 + h)︸︷︷︸

→f(x0)

· g(x0 + h)− g(x0)

h︸︷︷︸→g′(x0)

+f(x0 + h)− f(x0)

h︸︷︷︸→f ′(x0) as h→0, h 6=0, x0+h∈ ]a,b[

· g(x0).

(iv) By (iii) it is enough to consider the case f ≡ 1. Since g is continuous andg(x0) 6= 0, ∃ neighborhood Ux0 ⊂ ]a, b[ of x0 such that g > 0 or g < 0 on Ux0 . Now

1

h

(1

g(x0 + h)− 1

g(x0)

)=

1

g(x0 + h)g(x0)︸︷︷︸−→g(x0)2

· g(x0)− g(x0 + h)

h︸︷︷︸−→−g′(x0)

−→ − g′(x0)

g(x0)2,

as h→ 0, h 6= 0, x0 + h ∈ Ux0 ⊂ ]a, b[. �

4.7.4. Theorem (The Chain rule). Let f : ]a, b[ −→ ]c, d[, g : ]c, d[ −→ R, fbe differentiable at x0 ∈ ]a, b[ and g be differentiable at f(x0) ∈ ]c, d[. Then g ◦ f isdifferentiable at x0 and

(g ◦ f)′(x0) = g′(f(x0)) · f ′(x0).

Proof. For z ∈ ]c, d[ let

g∗(z) :=

g(z)− g(f(x0))

z − f(x0), if z 6= f(x0),

g′(f(x0)), if z = f(x0).

We have

g differentiable at f(x0) =⇒ limz→f(x0)

g∗(z) = g′(f(x0)), (∗)

and

∀z ∈ ]c, d[ : g(z)− g(f(x0)) = g∗(z)(z − f(x0)). (∗∗)Now

g′(f(x0)) · f ′(x0) =(∗)

limx→x0

g∗(f(x)) limx→x0

f(x)− f(x0)

x− x0

= limx→x0

g∗(f(x))(f(x)− f(x0))

x− x0

=(∗∗)

limx→x0

g(f(x))− g(f(x0))

x− x0

=⇒ ∃ limx→x0

g(f(x))− g(f(x0))

x− x0= (g ◦ f)′(x0) and g′(f(x0)) · f ′(x0) = (g ◦

f)′(x0). �

Definition. Let f : ]a, b[ → R. We say that f has a local maximum (resp.local minimum) at x0 ∈ ]a, b[, if there exists δ > 0 such that

f(x0) ≥ f(x) (resp. f(x0) ≤ f(x)), ∀x ∈ ]a, b[ ∩ B(x0, δ).

If in the previous line we have equality only for x = x0, then f(x0) is called anisolated (or strict) local maximum (resp. minimum). By an extremum we meaneither a maximum or minimum.


4.7.5. Proposition. Let f : ]a, b[ −→ R have a local extremum at x0 ∈ ]a, b[,and let f be differentiable at x0. Then f ′(x0) = 0.

Proof. It is enough to consider the statement in case of a local maximum,otherwise consider −f . Then

0 ≥ limh→0+

f(x0 + h)− f(x0)

h= f ′(x0) = lim

h→0−

f(x0 + h)− f(x0)

h≥ 0.

�

Remark. (i) f ′(x0) = 0 is a necessary but not a sufficient condition for alocal extremum of a differentiable function at x0. For instance, f(x) = x3 satisfiesf ′(0) = 0 but doesn’t have a local extremum at 0.

(ii) The Maximum-Minimum Theorem asserts that any continuous f : [a, b] → Rattains its (absolute) maximum and minimum on [a, b]. But if the extremum isattained at the boundary, we do not necessarily have f ′(x) = 0 as one can see fromf(x) = x, x ∈ [0, 1].

4.7.6. Theorem (Rolle’s Theorem). Let f : [a, b] → R be continuous, differ-entiable on ]a, b[ (i.e. differentiable at any x0 ∈ ]a, b[), and f(a) = f(b). Thenthere is a number c ∈ ]a, b[ such that f ′(c) = 0 (and f(c) is a global minimum ormaximum).

Proof. If f is constant on [a, b] the statement is trivial. If f is not constant,then there is x0 ∈ ]a, b[ with f(x0) > f(a) = f(b) or f(x0) < f(a) = f(b). There-fore, the (absolute) maximum or minimum, which exists by the Maximum-MinimumTheorem is attained in some point c ∈ ]a, b[. By 4.7.5, f ′(c) = 0. �

4.7.7. Theorem (The Mean Value Theorem (MVT)). Let f : [a, b] → R becontinuous and let f be differentiable on ]a, b[. Then there is a point c ∈ ]a, b[ with

f(b)− f(a)

b− a= f ′(c).

Proof. Define F : [a, b]→ R through

F (x) := f(x)− f(b)− f(a)

b− a(x− a).

F is continuous on [a, b] and differentiable on ]a, b[ with F (a) = f(a) = F (b). ByRolle’s Theorem: ∃c ∈ ]a, b[ with F ′(c) = 0. Since

F ′(c) = f ′(c)− f(b)− f(a)

b− athe assertion follows. �

4.7.8. Corollary. Let f : [a, b]→ R be continuous and f ′ = 0 on ]a, b[. Thenf is constant.

Proof. Let x ∈ ]a, b]. Then by the MVT on [a, x]: ∃c ∈ ]a, x[ with f(x)−f(a)x−a =

f ′(c) = 0. Hence f(x) = f(a). Consequently, f ≡ f(a). �

4.7.9. Example. Let f : [a, b] → R be continuous, differentiable on ]a, b[ and|f ′(x)| ≤M , ∀x ∈ ]a, b[. Then: |f(x)− f(y)| ≤M |x− y|, ∀x, y ∈ [a, b].

Proof. x, y ∈ ]a, b[, x 6= y =⇒MVT

∃c ∈ ]x, y[ (or c ∈ ]y, x[) with f(x)− f(y) =

f ′(c)(x − y). Now take absolute values, use |f ′(c)| ≤ M and then continuity on[a, b] �


4.7.10. Proposition. Let f : [a, b] → R be continuous and f be differentiableon ]a, b[. If f ′(x) ≥ 0 (resp. f ′(x) > 0, f ′(x) ≤ 0, f ′(x) < 0) for all x ∈ ]a, b[, thenf is increasing (resp. strictly increasing, decreasing, strictly decreasing) on [a, b].

Proof. We only treat the case f ′(x) > 0, ∀x ∈ ]a, b[. Suppose f is not strictlyincreasing. Then: ∃x1, x2 ∈ [a, b] with x1 < x2 and f(x1) ≥ f(x2). By the MVT:∃c ∈ ]x1, x2[ with

f ′(c) =f(x2)− f(x1)

x2 − x1≤ 0 .

�

4.7.11. Theorem (The Inverse Function Theorem). Let f : [a, b] → R be con-tinuous and strictly monotone (i.e. either strictly increasing or strictly decreasing).Then its inverse function f−1 : f([a, b]) −→ R exists and is continuous. If f isdifferentiable at x ∈ ]a, b[ with f ′(x) 6= 0, then f−1 is differentiable at f(x) and

(f−1)′(f(x)) =1

f ′(x).

Proof. Since f is strictly monotone, f is injective, hence a bijection from[a, b] onto f([a, b]). Thus f−1 exists. In particular f([a, b]) is a closed and boundedinterval. Next, we show that f−1 is continuous, i.e.

f(xk)→ f(x) in f([a, b]) =⇒ xk = f−1(f(xk))→ f−1(f(x)) = x in [a, b].

Suppose xk does not converge to x. Then: ∃ε > 0 with |xk − x| ≥ ε for infinitelymany k. Choose a subsequence, such that

|xkl − x| ≥ ε, ∀l ≥ 1 and xkl → c ∈ [a, b]. (possible by Bolzano-Weierstrassor compactness of [a,b])

By continuity of f , f(xkl)→ f(c). Since also f(xkl)→ f(x) and f is injective, wemust have x = c. But |c− x| = limk→∞ |xkl − x| ≥ ε . Thus f−1 is continuousand

f(xk)→ f(x) ⇐⇒ xk → x.

Now

1

f ′(x)= lim

y→x

1f(x)−f(y)

x−y

= limy→x

x− yf(x)− f(y)

= limy→x

f−1(f(x))− f−1(f(y))

f(x)− f(y)

= limf(y)→f(x)

f−1(f(x))− f−1(f(y))

f(x)− f(y).

Therefore, the last limit exists and equals (f−1)′(f(x)). �

Notation: Let f : ]a, b[→ R be differentiable and f ′ : ]a, b[→ R its derivative.If f ′ is again differentiable (on ]a, b[), then

f ′′ : ]a, b[ → R denotes the second derivative of f , i.e. f ′′ = (f ′)′,

......

...

f (n) : ]a, b[ → R denotes the n-th derivative of f , i.e. f (n) = (f (n−1))′.

In particular f (0) := f .


4.7.12. Proposition. Let f : ]a, b[ → R be differentiable. Suppose for somex ∈ ]a, b[, we have

f ′(x) = 0, ∃f ′′(x) and f ′′(x) > 0 (resp. f ′′(x) < 0).

Then f has an isolated local minimum (resp. maximum) at x.

Remark. 4.7.12 only gives a sufficient but not a necessary condition for anisolated local extremum. Example: f(x) = x4 has an isolated local minimum inx = 0, but f ′′(0) = 0.

Proof of 4.7.12. Suppose ∃f ′′(x) > 0 (the case ∃f ′′(x) < 0 works analo-gously). Since

f ′′(x) = limy→x

f ′(y)− f ′(x)

y − x> 0,

there exists δ > 0 with B(x, δ) ⊂ ]a, b[ and

f ′(y)− f ′(x)

y − x> 0, ∀y ∈ B(x, δ), y 6= x.

Since f ′(x) = 0, it follows

f ′(y) < 0, for x− δ < y < x,

f ′(y) > 0, for x < y < x+ δ.

By 4.7.10:

f is strictly decreasing in [x− δ, x] and f is strictly increasing in [x, x+ δ].

Thus x is an isolated local minimum of f . �

Definition. f : [a, b] → R is called differentiable, if f : ]a, b[ → R is differen-tiable, and the right-hand derivative of f in a and the left-hand derivative off in b exists, i.e.

∃ limh→0+

f(a+ h)− f(a)

h=: f ′(a) and ∃ lim

h→0−

f(b+ h)− f(b)

h=: f ′(b).

We denote by f ′ : [a, b]→ R the derivative of f in the above sense. If f ′ : [a, b]→ Ris again differentiable, we denote its derivative by

f ′′ : [a, b] −→ R (second derivative)

...

f (n) : [a, b] −→ R(n-th derivative, f (n) =

(f (n−1)

)′), f (0) := f.

4.7.13. Theorem (Taylor’s Theorem). Let f : [a, b]→ R be a function, n ∈ Nfixed. Suppose f (n−1) is continuous on [a, b] and that f (n)(z) exists for any z ∈]a, b[. Let x0 ∈ [a, b]. Then for any x ∈ [a, b], there exists ξ = ξx ∈ ]x, x0[ (orξ = ξx ∈ ]x0, x[), with

f(x) = f(x0) + f ′(x0)(x− x0) +f ′′(x0)

2(x− x0)2 + . . .

. . . +f (n−1)(x0)

(n− 1)!(x− x0)n−1 +

f (n)(ξ)

n!(x− x0)n.︸︷︷︸

“Lagrange form of remainder term”


Proof. Set

P (y) :=

n−1∑k=0

f (k)(x0)

k!(y − x0)k, y ∈ [a, b].

Fix x0 ∈ [a, b] and let x ∈ [a, b], x 6= x0. Define M = Mx ∈ R through

f(x) = P (x) +M(x− x0)n

and let

g(y) := f(y)− P (y)−M(y − x0)n, y ∈ [a, b]. (4.2)

We have to show: M = f(n)(ξ)n! for some ξ between x and x0.

For y ∈ ]a, b[, we have

g(n)(y) = f (n)(y)−M · n!

Thus we are done, if ∃ξ ∈ ]x, x0[ (or ξ ∈ ]x0, x[, if x0 < x) with g(n)(ξ) = 0. Sinceby definition of P , P (k)(x0) = f (k)(x0), if k = 0, . . . , n− 1, we have

g(k)(x0) =(4.2)

f (k)(x0)− P (k)(x0)−M · n!

(n− k)!(y − x0)n−k|y=x0

= 0 (∗)

for k = 0, . . . , n− 1. Further g(x) = 0 by definition of M . Since g(x0) = 0 by (∗)

=⇒MVT

∃ξ1 between x and x0 with g′(ξ1) = 0.

By (∗) again g′(x0) = 0

=⇒MVT

∃ξ2 between x0 and ξ1 with g′′(ξ2) = 0.

After n steps, we get finally g(n)(ξn) = 0 for some ξ := ξn between x0 and ξn−1. �

Remark. For n = 1, Taylor’s Theorem reduces to the MVT.

4.7.14. Corollary. Let f : [a, b] → R be a function, n ∈ N be fixed. Supposef (n) is continuous on [a, b] and let x0 ∈ [a, b]. Then for all x ∈ [a, b]:

f(x) =

n∑k=0

f (k)(x0)

k!(x− x0)k + η(x)(x− x0)n,

where η is a function with limx→x0 η(x) = 0.

Proof. By Taylor’s formula with Lagrange remainder term we have

f(x) =

n∑k=0

f (k)(x0)

k!(x− x0)k +

f (n)(ξx)− f (n)(x0)

n!︸︷︷︸=:η(x) (ξx depends on x)

(x− x0)n

Since ξx is between x and x0 we have ξx → x0 as x → x0. Thus by continuity off (n)

limx→x0

η(x) = limξx→x0

f (n)(ξk)− f (n)(x0)

n!= 0.

�

4.8. THE RIEMANN-STIELTJES INTEGRAL 77

4.8. The Riemann-Stieltjes Integral

Let [a, b], a < b, be a given interval. A partition P = {x0, . . . , xn} of [a, b] isa finite set of points in [a, b] such that

a = x0 < · · · < xn = b.

Let f : [a, b] → R be a bounded function. Let α : [a, b] → R be an increasingfunction, i.e. α(x) ≤ α(y), ∀x, y ∈ [a, b] with x ≤ y. Define the upper and lowersums w.r.t. f , P and α by

U(f, P, α) :=

n∑i=1

sup{f(x) | x ∈ [xi−1, xi]}(α(xi)− α(xi−1)

),

L(f, P, α) :=

n∑i=1

inf{f(x) | x ∈ [xi−1, xi]}(α(xi)− α(xi−1)

).

If α(x) = x, ∀x ∈ [a, b], we simply write U(f, P ) and L(f, P ) for the upper andlower sums. Since f is bounded:

m := inf{f(x) | x ∈ [a, b]} ≤ f(x) ≤ M := sup{f(x) | x ∈ [a, b]}, ∀x ∈ [a, b].

Hence for any partition P of [a, b], we have

m(α(b)− α(a)

)≤ L(f, P, α) ≤ U(f, P, α) ≤ M

(α(b)− α(a)

)Therefore, the upper Riemann-Stieltjes-integral (RS-integral for short) of fw.r.t. α over [a, b]∫ b

a

f(x)dα(x) := inf{U(f, P, α) | P partition of [a, b] },

and the lower RS-integral of f w.r.t α over [a.b]∫ b

a

f(x)dα(x) := sup{L(f, P, α) | P partition of [a, b]},

both exist.

4.8.1. Definition. f is said to be RS-integrable w.r.t α over [a, b], if∫ b

a

f(x)dα(x) =

∫ b

a

f(x)dα(x).

In this case, we write f ∈ R([a, b], α) and the common value of upper and lowerRS-integral (of f w.r.t. α) over [a, b] is denoted by∫ b

a

f(x)dα(x) or

∫ b

a

fdα(

=

∫ b

a

fdα =

∫ b

a

fdα)

and is called the RS-integral of f w.r.t. α over [a, b].

If α(x) = x, we write f ∈ R([a, b]) and∫ b

a

f(x)dx

is called the Riemann integral (R-integral for short) of f over [a, b].

4.8.2. Example. Let f : [0, 1] → R, f(x) := 1, if x ∈ (R \ Q) ∩ [0, 1] andf(x) := 0 if x ∈ Q ∩ [0, 1], α(x) := x, x ∈ [0, 1].

Then for any partition P = {x0, . . . , xn} of [0, 1], we have

U(f, P ) =

n∑i=1

1 · (xi − xi−1) = 1, L(f, P ) =

n∑i=1

0 · (xi − xi−1) = 0.


Thus ∫ 1

0

f(x)dx = 0 6= 1 =

∫ 1

0

f(x)dx

and f is not R-integrable.

4.8.3. Definition. If P and P ′ are partitions of [a, b] and P ⊂ P ′, then P ′ iscalled a refinement of P . Given two partitions P and P ′ of [a, b], then P ∪ P ′ iscalled their common refinement.

4.8.4. Lemma. If P ′ is a refinement of P , then

L(f, P, α) ≤ L(f, P ′, α) ≤ U(f, P ′, α) ≤ U(f, P, α).

Proof. Let P = {x0, . . . , xn}, P ′\P = {a1, . . . , am} and Pi := P∪{a1, . . . , ai},i = 1, . . . ,m. Consider P1, then xk−1 < a1 < xk for some k ∈ {1, . . . , n}. Now

L(f, P1, α)− L(f, P, α)

= inf{f(x) | x ∈ [xk−1, a1]}(α(a1)− α(xk−1)

)+ inf{f(x) | x ∈ [a1, xk]}

(α(xk)− α(a1)

)− inf{f(x) | x ∈ [xk−1, xk]}(α(xk)− α(xk−1)) ≥

↑“inf over smaller set is bigger”

0

In the same way L(f, Pi+1, α)− L(f, Pi, α) ≥ 0 for i = 1, . . . ,m− 1, and so

L(f, P ′, α) ≤ L(f, P, α).

Clearly, L(f, P ′, α) ≤ U(f, P ′, α) and similarly to the proof for the lower sums

U(f, P ′, α) ≤ U(f, P, α)

(“the sup over a smaller set is smaller”). �

4.8.5. Theorem (Upper-Lower Integral inequality). For f and α as above, itholds ∫ b

a

f(x)dα(x) ≤∫ b

a

f(x)dα(x).

Proof. Let P1, P2 be arbitrary partitions of [a, b]. Then by 4.8.4

L(f1, P1, α) ≤ L(f, P1 ∪ P2, α) ≤ U(f, P1 ∪ P2, α) ≤ U(f, P2, α)

Fix P2 and take the sup over all P1 on the l.h.s. Then∫ b

a

f(x)dα(x) ≤ U(f, P2, α)

Then take the inf over all P2 on the r.h.s. Consequently∫ b

a

f(x)dα(x) ≤∫ b

a

f(x)dα(x).

�

4.8.6. Lemma. f ∈ R([a, b], α), if and only if

∀ε > 0, ∃ partition P with U(f, P, α)− L(f, P, α) < ε. (4.3)


Proof. “⇐=”: For any partition P

L(f, P, α) ≤∫ b

a

f(x)dα(x) ≤∫ b

a

f(x)dα(x) ≤ U(f, P, α).

Thus (4.3) implies ∫ b

a

f(x)dα(x)−∫ b

a

f(x)dα(x) < ε, ∀ε > 0,

and so f ∈ R([a, b], α).

“=⇒”: If f ∈ R([a, b], α) and ε > 0, then there exist partitions P1 and P2 of [a, b]such that

U(f, P2, α)−∫ b

a

f(x)dα(x) <ε

2, (∗)

and ∫ b

a

f(x)dα(x)− L(f, P1, α) <ε

2. (∗∗)

Let P := P1 ∪ P2. Then

U(f, P, α) ≤ U(f, P2, α) <(∗)

ε

2+

∫ b

a

f(x)dα(x)

<(∗∗)

L(f, P1, α) + ε ≤ L(f, P, α) + ε

and (4.3) holds for P = P1 ∪ P2. �

4.8.7. Theorem. (i) Let f have at most finitely many points of discon-tinuity and let α be continuous at every point where f is discontinuous.Then f ∈ R([a, b], α).

(ii) Let f be monotone (increasing or decreasing) and α be continuous on [a, b].Then f ∈ R([a, b], α).

Proof. W.l.o.g. we may assume that f and α are not constant, since otherwisethe statement is trivial.

(i) Let ε > 0 and let a1, . . . , ak be the points of discontinuity of f (note: there maybe none). For each ai choose li, ri ∈ [a, b] with

a ≤ l1 < r1 < · · · < lk < rk ≤ b, ai ∈ int([li, ri])︸︷︷︸interior in the metric space [a,b]

,

and

α(ri)− α(li) <ε

2k(M −m), 1 ≤ i ≤ k.

(This is possible since α is continuous at each ai and there are only finitely manyai’s).

Let

C :=

{[a, b] \

⋃ki=1 int ([li, ri]) , if ∃ points of discontinuity of f ,

[a, b], if f is continuous.

Then C is compact and f is continuous on C. Thus by 4.6.2 f is uniformly contin-uous on C and so ∃δ > 0 with

x, y ∈ C and |x− y| < δ =⇒ |f(x)− f(y)| < ε

2(α(b)− α(a)).


Let P = {x0, . . . , xn} ⊃ {l1, r1, . . . , lk, rk} be a partition of [a, b] such that

P ∩( k⋃i=1

]li, ri[)

= ∅(no partition point in ]li, ri[, i = 1, ..., k

)and

xi − xi−1 < δ whenever xi−1, xi ∈ C for some i ∈ {1, . . . , n}.Then

U(f, P, α)− L(f, P, α)

=

n∑j=1

(sup{f(x) | x ∈ [xj−1, xj ]} − inf{f(x) | x ∈ [xj−1, xj ]}

)(α(xj)− α(xj−1))

=∑j with

∃ai∈[xj−1,xj ]for some i∈{1,...,k}

. . . +∑j with

6∃ai∈[xj−1,xj ]...

. . .

< k(M −m)ε

2k(M −m)+

ε

2(α(b)− α(a))

(α(b)− α(a)

)= ε.

Thus f ∈ R([a, b], α) by 4.8.6.

(ii) We only show the statement for increasing f . Let ε > 0 be given. Let n ∈ N,x0 = a, xn = b and for each i = 1, . . . , n − 1 let xi ∈ [a, b] with xn > xi > xi−1 begiven such that

α(xi) = α(xi−1) +α(b)− α(a)

n∈ [α(a), α(b)].

(Note that this is possible by the IVT and since α is increasing.)

Then for P = {x0, . . . , xn} as above

U(f, P, α)− L(f, P, α) =

n∑i=1

(f(xi)− f(xi−1)

)(α(xi)− α(xi−1)

)≤ α(b)− α(a)

n

n∑i=1

(f(xi)− f(xi−1)

)=

α(b)− α(a)

n

(f(b)− f(a)

)< ε for n big.


�

4.8.8. Theorem. Let f ∈ R([a, b], α), φ : [m,M ] → R be continuous. Thenφ ◦ f ∈ R([a, b], α).

Proof. Let ε > 0. Since (by 4.6.2) φ is uniformly continuous on [m,M ]:∃δ > 0, δ < ε, with

|φ(x)− φ(y)| < ε, ∀x, y ∈ [m,M ] with |x− y| < δ. (4.4)

Since f ∈ R([a, b], α): ∃ partition P = {x0, . . . , xn} of [a, b] with

U(f, P, α)− L(f, P, α) < δ2. (4.5)

For i ∈ {1, . . . , n} let

Mi := sup{f(x) | x ∈ [xi−1, xi]}, M∗i := sup{φ(f(x)) | x ∈ [xi−1, xi]},mi := inf{f(x) | x ∈ [xi−1, xi]}, m∗i := inf{φ(f(x)) | x ∈ [xi−1, xi]}.

and

M∗ := sup{|φ(f(x))| | x ∈ [a, b]}.Write

{1, . . . , n} = {i |Mi −mi < δ}︸︷︷︸=:A

∪{i |Mi −mi ≥ δ}︸︷︷︸=:B

We have ∑i∈B

(α(xi)− α(xi−1)

)≤∑i∈B

Mi −mi

δ

(α(xi)− α(xi−1)

)<

(4.5)δ. (4.6)

Consequently

U(φ ◦ f, P, α)− L(φ ◦ f, P, α)

=∑i∈A

(M∗i −m∗i

)(α(xi)− α(xi−1)) +

∑i∈B

(M∗i −m∗i

)(α(xi)− α(xi−1))

≤(4.4)

∑i∈A

ε(α(xi)− α(xi−1)) +∑i∈B

2M∗(α(xi)− α(xi−1)

)<

(4.6)ε(α(b)− α(a)) + 2M∗δ < ε(α(b)− α(a) + 2M∗)

and the assertion follows by 4.8.6. �

4.8.9. Theorem. Let f, g ∈ R([a, b], α) and c ∈ R. Then:

(a) f + g ∈ R([a, b], α) and

∫ b

a

(f + g)dα =

∫ b

a

fdα+

∫ b

a

gdα.

(b) cf ∈ R([a, b], α) and

∫ b

a

(cf)dα = c

∫ b

a

fdα

(in case c > 0, we also have f ∈ R([a, b], c · α) and∫ b

a

fd(cα) = c

∫ b

a

fdα).


(c) If c′ ∈ ]a, b[, then f ∈ R([a, c′], α) ∩R([c′, b], α) and∫ b

a

fdα =

∫ c′

a

fdα+

∫ b

c′fdα.

(d) If f(x) ≤ g(x), ∀x ∈ [a, b], then

∫ b

a

fdα ≤∫ b

a

gdα.

(e) If |f(x)| ≤ L , ∀x ∈ [a, b], then∣∣ ∫ b

a

fdα∣∣ ≤ L(α(b)− α(a)).

(f) If f ∈ R([a, b], α1) ∩R([a, b], α2), then f ∈ R([a, b], α1 + α2) and∫ b

a

fd(α1 + α2) =

∫ b

a

fdα1 +

∫ b

a

fdα2.

Proof. (a) For any given partition P , it holds

L(f + g, P, α) ≥ L(f, P, α) + L(g, P, α) (4.7)

and

U(f + g, P, α) ≤ U(f, P, α) + U(g, P, α) (4.8)

Since f, g ∈ R([a, b], α): for ε > 0, ∃ partitions P1, P2 of [a, b] with

U(f, P1, α) ≤∫ b

a

fdα+ε

2and U(g, P2, α) ≤

∫ b

a

gdα+ε

2. (4.9)

Then∫ b

a

(f + g)dα ≤ U(f + g, P1 ∪ P2, α)

≤(4.8)

U(f, P1 ∪ P2, α) + U(g, P1 ∪ P2, α)

≤4.8.4

U(f, P1, α) + U(g, P2, α) ≤(4.9)

∫ b

a

fdα+

∫ b

a

gdα+ ε.

Similarly:

∫ b

a

(f + g)dα ≥∫ b

a

fdα+

∫ b

a

gdα− ε. Letting ε→ 0, we get

∫ b

a

fdα+

∫ b

a

gdα ≤∫ b

a

(f + g)dα ≤∫ b

a

(f + g)dα ≤∫ b

a

fdα+

∫ b

a

gdα

=⇒ ∃∫ b

a

(f + g)dα and

∫ b

a

(f + g)dα =

∫ b

a

fdα+

∫ b

a

gdα.

(c) exercise.

The proofs of (b), (d), (e), (f) are very similar to the proof of (a), so we omit them.�

4.8.10. Theorem. Let f, g ∈ R([a, b], α). Then:

(a) f · g ∈ R([a, b], α).

(b) |f | ∈ R([a, b], α) and∣∣ ∫ b

a

fdα∣∣ ≤ ∫ b

a

|f |dα.

Proof. (a) Let φ(t) = t2. Then by 4.8.8, we have φ ◦ h = h2 ∈ R([a, b], α)whenever h ∈ R([a, b], α). Hence by 4.8.9(a), (b)

f · g =1

2

((f + g)2 − f2 − g2

)∈ R([a, b], α).


(b) Let φ(t) = |t|. Then by 4.8.8, we have φ ◦ f = |f | ∈ R([a, b], α).For c = 1 or c = −1, we have∣∣ ∫ b

a

fdα∣∣ = c

∫ b

a

fdα =4.8.9(b)

∫ b

a

cf︸︷︷︸≤|f |

dα ≤4.8.9(d)

∫ b

a

|f |dα.

�

4.8.11 (The MVT of RS-integration). Let f ∈ R([a, b], α) be continuous. Thenthere is some ξ ∈ [a, b] with∫ b

a

fdα = f(ξ)(α(b)− α(a)).

Proof. For any partition P

m(α(b)− α(a)) ≤ L(f, P, α) ≤ U(f, P, α) ≤M(α(b)− α(a)),

hence ∫ b

a

fdα ∈[m(α(b)− α(a)),M(α(b)− α(a))

],

and so ∃ c ∈ [m,M ], with ∫ b

a

fdα = c(α(b)− α(a)).

Since f is continuous, we get by the Maximum-Minimum-Theorem: ∃x0, x1 ∈ [a, b]with

f(x0) = m = inf{f(x) | x ∈ [a, b]}, f(x1) = M = sup{f(x) | x ∈ [a, b]}.

If c = m or c = M then we are done. Thus we may assume c ∈ ]m,M [ =]f(x0), f(x1)[. Then by the IVT 4.5.1: ∃ξ ∈ [a, b] with f(ξ) = c. �

Integration and differentiation:

For b ≤ a, we define ∫ b

a

fdα := −∫ a

b

fdα,

so that in particular ∫ a

a

fdα = −∫ a

a

fdα = 0.

4.8.12. Definition. Let I ⊂ R be a bounded interval that contains at leasttwo points. Then a differentiable function F : I → R is called an antiderivativeor primitive of f : I → R, if F ′ = f .

4.8.13. Remark. Let G : I → R be another differentiable function. Then

G is an antiderivative of f ⇐⇒ G− F = c for some c ∈ R(“antiderivatives are unique up to constants”)

Proof. “=⇒”: G′ = f =⇒F ′=f

G′ = F ′ =⇒ (G − F )′ = 0 =⇒4.7.8

G − F is

constant.

“⇐=”: G− F = c for some c ∈ R =⇒ G′ = (F + c)′ = F ′ = f . �


4.8.14. Proposition. Let I ⊂ R be a bounded interval that contains at leasttwo points. Let f : I → R be continuous and a ∈ I. Define

F (x) :=

∫ x

a

f(y)dy, x ∈ I

Then F is differentiable with F ′ = f .

Proof. By 4.8.7(i), F is well-defined. For x ∈ I and h 6= 0 such that x+h ∈ I,we have

F (x+ h)− F (x)

h=

1

h

(∫ x+h

a

f(y)dy −∫ x

a

f(y)dy)

=4.8.9(c)

and def. of∫ ba

for b≤a

1

h

∫ x+h

x

f(y)dy.

By the MVT of integration (with α(x) = x): ∃ξh ∈ [x, x + h] (or ξh ∈ [x + h, x] ifh < 0) with ∫ x+h

x

f(y)dy = f(ξh) · h.

Since ξh → x as h→ 0 and f is continuous, we get

F (x+ h)− F (x)

h= f(ξh) −→

h→0f(x).

Thus F is differentiable at x and F ′(x) = f(x). �

4.8.15. Theorem (Fundamental Theorem of Calculus (FTC)). Let I ⊂ R bea bounded interval that contains at least two points. Let f : I → R be continuous.Then f has an antiderivative F , such that for any two points a, b ∈ I, we have∫ b

a

f(x)dx = F (b)− F (a).

If G is another antiderivative of f , then we also have∫ b

a

f(x)dx = G(b)−G(a).

Proof. The assertions directly follow from 4.8.13 and 4.8.14. �

RS-integral and R-integral are related through the following:

4.8.16. Theorem. Assume α is differentiable on [a, b] with α′ ∈ R([a, b]). Then

f ∈ R([a, b], α) ⇐⇒ fα′ ∈ R([a, b])

and in any of these cases ∫ b

a

fdα =

∫ b

a

f(x)α′(x)dx.

Proof. Let ε > 0. By 4.8.6 there is a partition P = {x0, . . . , xn} of [a, b] suchthat

U(α′, P )− L(α′, P ) < ε. (4.10)

By the MVT 4.7.7, we can find ti ∈ ]xi−1, xi[ with

α(xi)− α(xi−1) = α′(ti)(xi − xi−1), 1 ≤ i ≤ n. (4.11)


For arbitrary si ∈ [xi−1, xi], we then get

n∑i=1

|α′(si)− α′(ti)|(xi − xi−1)

≤n∑i=1

(sup{α′(x) | x ∈ [xi−1, xi]} − inf{α′(x) | x ∈ [xi−1 − xi]}

)(xi − xi−1)

= U(α′, P )− L(α′, P ) <(4.10)

ε. (4.12)

Thus ∣∣ n∑i=1

f(si)(α(xi)− α(xi−1))−n∑i=1

f(si)α′(si)(xi − xi−1)

∣∣=

(4.11)

∣∣ n∑i=1

f(si)(α′(ti)− α′(si)

)(xi − xi−1)

∣∣ <(4.12)

Mε. (4.13)

In particular

n∑i=1

f(si)(α(xi)− α(xi−1)

)<

(4.13)

n∑i=1

f(si)α′(si)(xi − xi−1)︸︷︷︸

≤U(fα′,P )

+ Mε,

n∑i=1

f(si)α′(si)(xi − xi−1) <

(4.13)

n∑i=1

f(si)(α(xi)− α(xi−1)

)︸︷︷︸

≤U(f,P,α)

+ Mε.

(4.14)

Since si ∈ [xi−1, xi] was arbitrary, (4.14) we implies

U(f, P, α) ≤ U(fα′, P ) +Mε and U(fα′, P ) ≤ U(f, P, α) +Mε,

and also the same inequalities for the lower sums. Hence

|U(f, P, α)− U(fα′, P )| ≤Mε, (4.15)

and

|L(f, P, α)− L(fα′, P )| ≤Mε. (4.16)

Now assume f ∈ R([a, b], α). Then since (4.15) and (4.16) hold for any refinementof P (since (4.10), (4.11) hold for any refinement of P ), we may assume that

U(f, P, α)− L(f, P, α) < Mε (4.17)

(otherwise consider a common refinement). Using (4.15)-(4.17), we get

U(fα′, P )− L(fα′, P ) < 3Mε

and so fα′ ∈ R([a, b]) by 4.8.6 and by (4.15), (4.16) (holding for any refinement ofP ) ∫ b

a

fdα =

∫ b

a

f(x)α′(x)dx. (4.18)

If we assume fα′ ∈ R([a, b]) we can conclude similarly that f ∈ R([a, b], α) andthat (4.18) holds. �


4.8.17. Theorem (Integration by parts). Let f, g : [a, b] −→ R be differentiablewith continuous derivatives f ′ and g′. Then∫ b

a

f ′(x)g(x)dx = f(b)g(b)− f(a)g(a)−∫ b

a

f(x)g′(x)dx.

Proof. For F := f · g, we have

F ′(x) = f ′(x)g(x) + f(x)g′(x) (contimuous).

Furthermore, f ′ · g ∈ R([a, b]) and f · g′ ∈ R([a, b]) as products of continuousfunctions. Then by the FTC and the linearity of the integral

F (b)− F (a) =

∫ b

a

f ′(x)g(x)dx+

∫ b

a

f(x)g′(x)dx.

�

4.8.18. Theorem (Taylor’s Theorem with integral form of the remainder). Letf : [a, b] → R be a function and n ∈ N be fixed. Suppose that f (n) exists and iscontinuous on [a, b]. Let x0 ∈ [a, b]. Then for all x ∈ [a, b]

f(x) =

n−1∑k=0

f (k)(x0)

k!(x− x0)k +Rn(x), (4.19)

where

Rn(x) :=1

(n− 1)!

∫ x

x0

(x− t)n−1f (n)(t)dt.

Proof. For n = 1, (4.19) reduces to

f(x) = f(x0) +

∫ x

x0

f ′(t)dt,

which holds by the FTC.

Suppose (4.19) holds for n− 1 (where n ≥ 2). Then

Rn−1(t) =1

(n− 2)!

∫ x

x0

(x− t)n−2f (n−1)(t)dt

= −∫ x

x0

f (n−1)(t)d

dt

( (x− t)n−1

(n− 1)!

)dt

=4.8.17

∫ x

x0

f (n)(t)(x− t)n−1

(n− 1)!dt− f (n−1)(x)

0n−1

(n− 1)!+ f (n−1)(x0)

(x− x0)n−1

(n− 1)!

= Rn(x) + f (n−1)(x0)(x− x0)n−1

(n− 1)!.

Thus (4.19) also holds for n. �

4.8.19. Theorem (Change of variable). Let I ⊂ R be an interval that containsmore than two points and f : I → R be a continuous function. Let ϕ : [a, b]→ I becontinuously differentiable. Then∫ b

a

f(ϕ(t))ϕ′(t)dt =

∫ ϕ(b)

ϕ(a)

f(t)dt.


Proof. Let F (x) :=∫ xϕ(a)

f(t)dt, t ∈ I. Then F is differentiable with F ′ = f

and for t ∈ [a, b]

(F ◦ ϕ)′(t) = F ′(ϕ(t))ϕ′(t) = f(ϕ(t))ϕ′(t), (continuous in t).

Thus ∫ b

a

f(ϕ(t))ϕ′(t)dt = (F ◦ ϕ)(b)− (F ◦ ϕ)(a) =

∫ ϕ(b)

ϕ(a)

f(t)dt.

�

The RS-integral and infinite series:

4.8.20. Definition. The unit step function U is defined by

U(x) :=

{0, if x ≤ 0,

1, if x > 0.

4.8.21. Theorem. Let s ∈ ]a, b[ and f : [a, b] → R be bounded and continuousat s, α(x) := U(x− s), x ∈ [a, b], Then f ∈ R([a, b], α) and∫ b

a

fdα = f(s).

Proof. Consider a partition P = {x0, . . . , x3} of [a, b] with x1 = s. Then

U(f, P, α) =

3∑i=1

sup{f(x) | x ∈ [xi−1, xi]}(α(xi)− α(xi−1)

)= sup{f(x) | x ∈ [s, x2]} = M2,

and

L(f, P, α) = inf{f(x) | x ∈ [s, x2]} = m2

Since f is continuous at s, we have

M2 → f(s)

m2 → f(s)

}as x2 → s.


Hence f ∈ R([a, b], α) and

∫ b

a

fdα = f(s). �

4.8.22. Theorem. Let cn ≥ 0, n ≥ 1 and suppose∑n≥1 cn converges. Let

(sn)n≥1 be a sequence of distinct points in ]a, b[ (i.e. sn 6= sm for n,m ≥ 1, n ≥ m)and

α(x) :=∑n≥1

cnU(x− sn), x ∈ [a, b].

Let f : [a, b]→ R be continuous. Then f ∈ R([a, b], α) and∫ b

a

fdα =∑n≥1

cnf(sn).

Proof. By the comparison test, α is real-valued for any x ∈ [a, b]. Obviously,α is increasing (since U is increasing), α(a) = 0, α(b) =

∑n≥1 cn. Thus f ∈

R([a, b], α).

Let ε > 0. Choose N0 ∈ N, such that∞∑

n=N+1

cn < ε, ∀N ≥ N0.

For N ≥ N0 set

α1(x) =

N∑n=1

cnU(x− sn), α2(x) =

∞∑n=N+1

cnU(x− sn), x ∈ [a, b].

Then α1, α2 are increasing, hence f ∈ R([a, b], α1)∩R([a, b], α2). Thus using 4.8.21and 4.8.9(b),(f), we get ∫ b

a

fdα1 =

N∑n=1

cn · f(sn),

and (since α2(b)− α2(a) =∑∞n=N+1 cn − 0 < ε)∣∣ ∫ b

a

fdα2

∣∣ ≤4.8.9(e)

supx∈[a,b]

|f(x)|︸︷︷︸=M

· ε.

Then by 4.8.9(f)∣∣ ∫ b

a

fdα−N∑n=1

cnf(sn)∣∣ =

∣∣ ∫ b

a

fdα2

∣∣ ≤M · ε, ∀N ≥ N0.

ThereforeN∑n=1

cnf(sn) −→N→∞

∫ b

a

fdα.

�

RS-sums:

Let P = {x0, . . . , xn} be a partition of [a, b] and C = {c1, . . . , cn} ⊂ [a, b] satisfyci ∈ [xi−1, xi], 1 ≤ i ≤ n. Any such pair (P,C) is called a tagged partition of[a, b].

For a bounded function f : [a, b]→ R.

S(f, (P,C)) :=

n∑i=1

f(ci)(xi − xi−1)


is called the Riemann sum (R-sum for short) of f w.r.t. the tagged partition(P,C). Accordingly, if α : [a, b]→ R is increasing

S(f, (P,C), α) :=

n∑i=1

f(ci)(α(xi)− α(xi−1))

is called the Riemann-Stieltjes sum (RS-sum for short) of f w.r.t. the taggedpartition (P,C) and α.

4.8.23. Definition. Let f : [a, b] → R be a bounded function. f is calledintegrable in the sense of Riemann’s original definition, if there exists R ∈ Rwith the following property: ∀ε > 0, ∃δ > 0, such that

|S(f, (P,C))−R| < ε

for every tagged partition (P,C) with |P | := mesh(P ) := max1≤i≤n |xi−xi−1| < δ.

4.8.24. Theorem. Let f : [a, b]→ R be a bounded function. Then

f ∈ R([a, b]) ⇐⇒ f is integrable in the sense ofRiemann’s original definition.

In any of these cases, we have ∫ b

a

f(x)dx = R,

where R is as in Definition 4.8.23.

Proof. “⇐=”: Let ε > 0. By assumption, ∃δ > 0 such that for any taggedpartition (P,C) of [a, b] with |P | < δ, we have

|S(f, (P,C))−R| < ε/2. (4.20)

For any such P = {x0, . . . , xn}, we can choose C = {c1, . . . , cn} such that

supx∈[xi−1,xi]

f(x)− ε

2(b− a)< f(ci), 1 ≤ i ≤ n.

HenceU(f, P )− ε/2 < S(f, (P,C)) <

(4.20)R+ ε/2

and soU(f, P ) < R+ ε.

=⇒∫ b

a

fdx < R+ ε =⇒ε>0 arbitrary

∫ b

a

fdx ≤ R.

Similarly, we can change the points C if necessary to C = {c1, . . . , cn}, so that the

tagged partition (P, C) satisfies (4.20) and

infx∈[xi−1,xi]

f(x) +ε

2(b− a)> f(ci), 1 ≤ i ≤ n.

Hence

L(f, P ) + ε/2 > S(f, (P, C)) >(4.20)

R− ε/2

=⇒ L(f, P ) > R− ε =⇒∫ b

a

fdx > R− ε =⇒ε>0 arbitrary

∫ b

a

fdx ≥ R.


“=⇒”: Suppose f ∈ R([a, b]) and let ε > 0 be given. Then ∃ partition Q ={x0, . . . , xk} of [a, b] with

U(f,Q)− L(f,Q) < ε/4. (4.21)

Claim: If P = {x0, . . . , xn} is a partition of [a, b] with

|P | < δ :=ε

4k(M −m),

then

U(f, P )− L(f, P ) < ε.

Proof. Let P ′ = P ∪ Q. Then there are at most k − 1 partition points of Qthat are not in P . Consequently

U(f, P )− U(f, P ′) < (M −m)δ(k − 1) < ε/4. (4.22)

(Illustration: in the worst case only one point of Q lies in between two points ofP , k − 1 times, e.g. for the point xj ∈ ]xi, xi+1[ ∩Q, where xi, xi+1 ∈ P ,

we have (sup

[xi,xi+1]

f − sup[xi,xj ]

f︸︷︷︸<M−m

)(xj − xi) +

(sup

[xi,xi+1]

f − sup[xj ,xi+1]

f︸︷︷︸<M−m

)(xi+1 − xj)

< (M −m)(xi+1 − xi) < (M −m)δ.)

Now: U(f, P ) <(4.22)

U(f, P ′) + ε/4 ≤Q⊂P ′

U(f,Q) + ε/4 <(4.21)

L(f,Q) + ε/2, and

using the analogue of (4.22) for lower sums, we get

L(f, P ′)− L(f, P ) < ε/4 and then L(f, P ) > U(f,Q)− ε/2.Hence

U(f, P ) < L(f,Q) + ε/2 ≤ U(f,Q) + ε/2 < L(f, P ) + ε,

and so

U(f, P )− L(f, P ) < ε for any partition P with |P | < δ,

and the claim is shown. �

Since

L(f, P ) ≤∫ b

a

fdx ≤ U(f, P ) and L(f, P ) ≤ S(f, (P,C)) ≤ U(f, P ),

we obtain with the help of the claim∣∣S(f, (P,C))−∫ b

a

fdx∣∣ < ε

for any tagged partition (P,C) with |P | < δ.


Thus f is integrable in the sense of Riemann’s original definition with R =∫ bafdx.�

4.8.25. Remark. Nearly exactly as for 4.8.24 one can see that for continuousand increasing α : [a, b] −→ R:

f ∈ R([a, b], α) ⇐⇒there exists R ∈ R with the following property:∀ε > 0, ∃δ > 0, such that |S(f, (P,C), α)−R| < ε

for every tagged partition (P,C) of [a, b] with |P | < δ.

In any of these cases, it holds R =

∫ b

a

fdα.

4.8.26. Corollary. Let f : [a, b] → R be bounded and α : [a, b] → R becontinuous and increasing. Suppose f ∈ R([a, b], α). Then for any sequence ofpartitions

{(xni )0≤i≤N(n)}n≥1 of [a, b], N(n) ∈ N,

with

limn→∞

max1≤i≤N(n)

|xni − xni−1| = 0

and any sequence {(ξni )1≤i≤N(n)}n≥1 with ξni ∈ [xni−1, xni ], ∀i, n, 1 ≤ i ≤ N(n), we

have ∫ b

a

fdα = limn→∞

N(n)∑i=1

f(ξni ) ·(α(xni )− α(xni−1)

).

Proof. This follows immediately from 4.8.25. �

4.8.27. Theorem (Holder’s inequality for RS-integrals). Let α : [a, b] → R becontinuous and increasing and f, g ∈ R([a, b], α). Let p, q > 1 with 1

p + 1q = 1. Then

∫ b

a

|f(x)g(x)|dα ≤(∫ b

a

|f(x)|pdα)1/p

︸︷︷︸=:‖f‖p

(∫ b

a

|g(x)|qdα)1/q

︸︷︷︸=:‖g‖q

Proof. Let p, q be as in the statement. For any real numbers ak, bk, 1 ≤ k ≤ n,we have (Holder’s inequality for numbers)

n∑k=1

|akbk| ≤( n∑k=1

|ak|p)1/p( n∑

k=1

|bk|q)1/q

. (∗)

Let {(xnk )0≤k≤n}n≥1 be a sequence of partitions of [a, b] with

limn→∞

max1≤k≤n

|xnk − xnk−1| = 0


and {(ξnk )1≤k≤n}n≥1 be a sequence with ξnk ∈ [xnk−1, xnk ], 1 ≤ k ≤ n, n ≥ 1. Then

∫ b

a

|f(x) · g(x)|dα =4.8.26

limn→∞

n∑k=1

|f(ξnk ) · g(ξnk )|(α(xnk )− α(xnk−1)

)= lim

n→∞

n∑k=1

|f(ξnk )|(α(xnk )− α(xnk−1)

)1/p · |g(ξnk )|(α(xnk )− α(xnk−1)

)1/q≤(∗)

limn→∞

( n∑k=1

|f(ξnk )|p(α(xnk )− α(xnk−1)

))1/p( n∑k=1

|g(ξnk )|q(α(xnk )− α(xnk−1)

))1/q

=4.8.264.8.8

‖f‖p · ‖g‖q.

�

4.9. Functions of bounded variation

4.9.1. Definition. Let a < b and f : [a, b] → R be a given function. Thevariation of f over [a, b] is the quantity

V ba f := sup

{n∑i=1

|f(xi)− f(xi−1)| | n ∈ N, a = x0 < x1 < · · · < xn = b

}.

If V ba f is finite, we say f is of bounded variation on [a, b]. If f is not of boundedvariation on [a, b], we write V ba f =∞. We set

V aa f := 0.

4.9.2. Theorem. Let f : [a, b]→ R be a function and c ∈ ]a, b[. Then

V ba f = V ca f + V bc f.

Proof. First, we show V ba f ≤ V ca f + V bc f . We may assume V ca f , V bc f < ∞.Let P = {x0, . . . , xn} be any partition of [a, b] and c ∈ [xk−1, xk]. Then

k−1∑i=1

|f(xi)− f(xi−1)|+ |f(c)− f(xk−1)| ≤ V ca f,

|f(xk)− f(c)|+n∑

i=k+1

|f(xi)− f(xi−1)| ≤ V bc f.

Since |f(xk)− f(xk−1)| ≤ |f(xk)− f(c)|+ |f(c)− f(xk−1)|, we haven∑i=1

|f(xi)− f(xi−1)| ≤ V ca f + V bc f.

Since P was arbitrary the first claim follows. Next, show V ba f ≥ V ca f + V bc f .If V ba f = ∞, then the inequality holds. If either V ca f = ∞ or V bc f = ∞, thenclearly V ba f = ∞. Thus we may assume that all three variations are finite. Fromthe definition of sup: ∀ε > 0, ∃ partition Pac = {x0, . . . , xk−1, xk = c} of [a, c] with

k∑i=1

|f(xi)− f(xi−1)| > V ca f −ε

2

Similarly: ∃ partition Pcb = {xk = c, . . . , xn} of [c, b] withn∑

i=k+1

|f(xi)− f(xi−1)| > V bc f −ε

2.

4.9. FUNCTIONS OF BOUNDED VARIATION 93

Hence

V ba f ≥n∑i=1

|f(xi)− f(xi−1)| > V ca f + V bc f − ε.

Since ε > 0 was arbitrary, the second claim follows. �

4.9.3. Example. Let f : [a, b]→ R be increasing and x ∈ [a, b]. Then

V xa f = f(x)− f(a), x ∈ [a, b].

Proof. Let P = {x0, . . . , xn} be any partition of [a, x], x > a. Then

n∑i=1

|f(xi)− f(xi−1)| =n∑i=1

f(xi)− f(xi−1) = f(x)− f(a).

Hence V xa f = f(x) − f(a). For x = a by definition V aa f = 0 = f(a) − f(a).Similarly, if f is decreasing: V xa f = f(a)− f(x), x ∈ [a, b]. �

4.9.4. Theorem. Let f : [a, b] → R be continuous on [a, b] and differentiableon ]a, b[ with f ′ bounded on ]a, b[. Then V ba f < ∞, i.e. f is of bounded variationon [a, b].

Proof. Let M > 0 be such that |f ′(x)| ≤ M , ∀x ∈ [a, b], and let P ={x0, . . . , xn} be any partition of [a, b]. By the MVT: ∃ξi ∈ ]xi−1, xi[ with

|f(xi)− f(xi−1)| =∣∣ f ′(ξi)︸︷︷︸≤M

∣∣|xi − xi−1|, 1 ≤ i ≤ n.

Hencen∑i=1

|f(xi)− f(xi−1)| ≤Mn∑i=1

|xi − xi−1| = M(b− a).

�

4.9.5. Theorem. Let f be of bounded variation on [a, b] and

ϕ(x) := V xa f, x ∈ [a, b].

Then ϕ is increasing, ϕ(y) − ϕ(x) = V yx f, y ≥ x, and

f is left-continuous at x ∈ ]a, b]

f is continuous at x ∈ [a, b]

f is right-continuous at x ∈ [a, b[

=⇒ the same is true for ϕ.

Proof. The first statement is clear and the second follows from 4.9.2. Weonly show:

f is left-continuous at x ∈ ]a, b] =⇒ ϕ is left-continuous at x. (∗)

The other two assertions follow similarly. It is enough to show (∗) for x = b, becauseone can always consider x as right end point.

Let ε > 0 be given. Then ∃ partition P = {x0, . . . , xn} of [a, b] with

V ba f <

n∑i=1

|f(xi)− f(xi−1)|+ ε

2. (∗∗)

W.l.o.g. |f(b)− f(xn−1)| < ε/2. Otherwise insert xn−1 ∈ ]xn−1, b[ sufficiently nearto x so that |f(b)− f(xn−1)| < ε/2 which is possible since f is left-continuous at b


and the sum in (∗∗) gets bigger, so (∗∗) still holds. Then for any x ∈ [xn−1, b] (i.e.for any x ∈ [a, b] with |x− b| ≤ |xn−1 − b| := δ), we have

ϕ(x) = V xa f ≥ V xn−1a f

≥n−1∑i=1

|f(xi)− f(xi−1)|

>

n−1∑i=1

|f(xi)− f(xi−1)|+ |f(b)− f(xn−1)| − ε

2

=

n∑i=1

|f(xi)− f(xi−1)| − ε

2>

(∗∗)V ba f − ε = ϕ(b)− ε.

Thus |ϕ(b)− ϕ(x)| < ε and (∗) holds. �

4.9.6. Theorem. Let V ba f < ∞. Then, there are increasing functions g, h :[a, b]→ R with

f = g − hand

V xa f = g(x) + h(x)− f(a), x ∈ [a, b].

Moreover, if f is left-continuous, right-continuous, resp. continuous at x ∈ [a, b],then so are g and h.

Proof. For x ∈ [a, b], set

g(x) :=1

2

(f(a) + V xa f + f(x)

), h(x) :=

1

2

(f(a) + V xa f − f(x)

).

Theng − h = f

andV xa f = 2g(x)− f(a)− f(x) = g(x) + h(x)− f(a).

For y > x,

2g(y)− 2g(x) = V ya f − V xa f + f(y)− f(x) =4.9.2

V yx f + f(y)− f(x) ≥ 0,

sinceV yx f ≥ |f(y)− f(x)| ≥ −

(f(y)− f(x)

).

Thus g is increasing. Similarly, h is increasing and the assertions about continuitydirectly follow from 4.9.5 and the explicit formulas for g and h. �

4.9.7. Remark. The decomposition of f in 4.9.6 is not unique. Indeed, chooseany increasing ψ : [a, b]→ R that is left-continuous, right-continuous, or continuousat x. Then we also have the decomposition

f =(g + ψ︸︷︷︸=:f1↗

)−(h+ ψ︸︷︷︸=:f2↗

),

with f1, f2 left-continuous, right-continuous, or continuous at x. Moreover, if ψ isstrictly increasing, then so are f1 and f2.

4.9.8. Theorem. Suppose f, f ′ : [a, b] −→ R are continuous. Then V ba f < ∞and

V ba f =

∫ b

a

|f ′(x)|dx.


Proof. V ba f < ∞ follows from 4.9.4. Let P := {x0, . . . , xn} be any partitionof [a, b]. Then by the FTC

n∑i=1

|f(xi)− f(xi−1)| =n∑i=1

∣∣ ∫ xi

xi−1

f ′(x)dx∣∣ ≤ n∑

i=1

∫ xi

xi−1

|f ′(x)|dx =

∫ b

a

|f ′(x)|dx,

hence

V ba f ≤∫ b

a

|f ′(x)|dx.

Next, show∫ ba|f ′(x)|dx ≤ V ba f . For this, let ε > 0. Since f ′ is uniformly continuous

on [a, b]: ∃δ > 0 with

x, y ∈ [a, b] and |x− y| < δ =⇒ |f ′(x)− f ′(y)| < ε.

Choose a partition P = {x0, . . . , xn} of [a, b] with xi − xi−1 < δ, ∀i ∈ {1, . . . , n}.Now

x ∈ [xi−1, xi] =⇒ |f ′(x)| < |f ′(xi)|+ ε

and so∫ xi

xi−1

|f ′(x)|dx ≤ |f ′(xi)|(xi − xi−1)︸︷︷︸=|f ′(xi)(xi−xi−1)|

+ε(xi − xi−1)

=∣∣ ∫ xi

xi−1

(f ′(x) + f ′(xi)− f ′(x)

)dx∣∣+ ε(xi − xi−1)

≤∣∣ ∫ xi

xi−1

f ′(x)dx∣∣+∣∣ ∫ xi

xi−1

(f ′(xi)− f ′(x)

)dx∣∣+ ε(xi − xi−1)

≤ |f(xi)− f(xi−1)|+ 2ε(xi − xi−1),

hence ∫ b

a

|f ′(x)|dx ≤n∑i=1

|f(xi)− f(xi−1)|+ 2ε(b− a)

≤ V ba f + 2ε(b− a).

Since ε > 0 was arbitrary, the claim follows. �

4.9.9. Remark. The explicit representation of the variation as an integral in4.9.8 can be used to get an explict representation of g, h in the proof of 4.9.6.

4.9.10. Theorem. Let f : [a, b]→ R be increasing. Then

f(x+) := infh>0

x+h∈[a,b]

f(x+ h), x ∈ [a, b[ ,

f(x−) := suph>0

x−h∈[a,b]

f(x− h), x ∈ ]a, b],

both exist and the points of discontinuity of f are at most countable.

Proof. Define the jump height of f

δf(x) :=

f(x+)− f(x−), if x ∈ ]a, b[,

f(x)− f(x−), if x = b,

f(x+)− f(x), if x = a.


Then 0 ≤ δf(x) ≤ f(b) − f(a) and f is continuous at x, if and only if δf(x) = 0.Let

En :={x ∈ [a, b] | δf(x) >

1

n

}.

Then⋃n≥1En = {x ∈ [a, b] | δf(x) > 0} is the set of points where f is dis-

continuous. For a set A define #A ∈ N ∪ {0,∞} as the number of its elements.Since

#En ≤ inf{k ∈ N | f(a) + k · 1

n> f(b)} <∞

We get that⋃n≥1En is at most countable by Corollary 1.1.5 of Section 1.1. �

4.9.11. Corollary. Let V ba f <∞. Then f is continuous on [a, b] except on aset which is at most countable.

4.9.12. Definition. Let f, α : [a, b] → R be continuous, V baα < ∞. Letα = α1 − α2 be any decomposition of α as difference of continuous increasingfunctions α1, α2 : [a, b]→ R as in 4.9.6. Then the RS-integral of f w.r.t α is definedby ∫ b

a

fdα :=

∫ b

a

fdα1 −∫ b

a

fdα2.∫ bafdα is well-defined, i.e. independent of the choice of α1, α2. Indeed let α =

α1 − α2 for some other continuous increasing functions α1, α2 : [a, b] → R. Using4.8.26, we get ∫ b

a

fdα1 −∫ b

a

fdα2 =

∫ b

a

fdα1 −∫ b

a

fdα2

Moreover for any sequence of partitions

{(xni )0≤i≤N(n)}n≥1 of [a, b], N(n) ∈ N,with

limn→∞

max1≤i≤N(n)

∣∣xni − xni−1

∣∣ = 0,

and any sequence {(ξni )}n≥1 of intermediate points ξni ∈ [xni−1, xni ], 1 ≤ i ≤ N(n),

n ≥ 1, we have by 4.8.26∫ b

a

fdα = limn→∞

N(n)∑i=1

f(ξni )(α(xni )− α(xni−1)

).

Section 4 Additional Examples:

1) Using the ε-δ criterion, show that f : ]0,∞[ → R, x 7→ f(x) = 1x is continuous.

Solution: Let x0 ∈ ]0,∞[ be arbitrary. Then for x ∈ ]0,∞[,

|f(x)− f(x0)| =∣∣∣∣ 1x − 1

x0

∣∣∣∣ =

∣∣∣∣x− x0

xx0

∣∣∣∣ <↑

if |x−x0|<δ

δ

xx0<↑

if δ<x02

⇒x> x02

2δ

x20

.

Let ε > 0 be arbitrary. We need |f(x) − f(x0)| < ε, which we can get if 2δx20≤ ε,

hence δ ≤ εx20

2 . Thus for δ := min(x2

0

2 ,εx2

0

2

), we get

x ∈ ]0,∞[ and |x− x0| < δ =⇒ |f(x)− f(x0)| < ε.

2) Let f : Rn −→ Rm be continuous. Show that {x ∈ Rn | ‖f(x)‖ < 1} is open.


Solution: We have {x ∈ Rn | ‖f(x)‖ < 1} = {x ∈ Rn | f(x) ∈ B(0, 1)} =f−1(B(0, 1)) which is open by 4.1.4(iii).

3) Let K ⊂ R2 be compact. Prove that

A := {x ∈ R | ∃y ∈ R with (x, y) ∈ K}

is compact.

Proof. Let f : R2 → R, f(x, y) = x. Then f is continuous by 4.1.4(ii). Weclaim: A = f(K). (⇒ A is compact by 4.2.2). To prove the claim, let x ∈ A. Then(x, y) ∈ K for some y ∈ R, hence x = f(x, y) ∈ f(K). Conversely if x ∈ f(K) thenx = f(x, y) for some (x, y) ∈ K. Hence there is y ∈ R with (x, y) ∈ K, i.e. x ∈ A.

�

4) Let f : R2 −→ R be continuous. Show that A := {f(x) | ‖x‖ = 1} is a closedinterval of the form [a, b].

Proof. We have A = f(S1), where S1 = {x ∈ R2 | ‖x‖ = 1} is the unitcircle. Since S1 is (path-)connected and compact, we have that A is a connectedand compact subset of R (see Theorems 4.2.1 and 4.2.2), hence a compact interval(cf. Lemma 3.5.2). But compact intervals are of the form [a, b].

�

5) Let f : ] 12 , 1]→ R, f(x) := 1

x . Then f is continuous by example 1), but does not

attain its maximum. Indeed, we have sup{f(x) | x ∈ ] 12 , 1]} = 2 /∈ f

(] 12 , 1]

), i.e.

@x ∈] 12 , 1] with f(x) = 2.

6) Give an example of an unbounded discontinuous function on a compact set.

For instance: f : [0, 1]→ R, f(x) := 1x , if x > 0, f(0) := 0.

7) Let f : [0, 1]→ R, f(x) = xx2+1 . Verify the Maximum-Minimum Theorem for f .

Solution: First f is continuous by 4.3.3(iii) and [0, 1] is obviously compact. Wewill explicitly verify that the maximum is attained at x = 1 and the minimum isattained at x = 0. First, as x ≥ 0, we have f(x) = x

x2+1 ≥ 0. Thus 0 = f(0) =

inf{f(x) | x ∈ [0, 1]}.Since 0 ≤ (x− 1)2 = x2 − 2x+ 1, we have x2 + 1 ≥ 2x. Hence for x 6= 0

f(x) =x

x2 + 1≤ x

2x=

1

2= f(1).

Thus f(1) = sup{f(x) | x ∈ [0, 1]}.

8) Let (M,d1) and (N, d2) be metric spaces. Then a product metric d on M ×Ncan be defined by

d((x, y), (x, y)

):= d1(x, x) + d2(y, y), (x, y), (x, y) ∈M ×N.


Easy to check: (M ×N, d) is a metric space and

(xn, yn)→ (x, y) in M ×N ⇐⇒ xn → x in M and yn → y in N.

9) Let f : A ⊂ M → N be continuous (M , N metric spaces) and let K ⊂ A beconnected. Then

graph(f |K) := {(x, f(x)) ∈M ×N | x ∈ K}is connected in M ×N with product metric as in 8).

Proof. Let g : A ⊂ M → M × N , g(x) := (x, f(x)), x ∈ A. Then g iscontinuous w.r.t. the product metric of example 8) on M × N . But g(K) =graph(f |K) is then connected by 4.2.1.

�

Note: The statement remains true, if one replaces connected by compact orclosed.

10) Let

f(x, y) :=

{x2yx4+y2 , if (x, y) ∈ R2 \ {(0, 0)},

0, if (x, y) = (0, 0).

Then f is continuous on R2 \ {(0, 0)} but not continuous at (0, 0). Continuity onR2 \ {(0, 0)} is clear. To see that f is discontinuous at (0, 0) let for a, b 6= 0

ϕ(t) := (t · a, t · b), t > 0.

Then

f(ϕ(t)) =t2a2tb

t4a4 + t2b2=

a2bt

t2a4 + b2−→t→0

0.

For α 6= 0 defineϕ(t) := (t, t2α), t > 0.

Then

f(ϕ(t)) =t2t2α

t4 + t4α2=

α

1 + α2−→t→0

α

1 + α2.

Thus

limt→0

f(ϕ(t)) 6= limt→0

f(ϕ(t)),

and so f cannot be continuous at (0, 0). However, if

E := {(x, y) ⊂ R2 | x ≥ 0 and |y| ≤ x3}


then f |E is continuous, since∣∣∣∣ x2y

x4 + y2− 0

∣∣∣∣ =x2|y|x4 + y2

≤(x,y)∈E

|x|5

x4 + y2≤ |x| −→

as (x,y)→(0,0)(x,y)∈E

0.

11) Let f : [a, b] −→ R be differentiable and suppose that

|f(x)|+ |f ′(x)| 6= 0, ∀x ∈ [a, b].

Show that f has only finitely many zeros in [a, b] (a zero of f is an x ∈ [a, b] withf(x) = 0).

Proof. Suppose there are infinitely many zeros. Then we can find a sequenceyn ∈ [a, b] with f(yn) = 0, ∀n ≥ 1 and yn 6= ym, ∀n 6= m. Since [a, b] is compact:∃y ∈ [a, b] such that ynk −→ y for some subsequence.Note that ynk = y is only possible for at most one k ∈ N. By continuity of f :0 = f(ynk) −→

k→∞f(y) = 0. Now

f ′(y) = limk→∞

f(ynk)− f(y)

ynk − y= 0,

hence

|f ′(y)|+ |f(y)| = 0 .�

Illustration: Consider

f(x) :=

{x2 sin 1

x , if x 6= 0,0, if x = 0.

Then f is continuous on R since f is obviously continuous on R \ {0} and f is alsocontinuous at 0 since

|x2 sin1

x− 0| ≤ |x2| −→

x→00.

f is differentiable on R \ {0} since x2 and sin 1x are differentiable there. Hence by

4.7.3(iii) applied to x0 ∈ R \ {0}, we get

f ′(x0) = 2x sin1

x+ x2

(− 1

x2cos

1

x

)= 2x sin

1

x− cos

1

x.

But f is also differentiable at x0 = 0 with f ′(0) = 0 since for any x ∈ R \ {0}.∣∣∣∣f(x)− f(0)

x− 0

∣∣∣∣ =

∣∣∣∣x sin1

x

∣∣∣∣ ≤ |x| −→x→00.

We do not have f ′(x) −→x→0

f ′(0), thus f ′ is not continuous on R (only on R \ {0}).f has infinitely many zeros in [0, 1]

f(yk) = 0, if yk =1

kπ, k ∈ N.

Thus by the above: ∃y ∈ [0, 1] with |f(y)|+ |f ′(y)| = 0. Obviously y = 0 does fulfillthis but note that

ynk −→k→∞

y = 0 with |f(y)|+ |f ′(y)| = 0

exactly as in the proof above.

12) Let α > 1 and let f : [a, b] −→ R satisfy

|f(x)− f(y)| ≤M |x− y|α, (∗)


for some constant M ≥ 0 and all x, y ∈ [a, b]. Show that f is constant on [a, b].

Proof. Let x0 ∈ ]a, b[ be arbitrary. Then ∀x ∈ ]a, b[ with x 6= x0

|f(x)− f(x0)||x− x0|

≤M |x− x0|α−1 −→x→x0

0. (since α > 1)

Thus f ′(x0) = 0. Since f is continuous by (∗), f is constant by 4.7.8. �

13) The function f : R −→ ]0,∞[, f(x) := ex, x ∈ R is continuous on R andstrictly increasing on R, since f ′(x) = ex > 0, ∀x ∈ R. Indeed, by 4.7.10 f isstrictly increasing on any [a, b] ⊂ R, hence on R.Let ln : ]0,∞[ −→ R, ln := f−1 which exists by 4.7.11. Actually, by 4.7.11 ln isdefined on ]0,∞[ and continuous there.Moreover since f is differentiable at any x ∈ R, we get by 4.7.11 that ln is differen-tiable at any f(x) ∈ ]0,∞[ with

(ln)′(f(x)) =1

f ′(x). (∗)

Let y ∈ ]0,∞[ be arbitrary. Since f is surjective:∃x ∈ R with f(x) = y (⇐⇒ x = f−1(y) = ln y). Thus

(ln)′(y) =(∗)

1

f ′(ln y)=

1

eln y=

1

y, ∀y ∈ ]0,∞[.

14) Let f : R→ R be differentiable, f(0) 6= 0, such that for some c > 0: f ′(x) > 1c ,

∀x ∈ R. Show that f has exactly one zero in between 0 and −cf(0).

Proof. Suppose f(0) > 0 (the case f(0) < 0 is similar). By the MVT: ∃ξ ∈]− cf(0), f(0)[ with

f(0)− f(− cf(0)

)cf(0)

= f ′(ξ) >1

c,

hence f(0) − f(− cf(0)

)> f(0) and so f

(− cf(0)

)< 0. By the IVT: ∃x0 ∈

] − cf(0), 0[ with f(x0) = 0. Since f ′(x) > 0, ∀x ∈ R, we get by 4.7.10 that f isstrictly increasing on R. Theorefore f can at most have one zero.

�

15) We have seen in Section 4.5: If f : [0, 1] → [0, 1] is continuous, then f has afixed point, i.e. ∃x0 ∈ [0, 1] with f(x0) = x0.Now suppose additionally that f is differentiable in ]0, 1[ and that f ′(x) 6= 1,∀x ∈ ]0, 1[ . Then show that x0 is unique. Does the uniqueness of x0 still hold ifwe only assume that f is differentiable in ]0, 1[ ?

Solution: Suppose there exists ξ1, ξ2 ∈ [0, 1], ξ1 < ξ2 and f(ξ1) = ξ1, f(ξ2) = ξ2.Then by the MVT:

∃η ∈ ]ξ1, ξ2[ with 1 =f(ξ2)− f(ξ1)

ξ2 − ξ1= f ′(η) 6= 1 .

Differentiability in ]0, 1[ is not sufficient for the uniqueness of x0. Counterexample:f(x) = x, x ∈ [0, 1].

16) Let f be differentiable in [a, b]. Suppose f(a) > f(b) and f ′(a) > 0. Show:∃ξ ∈ ]a, b[ with f ′(ξ) = 0.


Proof. Since f is differentiable on [a, b], f is continuous on [a, b]. Then bythe Minimum-Maximum Theorem:

∃ξ ∈ [a, b] with f(ξ) = supx∈[a,b]

f(x).

Since f(a) > f(b) we have b 6= ξ. Since

0 < f ′(a) = limx→a+

f(x)− f(a)

x− a,

there exists a δ > 0 with f(x) > f(a), ∀x ∈ ]a, a+ δ[. Thus a 6= ξ and so f attainsits maximum in ]a, b[. Then f ′(ξ) = 0 by 4.7.5. �

17) Let f : [0,∞[→ R be continuous and differentiable in ]0,∞[. Suppose f(0) ≤ 0and that f ′ is increasing in ]0,∞[. Show that

g(x) :=f(x)

x, x ∈ ]0,∞[.

is increasing. Can f(0) ≤ 0 be dropped?

Solution: Let x > 0. By the MVT

∃ξ ∈ ]0, x[ withf(x)− f(0)

x− 0= f ′(ξ),

hence

f(x) = xf ′(ξ) + f(0). (∗)

Since f ′ is increasing, we have f ′(x) ≥ f ′(ξ), thus

g′(x) =4.7.3(iv)

xf ′(x)− f(x)

x2=(∗)

f ′(x)− f ′(ξ)x

− f(0)

x2≥ 0

=⇒4.7.10

g is increasing in ]0,∞[.

For f(x) ≡ 1, g(x) = 1x is strictly decreasing, hence f(0) ≤ 0 can not be omitted.

18) Let f : ]0,∞[→ R be three times differentiable. Suppose:

∃ limx→∞

f(x) = α ∈ R, ∃ limx→∞

f ′′′(x) = 0.

Show that ∃ limx→∞ f ′(x), ∃ limx→∞ f ′′(x) and

limx→∞

f ′(x) = limx→∞

f ′′(x) = 0.

Proof. Let x0 > 1. By Taylor’s Theorem with x = x0 + 1

∃ξ1 ∈ ]x0, x0 + 1[ with f(x0 + 1) = f(x0) + f ′(x0) +1

2f ′′(x0) +

1

6f ′′′(ξ1)

and with x = x0 − 1

∃ξ2 ∈ ]x0 − 1, x0[ with f(x0 − 1) = f(x0)− f ′(x0) +1

2f ′′(x0)− 1

6f ′′′(ξ2)

Adding both expressions

f ′′(x0) = f(x0 + 1) + f(x0 − 1)− 2f(x0)− 1

6

(f ′′′(ξ1)− f ′′′(ξ2)

)(4.23)

and subtracting both expressions

2f ′(x0) = f(x0 + 1)− f(x0 − 1)− 1

6

(f ′′′(ξ1) + f ′′′(ξ2)

)(4.24)


Letting x0 →∞ (4.23) and (4.24) directly imply the assertions. �

19) Let f, g : R→ R be both differentiable. Suppose that for some interval ]a, b[ ⊂R, we have f(g(x)) = ex, x ∈ ]a, b[. Show that g has no local extremum in ]a, b[.

Proof. By the chain rule

(f ◦ g)′(x) = f ′(g(x))g′(x) = ex 6= 0, ∀x ∈ ]a, b[ .

Thus g′(x) 6= 0, ∀x ∈ ]a, b[ and by 4.7.5, g can not have a local extremum in]a, b[. �

20) f : ]a, b[ −→ R is called convex, if

f(tx+ (1− t)y) ≤ tf(x) + (1− t)f(y), ∀x, y ∈ ]a, b[ , t ∈ ]0, 1[ .

Let x < u < y and t ∈ ]0, 1[ be such that

u = tx+ (1− t)y.

Then

u− x = (1− t)(y − x),

and so

(∗)

f(u) ≤ f(x) + (1− t)

(f(y)− f(x)

)= f(x) + f(y)−f(x)

y−x (u− x)

=: g(u).

In particular (∗) implies

f(u)− f(x)

u− x≤ f(y)− f(x)

y − x, a < x < u < y < b. (4.25)

Moreover, since

y − uy − x

,u− xy − x

∈ ]0, 1[ withy − uy − x

+u− xy − x

= 1,


we get

f(u) = f(y − uy − x

· x+u− xy − x

y)≤ y − uy − x

f(x) +u− xy − x

f(y)

=y − uy − x

(f(x)− f(y)

)+( u− xy − x

+y − uy − x︸︷︷︸

=1

)f(y)

⇐⇒ f(u)− f(y)

y − u≤ f(x)− f(y)

y − x

⇐⇒ f(y)− f(x)

y − x≤ f(y)− f(u)

y − u. (4.26)

(4.25) and (4.26) form the Chordal Slope Lemma.

The Chordal Slope Lemma implies that for each x ∈ ]a, b[

y 7−→ h(y) :=f(y)− f(x)

y − x, y ∈ ]a, b[ \{x}, (4.27)

is increasing. Then f is Lipschitz continuous on each compact subinterval [c, d] ⊂]a, b[.

Indeed, let x, y ∈ [c, d], x 6= y and ε > 0 such that a < c− ε, d+ ε < b. Then

f(c− ε)− f(x)

(c− ε)− x≤

(4.27)

f(y)− f(x)

y − x≤(3)

f(d+ ε)− f(x)

(d+ ε)− x

and so ∣∣∣∣f(y)− f(x)

y − x

∣∣∣∣ ≤ max

{∣∣∣∣f(c− ε)− f(x)

(c− ε)− x

∣∣∣∣ , ∣∣∣∣f(d+ ε)− f(x)

(d+ ε)− x

∣∣∣∣} =: Cx.

Since supx∈[c,d]

Cx <∞, we obtain the Lipschitz continuity of f on [c, d]. In particular

f is continuous on ]a, b[.

Note: If we define convex as we did before but for closed or half-closed intervals,then a convex function needs not be continuous.

Example: f : [a, b] −→ R defined by

f(x) =

{0, if x ∈ [a, b[ ,1, if x = b,


is not continuous, but satisfies

f(tx+ (1− t)y) ≤ tf(x) + (1− t)f(y) ∀x, y ∈ [a, b], t ∈ ]0, 1[ . (4.28)

Moreover, if g : [a, b] −→ R satisfies (4.28) and is continuous it does not need tohave a convex extension f : ]a− ε, b+ ε[ −→ R, i.e. f is convex and f = g on [a, b].

Counterexample: g(x) := −√x, x ∈ [0, 1] does not have a convex extension.

Proof. Suppose there exists a convex extension F of g, then for sufficientlysmall h ∈ ]0, 1[

F (−h)− F (0)

−h≤

(4.27)

F (h4)− F (0)

h4= − 1

h2

=⇒ F (−h) ≥ 1h −→h↘0

∞ (since F is continuous). �

20) Let f : ]a, b[ → R be convex. Then

(i) ∀x ∈ ]a, b[

∃f ′−(x) := limh→0−

f(x+ h)− f(x)

h,

∃f ′+(x) := limh→0+

f(x+ h)− f(x)

h.

(ii) f ′+, f′− : ]a, b[ −→ R are increasing, more precisely: ∀x, y ∈ ]a, b[ with x < y

it holds f ′−(x) ≤ f ′+(x) ≤ f ′−(y) ≤ f ′+(y).

Proof. Exercise. �

21) Let f : ]a, b[ −→ R be differentiable. Then

f is convex ⇐⇒ f ′ : ]a, b[ −→ R is increasing.

Proof. “=⇒” Since f is convex and differentiable, we get by 20)(ii) for x < y.

f ′(x) = f ′−(x) = f ′+(x) ≤ f ′−(y) = f ′+(y) = f ′(y).

“⇐=” Suppose f is not convex. Then we can find x, y ∈ ]a, b[ , x < y and t ∈ ]0, 1[with

f(tx+ (1− t)y︸︷︷︸=:u

) > tf(x) + (1− t)f(y).

Hence (cf. (4.25) and (4.26)) x < u < y and

f(u)− f(x)

u− x>f(y)− f(u)

y − u.

By the MVT: ∃ξ1 ∈ ]x, u[, ξ2 ∈ ]u, y[, hence ξ1 < ξ2, with

f ′(ξ1) > f ′(ξ2) .

�


22) Let f : ]a, b[ −→ R be twice differentiable. Then

f is convex ⇐⇒ f ′′(x) ≥ 0, ∀x ∈ ]a, b[ .

Proof. This follows directly form 21) and 4.7.10 (applied to each [c, d] ⊂]a, b[). �

Example: Let x, y > 0, p, q > 1 with 1p + 1

q = 1. Then Young’s inequality

holds, i.e.

xy ≤ xp

p+xq

q.

Proof. Consider ln : ]0,∞[ −→ R (see 13)). ln is twice differentiable with

ln′′(y) = − 1

y2< 0, ∀y ∈ ]0,∞[ .

=⇒22)− ln is convex in ]0,∞[

=⇒ − ln(1

pxp +

1

qyq)≤ −

(1

pln(xp) +

1

qln(xq)

)= −

(lnx+ ln y

)= − ln(x · y)

=⇒ ln(x · y) ≤ ln

(1

pxp +

1

qxq).

Taking the exponential on both sides, the result follows. �

Young’s inequality can be used to show —bf Holder’s inequality for numbers whichiss used in the proof of 4.8.27.

Claim: Let ak, bk, 1 ≤ k ≤ n be any real numbers, p, q > 1, 1p + 1

q = 1. Then

n∑k=1

|akbk| ≤( n∑k=1

|ak|p)1/p( n∑

k=1

|bk|q)1/q

.

Proof. Define

|a|p :=( n∑k=1

|ak|p) 1p

, |b|q :=( n∑k=1

|bk|q) 1q

.

If |a|p = 0 or |b|q = 0, then the claim is true. We may hence assume that |a|p 6= 0and |b|q 6= 0. Set

ck :=|ak||a|p

, dk :=|bk||b|q

.

Then by Young’s inequality (and taking the sum∑nk=1)

n∑k=1

|ckdk| ≤n∑k=1

|ck|p

p+

n∑k=1

|dk|q

q=

n∑k=1

|ak|p

p|a|pp+

n∑k=1

|bk|q

q|b|qq=

1

p+

1

q= 1

and the claim follows. �

23) Let ϕ : ]a, b[ → R be a function. We say ϕ has a support tangent line atx0 ∈ ]a, b[, if ∃γ = γx0

∈ R with l(x) := ϕ(x0) + γ(x− x0) ≤ ϕ(x), ∀x ∈ ]a, b[.

Show the following:

ϕ is convex ⇐⇒ ϕ has a support tangent line at each point of ]a, b[ .



24) Jensen’s Inequality: Let f ∈ R([a, b], α), f : [a, b] −→ [m,M ] ⊂ ]c, d[,

α(b)− α(a) = 1, and ϕ : ]c, d[ −→ R convex. Then∫ bafdα ∈ ]c, d[ and

ϕ ◦ f ∈ R([a, b], α) with ϕ

(∫ b

a

fdα

)≤∫ b

a

ϕ ◦ fdα.

Proof. Since ϕ is continuous, we have ϕ ◦ f ∈ R([a, b], α) by 4.8.8. Let

x0 :=

∫ b

a

fdα ∈ [m,M ] ⊂ ]c, d[ .

By 23) ϕ has a support tangent line at x0, i.e. ∃γ = γx0∈ R with

ϕ(y) ≥ l(y) = ϕ(x0) + γ(y − x0), ∀y ∈ ]c, d[ .

Then ∫ b

a

ϕ(f(x))dα(x) ≥4.8.9(d)4.8.7(i)

∫ b

a

ϕ(x0) + γ(f(x)− x0) dα(x)

= ϕ(x0) + γ(∫ b

a

f(x)dα(x)− x0︸︷︷︸=0

)

= ϕ(∫ b

a

fdα).

�

25) Show that if α is strictly increasing, then one can choose ξ ∈ ]a, b[ in the MVTof RS-integration 4.8.11.


Note: if α is not strictly increasing then the conclusion of 26) is wrong.

Counterexample: [a, b] = [0, 1], f(x) = x, α(x) = 1, x ∈ ]0, 1], α(0) = 0. Thenas in 4.8.21 one shows for any x ∈ ]0, 1]∫ 1

0

xdα = f(0) = f(0)(1− 0) 6= f(x)(1− 0).


26) Let f, g : [a, b] → R be continuous and f(x) ≤ g(x), ∀x ∈ [a, b]. Supposef(x0) < g(x0) for some x0 ∈ [a, b]. Show that then∫ b

a

fdx <

∫ b

a

gdx.

Proof. h(x) := g(x) − f(x) is continuous in [a, b]. We have h(x) ≥ 0 andh(x0) > 0. Thus there is some interval containing more than 1 point [c, d] ⊂ [a, b]and ε > 0 such that h(x) ≥ ε for all x ∈ [c, d]. Then∫ b

a

gdx−∫ b

a

fdx =

∫ b

a

h dx ≥∫ d

c

h dx ≥ ε(d− c) > 0.

�

27) Let f : [a, b] → R be continuous, f(x) ≥ 0, ∀x ∈ [a, b] and∫ bafdx = 0. Show

that then f(x) = 0, ∀x ∈ [a, b].

Proof. Suppose ∃x0 ∈ [a, b] with f(x0) > 0. Then∫ bafdx > 0 by 26) . �

28) Let f : [a, b]→ R be continuous and∫ b

a

f · g dx = 0, ∀g ∈ R([a, b]).

Show that then f(x) = 0, ∀x ∈ [a, b].

Proof. Choose g = f . Then∫ baf2dx = 0 and by 27) we get f2(x) = 0, hence

f(x) = 0, ∀x ∈ [a, b]. �

29) Let f : [a, b]→ R be two times differentiable and suppose that f ′′ is continuouson [a, b]. Show without using integration by parts that∫ b

a

xf ′′(x)dx = (bf ′(b)− af ′(a))− (f(b)− f(a))

Proof. (Of course the statement follows immediately from xf ′′(x) = (xf ′(x))′−f ′(x) using the FTC.) For z ∈ [a, b], consider

φ(z) :=

∫ z

a

xf ′′(x)dx−(zf ′(z)− af ′(a)

)+(f(z)− f(a)

).

Then

φ′(z) =FTC

zf ′′(z)−(zf ′′(z) + f ′(z)

)+ f ′(z) = 0, ∀z ∈ [a, b].

=⇒4.7.8

φ(z) = φ(a) = 0, ∀z ∈ [a, b].

�

30) Let f : [0, 1]→ R be continuous and 0 < a < b. Determine the limit

L := limε→0+

∫ b·ε

a·ε

f(x)

xdx.


Solution: Since lnx is differentiable on [a · ε, b · ε] and (lnx)′ = 1x is integrable on

[a · ε, b · ε], we get by 4.8.16

L = limε→0+

∫ b·ε

a·εf(x)d

(lnx)

=MVT of int.

limε→0+

f(ξε) (ln(b · ε)− ln(a− ε))(for some ξε ∈ [a · ε, b · ε]

)= lim

ε→0+f(ξε) ln(

b

a) = f(0) ln(

b

a). ( since ξε −→

ε→0+0, f continuous )

31) Let f be continuous on R. Show that∫ x

0

f(t)(x− t)dt =

∫ x

0

(∫ u

0

f(t)dt)du, ∀x ∈ R.

Proof. Using integration by parts, we get

∫ x

0

(1︸︷︷︸

=f ′(u)

·∫ u

0

f(t)dt︸︷︷︸=g(u)

)du =

[ug(u)

]x0︸︷︷︸

=x·∫ x0f(t)dt

−∫ x

0

uf(u)du︸︷︷︸=∫ x0tf(t)dt

=

∫ x

0

f(t)(x− t)dt.

In particular using induction, one can show that ∀n ∈ N, x ∈ R∫ x

0

f(t)(x− t)n

n!dt =

∫ x

0

(∫ xn

0

(· · ·∫ x2

0

(∫ x

0

f(t)dt)dx1 . . .

)dxn−1

)dxn.

�

32) Find the following limit

limn→∞

( 1

n+ 1+

1

n+ 2+ · · ·+ 1

3n

)Solution: It holds

1

n+ 1+

1

n+ 2+ · · ·+ 1

3n=

1

2n

(1

12 + 1

2n

+1

12 + 2

2n

+ . . .1

12 + 2n

2n

)Since f(x) = 1

12 +x

is continuous on [0, 1], we get f ∈ R([0, 1]) and

limn→∞

(1

n+ 1+

1

n+ 2+ · · ·+ 1

3n

)= lim

n→∞

2n∑k=1

f

(k

2n

)(k

2n− k − 1

2n

)

=4.8.26

∫ 1

0

112 + x

dx

=

[ln(x+

1

2

)]1

0

= ln3

2− ln

1

2= ln 3.

33) For α ≥ 0 calculate

L = limn→∞

n∑k=1

kα

nα+1.


Solution: Let f(x) = xα. Then

L = limn→∞

∞∑k=1

(k

n

)α1

n= lim

n→∞

n∑k=1

f

(k

n

)(k

n− k − 1

n

)=

4.8.26

∫ 1

0

xαdx

=

[xα+1

α+ 1

]1

0

=1

α+ 1.

34) Let f, g ∈ C([a, b]) := {h : [a, b] → R | h continuous}. Without using Holder’sinequality, show∣∣ ∫ b

a

f · gdx∣∣ ≤ (∫ b

a

f2dx)1/2(∫ b

a

g2dx)1/2

. (∗)

Proof. We have by 27) applied to f2, f ∈ C([a, b])∫ b

a

f2dx = 0 ⇐⇒ f(x) = 0, ∀x ∈ [a, b], i.e. f = 0 in C([a, b]).

C([a, b]) is a vector space over R and it is easy to see that

(f, g) :=

∫ b

a

f · g dx, f, g ∈ C([a, b]),

defines an inner product on C([a, b]). Thus (∗) is just the Cauchy Schwarz inequality1.7.6. �

35) Let f : [0, 1] −→ R be differentiable, f ′ continuous and f(0) = 0. Show thatthen

|f(x)| ≤(∫ 1

0

(f ′(x)

)2

dx

)1/2

, ∀x ∈ [0, 1].

Proof. We have for x ∈ [0, 1]

|f(x)| = |f(x)− f(0)| =FTC

∣∣ ∫ x

0

1 · f ′(x) dx∣∣

≤34)

(∫ x

0

1dx

)1/2

︸︷︷︸=√x ≤ 1

(∫ x

0

(f ′(x)

)2

dx

)1/2

︸︷︷︸≤ (

∫ 10f ′(x)2dx)

1/2

.

�

36) Show that

f(x) :=

{x2 cos

(πx2

), x ∈ ]0, 1],

0, x = 0.

is differentiable in [0, 1], but V 10 f =∞.

Proof. We have for x ∈ ]0, 1]

f ′(x) = 2x cos( πx2

)+

2π

xsin( πx2

)(∗)

and for x = 0 ∣∣∣h2 cos(πh2

)− 0

h

∣∣∣ ≤ |h| −→h→0

0,


hence f ′(0) = 0. Consider

xi :=

√1

1 + i2

i = 1, 2, . . . , 2n− 2,

and the intervals

[xi+1, xi], i = 2n− 3, . . . , 2, 1.

We get

V 10 f ≥

4.9.2V

√23√1n

f ≥2n−3∑i=1

|f(xi+1)− f(xi)|

=

2n−3∑i=1

∣∣∣∣x2i+1 cos

(π

x2i+1

)− x2

i cos

(π

x2i

)∣∣∣∣=

2n−3∑i=1

∣∣∣∣ 1

1 + i+12

cos

(π

(1 +

i+ 1

2

))︸︷︷︸

∈{−1,0,1}

− 1

1 + i2

cos

(π

(1 +

i

2

))︸︷︷︸

∈{−1,0,1}

∣∣∣∣≥

2n−3∑i=1

1

1 +i+ 1

2︸︷︷︸= 2i+3

−→n→∞

∞.

�

37) Show that

f(x) :=

{x2 cos

(πx

), x ∈ ]0, 1],

0, x = 0.

satisfies V 10 f <∞.

Proof. We have for x ∈ ]0, 1].

f ′(x) = 2x cos(πx

)+ π sin

(πx

)and for x = 0 ∣∣h2 cos

(πh

)− 0

h

∣∣ ≤ |h| −→h→0

0,

thus f ′(0) = 0. Therefore f ′ is bounded and V 10 f <∞ follows from 4.9.4. �

38) Show: V ba f <∞ =⇒ f is bounded on [a, b].

Proof. For any x ∈ [a, b], we have

|f(x)| ≤ |f(x)− f(a)|+ |f(a)|≤ |f(b)− f(x)|+ |f(x)− f(a)|+ |f(a)| ≤ V ba f + |f(a)|.

�


39) We have ∀α, β ∈ R:

V ba(αf + βg

)≤ |α|V ba f + |β|V ba g

andV ba |f | ≤ V ba f.

Proof. The first inequality holds because ∀x, y ∈ [a, b].∣∣(αf + βg)(x)− (αf + βg)(y)∣∣ ≤ |α||f(x)− f(y)|+ |β||g(x)− g(y)|

and the second because ∣∣|f |(x)− |f |(y)∣∣ ≤ ∣∣f(x)− f(y)

∣∣.�

40) Let g : [a, b] → [c, d] satisfy V ba g < ∞ and let f : [c, d] → R satisfy a Lipschitzcondition, i.e. ∃L ≥ 0 wih∣∣f(x)− f(y)

∣∣ ≤ L |x− y|, ∀x, y ∈ [c, d].

Then V ba (f◦g) <∞ and V ba (f◦g) ≤ L ·V ba g.

Proof. For any partition P = {x0, . . . , xn} of [a, b], we haven∑i=1

∣∣f ◦ g(xi)− f ◦ g(xi−1)∣∣ ≤ n∑

i=1

L∣∣g(xi)− g(xi−1)

∣∣ ≤ L · V ba g.�

41) It holds : V ba f <∞ =⇒ V ba(|f |p

)<∞, ∀p ≥ 1.

Proof. By 39) V ba |f | ≤ V ba f < ∞. In particular, by 38) f is bounded andso we may assume f : [a, b] −→ [c, d]. x 7→ |x|p has a bounded derivative on [c, d]and satisfies therefore a Lipschitz condition on [c, d]. Let L denote the Lipschitzconstant. Then by 40)

V ba(|f |p

)≤ L · V ba |f | < L · V ba f <∞.

�

42) We have:

V ba f <∞ and V ba g <∞ =⇒ V ba(

max(f, g))<∞.

Here max(f, g)(x) := max{f(x), g(x)}.

Proof. We have

max{f(x), g(x)} =|f(x)− g(x)|+ f(x) + g(x)

2=

1

2|f − g|(x) +

1

2f(x) +

1

2g(x).

Hence by 39)

V ba(

max(f, g))≤ 1

2

(V ba |f − g|+ V ba f + V ba g

)≤ 1

2

(V ba f + V ba (−g)︸︷︷︸

=V ba g

+V ba f + V ba g)

= V ba f + V ba g.


�

43) Let V ba f <∞ and g, h be as in 4.9.6, i.e.

g(x) =1

2

(f(a) + V xa f + f(x)

), h(x) =

1

2

(f(a) + V xa f − f(x)

).

Let g1, h1 be any other increasing functions with f = g1 − h1. Then:

g(y)− g(x) ≤ g1(y)− g1(x)

h(y)− h(x) ≤ h1(y)− h1(x)

}∀x, y ∈ [a, b] with x ≤ y.

In particular : V ba g ≤ V ba g1 and V ba h ≤ V ba h1.

Proof. By 39) ∀x, y ∈ [a, b] with x ≤ yV yx f ≤ V yx g1 + V yx h1 = g1(y)− g1(x) + h1(y)− h1(x).

Moreover

g(y)− g(x) =1

2

(V yx f + f(y)− f(x)

).

Hence

g(y)− g(x) ≤ 1

2

(g1(y)− g1(x) + h1(y)− h1(x) + g1(y)− h1(y)−

(g1(x)− h1(x)

))= g1(y)− g1(x)

Similarly

h(y)− h(x) ≤ h1(y)− h1(x).

�

44) Let f be continuous and α be increasing on [a, b]. Show that

F (x) :=

∫ x

a

fdα, x ∈ [a, b],

satisfies V baF <∞.

Proof. Since f is continuous on [a, b]: ∃M ≥ 0 with |f(x)| ≤M , ∀x ∈ [a, b].For any partition P = {x0, . . . , xn} of [a, b], we have by the MVT of RS-integrationfor some ξi ∈ [xi−1, xi]

|F (xi)− F (xi−1)| =∣∣ ∫ xi

a

fdα−∫ xi−1

a

fdα∣∣

=∣∣ ∫ xi

xi−1

fdα∣∣ =

∣∣f(ξi)(α(xi)− α(xi−1)

)∣∣ ≤M ∣∣α(xi)− α(xi−1)∣∣.

Thusn∑i=1

∣∣F (xi)− F (xi−1)∣∣ ≤M(α(b)− α(a))

and so V baF ≤M(α(b)− α(a)). �

45) Let f satisfy V ba f < ∞ such that f(x) ≥ m > 0, ∀x ∈ [a, b]. Show that thereare two increasing functions p, q such that

f =p

qon [a, b].


Proof. V ba f < ∞ =⇒38)

f is bounded. Thus m ≤ f(x) ≤ M for some M > 0

and all x ∈ [a, b]. Consider

F (x) := lnx, x ∈ [m,M ].

Then F is Lipschitz continuous on [m,M ] since F ′(x) = 1x is bounded on [m,M ].

Therefore by 40), F ◦ f = ln f has bounded variation on [a, b]. Using 4.9.6, we canwrite

ln f = g − h,where g, h are increasing on [a, b]. Thus

f =eg

eh

and we may choose p = eg, q = eh. �

46) Let V ba f <∞ and define

f(x) :=

1x−a

∫ x

a

f(t)dt, x ∈ ]a, b],

0, x = a.

Then show: V ba f <∞.

Proof. By 4.9.6, we know that

f = g − h, g, h↗and so ∫ x

a

f(t)dt =

∫ x

a

g(t)dt−∫ x

a

h(t)dt, x ∈ [a, b].

It is now enough to show that if p is an increasing function on [a, b], then the sameis true for

p(x) :=

{1

x−a∫ xap(t)dt, x ∈ ]a, b],

0, x = a.

Indeed, then

V ba f ≤39)

V ba g + V ba h ≤4.9.3

g(b)− g(a) + h(b)− h(a) <∞.

We have for a < x < y < b:

p(y)− p(x) =1

y − a

∫ y

a

p(t)dt− 1

x− a

∫ x

a

p(t)dt

=1

y − a

∫ y

x

p(t)dt+

(1

y − a− 1

x− a︸︷︷︸= x−y

(y−a)(x−a)

)∫ x

a

p(t)dt

≥ 1

y − a

∫ y

x

p(x)dt︸︷︷︸= y−x

(y−a)(x−a)p(x)(x−a)=− x−y(y−a)(x−a)

∫ xap(x)dt

+x− y

(y − a)(x− a)

∫ x

a

p(t)dt

=x− y

(y − a)(x− a)︸︷︷︸≤ 0

∫ x

a

(p(t)− p(x)︸︷︷︸≤ 0

)dt ≥ 0.

�


47) Show that if f is Lipschitz continuous on [a, b] with Lipschitz constant L, thenϕ(x) := V xa f is also Lipschitz continuous on [a, b] with Lipschitz constant L.

Proof. Let a ≤ x < y ≤ b and {x0, . . . , xn} be an arbitrary partition of [x, y].Then

n∑i=1

|f(xi)− f(xi−1)| ≤ Ln∑i=1

(xi − xi−1) = L(x− y),

and so

ϕ(y)− ϕ(x) = V yx f ≤ L(y − x) for a ≤ x < y ≤ b.The latter clearly implies the Lipschitz continuity of ϕ with Lipschitz constantL. �

Additional examples related to previous sections and Section 4:

For a < b define

S := C([a, b]) := {f | f : [a, b] −→ R is continuous }

Then S becomes a vector space over R with

f1, f2, f ∈ S, α ∈ R :(f1 + f2)(x) := f1(x) + f2(x)

(αf)(x) := αf(x)

}x ∈ [a, b].

Let

‖f‖a := |f(a)|, f ∈ S.Then:

(i) f = 0 (i.e. f(x) = 0 ∀x ∈ [a, b]) =⇒ ‖f‖a = 0

(ii) ‖αf‖a = |α||f(a)| = |α|‖f‖a, ∀f ∈ S, α ∈ R.

(iii)‖f + g‖a = |f(a) + g(a)| ≤ |f(a)|+ |g(a)| = ‖f‖a + ‖g‖a ∀f, g ∈ S.

But ‖ · ‖a is degenerate, since

g(x) := x− a, x ∈ [a, b] satisfies g ∈ S, g 6= 0, but ‖g‖a = |g(a)| = 0.

Thus ‖ · ‖a is a degenerate norm (also called a semi-norm) on S. We have for asequence (fn)n≥1 in S, f ∈ S and g as above:

fn −→n→∞

f in (S, ‖ · ‖a) ⇐⇒ fn(a) −→n→∞

f(a)

m mfn −→ f + g := f in (S, ‖ · ‖a) fn(a) −→ f(a) + g(a)︸︷︷︸

=0

Thus fn −→ f and fn −→ f in (S, ‖ · ‖a), but f − f = g 6= 0, so limits are notunique w.r.t. ‖ · ‖α.


49) Norms that are not equivalent:Consider the vector space S := C([−1, 1]). Then by 34)

(f, g)2 :=

∫ 1

−1

f · g dx, f, g ∈ S.

defines an inner product on S and so (cf. Proposition 1.7.7)

‖f‖2 :=√

(f, f)2, f ∈ S

defines a norm on S. Another norm on S is given by

‖f‖sup := supx∈[−1,1]

|f(x)|, f ∈ S.

Indeed, one immediately verifies the norm axioms. Obviously,

‖f‖2 =

√∫ 1

−1

|f(x)|2dx ≤√

2 ‖f‖sup, ∀f ∈ S,

but no converse estimate holds. Indeed, for ε > 0 let

fε(x) :=

√max

(0,

1

ε

(1− |x|

ε

)).

Then

‖fε‖2 ≡ 1 but ‖fε‖sup =

√1

ε↗ε↘0

+∞

Thus

@C > 0 with ‖f‖sup ≤ C ‖f‖2, ∀f ∈ S.

Typically, this happens since S is ∞-dimensional (see below). In particular closedand bounded subsets of S are in general not compact (cf. for instance Arzela-AscoliTheorem ?? below).To see that S = C([−1, 1]) is not finite dimensional consider any sequence (xi)i∈Nof distinct points in [−1, 1], i.e. xi ∈ [−1, 1], ∀i ∈ N and xi 6= xj whenever i 6= j.Then

∀n ∈ N, ∃ϕn,i ∈ S, 1 ≤ i ≤ n, with

ϕn,i(xj) = δij =

{1 if i 6= j,

0 else,(j ∈ N).


For instance: xi = 1i , i ≥ 1 and

Let αi ∈ R, 1 ≤ i ≤ n, be arbitrary and suppose

f =

n∑i=1

αiϕn,i = 0 in S, i.e. pointwise on [−1, 1].

Then in particular

0 = f(xj) = αj , 1 ≤ i ≤ n,

=⇒ (ϕn,i)1≤i≤n are linearly independent,

=⇒n was

arbitrary

S is not finite dimensional.

Moreover, (S, ‖ · ‖2) is not complete. Illustration:

Then fn −→n→∞

f w.r.t. ‖ · ‖2

but f /∈ S.

Bibliography

[1] R.M. Dudley, Real Analysis and Probability, Cambridge University Press 2004.[2] W. Rudin, Principles of Mathematical Analysis, Third Edition, McGraw-Hill 1976.

117

introduction to mathematical analysis 1trutnau/lecturea12020.pdf · 2020-03-20 · lecture notes...

Documents