compiled notes of course 18 - tel aviv universityostrover/teaching/18125.pdf · compiled notes of...

102
Compiled Notes of Course 18.125 Real and Functional Analysis Instructor: Yaron Ostrover Notes Author: Dahua Lin May 01, 2008

Upload: trinhtu

Post on 15-May-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Compiled Notes of Course 18.125

Real and Functional Analysis

Instructor: Yaron OstroverNotes Author: Dahua Lin

May 01, 2008

Contents

I Measure and Integration Theory 3

1 Lebesgue Measure Theory 41.1 Why Riemann Theory is Not Enough? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1.1 Brief Review of Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.2 Riemann Theory vs Lebesgue Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 The Basic Idea of Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Generic Measure Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.1 A Naive Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.2 σ-algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.3 Measurable and Measure Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2.4 Continuity of Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3.1 The Measure of Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3.2 Outer Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3.3 From Outer Measure to Lebesgue Measure . . . . . . . . . . . . . . . . . . . . . . . . 151.3.4 Existence of Mon-measurable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.4 Borel Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.4.1 Borel Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.4.2 Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Integration Theory 242.1 Lebesgue Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.1.1 Simple Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.1.2 Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.1.3 Additivity, Monotone Convergence Theorem, and Induced Measure . . . . . . . . . . . 272.1.4 Lebesgue Integration of Complex Functions . . . . . . . . . . . . . . . . . . . . . . . . 312.1.5 Dominated Convergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.2 Null Sets and Almost Everywhere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2.1 Null Sets and Complete Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2.2 Almost Everywhere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.3 Riesz Representation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.4 Principles of Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3 Introduction to Lp Spaces 463.1 Important Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3.1.1 Convex Sets and Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.1.2 Jensen’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.1.3 Holder’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.1.4 Minkowski’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.2 Lp Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.2.1 From Lp space to Lp space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.2.2 Lp Convergence and Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3 Important Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.3.1 Lusin’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

1

3.3.2 Egoroff’s Theorem and Convergence in measure . . . . . . . . . . . . . . . . . . . . . . 52

4 Product Measure 544.1 Product Measure Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.1.1 Product Measurable Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.1.2 Measure by Extension: Hahn-Kolmogorov theorem . . . . . . . . . . . . . . . . . . . . 554.1.3 Monotone Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.1.4 Product Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.2 Fubini’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5 Complex Measures 615.1 The Space of Complex Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.2 Lebesgue Decomposition of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.2.1 Absolute continuity and Singularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.2.2 Lebesgue Decomposition Theorem and Radon-Nikodym Theorem . . . . . . . . . . . . 64

II Fundamentals of Functional Analysis 67

6 Review of Linear Algebra 686.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 686.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696.3 Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.4 Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

7 Normed and Banach Spaces 737.1 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.1.1 Metric and Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737.1.2 Convergence in Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.2 Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757.2.1 Norm and Normed Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757.2.2 Seminorm and Norm of Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.3 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

8 Hilbert Spaces 818.1 Inner Product and Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 818.2 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838.3 Hilbert Spaces and Closed Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848.4 Complete Orthonormal Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

9 Linear Functionals 909.1 The Space of Linear Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909.2 Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

9.2.1 Bounded Linear Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 919.2.2 Dual Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 929.2.3 Riesz Representation Theorem on Hilbert Space . . . . . . . . . . . . . . . . . . . . . 939.2.4 Hahn-Banach Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 949.2.5 Second Dual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

10 Linear Operators 9710.1 The Space of Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9710.2 Inverse Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9810.3 Dual Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10010.4 Convergence of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

2

Part I

Measure and Integration Theory

3

Chapter 1

Lebesgue Measure Theory

1.1 Why Riemann Theory is Not Enough?

1.1.1 Brief Review of Riemann Integration

Let f : I → R be a real valued function defined on a real interval I = [a, b]. A partition P of I is definedby P = a = t0 < t1 < . . . < tn = b. A choice function ε is a function that chooses a point from each smallintervals in the partition: ξ(P ) = (t∗1, t

∗2, . . . , t

∗n), s.t. t∗i ∈ [ti, ti+1]∀i = 1, . . . , n. Then, we can define a sum

to approximate the integral as follows

I(f, P, ξ) =n∑

i=1

n∑i=1

f(t∗i )(ti+1 − ti), (1.1)

where n is the number of paritioned intervals in P .If the sum defined above converges as the maximum length of the partitioned intervals in P approaches

zero. The limit of the sum is called the Riemann integral of f on I, and f is said to be Riemann integrable.Formally, a real valued function f is said to be integrable on I if there exists v ∈ R such that

∀ε > 0,∃δ > 0, s.t.∀P, ∀ξ, s.t. max1≤j≤n

|tj − tj−1| < δ, |I(f, P, ξ) − v| < ε,

then v is called the Riemann integral of f on I, and f is said to be Riemann integrable.

1.1.2 Riemann Theory vs Lebesgue Theory

1. “Riemann integration mainly works for continuous functions”. An example that is not Riemann inte-grable is the Dirichlet function defined by

D(x) =

{1 x ∈ Q0 x /∈ Q

.

While, “Lebesgue integration works for every function that you can imagine”.

2. Lebesgue theory readily answers a series of important questions, especially the convergence of integra-tion. Does fn → f imply

∫fn →

∫f ?

3. Lebesgue theory is the foundation of many modern mathematical branches, including modern probabilitytheory and functional analysis.

1.1.3 The Basic Idea of Lebesgue Integration

The theoretical development of the Lebesgue integration in a rigorous manner is very technical, however, thebasic idea behind is pretty simple.

4

1.2. GENERIC MEASURE SPACE CHAPTER 1. LEBESGUE MEASURE THEORY

“In doing Riemann integration, we divide the domain of a function, while in Lebesgue integration, we tryto divide its range.” ∫

f =∑

r∈rgnf

r · sizeof{x|f(x) = r}.

The key question here is that how to define the size of a set, which leads to the study of measure.

1.2 Generic Measure Space

1.2.1 A Naive Approach

We intend to assign a nonnegative value to each subset to measure its size. This can be defined as a functionfrom subsets to nonnegative real values:

µ : 2R → [0,∞], (1.2)

where 2R refers to the set of all subsets of R.We hope that the function µ satisfies the following properties:

• µ(∅) = 0.

• µ(R) = ∞.

• (countable additivity) µ(⊔

i∈N Ai) =∑

i∈N µ(Ai).

• µ([a, b]) = b− a, ∀b ≥ a ∈ R.

• (monotonicity) A ⊆ B ⇒ µ(A) ≤ µ(B)

• (translation invariance) µ(A+ y) = µ(A), ∀y ∈ R.

Then we can construct approximation to the integration by the following series.

In(f) =∞∑

k=−∞

k

2nµ{x|f(x) ∈ [

k

2n,k + 12n

]}, (1.3)

we can see that this way of construction actually divides the range into intervals of length 1/2n. As napproches infinity, we are expecting that In approches the integral.

The Bad news is that “there is no such function µ defined on 2R that simultaneously satisfies all conditionsabove.” More strictly, simultaneous satisfaction of µ(∅) = 0, µ([a, b]) = b − a and countable additivity isimpossible for µ : 2R → [0,∞].

Clearly we want too much. To address this issue, we try to construct a function defined on only areasonable subset of 2R that satisfies all the above conditions, rather than making it defined on the entire 2R,which is too big.

1.2.2 σ-algebra

To construct such a “reasonable collection of subsets”, we introduce an algebraic system of subsets calledσ-algebra.

Definition 1.1 (σ-algebra). A σ-algebra over a set X is a non-empty class F of subsets of X that includesthe empty set and is closed under complementation and countable unions. Formally, we have

1. ∅ ∈ F

2. A ∈ F ⇒ Ac ∈ F

3. ∀i ∈ N, Ai ∈ F ⇒∪∞

i=1Ai ∈ F .

Given any non-empty set X, it is easy to verify that the following are σ-algebras:

1. 2X

5

1.2. GENERIC MEASURE SPACE CHAPTER 1. LEBESGUE MEASURE THEORY

2. {∅, X}

3. Given any A ⊂ X, {∅, A,Ac, X}. This σ-algebra is the smallest σ-algebra that contains A, denoted byσ({A}).

Proposition 1.1 (Properties of σ-algebra). For any σ-algebra F over a non-empty set X, we have

1. (inclusion of X) X ∈ F

2. (closed under intersection and union) A,B ∈ F ⇒ A ∩B ∈ F , and A\B ∈ F

3. (closed under countable intersection) ∀i ∈ N, Ai ∈ F ⇒∩n

i=1Ai ∈ F

4. (closed under finite union and intersection) ∀i = 1, . . . , n, Ai ∈ F ⇒∪n

i=1Ai ∈ F , and∩n

i=1Ai ∈F .

Proof. 1. X ∈ F follows from the fact that ∅ ∈ F and X = ∅c.

2. A,B ∈ F ⇒ Ac ∈ F , Bc ∈ F , hence A ∩B = (Ac ∪Bc)c ∈ F and A\B = A ∩Bc ∈ F .

3. ∀i ∈ N, Ai ∈ F ⇒ ∀i ∈ N, Aci ∈ F . Hence,

∞∩i=1

Ai =

( ∞∪i=1

Aci

)c

∈ F .

4. Given Ai ∈ F , i = 1, . . . n, we can augment the finite collection of subsets to an countable collection byletting Ai = ∅,∀i > n. Clearly, all sets in the augmented collection are in F . Then,

n∪i=1

Ai =∞∪

i=1

Ai ∈ F .

The closeness of finite intersection immediately follows from that of finite union, as

n∩i=1

Ai =

(n∪

i=1

Aci

)c

∈ F .

1.2.3 Measurable and Measure Spaces

Definition 1.2 (Measurable space). A non-empty set X together with a σ-algebra F defined over it is calleda measurable space, denoted by (X,F).

Elements of F are said to be F-measurable, or simply measurable if F is clear from context.

Definition 1.3 (Measure). A measure of a measurable space (X,F) is a map µ : F → [0,∞] such that

1. µ(∅) = 0;

2. (countable additivity) µ(⊔∞

i=1Ai) =∑∞

i=1 µ(Ai) for disjoint sets A1, A2, . . . in F .

Note that the first condition is required to prevent the trivial construction that assigns infinity to everyset in F .

Any measurable space admits a measure. A trivial example is to let µ(A) = 0 for each A ∈ F .

Definition 1.4 (Measure space). A measurable space (X,F) together with a measure µ is called a measurespace, denoted by (X,F , µ).

Proposition 1.2 (Properties of measure space). Given a measure space (X,F , µ), the measure µ satisfiesthe following statements:

1. (finite additivity) Given n disjoint sets A1, . . . , An in F , µ(⊔n

i=1Ai) =∑n

i=1 µ(Ai).

6

1.2. GENERIC MEASURE SPACE CHAPTER 1. LEBESGUE MEASURE THEORY

2. (monotonicity) ∀A,B ∈ F s.t.A ⊆ B, µ(A) ≤ µ(B).

3. (measure of difference set) ∀A,B ∈ F s.t.A ⊆ B, µ(A) <∞, µ(B\A) = µ(B) − µ(A).

4. (inclusion-exclusion principle) ∀A,B ∈ F s.t.µ(A∩B) < +∞, µ(A∪B) = µ(A)+µ(B)−µ(A∩B).

Proof. 1. Augment the finite collection of subsets to a countable collection by letting Ai = ∅ for all i > n.It is obvious that the subsets in the augmented collection are all in F and mutually disjoint, thuscountable additivity can be applied as follows,

µ

(n⊔

i=1

Ai

)= µ

( ∞⊔i=1

Ai

)=

∞∑i=1

µ(Ai) =n∑

i=1

µ(Ai).

The last equality is from the fact that µ(Ai) = µ(∅) = 0,∀i > n.

2. We can write B as B = A t (B\A), where B\A ∈ F because A,B ∈ F . From the finite additivityproved above, we have

µ(B) = µ(A) + µ(B\A) ≥ µ(A),

by nonnegativity of measure.

3. From the formula above, we haveµ(B\A) = µ(B) − µ(A)

whenever µ(A) <∞. Note that µ(A) <∞ is indispensable as ∞−∞ is undefined in the extended realsystem.

4. Rewrite A ∪B asA ∪B = A t (B\A) = A t (B\(A ∩B)).

Clearly, B\(A∩B) ∈ F and it has no overlap with A. By finite additivity and the measure of differenceset proved above, we have

µ(A ∪B) = µ(A) + µ(B\(A ∩B)) = µ(A) + µ(B) − µ(A ∩B).

Here, the condition that µ(A ∩B) <∞ is necessary due to the reason given above.

1.2.4 Continuity of Measure

To study the continuity of a measure, we first define the monotonical sequences of sets.

Definition 1.5 (Monotonical sequence of sets). A sequence of sets (En)∞n=1 is said to increases to E if∀i ∈ N, Ei ⊆ Ei+1, and

∪∞i=1Ei = E, denoted by Ei ↑ E. A sequence of sets (En)∞n=1 is said to decreases

to E if ∀i ∈ N, Ei ⊇ Ei+1, and∩∞

i=1Ei = E, denoted by Ei ↓ E.

Lemma 1.1 (Continuity of measure). Given a measure space (X,F , µ), let (Ei)∞i=1 be a sequence of sets inF , then

1. En ↑ E ⇒ µ(E) ⇒ limn→∞ µ(En);

2. En ↓ E, and ∃Ei s.t. µ(Ei) <∞ ⇒ µ(E) = limn→∞ µ(En)

Proof. 1. This statement is proved by three steps: express the limit set as a disjoint union → write itsunion as a series by countable additivity → prove the convergence of the series.

(a) First, we prove that

E =∞∪

i=1

Ei = E1 t

( ∞⊔i=1

Ei+1\Ei

)(1.4)

For conciseness, we denote the set defined in the right hand by E. To establish the equality, weshow both E ⊆ E and E ⊆ E. The former follows directly from the fact that E1 ⊆ E andEi\Ei−1 ⊆ Ei ⊆ E for all i = 1, 2, . . ..

7

1.2. GENERIC MEASURE SPACE CHAPTER 1. LEBESGUE MEASURE THEORY

Now, we prove the other direction. For arbitrary x ∈ E, there exists Ei such that x ∈ Ei. Let k bethe minimum integer such that x ∈ Ek. If k = 1, then x ∈ E1, otherwise k > 1 and x ∈ Ek\Ek−1.In both cases, it is obvious that x ∈ E. As x is arbitrarily choosen from E, we have E ⊆ E.Furthermore, we need to verify that the sets in the right hand side are disjoint. For each positiveinteger i, we have E1 ⊆ Ei, it follows that

E1 ∩ (Ei+1\Ei) = (E1 ∩ Eci ) ∩ Ei+1 = ∅ ∩ Ei+1 = ∅.

In addition, for any i < j ∈ nsp, we have Ei ⊆ Ei+1 ⊆ Ej , and thus

(Ei+1\Ei) ∩ (Ej+1\Ej) = Eci ∩ (Ei+1 ∩Ec

j ) ∩Ej+1 = Eci ∩ ∅ ∩Ej+1 = ∅.

Therefore, the sets {E1, Ei+1\Ei,∀i ∈ N} are mutually disjoint.

(b) Based on (1.4), we have the following by countable additivity

µ(E) = µ(E1) +∞∑

i=1

µ(Ei+1\Ei) (1.5)

On the other hand, it can be easily shown by induction that

Ek = E1 t

(k−1⊔i=1

Ei+1\Ei

). (1.6)

by finite additivity, we have

µ(Ek) = µ(E1) +k−1∑i=1

µ(Ei+1\Ei). (1.7)

Comparing (1.5) and (1.7), we can see that µ(Ek) is a finite partial sum of the series µ(E).

(c) When µ(E) < +∞, the series (1.4) is absolutely convergent (due to nonnegativity of measure), itfollows that limk→∞ µ(Ek) → µ(E). Otherwise, µ(E) → ∞, then given any v > 0, there is k ∈ Nsuch that the partial sum in (1.7) is larger than v. Hence, limk→∞ µ(Ek) = ∞ = µ(E).

2. Because Ei ⊇ Ei+1, ∀i, it can be easily shown that for any k,

E =∞∩

i=1

Ei =∞∩

i=k

Ei.

Since there exists k0 such that µ(Ek0) < ∞ and thus µ(Ei) < ∞ for all i ≥ k0, we can express theintersection of any sequence of the given condition as the intersection of sequence of finite-measure sets.Hence, we assume µ(E1) < +∞ without lossing generality.

(a) First of all, we prove the following identity

E = E1 ∩

( ∞∩i=1

(Ei\Ei+1)c

)= E1\

( ∞∪i=1

(Ei\Ei+1)

).

We only need to prove the first equality, while the second one immediately follows.Let G =

∩∞i=1(Ei\Ei+1)c, then E ⊆ E1 ∩ G follows from the fact that E ⊆ E1 and E ⊆ Ei+1 ⊆

(Ei\Ei+1)c. Now, we prove the other direction. For arbitrary x ∈ E1 ∩ G, we have x ∈ E1 andx ∈ Ei\Ei+1 for all i ∈ N. It can be easily shown by induction that

Ek = E1 ∩

(k−1∩i=1

(Ei\Ei+1)c

). (1.8)

Hence, x ∈ Ek for all k ∈ N. As a result, x ∈ E =∩∞

i=1Ei, and thus E1 ∩G ⊆ E.

8

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

In addition, for all i < j ∈ N we have (Ei\Ei+1) ∩ (Ej\Ej+1) due to Ej ⊆ Ei+1. Hence, the setsin form of Ei\Ei+1 are disjoint. Furthermore,

∪∞i=1(Ei\Ei+1) ⊆ E1 due to Ei\Ei+1 ⊆ Ei ⊆ E1.

Therefore, (1.8) can be rewritten as

E1\E =∞⊔

i=1

(Ei\Ei+1). (1.9)

And from (1.8), we have

E1\Ek =k−1⊔i=1

(Ei\Ei+1). (1.10)

(b) From (1.9) and (1.10), we have (E1\Ek) ↑ (E1\E). Applying the conclusion for increasing sequencesets that we have just proved, we get

µ(E1\E) = limn→∞

µ(E\En). (1.11)

As E ⊆ E1, En ⊆ E1 for all n ∈ N,and µ(E1) <∞, the measure of E and En,∀n are finite. Then,the above formula can be further written into

µ(E1) − µ(E) = limn→∞

(µ(E1) − µ(En)) = µ(E1) − limn→∞

µ(En) (1.12)

It immediately leads to µ(E) = limn→∞ µ(En).

It is important to note that the existence of a finite-measure set Ei is crucial for the second statement inthe above lemma. Here, I give an example. Consider the sequence of sets (Ei)∞i=1 with Ei = (i,∞). Clearly,µ(Ei) = ∞ for each i ∈ N, however, this sequences obviously decreases to E = ∅, and thus µ(E) = 0, whichis not the limit of µ(Ei).

1.3 Lebesgue Measure

In the following, we are trying to construct a measure on R, which is called the Lebesgue measure, denotedby m. The goal is that Lebesgue measure should satisfy the following properties

1. m(interval) = len(interval);

2. translation invariance: m(x+A) = m(A);

3. σ-additivity.

The plan of constructing Lebesgue measure comprises three stages:

1. define the measure m on some simple sets, i.e. intervals;

2. extend the definition to all reasonable sets by approximating them with finite or countable union ofintervals;

3. restrict to a certain σ-algebra to achieve σ-additivity.

1.3.1 The Measure of Intervals

Definition 1.6 (Semi-algebra). A collection of subsets S ⊆ 2X is called a semi-algebra over X if it satisfies

1. ∅ ∈ S and X ∈ S;

2. closed under intersection: A,B ∈ S ⇒ A ∩B ∈ S;

9

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

3. if A ∈ S, then Ac can be expressed as finite union of sets in S, as

X\A =N∪

i=1

Bi

where Bi ∈ S, ∀i = 1, . . . , N .

Consider the set of all intervals in R, which is defined as

Int = {I is in form of 〈a, b〉|a, b ∈ [−∞,∞], a ≤ b}.

Here, we say that a set I ⊆ R is form of 〈a, b〉 if I is either of the following: [a, b], (a, b], [a, b) or (a, b).

Proposition 1.3. The set of intervals Int is a semi-algebra.

Proof. 1. ∅ = (a, a) for arbitrary a ∈ R, thus ∅ ∈ Int, and R = (−∞,+∞) ∈ Int.

2. Given arbitrarily two intervals respectively in form of 〈a1, b1〉 and 〈a2, b2〉, let a′ = max(a1, a2) andb′ = min(b1, b2. If a′ ≤ b′, their intersection is in form of 〈a′, b′〉, otherwise their intersection is ∅. Inboth cases, the intersection is in Int, thus Int is closed under intersection.

3. For I = [a, b] ∈ Int, we have Ic = (−∞, a) ∪ (b,∞) ∈ Int. Similar argument shows that the thirdcondition also applies to other interval forms: [a, b), (a, b] and (a, b).

We define the length of an interval by l(〈a, b〉) = b− a.

Lemma 1.2. The length of intervals defined over Int satisfies σ-additivity, that is if I ∈ Int and I1, I2, . . . ∈Int, and {Ii}∞i=1 are disjoint, then

I =∞⊔

k=1

Ik ⇒ l(I) =∞∑

k=1

l(Ik). (1.13)

Proof. 1. First of all, we proof that the length satisfies finite additivity, that is

∀I, Ik ∈ S, I =N⊔

k=1

Ik ⇒ l(I) =N∑

k=1

l(Ik).

Let I be in form of 〈a, b〉 with a ≤ b and Ik be in form of 〈ak, bk〉 with ak ≤ bk. In the case whereI = tN

k=1Ik = ∅, we must have I = ∅ and Ik = ∅,∀k = 1, . . . , N . Then, l(I) = 0 and∑N

k=1 l(Ik) =∑Nk=1 0 = 0. The statement holds. In the following, we assume I 6= ∅, and Ik 6= ∅ for all k = 1, . . . , N .

(If there are some empty sets in {Ik}Nk=1 we can simply remove them without affecting the equality as

the length of empty sets are zeros.)

We rearrange Ik in the order of ak such that a1 ≤ a2 ≤ · · · ≤ aN . Now we show that this orderedsequence of intervals satisfies the following properties.

(a) a1 = a. If a1 < a, then for any x ∈ (a1, a), x ∈ I1 but x /∈ I; if a1 > a, then for any x ∈ (a, a1),x ∈ I but x /∈ Ik,∀k = 1, . . . , N , and thus x is not in their union. Hence, for I =

⊔Nk=1 Ik, it is

necessary that a1 = a.(b) bN = b. The proof of this is similar to above. If bN < b, then for any x ∈ (bN , b), x ∈ I but

x /∈⊔N

k=1 Ik; or if bN > b, then for any x ∈ (b, bN ), x ∈ Ik but x /∈ I.(c) bk = ak+1 for all k = 1, . . . , N − 1. If bk < ak+1, then for any x ∈ (bk, ak+1), it has a = a1 ≤

ak ≤ bk < x < ak+1 ≤ aN ≤ bN = b, thus x ∈ I, but it is clear that x is not in Ik for any k.If bk > ak+1, then the set (ak+1, bk), which is contained in both Ik and Ik+1, is not empty. Thisviolates the condition that Ik and Ik+1 are disjoint.

Based on the three properties, we have

N∑i=1

l(Ik) =N∑

i=1

(bk − ak) = −a1 +N−1∑k=1

(bk − ak+1) + bN = bN − a1 = b− a = l(I). (1.14)

Hence, the finite additivity is established.

10

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

2. Then, we prove countable sub-additivity.That is to prove

∀I, Ik ∈ Int, I =∞⊔

k=1

Ik ⇒ l(I) =∞∑

k=1

l(Ik).

In doing this, we first prove it for closed and bounded interval I = [a, b]. Let Ik be in form of 〈ak, bk〉with ak ≤ bk. Given arbitrary ε > 0, we set I ′k = (a′k, b

′k) with a′k = ak−2−(k+1)ε and b′k = bk +2−(k+1)ε.

Clearly, Ik ⊂ I ′k, and thus I ⊆∩∞

k=1 I′k.

The collection {I ′k}∞k=1 constitutes a open cover of I. And since I = [a, b] is closed and bounded, thereis an finite sub-cover of I due to Heine-Borel theorem. For convenience of notation, we denote the setsin this finite sub-cover by {Jk}N

k=1, which are respectively (ck, dk).

Then for any x ∈ I, there exists some 1 ≤ k ≤ N such that x ∈ Jk. From the finite sub-cover, weconstruct a sub collection as follows. In the first step, we choose k1 such that a ∈ Jk1 . The selectionprocedure continues as follows, after selecting Jki , if dki > b then it is done, otherwise we select ki+1

such that dki ∈ Jki+1 . As the sub-cover is finite, this procedure can be finished with finite steps. It iseasy to see that the selected collection has: ck1 < a, cki < cki+1 < dki ,∀i = 1, . . . , N ′ − 1, and dkN′ > b,where N ′ is the number of the selected sets. So, we have

N ′∑i=1

l(Jki) =N ′∑i=1

(dki − cki) = −ck1 +N ′−1∑i=1

(dk − ck+1) + dkN′ > dkN′ − ck1 > b− a = l([b, a]). (1.15)

On the other hand,

N ′∑i=1

l(Jki) ≤N∑

k=1

l(Jk) ≤∞∑

k=1

l(I ′k) =∞∑

k=1

(l(Ik) + 2−kε) =

( ∞∑k=1

l(Ik)

)+ ε. (1.16)

Combining the above two inequalities, we have

l([b, a]) <

( ∞∑k=1

l(Ik)

)+ ε. (1.17)

This holds for arbitrary positive number ε, hence

l([b, a]) ≤∞∑

k=1

l(Ik). (1.18)

Now, the sub-additivity is established for any close interval [a, b]. we then continue to show this forother forms of intervals.

For I = [a, b) =⊔∞

k=1 Ik, and consider a family of close sets Aε = [a, b − ε] where 0 < ε < b − a. It iseasy to see that

⊔∞k=1 Ik covers Aε defined above. Since Aε is a close interval, from the sub-additivity

for close interval proved above, we have

b− a− ε = l(Aε) ≤∞∑

k=1

l(Ik), ∀ ε s.t. 0 < ε < b− a (1.19)

thus∞∑

k=1

l(Ik) ≥ b− a = l(I). (1.20)

With similar argument, we can as well establish sub-additivity for (a, b] and (a, b).

3. We finally prove inequality in opposite direction, that is

∀I, Ik ∈ Int, I =∞⊔

k=1

Ik ⇒ l(I) ≥∞∑

i=1

l(Ik).

11

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

First of all, we show that for any N > 0, I can be written as

I =

(N⊔

k=1

Ik

)⊔(M⊔i=1

Ji

)(1.21)

for some disjoint intervals J1, . . . , JM ∈ Int. From De-Morgan’s rule,

I\

(N∪

k=1

Ik

)= I ∩

(N∪

k=1

Ik

)c

= I ∩

(N∩

k=1

Ick

). (1.22)

As Int is a semi-algebra, for each Ik we have

Ick =

nk⊔j=1

Bk,j ,

for some disjoint intervals Bk,j ∈ Int, and thus

N∩k=1

Ick =

N∩k=1

nk∪j=1

Bk,j

=n1∪

j1=1

· · ·nN∪

jN=1

(B1,j1 ∩B2,j2 ∩ · · · ∩BN,jN) (1.23)

For two distinct terms in this union, Bj1 ∩ · · · ∩BjNand Bj′

1∩ · · · ∩Bj′

N, there exists l such that jl 6= j′l ,

thus Bl,jl∩ Bl,j′

l= ∅, and these two distinct terms are disjoint. Hence, we can write this union by

rearranging tht termsN∩

k=1

Ick =

M⊔i=1

Ci. (1.24)

Here, M = n1n2 · · ·nN . Ci are derived by reindexing the terms in (1.23), and thus they are disjoint.Pluging this back into (1.22), we have

I\

(N∪

k=1

Ik

)= I ∩

(M⊔i=1

Ci

)=

M⊔i=1

(I ∩ Ci) =M⊔i=1

Ji, (1.25)

where Ji = I ∩ Ci ∈ Int for i = 1, . . . ,M . Note that Ik are mutually disjoint, and from (1.25), tMi=1Ji

are contained in I and disjoint from tNk=1Ik, so

I =

(N⊔

k=1

Ik

)⊔(M⊔i=1

Ji

), (1.26)

where Ji ∈ Int for i = 1, . . . ,M . The identity of (1.21) is established. Then, from finite additivityproved above, we have

l(I) =N∑

i=1

l(Ik) +M∑

j=1

l(Jk) ≥N∑

i=1

l(Ik). (1.27)

Note that this holds for any N > 0. Taking the limit as N → ∞, we get

l(I) ≥ limN→∞

N∑k=1

l(Ik) =∞∑

k=1

l(Ik). (1.28)

Now, both direction of the inequality has been proved, as a result, the countable additivity of length inInt is thus established.

Corollary 1.1. Let I ∈ Int be an interval, and {Ji}∞i=1 with Ji ∈ Int for all i ∈ N be a cover of I, i.e.I ⊆

∪∞i=1 Ji, then

l(I) ≤∞∑

i=1

l(Ji).

12

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

Proof. Let K1 = J1 and Ki = Ji\Ji−1 for i ≥ 2. It is easy to see that the sets in {Ki}∞i=1 are disjoint. And,since Int is a semi-algebra, each Ki can be written into a finite disjoint union of intervals as

Ki =ni⊔

j=1

Li,j . (1.29)

We extend the definition of length to finite disjoint union of intervals by l(⊔n

k=1 Ik) =∑n

k=1 l(Ik), then

l(Ki) =ni∑

j=1

l(Li,j). (1.30)

On the other hand, Ji = (Ji ∩ Ji−1) tKi, and Ji ∩ Ji−1 ∈ Int, thus by additivity of length,

l(Ji) = l(Ji ∩ Ji−1) + l(Ki) ≥ l(Ki). (1.31)

Now, as I ⊆⊔∞

i=1Ki, we can write

I =∞⊔

i=1

(I ∩Ki) =∞⊔

i=1

ni⊔j=1

(I ∩ Li,j). (1.32)

Here, I ∩ Li,j are intervals, then by σ-additivity of lengths, we have

l(I) =∞∑

i=1

ni∑j=1

l(I ∩ Li,j) ≤∞∑

i=1

ni∑j=1

l(Li,j) =∞∑

i=1

l(Ki) ≤∞∑

i=1

l(Ji). (1.33)

1.3.2 Outer Measure

Definition 1.7 (Outer measure). A function µ∗ : 2X → [0,∞] is said to be an outer measure over X if itsatisfies

1. µ∗(∅) = 0;

2. µ∗(A) ≤ µ∗(B) whenever A ⊆ B;

3. (sub-additivity) µ∗(∪∞

i=1Ai) ≤∑∞

i=1 µ∗(Ai).

Definition 1.8 (Lebesgue’s outer measure). Lebesgue’s outer measure is a function m∗ : 2X → [0,∞]given by

m∗(A) = inf

{ ∞∑k=1

l(Ik) | A ⊆∞∪

i=1

Ik, Ik ∈ Int

}.

Proposition 1.4. Lebesgue’s outer measure is an outer measure over R.

Proof. We proof Lebesgue’s outer measure satisfies the three conditions of outer measure.

1. First of all, Lebesgue’s outer measure is non-negative, which follows from the non-negativity of intervallength. For any a ∈ R, we can write

∅ =∞∪

i=1

Ai,

with Ai = (a, a). Then,

m∗(∅) ≤∞∑

i=1

l(Ai) =∞∑

i=1

0 = 0.

So, m∗(∅) = 0.

2. When A ⊆ B, any cover of B is also a cover of A, then based on the definition of m∗ as an infimium,m∗(A) ≤ m∗(B) directly follows.

13

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

3. Given arbitrary ε > 0, for any i ∈ N, there exists I(j)i ⊆ X for all j = 1, . . . ,∞ such that

Ai ⊆∞∪

j=1

I(j)i ,

and∞∑

j=1

m∗(I(j)i ) < m∗(Ai) +

ε

2i.

Hence,∞∪

i=1

Ai ⊆∞∪

i=1

∞∪j=1

I(j)i ,

and∞∑

i=1

∞∑j=1

m∗(I(j)i ) <

∞∑i=1

(m∗(Ai) +

ε

2i

)=

∞∑i=1

m∗(Ai) + ε.

Note that countable union of countable set is countable, then from the definition of m∗, we have

m∗

( ∞∪i=1

Ai

)<

∞∑i=1

m∗(Ai) + ε,

for arbitrary ε > 0. Hence,

m∗

( ∞∪i=1

Ai

)≤

∞∑i=1

m∗(Ai).

The following lemma shows that the definition of outer measure is consistent with the length of intervals.

Lemma 1.3. Let A = t∞k=1Ik where Ik ∈ Int for all k ∈ N, then m∗(A) =

∑∞i=1 l(Ik). In particular, for each

interval I ∈ Int, we have m∗(I) = l(I).

Proof. First, m∗(A) ≤∑∞

i=1 l(Ik) immediately follows from the definition of outer measure. So, to establishthe equality, it suffices to show

m∗(A) ≥∞∑

i=1

l(Ik).

For arbitrary ε > 0, there exists a cover of of A by {Ji}∞i=1 with Ji ∈ Int, ∀i ∈ N, such that

∞∑i=1

l(Ji) ≤ m∗

( ∞⊔k=1

Ik

)+ ε. (1.34)

Since A = t∞k=1Ik ⊆

∪∞i=1 Ji, we have Ik ⊆

∪∞i=1 Ji ∩ Ik. Note here that each Ji ∩ Ik is also an interval, hence

l(Ik) ≤∞∑

i=1

l(Ji ∩ Ik), ∀k ∈ N. (1.35)

It follows that∞∑

k=1

l(Ik) ≤∞∑

k=1

∞∑i=1

l(Ji ∩ Ik) =∞∑

i=1

∞∑k=1

l(Ji ∩ Ik). (1.36)

(TODO: the exchange of summation order need to be justified).Note that Ji ⊇

⊔∞k=1(Ji ∩ Ik), we have

l(Ji) ≥∞∑

k=1

l(Ji ∩ Ik). (1.37)

14

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

Combining (1.36), (1.37), and (1.34), we get

∞∑k=1

l(Ik) ≤∞∑

i=1

l(Ji) ≤ m∗

( ∞⊔k=1

Ik

)+ ε. (1.38)

As ε > 0 is arbitrary,∞∑

k=1

l(Ik) ≤ m∗

( ∞⊔k=1

Ik

)= m∗(A). (1.39)

Proposition 1.5 (Translation-Invariance of m∗). The Lebesgue’s outer measure m∗ is translation invariant,i.e. for any E ⊆ R and any x ∈ R, m∗(E) = m∗(E + x).

Proof. Note that fact that {Ik}∞k=1 is a cover of E ⊆ R, if and only if {Ik + x}∞k=1 is a cover of E + x. Inaddition, length of intervals is translation invariant, i.e. l(I + x) = l(I) for any I ∈ Int and xinR. Then, wehave for any x ∈ R,

m∗(E) = inf

{ ∞∑k=1

l(Ik)|E ⊆∞∪

k=1

Ik, Ik ∈ Int

}

= inf

{ ∞∑k=1

l(Ik)|E + x ⊆∞∪

k=1

(Ik + x), Ik ∈ Int

}

= inf

{ ∞∑k=1

l(Ik + x)|E + x ⊆∞∪

k=1

(Ik + x), Ik ∈ Int

}= m∗(E + x). (1.40)

Proposition 1.6. The Lebesgue’s outer measure of countable set in R is 0.

Proof. Let A = {x1, x2, . . .} ∈ R be a countable set, given arbitrary ε > 0, we can construct a collectionof intervals that cover A as follows. Let Cε = {Ik} with Ik = (xk − 2−kε, xk + 2−kε). It is obvious thatA ⊆

∪∞k=1 Ik. The total length of this collection is

∞∑i=1

l(Ik) = 2∞∑

i=1

2−kε = 2ε. (1.41)

Hence, m∗(A) ≤ 2ε for any ε > 0, as a result, m∗(A) = 0.

As immediate corollary of this proposition, we have m∗(Q) = 0 and the outer measure of countable unionof countable sets is zero.

Proposition 1.7. The Lebesgue’s measure of Cantor set is zero.

Proof. Here gives a brief sketch of the proof. Consider the construction of Cantor set as an infinite refinementprocess. At the k-step, it ends up with 2k intervals with total length

(23

)k. The intervals derived after any

finite steps form a cover of the Cantor set. Therefore, by the definition of m∗, we have m∗(C) ≤(

23

)k for anyk ∈ N, where C denotes the Cantor set. Taking k → ∞ results in m∗(C) = 0.

1.3.3 From Outer Measure to Lebesgue Measure

We have defined an outer measure and shown its consistency with the definition of length on intervals. In thefinal step, we are going to restrict the domain of the outer measure to achieve σ-additivity, which eventuallyleads to the Lebesgue measure on R.

The restricted domain is characterized by the following criterion.

15

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

Definition 1.9 (Lebesgue measurable set). A subset E ⊆ R is said to be Lebesgue measurable if it satisfiesthe Caratheodory’s condition given by

∀T ⊆ R, m∗(T ) = m∗(T ∩ E) +m∗(T ∩Ec).

In the following, we use B0 to denote the collection of all Lebesgue measurable sets for conciseness. And,we denote the restriction of the outer measure to B0 by m, i.e. m = m∗|B0 . We will show later that B0 is aσ-algebra, and m is a measure over B0.

Theorem 1.1 (Properties of Lebesgue measurable sets). The Lebesgue measurable sets defined above have

1. ∅,R ∈ B0 with m(∅) = 0 and m(R) = ∞.

2. E ∈ B0 ⇒ Ec ∈ B0.

3. E1, E2 ∈ B0 ⇒ E1 ∪ E2, E1 ∩ E2 ∈ B0, which immediately follows that E1\E2 ∈ B0. By induction, wefurther have B0 is closed under finite intersection and finite union.

4. Let {Ei}Ni=1 ⊂ B0 be disjoint, then m∗

(T ∩

⊔Ni=1Ei

)=∑N

i=1m∗(T ∩ Ei). This is a generalization of

the Caratheodory’s condition.

5. m is translation invariant, i.e. E ∈ B0 ⇒ m(E + x) = m(E), ∀x ∈ R.

6. B0 is closed under countable union and intersection. {Ei}∞i=1 ⊂ B0 ⇒∪∞

i=1Ei,∩∞

i=1Ei ∈ B0.

7. m is countable additive. m (⊔∞

i=1Ei) =∑∞

i=1m(Ei) for any disjoint collection of Ei in B0.

The properties given above immediately leads to the definition of Lebesgue measure.

Definition 1.10 (Lebesgue measure). The collection B0 of all Lebesgue measurable sets is a σ-algebra, andm defined above is a measure over B0, called the Lebesgue measure. The triple (R,B0,m) is called theLebesgue measure space on R.

Now, we prove the theorem.

Proof. Let B0 be Lebesgue measurable sets, i.e. the sets satisfying the Caratheodory’s condition, and m bethe restriction of outer measure to B0.

1. For any T ⊆ R,m∗(T ∩ ∅) +m∗(T ∩ ∅c) = m∗(∅) +m∗(T ) = m∗(T ), (1.42)

andm∗(T ∩ R) +m∗(T ∩ Rc) = m∗(T ) +m∗(∅) = m∗(T ). (1.43)

So, ∅,R ∈ B0, and thus m(∅) = m∗(∅) = 0 and m(R) = m∗(R) = ∞.

2. For any T ⊆ R, and E ∈ B0,

m∗(T ∩Ec) +m∗(T ∩ (Ec)c) = m∗(T ∩ Ec) +m∗(T ∩E) = m∗(T ), (1.44)

so Ec ∈ B0.

3. For any T ⊆ R, and E1, E2,∈ B0, We first prove that E1 ∪ E2 ∈ B0. Note that

T ∩ (E1 ∪ E2) = T ∩ (E1 ∪ (E2 ∩ Ec1)) = (T ∩ E1) ∪ (T ∩Ec

1 ∩ E2). (1.45)

By sub-additivity of m∗, we have

m∗(T ∩ (E1 ∪E2)) ≤ m∗(T ∩ E1) +m∗(T ∩ Ec1 ∩E2). (1.46)

On the other hand, that E2 ∈ B0 results in

m∗(T ∩Ec1 ∩E2) +m∗(T ∩Ec

1 ∩ Ec2) = m∗(T ∩Ec

1). (1.47)

16

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

Combining (1.46) and (1.47), and considering E1 ∈ B0, we have

m∗(T ∩ (E1 ∪E2)) +m∗(T ∩ (E1 ∪ E2)c)=m∗(T ∩ (E1 ∪ E2)) +m∗(T ∩Ec

1 ∩ Ec2)

≤m∗(T ∩ E1) +m∗(T ∩ Ec1 ∩ E2) +m∗(T ∩ Ec

1 ∩Ec2)

=m∗(T ∩ E1) +m∗(T ∩ Ec1)

=m∗(T ). (1.48)

The opposite direction of the inequality, that is

m∗(T ) ≤ m∗(T ∩ (E1 ∪E2)) +m∗(T ∩ (E1 ∪ E2)c),

directly follows from sub-additivity of m∗. Hence, the equality is established, and E1 ∪E2 ∈ B0.

While the closeness under intersection can be directly obtained based on that of set complement andunion. The argument is briefly given as follows. Since E1, E2 ∈ B0, Ec

1, Ec2 ∈ B0, then E1 ∩ E2 =

(Ec1 ∪ Ec

2)c ∈ B0.

4. We prove this property by induction. When N = 1, the equality trivially holds. Suppose the propertyholds for N = k, then we show that it holds for N = k + 1. For conciseness, we let Gk =

⊔ki=1Ei. As

Ek+1 ∈ B0,m∗(T ∩Gk+1) = m∗(T ∩Gk+1 ∩Ek+1) +m∗(T ∩Gk+1 ∩ Ec

k+1). (1.49)

Here,Gk+1 ∩Ek+1 = Ek+1, and Gk+1 ∩ Ec

k+1 = Gk.

Then, we havem∗(T ∩Gk+1) = m∗(T ∩Gk) +m∗(T ∩Ek+1) (1.50)

Due to the induction assumption,

m∗(T ∩Gk) =k∑

i=1

m∗(T ∩ Ei), (1.51)

Combining (1.50) and (1.51) leads to

m∗(T ∩Gk+1) =k+1∑i=1

m∗(T ∩ Ei). (1.52)

To sum up, this property holds for all N ∈ N.

5. We first prove several identities for set translation,

(A+ x) ∩ (B + x) = (A ∩B) + x. (1.53)

(A+ x)c = Ac + x. (1.54)

(A+ x) ∩ (B + x)c = (A ∩Bc) + x. (1.55)

This can be shown by

p ∈ (A+ x) ∩ (B + x) ⇔ p− x ∈ A ∩B ⇔ p ∈ A ∩B + x. (1.56)

andp ∈ (A+ x)c ⇔ p /∈ A+ x⇔ p− x /∈ A⇔ p− x ∈ Ac ⇔ p ∈ Ac + x. (1.57)

The third identity immediately follows the first two.

Then, we have for any E, T ∈ 2R,

T ∩ (E + x) = ((T − x) + x) ∩ (E + x) = ((T − x) ∩E) + x. (1.58)

17

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

and similarly, we haveT ∩ (E + x)c = ((T − x) ∩Ec) + x (1.59)

Due to the translation invariance of m∗, we have

m∗(T ∩ (E + x)) = m∗(((T − x) ∩E) + x) = m∗((T − x) ∩E) (1.60)

and likewisem∗(T ∩ (E + x)c) = m∗((T − x) ∩ Ec). (1.61)

Whenever E ∈ B0, we have for any T ,

m∗(T ∩(E+x))+m∗(T ∩(E+x)c) = m∗((T−x)∩E)+m∗((T−x)∩Ec) = m∗(T−x) = m∗(T ). (1.62)

This shows that E+x ∈ B0. Because m is a restriction of m∗ and m∗ is translation-invariant, it followsdirectly that

m(E + x) = m∗(E + x) = m∗(E) = m(E). (1.63)

6. Given {Ei}∞i=1, let B1 = E1 and Bi = Ei\(∪i−1

j=1Ei

)for i ≥ 2, then it is easy to see that {Bi}∞i=1 is

disjoint, and∞∪

i=1

Ei =∞⊔

i=1

Bi.

As B0 is closed under union and set difference, Bi ∈ B0 for all i ∈ N.

In the following, we are going to show that the right hand side is Lebesgue measurable, that is to showfor any T ⊆ R,

m∗(T ) = m∗

(T ∩

( ∞⊔i=1

Bi

))+m∗

(T\

( ∞⊔i=1

Bi

)). (1.64)

The inequality (≤) directly follows from σ-sub-additivity of m∗. To establish the equality, it suffices toshow the other direction (≥). For any k ∈ N,

⊔ki=1Bi ∈ B0, then

m∗(T ) = m∗

(T ∩

(k⊔

i=1

Bi

))+m∗

(T\

(k⊔

i=1

Bi

))

=k∑

i=1

m∗(T ∩Bi) +m∗

(T\

(k⊔

i=1

Bi

))

≥k∑

i=1

m∗(T ∩Bi) +m∗

(T\

( ∞⊔i=1

Bi

)). (1.65)

Here, the first equality is due to the 4th property proved above, while the second inequality is due tomonotonicity of m∗. Since (1.65) holds for every k ∈ N, taking k → ∞, we get

m∗(T ) ≥∞∑

i=1

m∗(T ∩Bi) +m∗

(T\

( ∞⊔i=1

Bi

))

≥ m∗

(T ∩

( ∞⊔i=1

Bi

))+m∗

(T\

( ∞⊔i=1

Bi

)). (1.66)

7. First, m (⊔∞

i=1Ei) ≤∑∞

i=1m(Ei) directly follows the σ-sub-additivity of m∗. For the other direction ofinequality, we have for arbitrary N > 0,

m

( ∞⊔i=1

Ei

)≥ m

(N⊔

i=1

Ei

)=

N∑i=1

m(Ei). (1.67)

Here, the first inequality is due to the monotonicity of m∗, while the second equality follows from the4th property proved above by letting T = R.

18

1.3. LEBESGUE MEASURE CHAPTER 1. LEBESGUE MEASURE THEORY

Taking N → ∞, we get

m

( ∞⊔i=1

Ei

)=

∞∑i=1

m(Ei). (1.68)

Proposition 1.8. Intervals are Lebesgue measurable, i.e. Int ⊂ B0.

Proof. Here, we need to prove that given I ∈ Int, for any T ⊆ R, m∗(T ) = m∗(T ∩ I) +m∗(T ∩ Ic).First, given arbitrary ε > 0, there exist {Ik : Ik ∈ Int}∞k=1 that covers T sucht that

m∗(T ) ≥∞∑

k=1

l(Ik) − ε =∞∑

k=1

m∗(Ik) − ε. (1.69)

As T ∩ I ⊆∪∞

k=1(Ik ∩ I),

m∗(T ∩ I) ≤∞∑

k=1

m∗(Ik ∩ I). (1.70)

Likewise,

m∗(T ∩ Ic) ≤∞∑

k=1

m∗(Ik ∩ Ic). (1.71)

Combining (1.70) and (1.71),

m∗(T ∩ I) +m∗(T ∩ Ic) ≤∞∑

k=1

(m∗(Ik ∩ I) +m∗(Ik ∩ Ic)) . (1.72)

Note that Ik ∩ I ∈ Int for any k ∈ N, and due to that Int is a semi-algebra, we can write Ik ∩ Ic =⊔nk

i=1 Jk,i

with Jk,i ∈ Int for any k, i. Then,

m∗(Ik ∩ I) +m∗(Ik ∩ Ic) = m∗(Ik ∩ I) +nk∑i=1

m∗(Ik ∩ Jk,i)

= l(Ik ∩ I) +nk∑i=1

l(Ik ∩ Jk,i)

= l

((Ik ∩ I) t

(nk⊔i=1

Jk,i

))= l(Ik) = m∗(Ik). (1.73)

Plugging (1.73) into (1.72) and (1.69), we have

m∗(T ∩ I) +m∗(T ∩ Ic) ≤∞∑

k=1

m∗(Ik) ≤ m∗(T ) + ε. (1.74)

As ε > 0 is arbitrary, it is necessary that

m∗(T ∩ I) +m∗(T ∩ Ic) ≥ m∗(T ). (1.75)

The other direction of the inequality follows directly from m∗’s σ-sub-additivity.

Definition 1.11 (Borel algebra). The Borel algebra on R is the smallest sigma-algebra that includes all opensubsets of R, denoted by B(R).

Proposition 1.9. All sets in Borel algebra of R are Lebesgue measurable, i.e. B(R) ⊂ B0.

To prove this proposition, we make use of the Lindelof’s Lemma.

Lemma 1.4 (Lindelof’s Lemma). Every open subset of R is a countable union of open intervals.

Proof. Since B0 is a σ-algebra over R, it suffices to prove that every open set is in R. From Lindelof’s Lemma,each open set can be written as countable union of open intervals, and open intervals are in B0, so each openset is in B0.

Corollary 1.2. All open and closed sets in R is Lebesgue measurable.

19

1.4. BOREL MEASURABLE FUNCTIONS CHAPTER 1. LEBESGUE MEASURE THEORY

1.3.4 Existence of Mon-measurable Sets

Though B0 is a very large collection, but it does not contain everything, i.e. B0 6= 2R.To “construct” a non-measurable set, we need Axiom of Choice, which has many equivalent form.

Theorem 1.2 (Axiom of Choice (AoC)). For any collection C of non-empty sets, there exists a set thatcontains exactly one element from each set in C.

Equivalently, for any collection of C of non-empty sets, there exists a “choice function” f defined on C,such that for any A ∈ C, f(A) ∈ A.

Equivalently, for any collection of C of non-empty sets, the Cartesian product∏

A∈C A is not empty.

Theorem 1.3 (Existence of non-measurable sets). There exists a set E ⊂ R such that E /∈ B0.

Proof. First, we define an equivalence relation on R by

x ∼ y ⇔ x− y ∈ Q, ∀x, y ∈ R.

It is trivial to verify that ∼ defined above is really an equivalence relation. Let [x] denote the equivalenceclass containing x ∈ R. Then, [x] ∩ (0, 1) 6= ∅, due to the obvious fact that there exist rationales in (x− 1, x)for any x ∈ R.

Let E ⊂ [0, 1] be a set with a single representative from every equivalence class defined above. By axiomof choice, such a set exists. In the following, we show that it is non-measurable.

Note thatR =

⊔r∈Q

(E + r).

As Q is countable. Assume E is measurable, by countable additivity and translation invariance of m, we have

∞ = m(R) = m

⊔r∈Q

(E + r)

= m

⊔r∈Q

E

, (1.76)

which follows that m(E) > 0. On the other hand, let

F =⊔

r∈Q∩(0,1)

(E + r),

then F ⊆ [0, 2], and thus2 ≥ m(F ) ≥

∑r∈Q∩(0,1)

m(E) = ∞. (1.77)

Hence, E is not measurable.

1.4 Borel Measurable Functions

1.4.1 Borel Algebra

Lemma 1.5. Let S a collection of σ-algebra over X, then∩

M∈S M is also a σ-algebra over X.

Proof. Let M∗ =∩

M∈S M, then

1. ∅, X ∈ M,∀M ∈ S ⇒ ∅, X ∈ M∗ =∩

M∈S M;

2. A ∈ M∗ ⇒ A ∈ M,∀M ∈ S ⇒ Ac ∈ M,∀M ∈ S ⇒ Ac ∈ M∗;

3. {Ai}∞i=1 ⊆ M∗ ⇒ {Ai}∞i=1 ⊆ M,∀M ∈ S,⇒∪∞

i=1Ai ⊆ M,∀M ∈ S ⇒∪∞

i=1Ai ⊆ M∗.

Lemma 1.6 (σ-algebra generated by F). Given a set X and any collection of subset F ⊆ 2X , there exists asmallest σ-algebra that contains F , which is called the σ-algebra generated by F , and denoted by σ(F).

20

1.4. BOREL MEASURABLE FUNCTIONS CHAPTER 1. LEBESGUE MEASURE THEORY

Here, when we say M∗ is a “smallest σ-algebra”, it means that if M is also a σ-algebra satisying the samecondition, then M∗ ⊆ M. Hence, we can see that if such a “smallest σ-algebra” exists, it must be unique.

Proof. Let S be the class of all σ-algebra that contains F . First of all S is not empty, for any non-emtpy setX and collection F ⊆ 2X , it is easy to see that 2X is a σ-algebra over X that contains F .

Define M∗ =∩

M∈S M, we are going to show that M∗ is the smallest σ-algebra that is desired.

1. F ⊆ M∗ =∩

M∈S M, as F ⊆ M, ∀M ∈ S.

2. ∀M ∈ S, M∗ ⊆ M by its definition.

3. As M∗ is an intersection of σ-algebras over X, by the lemma above, it is itself a σ-algebra.

Together, they imply that M∗ is the smallest σ-algebra containing F .

Definition 1.12 (Borel σ-algebra). Let (X, T ) be a topological space, then σ(T ), the σ-algebra over Xgenerated by the topology T , is called the Borel σ-algebra of the topological space.

Conventionally, if the topology of a space is implicit, we use B(X) to denote the Borel σ-algebra over thedefault topological space of X.

The following are some examples of Borel σ-algebras for different topological space.

1. For the indiscrete topology T = {∅, X}, σ(T ) = {∅, X};

2. For the discrete topology T = 2X , σ(T ) = 2X .

3. For the topology T = {∅, U,X}, σ(T ) = {∅, U, U c, X}.

4. For the topology T = {∅, U1, U2, X}, such with U1 ⊂ U2, σ(T ) = {∅, U1, U2, Uc1 , U

c2 , U2\U1, (U2\U1)c}.

(You can derive this set by constructing with union, intersection, and complement, and carefully removethe identical sets)

5. B(R) and B(Rn) are respectively borel algebra over real field and real vector spaces, which contains allopen and closed sets, and all countable unions of closed sets Fσ, and all countable intersections of opensets Gδ, and so on. Note that the sets in the borel algebra are not restricted to those can be expressedin these ways.

Proposition 1.10. Let I = {(−∞, a) : a ∈ R}, then σ(I) = B(R).

Proof. It is obvious that I ∈ B(R), which immediately follows that σ(I) ⊆ B(R). Then, we are going to provethe inclusion in other direction.

1. For each α ∈ R, [α,+∞) = (−∞, α)c ∈ σ(I);

2. For each α < β ∈ R, [α, β) = (−∞, β) ∩ [α,∞) ∈ σ(I);

3. Thus, (α, β) =∪∞

n=1[α+ 1/n, β) ∈ σ(I).

While, in R all open sets are countable union of open intervals, so all open sets are in σ(I), which leads tothat the natural topology of R is contained in σ(I), and hence B(R) ⊆ σ(I).

1.4.2 Measurable Functions

Before going into the definition, we first show two useful propositions in the following lemma.

Lemma 1.7. Let (X,FX) and (Y,FY ) be two measurable spaces, and f : X → Y be any function, then

1. GX = {f−1(A)|A ∈ FY ) is a σ-algebra over X;

2. GY = {A ⊆ Y |f−1(A) ∈ FX} is a σ-algebra over Y .

Proof. The verification of these statements are trivial by noting that f−1(∅) = ∅, f−1(X) = X, and f−1

preserves basic set operations including arbitrary union, arbitrary intersection, and set complement.

21

1.4. BOREL MEASURABLE FUNCTIONS CHAPTER 1. LEBESGUE MEASURE THEORY

Definition 1.13 ((Generic) Measurable function). Let (X,FX) and (Y,FY ) be measurable spaces, then afunction f : X → Y is called a measurable function if A ∈ FY ⇒ f−1(A) ∈ FX , in other words, thepre-image of any measurable set is measurable.

Definition 1.14 ((Borel) Measurable function). Let (X,F) be a measurable space, and (Y, T ) be a topologicalspace, then a function f : X → Y is called a (Borel) measurable function when it is measurable in genericsense for Y endowed with the Borel σ-algebra σ(T ).

The borel measurability of a function can be equivalently defined as follows.Let (X,F) be a measurable space, and (Y, T ) be a topological space a function f : X → Y is called a

(Borel) measurable function if for any open set A ⊂ Y , f−1(A) is measurable.

The following lemma shows that the two definitions of “borel measurable function” are indeed equivalent.

Lemma 1.8. Let (X,F) be a measurable space, and (Y, T ) be a topological space endowed with a Borelσ-algebra B(T ), give a function f : X → Y , f−1(A) ∈ F , ∀A ∈ σ(T ) iff f−1(A) ∈ F , ∀A ∈ T .

Proof. It is trivial to see that f−1(A) ∈ F , ∀A ∈ B(T ) ⇒ f−1(A) ∈ F , ∀A ∈ T as T ⊆ B(T ). In thefollowing, we show the converse.

Suppose, f has f−1(A) ∈ F for all open set A in Y . Let G = {A ⊆ Y |f−1(A) ∈ F}, then by the lemmaabove, G is a σ-algebra over Y . Clearly, all open sets are in G, in other words, T ⊆ G, thus B(Y ) = σ(T ) ⊆ G,which implies that for each A ∈ σ(T ), f−1(A) ∈ F .

Proposition 1.11. Let (X,F) be a measurable space, and f : X → R be a function satisfying f−1((−∞, a)) ∈F for all a ∈ R, then f is a borel measurable.

Proof. We have shown in previous section that any open set in R can be derived from set complement,intersection, and countable union of sets in form of (−∞, a), which follows that for any open set A ⊆ R, itspreimage f−1(A) can be expressed in form of the countable union, intersection, and set complements of thesets in form of f−1((−∞, a)) which are measurable. Hence, the preimage of any open set in R is measurablein (X,F), thus f is borel measurable.

Proposition 1.12. Let X and Y be topological spaces, then any continuous function f : X → Y is measurable.

Proof. This immediately follows from the fact that in a topological space (by convention endowed with Borelσ-algebra), any open set is measurable.

Proposition 1.13. Let X be a measurable space, Y and Z be topological spaces. Given any function f : X →Y and continuous function g : Y → Z, we have

1. if f is continous then g ◦ f is continuous;

2. if f is borel measurable then g ◦ f is borel measurable.

Proof. Given any open set V ⊂ Z, g−1(V ) is open due to the continuity of g. Note that (g ◦ f)−1(V ) =f−1(g−1(V )), so if f is continuous, then f−1(g−1(V )) is open, thus g ◦ f is continous; or if f is measurable,then f−1(g−1(V )) is measurable, thus g ◦ f is measurable.

Lemma 1.9. Let X be a measurable space, and Y be a topological space. Let u : X → R and v : X → Rbe measurable functions, and φ : R2 → Y be continuous functions. Then, h : X2 → Y defined by h(x, y) =φ(u(x), v(y)) is measurable.

Proof. We define f : X2 → R2 by f(x, y) = (u(x), v(y)), then h = φ ◦ f . Since φ is continuous, from theproposition above, we can see that it suffices to show that f is measurable.

Given any open box in form of V = I1 × I2 ⊆ R2 with I1 and I2 being open intervals in R, thenf−1(V ) = u−1(I1) ∩ v−1(I2), where u−1(I1) and v−1(I2) are both measurable due to the measurability of uand v, thus f−1(V ) is measurable.

Since open boxes constitute basis of the standard topology of R2, and R2 is second countable underthis topology, we can see that any open set U ⊂ R2 can be written as countable union of open boxes asU =

∪∞i=1 Vi, then f−1(U) =

∪∞i=1 f

−1(Vi) is measurable.

Theorem 1.4. For measurable functions, we have

22

1.4. BOREL MEASURABLE FUNCTIONS CHAPTER 1. LEBESGUE MEASURE THEORY

1. Let u and v be real measurable functions, then f = u+ iv is complex measurable.

2. Let f = u+ iv be complex measurable function, then u, v, |f | are real measurable.

3. Let f, g be real (complex) measurable functions, then f + g and fg are real (complex) measurable.

4. Let E be a measurable set, then its indicator function χE is measurable.

Proof. These statements are immediate corollaries of the above propositions.

1. We can write f(x) = φ(u(x), v(x)) with φ : R2 → C given by φ(u, v) = u+ i ∗ v which is continuous andthus measurable.

2. This follows from the fact that taking real or imaginary part, or taking magnitude, are all continuousand thus measurable functions.

3. We can write (f + g)(x) = φ+(f(x), g(x)) with φ+(u, v) = u + v and (fg)(x) = φ×(f(x), g(x)) withφ×(u, v) = uv. Both φ+ and φ× are continuous and thus measurable.

4. The pre-image of χ−1E can only be either one of the following: ∅, E, Ec, and X, which are all measurable

whenever E is measurable.

Theorem 1.5. Let (X,F) be a measurable space, and fn : X → R be a sequence of Borel measurablefunctions, then supn fn, infn fn, lim supn fn and lim infn fn are all Borel measurable.

Proof. We first look at supn fn. We claim that for each a ∈ R,(sup

nfn

)−1

((a,+∞)) =∞∪

n=1

f−1n ((a,+∞)). (1.78)

This can be shown as follows

x ∈(

supnfn

)−1

((a,+∞)) ⇒ supnfn(x) > a⇒ ∃m ∈ N, fm(x) > a

⇒ x ∈ f−1m ((a,+∞)) ⊆

∞∪n=1

f−1n ((a,+∞)); (1.79)

and

x ∈∞∪

n=1

f−1n ((a,+∞)) ⇒ ∃m ∈ N, x ∈ f−1

m ((a,+∞))

⇒ fm(x) > a⇒ supnfn(x) > a

⇒ x ∈(

supnfn

)−1

((a,+∞)). (1.80)

The claim is shown, which readily leads to the conclusion that supn fn is borel measurable.Similar argument shows that infn fn is also borel measurable. While, lim supn fn = infn supkgen fn and

lim infn fn = supn infk≥n fn, so they are also borel measurable.

23

Chapter 2

Integration Theory

2.1 Lebesgue Integration

2.1.1 Simple Functions

Definition 2.1 (Simple function). A function s : X → R is called a simple function if s(X) is finite.A simple function s can be uniquely written as

s =n∑

i=1

αiχAi

where αi ∈ R and Ai = s−1(αi) = {x ∈ X|s(x) = αi}.

Proposition 2.1. A simple function s given by s =∑n

i=1 αiχAi is measurable iff Ai is measurable for alli = 1, . . . , n.

Proof. 1. Given any set of B ∈ R, we have

χ−1Ai

(B) =

∅ (0 ∈ Ai and 1 ∈ Ai),Ai (0 /∈ Ai and 1 ∈ Ai),Ac

i (0 ∈ Ai and 1 /∈ Ai),X (0 ∈ Ai and 1 ∈ Ai).

(2.1)

Hence, if Ai is measurable, then χAi is measurable. As a finite sum of measurable function, s ismeasurable.

2. Since s is measurable, Ai = s−1({αi}) is measurable.

Definition 2.2 (Monotonical function sequence). Let f1, f2, . . . be a sequence of real valued functions definedon X, such that for every x ∈ X, f1(x), f2(x), . . . is an increasing sequence, then f1, f2, . . . is called anincreasing sequence of functions.

Similarly, we can define decreasing sequence of functions.

Theorem 2.1. Each non-negative measurable function can be approximated (pointwisely) by an increasingsequence of simple measurable functions.

Formally, let f : X → [0,+∞] be a measurable function, then there exists a increasing sequence of simplemeasurable functions s1 ≤ s2 ≤ · · · , such that

limn→∞

sn(x) = f(x), ∀x ∈ X.

Proof. We construct a function Ψn : [0,+∞] → [0,+∞] in as follows. First, we partition [0, n] into n2n

intervals of length δn = 2−n. Then for each t ≥ 0, there exists a unique kt such that ktδn < t ≤ (k + 1)δn.We then define

Ψn(t) =

{kδn (0 ≤ t < n),n (n ≤ t ≤ +∞).

24

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

Note that Ψn is a simple Borel measurable function as it can only take integer values from between 0 andn2n, and Ψ1,Ψ2, . . . is an increasing sequence of functions.

In addition, ∀t ∈ [0,+∞], t− δn < Ψn(t) < t, hence Ψn(t) converges to t pointwisely as n→ +∞.Define sn = Ψn ◦ f . Note that sn is a simple measurable function, as both Ψn and f are measurable,

and Ψn is a simple function. Then (sn)∞n=1 is an increasing sequence of simple measurable functions thatconverges to f pointwisely.

In the following, we use the term positive measure to refer to a measure µ : M → [0,+∞] such thatthere exists a nonempty set A ∈ M with µ(A) < +∞.

2.1.2 Lebesgue Integral

We first define the Lebesgue integral on simple functions, and then extend it to all measurable functions.

Definition 2.3 (Lebesgue Integral). Given a measure space (X,M, µ), and a simple measurable functions : X → [0,+∞]. Note that s can be uniquely written as s =

∑ni=1 αiχAi with A1, . . . , An being mutually

disjoint. Then, its Lebesgue integral on a measurable set E ∈ M, denoted by∫

Esdµ is defined by∫

E

sdµ =n∑

i=1

αiµ(E ∩Ai).

Let f : X → [0,+∞] be non-negative measurable function, then its Lebesgue integral is given by∫E

fdµ = sup{∫

E

sdµ

∣∣∣∣ s is simple measurable, and 0 ≤ s ≤ f} .

Proposition 2.2. Given a measure space (X,M, µ), non-negative measurable functions f, g : X → [0,+∞],measurable sets E,A,B ∈ M, and c ∈ [0,∞], then we have

1. f ≤ g ⇒∫

Efdµ ≤

∫Egdµ;

2. A ⊂ B ⇒∫

Afdµ ≤

∫Bfdµ;

3.∫

Ecfdµ = c

∫Efdµ;

4. f |E ≡ 0 ⇒∫

Efdµ = 0 (This holds even when µ(E) = +∞);

5. µ(E) = 0 ⇒∫

Efdµ = 0 (This holds even when f |E = +∞);

6.∫

Efdµ⇒

∫XχEfdµ.

Proof. We prove these properties respectively. For convenience, we denote

Sf = {s |s is simple measurable, and 0 ≤ s ≤ f}.

Then, ∫E

fdµ = sups∈Sf

∫E

sdµ. (2.2)

1. As f ≤ g, s ≤ f ⇒ s ≤ g, it means that Sf ⊂ Sg, which follows that∫E

fdµ = sups∈Sf

∫E

sdµ ≤ sups∈Sg

∫E

sdµ =∫

E

gdµ. (2.3)

2. Let s =∑n

i=1 αiχCi be a non-negative simple measurable function. When A ⊂ B,∫A

sdµ =n∑

i=1

αiµ(A ∩ Ci) ≤n∑

i=1

αiµ(B ∩ Ci) =∫

B

sdµ. (2.4)

Hence, ∫A

fdµ = sups∈Sf

∫A

sdµ ≤ sups∈Sf

∫B

sdµ =∫

B

fdµ. (2.5)

25

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

3. Let s =∑n

i=1 αiχCi be a non-negative simple measurable function, then when c ≥ 0, cs =∑n

i=1(cαi)χCi

is also a non-negative simple measurable function, which has∫E

csdµ =n∑

i=1

(cαi)µ(Ci) = c

n∑i=1

αiµ(Ci) = c

∫E

sdµ. (2.6)

In addition, we note that s ≤ f ⇔ cs ≤ cf , which means Scf = {cs|s ∈ Sf}. Hence,∫E

cfdµ = sups∈Scf

∫E

sdµ = sups∈Sf

∫E

csdµ = c · sups∈Sf

∫E

sdµ = c

∫E

fdµ. (2.7)

4. For each s ∈ Sf , 0 ≤ s(x) ≤ f(x),∀x ∈ X, hence, when f(x) = 0,∀x ∈ E ⊂ X, s(x) = 0,∀x ∈ E. Foreach s satisfying the above condition, we have∫

E

sdµ = 0 · µ(E ∩ s−1(0)) = 0. (2.8)

As a result, ∫E

fdµ = sups∈Sf

∫E

sdµ = sups∈Sf

0 = 0. (2.9)

5. For each s ∈ Sf , write it in form of s =∑n

i=1 αiχAi . And, note that µ(E) = 0, and thus µ(E ∩ Ai) =0,∀i = 1, . . . , n, then ∫

E

sdµ =n∑

i=1

αiµ(E ∩Ai) =n∑

i=1

αi · 0 = 0. (2.10)

Hence, ∫E

fdµ = sups∈ssetf

∫E

sdµ = sups∈Sf

0 = 0. (2.11)

6. Let s =∑n

i=1 αiχAi be a simple non-negative measurable function. We note that

χEs =n∑

i=1

αiχE∩Ai . (2.12)

It follows that ∫X

χEsdµ =n∑

i=1

αiµ(E ∩Ai) =∫

E

sdµ. (2.13)

In addition, s ≤ f ⇒ χEs ≤ χEf , in other words, s ∈ Sf ⇒ χEs ∈ SχEf∫E

fdµ = sups∈Sf

∫E

sdµ = sups∈Sf

∫X

χEsdµ ≤ sups′∈SχEf

∫X

s′dµ =∫

X

χEfdµ. (2.14)

On the other hand, for every s ∈ SχEf , we have s(x) = 0,∀x ∈ Ec. Write s into s =∑n

i=1 αiχCi , thenαi > 0 ⇒ Ci ∈ E ⇔ Ci ∩E = Ci, therefore,∫

X

sdµ =n∑

i=1

αiµ(Ci) =n∑

i=1

αiµ(Ci ∩ E) =∫

E

sdµ. (2.15)

Hence, ∫X

χEfdµ = sups∈SχEf

∫X

sdµ = sups∈SχEf

∫E

sdµ =∫

E

χEfdµ. (2.16)

Note that χEf ≤ f , by monotonicity shown above∫E

χEfdµ ≤∫

E

fdµ. (2.17)

As a result, ∫X

χEfdµ ≤∫

E

fdµ. (2.18)

26

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

2.1.3 Additivity, Monotone Convergence Theorem, and Induced Measure

We first prove finite additivity of simple functions, and then generalize it to countable additivity of genericmeasurable functions.

Lemma 2.1. Each non-negative simple measurable function induces a measure.Formally, let s be a non-negative simple measurable function defined in a measure space (X,M, µ), and

define Ψ : M → [0,+∞] by

Ψ(E) =∫

E

sdµ, ∀E ∈ M,

then Ψ is a measure.

Proof. We show that Ψ satisfies the condition of a measure.

1. Ψ(∅) =∫∅ sdµ = 0 due to µ(∅) = 0.

2. We write s =∑n

i=1 αiχAi Let E1, E2, . . . be disjoint measurable sets, and let E =⊔∞

k=1Ek, then

Ψ(E) =n∑

i=1

αiµ(E ∩Ai) =n∑

i=1

αiµ

(( ∞⊔k=1

Ek

)∩Ai

)=

n∑i=1

αiµ

( ∞⊔k=1

(Ek ∩Ai)

)

=n∑

i=1

αi

( ∞∑k=1

µ(Ek ∩Ai)

)=

∞∑k=1

n∑i=1

αiµ(Ek ∩Ai)

=∞∑

k=1

∫Ek

sdµ =∞∑

k=1

Ψ(Ek). (2.19)

Proposition 2.3. Lebesgue Integral is additive on simple measurable functions.Formally, let s and t be two non-negative simple measurable functions defined in a measure space (X,M, µ),

then ∫X

(s+ t)dµ =∫

X

sdµ+∫

X

tdµ.

As an immediate corollary, we have for each measurable set E ∈ M,∫E

(s+ t)dµ =∫

X

(χEs+ χEt)dµ =∫

X

χEsdµ+∫

X

χEtdµ =∫

E

sdµ+∫

E

tdµ.

Proof. We define for each E ∈ M, Ψs+t(E) =∫

E(s + t)dµ, Ψs(E) =

∫Esdµ, Ψt(E) =

∫Etdµ, then the

functions Ψs+t,Ψs and Ψt are all measures. Let s =∑m

i=1 αiχAi and t =∑n

j=1 βjχBj , such that⊔m

i=1Ai =⊔nj=1Bj = X. Let Eij = Ai ∩Bj for i = 1, . . . ,m and j = 1, . . . , n. It is obvious that Eij are measurable and

mutually disjoint, and they haveX =

⊔1≤i≤m, 1≤n≤n

Eij .

We note ∫Eij

(s+ t)dµ = (αi + βj)µ(Eij) = αiµ(Eij) + βjµ(Eij) =∫

Eij

sdµ+∫

Eij

tdµ. (2.20)

27

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

In addition,

∫X

(s+ t)dµ = Ψs+t(X) = Ψs+t

⊔i,j

Eij

=m∑

i=1

n∑j=1

Ψs+t(Eij)

=m∑

i=1

n∑j=1

∫Eij

(s+ t)dµ =m∑

i=1

n∑j=1

(∫Eij

sdµ+∫

Eij

tdµ

)

=m∑

i=1

n∑j=1

Ψs(Eij) +m∑

i=1

n∑j=1

Ψt(Eij) = Ψs

⊔i,j

Eij

+ Ψt

⊔i,j

Eij

= Ψs(X) + Ψt(X) =

∫X

sdµ+∫

X

tdµ. (2.21)

Before generalizing the additivity to generic measurable functions, we still need to prove the followingtheorem, namely the Lebesgue Monotone Convergence Theorem, which in itself is a very important theoremin integration theory.

Theorem 2.2 (Montotone Convergence Theorem (MCT)). Let f1, f2, . . . be an increasing sequence of non-negative measurable functions defined on a measure space (X,M, µ), assume that it converges pointwisely toa function f , i.e

limn→∞

fn(x) = f(x), ∀x ∈ X,

then f is measurable and ∫X

fdµ =∫

X

limn→∞

fdµ. = limn→∞

∫X

fndµ.

Proof. By monotonicity of the Lebesgue integral, we have∫

Xfidµ ≤

∫Xfi+1dµ for all i ∈ N due to fi ≤ fi+1.

Hence, the real value sequence(∫

Xfndµ

)∞n=1

is an increasing sequence, and thus have a unique limit α ∈ [0,∞].

1. As (fn)∞n=1 is increasing, we have fn ≤ f , which follows that∫X

fndµ ≤∫

X

fdµ, ∀n ∈ N. (2.22)

Take the limit for left hand side, we get

α = limn→∞

∫X

fndµ ≤∫

X

fdµ. (2.23)

2. Let s be a simple measurable function satisfying 0 ≤ s ≤ f , and c be any real value with 0 < c < 1.Define Ψ : M → [0,+∞] : E 7→

∫Esdµ, then Ψ is a measure.

On the other hand, let En = {x ∈ X : fn(x) ≥ cs(x)}, then it is easy to see that En is measurable(note En = (fn − cs)−1[0,∞]). In addition, fn(x) ≥ cs(x) ⇒ fn+1(x) ≥ cs(x), it follows that E1 ⊆E2 ⊆ · · · . Furthermore, we claim that

∪∞n=1En = X. This claim is briefly shown as follows. First,

En ⊆ X ⇒∪∞

n=1En ⊆ X. In showing the other direction, for each x ∈ X, if f(x) = 0, then s(x) = 0,thus it is obvious that x ∈ En,∀n ∈ N. Otherwise, f(x) > cs(x), as fn(x) ↑ f(x), there exists n suchthat f(x) − fn(x) < f(x) − cs(x), i.e fn(x) > cs(x), thus x ∈ En. The claim is proved.

By continuity of measure, we have

limn→∞

Ψ(En) = Ψ(E) ⇔ limn→∞

∫En

sdµ =∫

X

sdµ. (2.24)

From the basic property of Lebesgue integral, we have∫X

fndµ ≥∫

En

fndµ ≥∫

En

csdµ = c

∫En

sdµ. (2.25)

28

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

Take the limit as n→ ∞ for both sides, then

α = limn→∞

∫X

fndµ ≥ c limn→∞

∫En

sdµ = c

∫X

sdµ. (2.26)

This holds for any 0 < c < 1, which follows that

α = limn→∞

∫X

fndµ ≥∫

X

sdµ. (2.27)

Theorem 2.3 (Countable Additivity of Lebesgue Integral). Let f1, f2, . . . of non-negative measurable func-tions on a measure space (X,M, µ), and define f : X → [0,+∞] by

f(x) =∞∑

n=1

fn(x), ∀x ∈ X,

then ∫X

fdµ =∫

X

∞∑n=1

fndµ =∞∑

n=1

∫X

fndµ.

Proof. 1. We first prove for the sum of two functions. Let f and g be two non-negative measurablefunctions, then there exist increasing sequences of non-negative simple measurable functions (si)∞i=1 ↑ fand (ti)∞i=1 ↑ g, then (si + ti)∞i=1 ↑ f + g. Based on the additivity proved on simple functions, we have∫

X

(si + ti)dµ =∫

X

sidµ+∫

X

tidµ. (2.28)

By MCT, we get∫X

(f+g)dµ =∫

x

limi→∞

(si+ti)dµ = limi→∞

∫X

(si+ti)dµ = limi→∞

∫X

sidµ+ limi→∞

∫X

tidµ =∫

X

fdµ+∫

X

gdµ.

(2.29)

2. By induction, we can show the finite additivity as∫X

n∑i=1

fidµ =n∑

i=1

∫X

fidµ. (2.30)

3. Let gn =∑n

i=1 fi, then gn ↑ g∞ = f . By finite additivity and MCT,∞∑

i=1

∫X

fidµ = limn→∞

n∑i=1

∫X

fidµ = limn→∞

∫X

n∑i=1

fidµ = limn→∞

∫X

gndµ =∫

X

limn→∞

gndµ =∫

X

fdµ.

(2.31)

As an immediate corollary, we can show that the order of infinite sum can be exchanged for non-negativeterms.

Corollary 2.1. Consider a non-negative function a : N × N → [0,+∞], we have∞∑

i=1

∞∑j=1

a(i, j) =∞∑

j=1

∞∑i=1

a(i, j).

Proof. Suppose we are working with a measure space on N with counting measure µ. Define a sequence ofnon-negative function f1, f2, . . . by fj(i) = a(i, j),∀i, j ∈ N. Then fj are measurable functions, hence

∞∑i=1

∞∑j=1

a(i, j) =∞∑

i=1

∞∑j=1

fj(i) =∫

N

∞∑j=1

fjdµ =∞∑

j=1

∫Nfjdµ =

∞∑j=1

∞∑i=1

fj(i) =∞∑

j=1

∞∑i=1

a(i, j). (2.32)

29

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

The following lemma states the relation between lim inf and Lebesgue integral, which has important utility.

Lemma 2.2 (Fatou’s Lemma). Let f1, f2, . . . be a sequence of non-negative measurable functions defined ona measure space (X,M, µ), then ∫

X

lim infn→∞

fndµ ≤ lim infn→∞

∫X

fndµ.

Proof. For each k ∈ N, define gk : X → [0,+∞] by gk(x) = infi≥k fi(x). Then, g1, g2, . . . is an increasingsequence of non-negative measurable functions, and it has

limk→∞

gk(x) = supk∈N

gk(x) = supk∈N

infi≥k

fi(x) = lim infn→∞

fn(x). (2.33)

By MCT, we have

limk→∞

∫X

gkdµ =∫

X

limk→∞

gkdµ =∫

X

lim infn→∞

fndµ. (2.34)

In addition, by definition, gn ≤ fn, thus by monotonicity,∫X

gndµ ≤∫

X

fndµ. (2.35)

Taking lim inf as n→ ∞ of both sides, we get∫X

lim infn→∞

fndµ = limk→∞

∫X

gkdµ = lim infn→∞

∫X

gndµ ≤ lim infn→∞

∫X

fndµ. (2.36)

Theorem 2.4. Given a measure space (X,M, µ), every non-negative measurable function f : X → [0,+∞]defines a measure ν : M → [0,+∞] by

ν(E) =∫

E

fdµ, ∀E ∈ M.

And, for each measurable function g : X → [0,+∞], we have∫X

gdν =∫

X

gfdµ.

This equation can be summarized as the rule dν = fdµ.

Proof. 1. We first prove that ν is a measure, by showing that it satisfies the conditions of a measure.

(a) ν(∅) =∫∅ fdµ = 0.

(b) Let {Ei}∞i=1 be a sequence of pairwisely disjoint measurable sets, and let E =∪∞

i=1Ei. Note that

χEf =∞∑

i=1

χEif. (2.37)

Then by countable additivity of Lebesgue integral, we get

ν(E) =∫

E

fdµ =∫

X

χEfdµ =∫

X

∞∑i=1

χEifdµ =∞∑

i=1

∫X

χEifdµ =∞∑

i=1

∫Ei

fdµ =∞∑

i=1

ν(Ei).

(2.38)

2. Then, we prove the second statement, namely dν = fdµ.

We first show that this holds for each characteristic functions of measurable set E.∫X

χEdν = ν(E) =∫

E

fdµ =∫

X

χEfdµ. (2.39)

30

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

Then for each non-negative simple measurable function s =∑n

i=1 αiχAi . Then by additivity,∫X

sdν =∫

X

n∑i=1

αiχAidν =n∑

i=1

αi

∫X

χAidν =n∑

i=1

αi

∫X

χAifdµ =∫

X

n∑i=1

αiχAifdµ =∫

X

sfdµ.

(2.40)Finally, for arbitrary non-negative measurable function g, there exists an increasing sequence of non-negative simple measurable functions (s1, s2, . . .) such that sn ↑ g (pointwisely). It follows that (snf) ↑gf . Then, by MCT∫

X

gdν =∫

X

limn→∞

sndν = limn→∞

∫X

sndν = limn→∞

∫X

snfdµ =∫

X

limn→∞

snfdµ =∫

X

gfdµ. (2.41)

2.1.4 Lebesgue Integration of Complex Functions

In the following, we generalize the concept of Lebesgue integral from non-negative real functions to genericcomplex functions.

Definition 2.4 (L1 functions). Given a measure space (X,M, µ), we denote

L1(X,µ) ={f : X → C

∣∣∣∣ f is measurable, and∫

X

|f |dµ < +∞}.

Each function in L1(X,µ) is called an L1 function. In other words, an L1 function is a measurable functionf such that |f | has finite Lebesgue integral.

L1(X,µ) is sometimes written as L1(X) or L1(µ), when the underlying measure or the universe is clearfrom context.

Definition 2.5 (Lebesgue integral of complex functions). Let f = u + iv be a complex measurable functiondefined on a measure space (X,M, µ), where u and v are respectively the real and imaginary part of f , suchthat u, v are measurable, and f ∈ L1(X,µ). Then for each E ∈ M, we define∫

E

fdµ =∫

E

udµ+ i

∫E

vdµ =(∫

E

u+dµ−∫

E

u−dµ

)+ i

(∫E

v+dµ−∫

E

v−dµ

),

where u+ = max(u, 0), u− = min(u, 0), v+ = max(v, 0), and v− = min(v, 0).(Note that as |u±| ≤ |u| ≤ |f | and |v±| ≤ |v| ≤ |f |, hence u± and v± are all non-negative L1-functions.)

The following proposition states that Lebesgue integration is a linear functional acting on a L1 function.

Proposition 2.4 (Linearity of Lebesgue integration). Given a measure space (X,M, µ). Let f, g ∈ L1(X,µ),and α, β ∈ C, then αf + βg ∈ L1(X,µ) and∫

X

(αf + βg)dµ = α

∫X

fdµ+ β

∫X

gdµ.

To prove the linearity, we first note the following simple fact

Lemma 2.3. Given a measure space (X,M, µ) and let f, g, f ′, g′ be non-negative L1 functions such thatf − g = f ′ − g′, then ∫

X

fdµ−∫

X

gdµ =∫

X

f ′dµ−∫

X

g′dµ.

Proof. This can be shown by addivitity of Lebesgue integral of non-negative functions, as

f − g = f ′ − g′ ⇒ f + g′ = f ′ + g

⇒∫

X

(f + g′)dµ =∫

X

(f ′ + g)dµ⇒∫

X

fdµ+∫

X

g′dµ =∫

X

f ′dµ+∫

X

gdµ

⇒∫

X

fdµ−∫

X

gdµ =∫

X

f ′dµ−∫

X

g′dµ. (2.42)

31

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

Now we come back to prove the linearity.

Proof. We first show∫

X(f + g)dµ =

∫Xfdµ +

∫Xgdµ, and then show

∫Xαfdµ = α

∫Xfdµ, then linearity

immediately follows from these two properties.

1. The statement that f+g is an L1 function follows from the fact that |f+g| ≤ |f |+ |g| and monotonicityof Lebesgue integral.

Decompose f and g into real and imaginary parts as f = uf + ivf and g = ug + ivg, then f + g =(uf +ug)+ i(vf + vg), where uf +ug and vf + vg are respectively the real and imaginary parts of f + g.

Then we have∫X

(f + g)dµ =(∫

X

(uf + ug)+dµ−∫

X

(uf + ug)−dµ)

+ i

(∫X

(vf + vg)+dµ−∫

X

(vf + vg)−dµ).

(2.43)

Note that

(uf + ug)+ − (uf + ug)− = Re(f) + Re(g) = (u+f − u−f ) + (u+

g − u−g ) = (u+f + u+

g ) − (u−f + u−g );(2.44)

(vf + vg)+ − (vf + vg)− = Im(f) + Im(g) = (v+f − v−f ) + (v+

g − v−g ) = (v+f + v+

g ) − (v−f + v−g ). (2.45)

Then, by the lemma above,∫X

(uf + ug)+dµ−∫

X

(uf + ug)−dµ =∫

X

(u+f + u+

g )dµ−∫

X

(u−f + u−g )dµ

=∫

X

u+f dµ+

∫X

u+g dµ−

∫X

u−f dµ−∫

X

u−f dµ

=(∫

X

u+f dµ−

∫X

u−f dµ

)+(∫

X

u+g dµ−

∫X

u−g dµ

). (2.46)

Likewise, we have∫X

(vf + vg)+dµ−∫

X

(vf + vg)−dµ =(∫

X

v+f dµ−

∫X

v−f dµ

)+(∫

X

v+g dµ−

∫X

v−g dµ

). (2.47)

Combining the results above, we get∫X

(f + g)dµ =(∫

X

u+f dµ−

∫X

u−f dµ

)+ i

(∫X

v+f dµ−

∫X

v−f dµ

)+(∫

X

u+g dµ−

∫X

u−g dµ

)+ i

(∫X

v+g dµ−

∫X

v−g dµ

)=∫

X

fdµ+∫

X

gdµ. (2.48)

2. The statement that αf is an L1 function follows from the fact of |αf | ≤ |α||f |. Let f = u + iv, whereu = Re(f) and v = Im(f).

If α ∈ R and α ≥ 0, then (αu)± = αu± and (αv)± = αv±, thus∫X

αfdµ =∫

X

(αu+ iαv)dµ =(∫

X

(αu)+dµ−∫

X

(αu)−dµ)− i

(∫X

(αv)+dµ−∫

X

(αv)−dµ)

= α

((∫X

u+dµ−∫

X

u−dµ

)− i

(∫X

v+dµ−∫

X

v−dµ

))= α

∫X

fdµ. (2.49)

32

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

In addition, we claim∫

X(−f)dµ = −

∫Xfdµ. Combining the result above and this claim leads to that∫

Xαfdµ = α

∫Xfdµ for all α ∈ R. This claim is briefly shown below∫

X

(−f)dµ =(∫

X

(−u)+dµ−∫

X

(−u)−dµ)

+ i

(∫X

(−v)+dµ−∫

X

(−v)−dµ)

=(∫

X

u−dµ−∫

X

u+dµ

)+ i

(∫X

v−dµ−∫

X

v+dµ

)= −

((∫X

u+dµ−∫

X

u−dµ

)+ i

(∫X

v+dµ−∫

X

v−dµ

))= −

∫X

fdµ. (2.50)

Now we generalize the conclusion from the case with α ∈ R to α ∈ C. Write α = a + ib, thenαf = (au− bv) + i(bu+ av). Hence,∫

X

αfdµ =∫

X

(au− bv)dµ+ i

∫X

(bu+ av)dµ =∫

X

audµ+∫

X

(−b)vdµ+ i

∫X

budµ+ i

∫X

avdµ

= a

∫X

udµ− b

∫X

vdµ+ ib

∫X

udµ+ ia

∫X

vdµ

= (a+ ib)(∫

X

udµ+∫

X

vdµ) = α

∫X

fdµ. (2.51)

By linearity, we can extend the monotonicity of Lebesgue integral from non-negative functions to allreal-valued L1 functions.

Proposition 2.5. Given a measure space (X,M, µ) and f, g : X → R ∈ L1(X,µ), then

f(x) ≤ g(x), ∀x ∈ X ⇒∫

X

fdµ ≤∫

X

gdµ.

Proof.

f ≤ g ⇒ g − f ≥ 0 ⇒∫

X

gdµ−∫

X

fdµ =∫

X

(g − f)dµ ≥ 0 ⇒∫

X

fdµ ≤∫

X

gdµ. (2.52)

Proposition 2.6. Given a measure space (X,M, µ) and f ∈ L1(X,µ), then∣∣∣∣∫X

fdµ

∣∣∣∣ ≤ ∫X

|f |dµ.

Proof. Let z =∫

Xfdµ. It is obvious that

∫X|f |dµ ≥ 0, hence if z = 0, the statement trivially holds. Now

we consider the case in which z 6= 0. Let α = z/|z|, then |α| = 1 and αz = |z|. Thus,∣∣∣∣∫X

fdµ

∣∣∣∣ = α

∫X

fdµ =∫

X

αfdµ. (2.53)

Write αf = ua + iva, then |ua(x)| ≤ |αf(x)| = |f(x)|,∀x ∈ X. By definition,

|z| =∫

X

αfdµ =∫

X

uadµ+ i

∫X

vadµ. (2.54)

since |z| ∈ R,∫

Xvadµ = 0. Then by monotonicity,∣∣∣∣∫

X

fdµ

∣∣∣∣ = |z| =∫

X

uadµ ≤∫

X

|f |dµ. (2.55)

33

2.1. LEBESGUE INTEGRATION CHAPTER 2. INTEGRATION THEORY

2.1.5 Dominated Convergence Theorem

Dominated Convergence Theorem (DCT) introduced below is one of the most important theorem in measureand integration theory, which establishes a widely applicable condition under which the order of limit andintegration can be exchanged.

Theorem 2.5 (Dominated Convergence Theorem (DCT)). Let (fn)∞n=1 be a sequence of complex measurablefunction defined in a measure space (X,M, µ) and f is a complex function satisfying

f(x) = limn→∞

fn(x),∀x ∈ X.

If there exists a positive function g ∈ L1(µ), such that |fn(x)| ≤ g(x),∀x ∈ X, then f ∈ L1(µ) and

limn→∞

∫X

|f − fn|dµ = 0,

thus ∫X

fdµ = limn→∞

∫X

fndµ.

Proof. 1. We first show that f is integrable, i.e f ∈ L1(µ). Since |fn(x)| ≤ g(x), ∀x ∈ X, ∀n ∈ N,|f(x)| = limn→∞ |fn(x)| ≤ g(x), hence ∫

X

|f |dµ ≤∫

X

gdµ <∞, (2.56)

which implies that f ∈ L1(µ).

2. Note that |fn(x) − f(x)| ≤ 2g(x), we define a real valued function hn : X → R for each n ∈ N byhn(x) = 2g(x) − |fn(x) − f(x)|. It is clear that hn is positive function for each n. In addition, aslimn→∞ fn = f , limn→∞ hn = 2g. Then, we have∫

X

2gdµ =∫

X

limn→∞

(2g − |fn − f |)dµ =∫

X

lim infn→∞

(2g − |fn − f |)dµ

≤ lim infn→∞

(2g − |fn − f |)dµ (by Fatou’s Lemma)

=∫

X

2gdµ+ lim infn→∞

(−∫

X

|fn − f |dµ)

=∫

X

2gdµ− lim supn→∞

∫X

|fn − f |dµ. (2.57)

Since g ∈ L1(µ),∫

X2gdµ < +∞, hence it is necessary that

lim supn→∞

∫X

|fn − f |dµ = limn→∞

∫X

|fn − f |dµ = 0. (2.58)

3. Finally, we have for each n,

limn→∞

∣∣∣∣∫X

fdµ−∫

X

fndµ

∣∣∣∣ ≤ limn→∞

∫X

|f − fn|dµ = 0 (2.59)

It follows that ∫X

fdµ = limn→∞

∫X

fndµ. (2.60)

34

2.2. NULL SETS AND ALMOST EVERYWHERE CHAPTER 2. INTEGRATION THEORY

2.2 Null Sets and Almost Everywhere

2.2.1 Null Sets and Complete Measure

Definition 2.6 (Null set). Let (X,M, µ) be a measure space, a set A ⊂ X is called a null set, or zeromeasure set, if it is contained in a set with measure zero, i.e there exists B ∈ M such that µ(B) = 0 andA ⊆ B.

Definition 2.7 (Complete measure). Let (X,M, µ) be a measure space, µ is said to be a complete measureif for every E ∈ M with µ(E) = 0, one has A ⊆ E ⇒ A ∈ M.

In other words, a measure is complete if every null set is measurable.

Theorem 2.6 (Completion of a measure). Given a measure space (X,M, µ), define M = {E ⊂ X|∃A,B ∈M, A ⊆ E ⊆ B, and µ(B\A) = 0}, and µ : M → [0,+∞], such that for each E ∈ M as defined above,µ(E) = µ(A). Then, M is a σ-algebra over X, and µ is well-defined and it is a complete measure.

Proof. 1. First of all, we show that M is a σ-algebra.

Note that M ⊆ M, since for each E ∈ M, we can choose A = E and B = E, which have A ⊆ E ⊆ Band B\A = ∅ (thus µ(B\A) = 0). Hence, E ∈ M. Then, we verify that M satisfies the three conditionsof a σ-algebra.

(a) ∅ ∈ M ⊆ M.

(b) Suppose E ∈ M, there exists A,B ∈ M such that A ⊆ E ⊆ B and µ(B\A) = 0. Then for Ec, wehave Bc ⊆ Ec ⊆ Ac, where both Ac and Bc are in M. In addition, Ac\Bc = Ac ∩B = B\A, thusµ(Ac\Bc) = 0. Hence, Ec ∈ M.

(c) Suppose {En}∞n=1 ⊂ M, there exists, A1, A2, . . . ∈ M and B1, B2, . . . ∈ M such that An ⊆ En ⊆Bn and µ(Bn\An) = 0 for each n ∈ N. Then for

∪∞n=1En, we have

∞∪n=1

An ⊆∞∪

n=1

En ⊆∞∪

n=1

Bn (2.61)

where∪∞

n=1An ∈ M and∪∞

n=1Bn ∈ M. In addition,( ∞∪n=1

Bn

)\

( ∞∪n=1

An

)⊆

∞∪n=1

(Bn\An). (2.62)

It implies that

µ

(( ∞∪n=1

Bn

)\

( ∞∪n=1

An

))≤

∞∑n=1

µ(Bn\An) = 0. (2.63)

Therefore,∪∞

n=1En ∈ M.

2. We then show that µ is well defined.

Formally, this is equivalent to the following statement: Given E ∈ M, and there exists A1, B1, A2, B2 ∈M such that A1 ⊆ E ⊆ B1, A2 ⊆ E ⊆ B2, and µ(B1\A1) = µ(B2\A2) = 0, then µ(A1) = µ(A2).

Note that A2 ⊆ B1, then we have A2\A1 ⊆ B1\A1, hence µ(A2\A1) ≤ µ(B1\A1) = 0. Consequently,µ(A2) ≤ µ(A1) + µ(A2\A1) = µ(A1), likewise, we have µ(A1) ≤ µ(A2). It follows that µ(A1) = µ(A2).

3. In the following, we continue to show that µ is a measure of the measurable space (X,M).

(a) For ∅, we can choose A = B = ∅, thus µ(∅) = µ(∅) = 0.

(b) Let {En}∞n=1 ⊂ M be a countable collection of disjoint sets which are measurable w.r.t M, and letE =

∪∞n=1En. Then we can find A1, A2, . . . ∈ M and B1, B2, . . . ∈ M such that An ⊆ En ⊆ Bn

and µ(Bn\An) = 0 for each n ∈ N. Let A =∪∞

n=1An and B =∪∞

n=1Bn. We have shown abovethat A ⊆ E ⊆ B and µ(B\A) = 0. Hence, µ(E) = µ(A), and µ(En) = µ(An).{En}∞n=1 are disjoint andAn ⊆ En, it follows that {An}∞n=1 are disjoint, hence µ(A) =

∑∞n=1 µ(An).

As a result, µ(E) =∑∞

n=1 µ(En).

35

2.2. NULL SETS AND ALMOST EVERYWHERE CHAPTER 2. INTEGRATION THEORY

4. Finally, we show that µ is complete.

Let E ⊂ X such that there exists B ∈ M such that E ⊆ B and µ(B) = 0, then E ∈ M, whichimmediately follows from the definition of M by choosing A = ∅.

2.2.2 Almost Everywhere

Definition 2.8 (Almost everywhere). Let (X,M, µ) be a measure space, and P be a property of X, then wesay that P holds almost everywhere with respect to µ if there exists N ∈ M with µ(N) = 0 such that Pholds for each x ∈ X\N . This is notated by P holds a.e.[µ]. Here, [µ] can be omitted if the measure is clearfrom context.

In other words, P holds almost everywhere, if it holds over the entire X except for a null set.

Definition 2.9 (Almost everywhere equality). Let f and g be two measurable functions defined on a measurespace (X,M, µ), we say that f equals g almost everywhere if µ{x|f(x) 6= g(x)} = 0, notated by f = g, a.e.[µ].

Proposition 2.7. Almost everywhere equality is an equivalence relation between measurable functions.

It is trivial to check this.

Theorem 2.7. Let (X,M, µ) be a measure space, f and g be two measurable functions, then

f = g, a.e.[µ] ⇒∫

X

fdµ =∫

X

gdµ.

Proof. Let N = {x|f(x) 6= g(x)}. If N = ∅, then f is the same as g, then∫

Xfdµ =

∫Xgdµ trivially holds.

Otherwise, we have∫

Xfdµ =

∫X\N

fdµ +∫

Nfdµ. Since µ(N) = 0,

∫Nfdµ = 0, thus

∫Xfdµ =

∫X\N

fdµ.Likewise,

∫Xgdµ =

∫X\N

gdµ. And f equals g on X\N , we thus have∫

X\Nfdµ =

∫X\N

gdµ. Hence, theequality is established.

In the following, we extend the concept of measurable function to those defined almost everywhere (butnot necessarily the entire space).

Definition 2.10 (measurable function (defined almost everywhere)). Let (X,M, µ) be a measure space,E ∈ M and µ(Ec) = 0, Y be a topological space, a function f : E → Y is said to be measurable on X iff−1(V ) ∩ E is measurable for each open set V ⊆ Y .

In addition, we define its Lebesgue integral over X by∫X

fdµ :=∫

E

fdµ

and we say f ∈ L−1(X,µ) if∫

E|f |dµ < +∞.

Lemma 2.4. Integrable function is finite almost everywhere. Formally, let (X,M, µ) be a measure space,and f ∈ L1(µ), then

|f(x)| < +∞, a.e.[µ].

Proof. Let Sn = {x ∈ X||f(x)| > n}, and M =∫

X|f |dµ, since f ∈ L1(µ), M < +∞. Then, we have

M =∫

X

|f |dµ ≥∫

Sn

|f |dµ ≥∫

Sn

ndµ = nµ(Sn). (2.64)

Hence, µ(Sn) ≤ M/n, thus limn→∞ µ(Sn) = 0. Let R = {x ∈ X||f(x)| = +∞}, then R ⊆ Sn, ∀n ∈ N, itfollows that µ(R) ≤ µ(Sn), ∀n ∈ N, therefore, µ(R) = 0.

The following theorem states some important facts about infinite series of integrable functions.

Theorem 2.8. Let (fn)∞n=1 be a sequence of measurable functions defined almost everywhere on a measurespace (X,M, µ), such that

∑∞n=1

∫X|fn|dµ < +∞ then. we have

1. f(x) =∑∞

n=1 fn(x) converges almost everywhere w.r.t µ;

36

2.2. NULL SETS AND ALMOST EVERYWHERE CHAPTER 2. INTEGRATION THEORY

2. f is integrable, i.e f ∈ L1(X,µ);

3.∫

Xfdµ =

∑∞n=1

∫fndµ.

Note that even when each of fn is defined on the entire X, their sum converges almost everywhere (notnecessarily the entire X).

Proof. Let Sn is the domain of fn, in which fn is defined, then µ(Scn) = 0. Let S = ∩∞

n=1Sn, it is easy to seethat µ(Sc) = 0. We define φ : S → [0,+∞] by φ(x) =

∑∞n=1 |fn(x)|, then by MCT,∫

S

φdµ =∫

S

∞∑n=1

|fn|dµ =∞∑

n=1

∫S

|fn|dµ < +∞ (2.65)

Hence, φ ∈ L1(µ). Let E = {x ∈ S|φ(x) < +∞}, by the lemma above, µ(Ec) = 0. Note that f is absolutelyconvergent on E and |f(x)|leφ(x), ∀x ∈ E, therefore, f ∈ L1(µ).

Let gn =∑N

i=1 fi(x), then |gn(x)| ≤ φ(x), ∀x ∈ E. By definition of infinite series, limn→∞ gn(x) =f(x), ∀x ∈ E. Hence, by DCT (gn is dominated by φ ∈ L1(µ)),∫

E

fdµ = limn→∞

∫E

gndµ = limn→∞

n∑i=1

∫E

fidµ =∞∑

n=1

∫E

fndµ. (2.66)

Proposition 2.8. Let (X,M, µ) be a measure space, then

1. If f : X → [0,+∞] is measurable such that∫

Efdµ = 0 for E ∈ M, then f = 0, a.e.[µ] on E.

2. Let f ∈ L1(µ), such that∫

Efdµ = 0,∀E ∈ M, then f = 0, a.e.[µ] on X.

3. Let f ∈ L1(µ), such that |∫

Xfdµ| =

∫X|f |dµ, then there exists α ∈ C such that αf = |f |, a.e. on X.

Proof. 1. Let An = {x ∈ E|f(x) > 1/n}, then

0 =∫

E

fdµ ≥∫

An

fdµ ≥∫

An

1ndµ =

1nµ(An). (2.67)

Hence, µ(An) = 0. Let S = {x ∈ E|f(x) > 0}, note that S =∪∞

n=1An. As a result,

µ(S) ≤∞∑

n=1

µ(An) = 0. (2.68)

It means that f(x) = 0, a.e.[µ] on E.

2. Decompose f into f = (u+ − u−) + i(v+ − v−), since f is measurable, u+, u−, v+, v− are measurable.Let A = (u+)−1((0,+∞]). And, we have

0 = Re∫

A

fdµ =∫

A

u+dµ (2.69)

By the statement shown above, u+ = 0 almost every on A, thus µ(A) = 0. Likewise, the measure of(u−)−1((0,+∞]), (v+)−1((0,+∞]), and (v−)−1((0,+∞]) are all zeros. Note that

f−1(R\{0}) ⊂ (u+)−1((0,+∞]) ∪ (u−)−1((0,+∞]) ∪ (v+)−1((0,+∞]) ∪ (u−)−1((0,+∞]). (2.70)

Hence, µ(f−1(R\{0}) = 0, which means that f = 0, a.e.[µ] on X.

3. Let z =∫

Xfdµ, and choose α = z/|z|, then αz = |z| ∈ [0,+∞]. Then

|z| =∫

X

αfdµ =∫

X

|f |dµ. (2.71)

Note that |α| = 1, |alphaf | = |f |, thus Re(αf) ≤ |f | on X, i.e Re(|f | −αf) ≥ 0 on X, let h = |f | −αf ,we can write h = u+ + iv, since u− = 0 on X. From (2.71),

∫Xhdµ = 0, it follows that

∫Xu+dµ = 0,

thus u+ = 0, a.e. on X, i.e |f | = Re(αf), a.e. on X, thus |f | = αf, a.e. on X.

37

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

Theorem 2.9. Let (X,M, µ) be a measure space with locally finite measure, f ∈ L1(µ), and S be a closedsubset of C, if for each E ∈ M with 0 < µ(E) < +∞, one has

AE(f) =1

µ(E)

∫E

fdµ ∈ S,

then f(x) ∈ S, a.e.[µ] on X.

Proof. Since S is closed, Sc is open. If, Sc = ∅, then we are done. Otherwise, for each c ∈ Sc, there existsr > 0 such that Br(c) ⊂ Sc, where Br(c) is the open ball of radius r centered at c.

Note that we need prove f−1(Sc) has zero measure, and f−1(Sc) is countable union of the sets in form off−1(Br(c) with Br(c) ⊆ Sc. Hence, it is enough to show that µ(f−1(Br(c))) = 0 for each Br(c) ⊆ Sc.

By contradition, we assume that there exists Br(c) ⊆ Sc such that µ(E) > 0 where E = f−1(Br(c)). (Ifµ(E) = +∞, we can find any finite-measure subset in it. This can be done since µ is locally finite.) Then,we have

|AE(f) − c| =∣∣∣∣ 1µ(E)

∫E

fdµ− 1µ(E)

∫E

cdµ

∣∣∣∣ ≤ 1µ(E)

∫E

|f − c|dµ ≤ 1µ(E)

∫E

rdµ = r. (2.72)

Since, AE(f) ∈ S, and Br(c) ⊆ Sc, hence ∀x ∈ S, |x− c| > r, thus AE(f) > r, leading to contradition.

(Note that in the lecture, the condition is given as µ(X) < +∞, actually, as we have seen that this canbe relaxed to a mild condition, namely local finiteness, such that it applies to more cases.)

Theorem 2.10. Let {En}∞n=1 be a collection of measurable sets in a measure space (X,M, µ) such that∑∞n=1 µ(En) < +∞, then almost every x ∈ X is covered by at most finitely many En.

Proof. Let A = {x ∈ X|x is in infinitely many En}, we need show µ(A) = 0. Define, g =∑∞

n=1 χEn , then itis equivalent to proving g < +∞, a.e. on X as x ∈ A⇔ g(x) = +∞. This readily follows from∫

X

gdµ =∫

X

∞∑n=1

χEndµ =∞∑

n=1

∫X

χEndµ =∞∑

n=1

µ(En) < +∞. (2.73)

Note that in this proof, A can be formally written as A =∩∞

n=1

∪∞k=nEn.

Remark: it is claimed in the lecture that µ(A) = 0 ⇔∑∞

n=1 µ(En) < +∞ when µ(X) < +∞. I noticethat this claim is not necessarily true.

2.3 Riesz Representation Theorem

Definition 2.11 (positive linear functional). A positive linear functional defined on a function space isa linear functional that yields non-negative value for each non-negative function.

Definition 2.12 (Regular measure). Let X be a topological space, and M is a σ-algebra over X, then ameasure µ : M → [0,+∞] is called a regular measure, if it satisfies

1. (outer regularity)µ(E) = inf{µ(V )|E ⊆ V, V is open}, ∀E ∈ M;

2. (inner regularity)µ(E) = sup{µ(K)|K ⊆ E, K is compact}, ∀E ∈ M.

In particular, we say that µ satisifies outer regularity at E if µ(E) = inf{µ(V )|E ⊆ V, V is open}, andthat is satisfies inner regularity at K if µ(E) = sup{µ(K)|K ⊆ E, K is compact}.

Theorem 2.11 (Riesz representation theorem in a σ-compact space). Let X be a Hausdorff, locally compact,and σ-compact topological space, Λ is a positive linear functional on CC(X), then there exists a unique regularBorel measure µ such that Λf =

∫Xfdµ, ∀f ∈ Cc(X).

38

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

A more general version of Resiz representation theorem (without the requirement of σ-compactness) is

Theorem 2.12 (Riesz representation theorem). Let X be a Hausdorff, locally compact topological space, Λbe a positive linear functional on CC(X), then there exists a σ-algebra M that contains the Borel σ-algebra,and there exists a unique positive measure µ on M such that it satisfies the following five conditions:

C.1: Λf =∫

Xfdµ, ∀f ∈ Cc(X);

C.2: µ(K) < +∞ for each compact set K;

C.3: (outer regularity) µ(E) = inf{µ(V )|E ⊆ V, V is open} for all E ∈ M;

C.4: ((semi) inner regularity) µ(E) = sup{µ(K)|K ⊆ E, K is compact} for all E ∈ M with µ(E) < +∞ orE is open;

C.5: (completeness) µ is a complete measure, i.e. µ(E) = 0 ⇒ A ∈ M, ∀A ⊆ E.

We divide the proof into two parts: the first part is to show the existence of such µ; the second part is toshow that the measure µ that satisfies all the properties is unique.

In addition, for conciseness, we introduce the following notations to represent the relation between afunction f : X → [0, 1] ∈ Cc(X) and a set A. f ≺ A means suppf ⊆ A; while f � V means supp(1−f) ⊂ A,i.e. f(x) = 1, ∀x ∈ A.

Proof of Existence. We first construct the σ-algebra M and the measure µ explicitly, and then prove that theconstructed objects satisfy the required conditions.

First of all, we define a non-negative real valued function µ : 2X → [0,+∞] as follows. For each open setV ,

µ0(V ) = sup{Λf |f ≺ V },and then it can be extended to each subset E of X as

µ(E) = inf{µ0(V )|E ⊆ V, V is open}.

Actually, µ as defined above is an outer measure that on which we are going to construct the σ-algebra andderive the measure by restriction.

Claim (1) µ is an outer measure on X with µ(E) = µ0(E) when E is open, i.e. µ is an extension of µ0.

Proof of Claim (1). The proof takes several steps,

1. First of all, we show that µ0 satisfies the following three conditions.

(a) (µ0(∅) = 0) It is easy to see that the only function f that has f ≺ ∅ is the zero function, thusΛf = 0, and µ0(∅) = 0.

(b) (monotonicity) Given two open sets V1 ⊂ V2, f ≺ V1 ⇒ f ≺ V2, and thus µ0(V1) ≤ µ0(V2) bydefinition, as V2 may admit a larger set of functions whose supports contained in it.

(c) (sub-additivity) We first show that µ0(V1 ∪ V2) ≤ µ0(V1) + µ0(V2) for any two open sets V1, V2,and then extend it to finite sub-additivity.Consider any function g ≺ V1 ∪ V2, by partition of unity, there exist h1 and h2 such that h1 ≺ V1,h2 ≺ V2 and h1(x) + h2(x) = 1, ∀x ∈ supp g. As a result, h1g ≺ V1, h2g ≺ V2, and h1g+ h2g = g.By linearity of Λ,

Λg = Λ(h1g + h2g) = Λ(h1g) + Λ(h2g) ≤ µ0(V1) + µ0(V2), (2.74)

This holds for any g ≺ V1 ∪ V2, consequently,

µ0(V1 ∪ V2) = supg≺V1∪V2

Λg ≤ µ0(V1) + µ0(V2). (2.75)

By induction, this results can be generalized to finite sub-additivity as for any open sets V1, . . . , Vn,

µ0

(n∪

i=1

Vi

)≤

n∑i=1

µ0(Vi). (2.76)

39

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

2. Then, we show that µ is an outer measure based on the results above.

(a) (µ(∅) = 0) The smallest open set that contains an empty set is obviously an empty set. Thus, bydefinition, we have µ(∅) = 0.

(b) (monotonicity) Given any subsets E1 and E2 of X with E1 ⊂ E2, it is clearly that E2 ⊆ V ⇒E1 ⊆ V . It follows that µ(E1) ≤ µ(E2) by definition.

(c) (countable sub-additivity) Consider a countable collection of sets {Ei}∞i=1, and let E =∪∞

i=1Ei.We assume that µ(Ei) < +∞ for all i, otherwise, the sub-additivity trivially holds as both sidesare infinity. By definition of µ(E), given any ε > 0, for each i, there is an open set Vi with Ei ⊆ Vi

such that µ0(Vi) ≤ µ(Ei)+2−iε. Let f : X → [0, 1] ∈ Cc(X) have f ≺ E. Then, {Vi}∞i=1 is an opencover of supp f , which is a compact set, hence there is a finite number N such that f ≺

∪Ni=1 Vi.

It follows that

Λf ≤ µ0

(N∪

i=1

Vi

)≤

N∑i=1

µ0(Vi) ≤∞∑

i=1

µ(Ei) + ε. (2.77)

As this holds for all open sets V ⊇∪∞

i=1Ei and all f ≺ V , we have

µ(E) = sup{Λf |f ≺ V ⊇ E} ≤∞∑

i=1

µ(Ei) + ε. (2.78)

Since it holds for any ε > 0, the sub-additivity is established.

3. Finally, we show that µ and µ0 agrees on open sets. Note that when E is open, the smallest open setthat contains E is itself. By monotonicity of µ0, we have

µ(E) = inf{µ0(V )|E ⊆ V, V is open} = µ0(E). (2.79)

The claim (1) is finally proved.

For the convenience of discussing inner and outer regularity, we define

µ∗(E) = sup{µ(K)|K ⊆ E, K is compact}, and µ∗(E) = inf{µ(V )|E ⊆ V, V is open}.

Hence, µ(E) satisfies outer regularity at E when µ(E) = µ∗(E), and satisfies inner regularity when µ(E) =µ∗(E). We can see that µ is inherently outer regular by its definition. (C.3 automatically holds)

Then, we defineMF = {E|µ(E) < +∞ and µ(E) = µ∗(E)},

andM = {E|K ∩ E ∈ MF , ∀K compact}.

We are going to show that M is an σ-algebra with µ being a measure over it. To this end, we first establishseveral important results about MF .

Claim (2) if K is compact, then K ∈ MF and µ(K) = inf{Λf |K ≺ f}.Claim (3) if V is open, then µ(V ) = µ∗(V ), i.e. µ(V ) < +∞ ⇒ V ∈ MF .

We note that C.4 is established when these two claims are proved.

Proof of Claim (2). First, it immediately follows from the monotonicity of µ that for each compact set K,

µ(K) = sup{µ(K ′)|K ′ ⊆ K, K ′ is compact}. (2.80)

Therefore, K ∈ MF . In the following, we will show the identity that µ(K) = inf{Λf |f � K}.For a given compact set K, let f : X → [0, 1] ∈ Cc(X) satisfy f � K and 0 < α < 1, we define

Vα = f−1((α,+∞)), which is open due to the continuity of f . We note that K ⊆ Vα for any α ∈ (0, 1) andαg ≤ f, ∀g ≺ Vα. By claim (1), we have

µ(K) ≤ µ(Vα) = supg≺Vα

Λg ≤ α−1Λf, (2.81)

40

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

this holds any f � K and any α ∈ (0, 1), thus µ(K) ≤ inf{Λf |K ≺ f}, since Λf < +∞ for all f ∈ Cc(f)(why?), µ(K) < +∞. On the other hand, by definition of µ and claim (1), given ε > 0, there is an open set Vwith K ⊂ V such that µ(V ) < µ(K) + ε. By Urysohn lemma, there is a f ∈ CC(X) with K ≺ f ≺ V , hence,

Λf ≤ µ(V ) ≤ µ(K) + ε. (2.82)

As this holds for any ε > 0, we have µ(K) ≥ inff�K Λf , and thus µ(K) = inf{Λf |K ≺ f}.

Proof of Claim (3). Given any open set V , first of all µ(V ) ≥ µ∗(V ), which directly follows from the mono-tonicity of µ and the definition of µ∗. For the other direction, we need to show that for any open set V ,

sup{Λf |f ≺ V } ≤ sup{µ(K)|K ⊆ V }. (2.83)

It suffices to show that for each f ≺ V , there exists a compact set K ⊂ V such that Λf ≤ µ(K). To showthis, we let K = supp f , clearly, K ⊆ V due to f ≺ V . And, for each f ′ � K, we have Λf ≤ Λf ′, due topositiveness of Λ, thus Λf ≤ µ(K) = inf{Λf ′|f ′ � K} (the last equality is due to claim (2)).

An important property with MF is that µ satisfies countable additivity on MF . This property will beused in proving other claims.

Claim (4) Let {Ei}∞i=1 be a countable collection of disjoint sets in MF , let E =⊔∞

i=1Ei, then µ(E) =∑∞i=1 µ(Ei). In addition, if µ(E) < +∞, E ∈ MF .

Proof. We first show that for any two compact sets K1 and K2, we have µ(K1 tK2) = µ(K1)+µ(K2). Fromclaim (1), we immediately have µ(K1 tK2) ≤ µ(K1)+µ(K2) due to sub-additivity. Hence, it suffices to showthe inequality in the other direction, i.e. µ(K1 tK2) ≥ µ(K1) + µ(K2).

First, by Urysohn lemma, there exist f : X → [0, 1] ∈ Cc(X) such that f(x) = 1,∀x ∈ K1 and f(x) =0,∀x ∈ K2. On the other hand, from claim (2), given any ε > 0, there is a g : X → [0, 1] ∈ Cc(X) such thatK1 tK2 ≺ g and Λg < µ(K1 tK2) + ε. As a result,

µ(K1) + µ(K2) ≤ Λ(fg) + Λ((1 − f)g) = Λg ≤ µ(K1 tK2) + ε. (2.84)

As this holds for any ε > 0, µ(K1) + µ(K2) ≤ µ(K1 tK2), thus µ(K1 tK2) = µ(K1) + µ(K2). By induction,this generalizes to any finite union of compact sets.

Then, we consider a countable collection {Ei}∞i=1 ⊂ MF where Ei are disjoint. Let E =⊔∞

i=1Ei, Likewise,it is enough to prove µ(E) ≥

∑∞i=1 µ(Ei). Given ε > 0, for each i, since Ei ∈ MF , there exists a compact

set Hi ⊆ Ei such that µ(Hi) ≥ µ(Ei) − 2−iε. Let Gn =⊔n

i=1Hi, then Gn is a compact set for each n, andµ(Gn) =

∑ni=1 µ(Hi) due to the conclusion shown right above.

µ(E) ≥ µ(Gn) =n∑

i=1

µ(Hi) ≥ µni=1µ(Ei) − ε. (2.85)

As this holds for any n ∈ N and ε > 0, take n→ ∞, we have µ(E) ≥∑n

i=1 µ(Ei).

We have shown that compact sets and open sets with finite µ are all in MF . Actually, we will show thatany set in MF can be approximated by compact sets from below, and by open sets from above.

Claim (5) If E ∈ MF , for any ε > 0, there exist a compact set K and an open set V such that K ⊆ E ⊆ Vand µ(V \K) < ε.

Proof of Claim (5). By definition of MF , µ∗ and µ∗, for each E ∈ MF , there exist K and V , such that

µ(V ) − ε

2< µ(E) < µ(V ) +

ε

2. (2.86)

Note that V \K is open, and µ(V ) < µ(K) + ε < +∞, by claim (3), V \K ∈ MF . On the other hand, byclaim (4), µ(V ) = µ(K) + µ(V \K), thus µ(V \K) < ε.

Now, we have sufficiently characterize what are in MF .Claim (6) Let A,B ∈ MF , then A ∪B,A ∩B,A\B ∈ MF .

41

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

Proof of Claim (6). From the definition of MF and monotonicity of µ, we can see that to show E ∈ MF , itsuffices to show there exists a compact set K ⊆ E such that µ(E) ≤ µ(K) + ε for any given ε > 0.

Let A,B ∈ MF , then there exist K1,K2, V1 and V2 with K1 ⊆ A ⊆ V1 and K2 ⊆ B ⊆ V2 such thatµ(Vi\Ki) < ε for any given ε > 0, according to claim (5).

1. (A ∪ B ∈ MF ). Clearly, K1 ∪ K2 ⊆ A ∪ B ⊆ V1 ∪ V2, K1 ∪ K2 is compact, and V1 ∪ V2 is open. Inaddition, we have

(V1 ∪ V2)\(K1 ∪K2) ⊆ (V1\K1) ∪ (V2\K2). (2.87)

As µ is an outer measure,

µ((V1 ∪ V2)\(K1 ∪K2)) ≤ µ(V1\K1) + µ(V2\K2) ≤ 2ε. (2.88)

2. (A ∩B ∈ MF ). Clearly, K1 ∩K2 ⊆ A ∩B ⊆ V1 ∩ V2, K1 ∩K2 is compact, and V1 ∩ V2 is open. And,

(V1∩V2)\(K1∩K2) = V1∩V2∩(Kc1∪Kc

2) = (V1∩Kc1∩V2)∪(V1∩V2∩Kc

2) ⊆ (V1\K1)∪(V2\K2), (2.89)

µ((V1 ∩ V2)\(K1 ∩K2)) ≤ µ(V1\K1) + µ(V2\K2) ≤ 2ε. (2.90)

3. (A\B ∈ MF ). Note that

A\B ⊆ V1\V2 ⊆ (V1\K1) ∪ (K1\V2) ∪ (V2\K2), (2.91)

thusµ(A\B) ≤ µ(K1\V2) + 2ε, (2.92)

where K1\V2 is compact set contained in A\B.

To sum up, MF is closed under finite union, finite intersection, and set difference.

By extending MF to M = {E|K ∩E ∈ MF ,∀K compact}, we obtain an σ-algebra.

Claim (7) M is a σ-algebra that contains the Borel σ-algebra.

Proof of Claim (7). We first show that M is a σ-algebra.

1. ∅ ∈ M, because ∅ ∩K = ∅ ∈ MF for every compact set K.

2. Let A ∈ M, i.e. for each compact set K, A∩K ∈ MF , thus Ac ∩K = K\(A∩K) ∈ MF , as K ∈ MF

and MF is closed under set difference. Therefore, Ac ∈ M.

3. Let {Ai}∞i=1 ⊂ M, then for for any compact set K, Ai ∩ K ∈ MF for each i. Let B1 = A1 ∩ Kand Bn+1 = (An+1 ∩ K)\

∪ni=1Bi for each n ≥ 1. It is easy to see that Bn ∈ MF for each n due

to the fact that MF is closed under union and set difference. We note that A ∩ K =⊔∞

n=1Bn, andµ(A ∩K) ≤ µ(K) < +∞, by claim (4), A ∩K ∈ MF . As this holds for any compact set K, A ∈ M.

Now, we can conclude that M is a σ-algebra. In the following, we show that it contains the Borel σ-algebra.It suffices to show that it contains every closed set. For each closed set C, and any compact set K, C ∩K iscompact, and thus C ∩K ∈ MF , which follows that C ∈ M.

Claim (8) MF = {A ∈ M|µ(A) < +∞}, i.e. A ∈ MF if and only if A ∈ M and µ(A) < +∞.

Proof of Claim (8). The proof is conducted in two directions respectively.

1. A ∈ MF ⇒ A ∈ M and µ(A) < +∞. Let A ∈ MF , then for any compact set K, K ∈ MF by claim(2), and A ∩ K ∈ MF by claim (6), thus A ∈ M. The statement µ(A) < +∞ is directly from thedefinition of MF .

42

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

2. A ∈ M and µ(A) < +∞ ⇒ A ∈ MF . It suffices to show that given any ε > 0, there exists a compactset K ⊆ A such that µ(A) ≤ µ(K) + ε.

As µ(A) = inf{µ(V )|A ⊆ V and V is open}, fix ε > 0, there exists an open set V ⊇ A such thatµ(V ) < +∞, by claim (5), there exists an open set K ⊆ V such that µ(V \K) < ε/2. As A ∈ M,A∩K ∈ MF , then there exists another compact set K ′ such that K ′ ⊆ A∩K and µ((A∩K)\K ′) < ε/2.Due to the sub-additivity of µ, we have

µ(A) ≤ µ(K ′) + µ(A\K ′), (2.93)

whereA\K ′ ⊆ ((A ∩K)\K ′) ∪ (A\K) ⊆ ((A ∩K)\K ′) ∪ (V \K) ≤ ε/2 + ε/2 = ε. (2.94)

Hence, we can conclude that A ∈ MF .

Claim (9) The restriction of µ to M is a complete measure.

Proof. We have shown in claim (1) that µ(∅) = 0, thus we only need to show µ satisfies σ-additivity on M.Let {Ei}∞i=1 be a collection of disjoint sets in M, and let E =

⊔∞i=1Ei. If there exist i with µ(Ei) = +∞,

then by monotonicity µ(E) = +∞, and the countability of µ trivially holds in this case. Assume µ(Ei) < +∞for each i, then every Ei is in M by claim (8). The countable additivity in this case has been established byclaim (4).

Finally, we show that µ is complete. Let A ∈ M with µ(A) = 0, and B ⊂ A. By claim (8), A ∈ MF . Toshow that B is also in MF , it is enough to show that for every compact set K with K ⊆ B, µ(K) = 0, thisdirectly follows from the monotonicity of µ.

Claim (10) Λf =∫

Xfdµ, ∀f ∈ Cc(X). (This corresponds to C.1)

Proof. First of all we prove that for any real valued function f : X → R ∈ Cc(X), we have Λf =∫

Xfdµ.

Let K = supp f . Since f is continuous, f(K) is compact, and thus there exists a closed interval [a, b]with f(X) ⊆ [a, b]. Given any ε > 0, choose a finite sequence of values y0, y1, . . . , yn such that y0 < a ≤ y1 ≤· · · ≤ yn = b such that yi+1 − yi < ε for i = 0, . . . , n− 1. Define Ei = f−1((yi−1, yi])∩K, then E1, . . . , En area collection of disjoint Borel sets, with

⊔ni=1Ei = K. Therefore, for each i, there exists an open set Vi with

Ei ⊆ Vi such that µ(Vi) < µ(Ei) + ε/n, and f(x) < yi + ε, forallx ∈ Vi. Then {Vi}ni=1 form a open cover of

K, by partition of unity, For Vi, there is hi ≺ Vi, such that∑n

i=1 hi(x) = 1,∀x ∈ K. Hence, f =∑n

i=1 hif .Note that

µ(K) ≤ Λ

(n∑

i=1

hi

)=

n∑i=1

Λhi, (2.95)

and hif ≤ (yi + ε)hi, thus

Λf ≤n∑

i=1

Λ(hif) ≤n∑

i=1

Λ((yi + ε)hi) =n∑

i=1

(yi + ε)Λhi

=n∑

i=1

(|a| + yi + ε)Λhi − |a|n∑

i=1

Λhi (2.96)

Here, for each i, Λhi ≤ µ(Vi) < µ(Ei) + ε/n, as hi ≺ Vi, and∑n

i=1 Λhi ≥ mu(K), we then have

Λf ≤n∑

i=1

(|a| + yi + ε)(µ(Ei) + ε/n) − |a|µ(K). (2.97)

Note µ(K) =∑n

i=1 µ(Ei), we further have

Λf ≤n∑

i=1

(yi − ε)µ(Ei) + 2εµ(K) +ε

n

n∑i=1

(|a| + yi + ε). (2.98)

43

2.3. RIESZ REPRESENTATION THEOREM CHAPTER 2. INTEGRATION THEORY

Here, 1n

∑ni=1 yi ≤ |b|, and

∑ni=1(yi − ε)µ(Ei) ≤

∫Xfdµ (by monotonicity of Lebesgue integral), thus

Λf ≤∫

X

fdµ+ ε(2µ(K) + |a| + |b| + ε). (2.99)

As this holds for any ε > 0, Λf ≤∫

Xfdµ. From this, we also have Λ(−f) ≤

∫X

(−f)dµ, leading to theinequality in opposite direction, and resulting in the equality Λf =

∫Xfdµ for any real valued function f .

It is not straightforward to extend this equality to general complex functions with compact supports, byrespectively considering the real and imaginary parts.

Up to here, the proof of existence is completed, in the following, we will prove the uniqueness.

Proof of Uniqueness. Suppose M is a σ-algebra, and µ1, µ2 are two measures on M that satisfy the conditionsC.1 to C.5. Since they both satisfy C.4, if they agree on compact sets, then they are identical, thus it sufficesto prove µ1 = µ2 for each compact set.

For each compact set K, given ε > 0, by C.3, there is an open set V with µi(V ) ≤ µi(K) + ε for i = 1, 2.By Urysohn lemma, there is f : X → [0, 1] ∈ Cc(X) with K ≺ f ≺ V , then

µl(K) =∫

X

χKdµ1 ≤∫

X

fdµ1 = Λf =∫

X

fdµ2 ≤∫

X

χV dµ2 = µ2(V ) ≤ µ2(K) + ε (2.100)

As this holds for arbitrary ε > 0, µ1(K) ≤ µ2(K), likewise, µ2(K) ≤ µ1(K), thus µ1(K) = µ2(K).

44

2.4. PRINCIPLES OF MEASURE THEORY CHAPTER 2. INTEGRATION THEORY

2.4 Principles of Measure Theory

There are four important results in measure theory, which we call the principles of measure theory, whoseproofs will be given in later lectures.

Let (X,M, µ) be a measure space, then

1. Every measurable set is nearly Borel . For each measurable set A, there exist Fσ, Gδ and N suchthat A = Fσ ∪N = Gδ ∪N , where Fσ is a countable union of closed sets, Gδ is a countable intersectionof open sets, and N is a null set.

2. Every measurable set is nearly open . For each measurable set A, given ε > 0, there is an open setU with A ⊆ U such that µ(U) < µ(A) + ε.

3. Every measurable function is nearly continuous. This is Lusin’s theorem.

4. Every pointwise convergent sequence of measurable function is nearly uniform convergent .This is Egorov’s theorem.

45

Chapter 3

Introduction to Lp Spaces

3.1 Important Inequalities

3.1.1 Convex Sets and Convex Functions

Definition 3.1 (Convex set). Let C be a subset of a complex vector space X, then C is called a convex setif x, y ∈ C ⇒ (1 − λ)x+ λy ∈ C, ∀λ ∈ [0, 1].

Definition 3.2 (Convex function). Let f : X → R be a real valued function defined on a complex vector spaceX, then f is called a convex function if f((1 − λ)x+ λy) ≤ (1 − λ)f(x) + λf(y), ∀x, y ∈ X, λ ∈ [0, 1]. Inparticular, if the equality holds iff x = y, then f is said to be strictly convex.

Some important convex functions defined on R include (− log), exp, and x 7→ |x|α with α ≥ 1.

Proposition 3.1. Each seminorm is convex. In particular, each norm is convex.

Proof. Let X be a vector space with a seminorm p defined on it. Given x, y ∈ X, then when λ ∈ [0, 1],p((1 − λ)x+ λy) ≤ (1 − λ)p(x) + λp(y), which directly follows from the definition of seminorm.

3.1.2 Jensen’s Inequality

In the following, we introduce several important inequalities.

Theorem 3.1 (Jensen’s Inequality). Let (X,M, µ) be a probability space (a measure space with µ(X) = 1),f : X → R ∈ L1(X,µ), and ϕ : R → R be a convex function, then

ϕ

(∫X

fdµ

)≤∫

X

(ϕ ◦ f)dµ.

Proof. Since ϕ is convex, at each x0 ∈ R, there exist a, b ∈ R such that ϕ(x0) = ax0 + b and ϕ(x) ≥ax + b, ∀x ∈ R, (here, y = ax + b defines a supporting plane of the epigraph of ϕ at x0). Let x0 =

∫Xfdµ,

then we have

ϕ

(∫X

fdµ

)= ϕ(x0) = ax0 + b = a

∫X

fdµ+ b =∫

(af + b)dµ ≤∫

(φ ◦ f)dµ. (3.1)

The proof is completed.

By considering measure on finite sets, we immediately derive the following finite form of the inequality.

Corollary 3.1 (Jensen’s Inequality in finite form). Let g : X → R be a convex function on vector space,x1, . . . , xn ∈ X, and α1, . . . , αn ∈ [0, 1] with

∑ni=1 αi = 1, then

g

(n∑

i=1

αixi

)≤

n∑i=1

αig(xi).

46

3.1. IMPORTANT INEQUALITIES CHAPTER 3. INTRODUCTION TO LP SPACES

3.1.3 Holder’s Inequality

Definition 3.3 (Holder’s conjugate pair). Let p, q ∈ [1,+∞], they are said to be conjugate if 1/p+1/q = 1.In particular, p = 2 is self-conjugate, and (1,+∞) is a conjugate-pair.

Definition 3.4 (Essential supremum). Let f : X → R be a real valued measurable function defined on ameasure space (X,M, µ), then the essential supremum of f on a measurable subset S of X is

ess supS

f = inf{a ∈ R : µ({x : f(x) > a}) = 0}.

In particular, when S = X, we denote ess supX f by ess sup f , which is called the essential supremum of f .

Definition 3.5 (|| · ||p function). Let (X,M, µ) be a measure space, let p ∈ [0,+∞], we define || · ||p : X → Ras follows: when p < +∞,

||f ||p =(∫

X

|f |pdµ)1/p

and when p = +∞,||f ||∞ = ess sup

X|f |.

Theorem 3.2 (Holder’s Inequality). Let (X,M, µ) be a measure space, p, q ∈ [1,+∞] be Holder conjugate(i.e. 1/p+ 1/q = 1), and f, g : X → C be measurable functions,

||fg||1 ≤ ||f ||p||g||q.

In particular, if p, q < +∞, it can be written as∫X

|fg|dµ ≤(∫

X

|f |pdµ)1/p(∫

X

|g|qdµ)1/q

.

Proof. We first consider the case where p, q < +∞. Let a = ||f ||p and b = ||g||q, if either a or b is infinite,then we are done. Hence, it suffices to assume that a, b < +∞. In addition, without losing generality, weassume that fg 6= 0, a.e.[µ]. (otherwise, we can just consider the subset within which this is satisfied). Letf ′ = f/a and g′ = g/a, then ∫

X

|f ′|pdµ = 1 and∫

X

|g′|pdµ = 1 (3.2)

Note that 0 < |f ′(x)| < +∞, a.e.[µ] and so is g. For each x satisfying this condition, let s(x) = p log |f ′(x)|and t(x) = q log |g′(x)|, which are defined almost everywhere. Then

|f ′(x)| = exp(s(x)p

)a.e.[µ] and |g′(x)| = exp

(t(x)q

), a.e.[µ] (3.3)

Note that exp is convex, and 1/p+ 1/q = 1, we have

|f ′(x)g′(x)| = exp(s(x)p

+t(x)q

)≤ es(x)

p+et(x)

q=

1p|f ′(x)|p +

1q|g′(x)|q. (3.4)

Taking integration of both sides, we have∫X

|f ′g′|dµ ≤ 1p

∫X

|f ′|dµ+1q

∫X

|g′|dµ =1p

+ 1q = 1. (3.5)

As a result,

||fg||1 =∫

X

|fg|dµ ≤ ab = ||f ||p||g||q. (3.6)

When p = 1 and q = ∞. Let a = ||f ||1 and b = ||g||q, then |g(x)| ≤ q, a.e.[µ]. Assume that b < +∞, otherwisewe are done. As a result,

||fg||1 =∫

X

|fg|dµ ≤∫

X

b|f |dµ = b

∫X

|f |dµ = ||g||∞||f ||1. (3.7)

The proof is completed.

47

3.2. LP SPACES CHAPTER 3. INTRODUCTION TO LP SPACES

3.1.4 Minkowski’s Inequality

Theorem 3.3 (Minkowski’s Inequality). Let (X,M, µ) be a measure space, f, g : X → C be measurablefunctions, and p ∈ [1,+∞], then

||f + g||p ≤ ||f ||p + ||g||p.

In particular, when p < +∞, it can be written as(∫X

|f + g|pdµ)1/p

=(∫

X

|f |pdµ)1/p

+(∫

X

|g|pdµ)1/p

.

Proof. It suffices to assume that ||f ||p < +∞ and ||g||p < +∞ and ||f + g||p > 0, otherwise we are done.When p < +∞, we have

||f + g||pp =∫

X

|f + g|pdµ ≤∫

X

(|f | + |g|)|f + g|p−1dµ =∫

X

|f ||f + g|p−1dµ+∫

X

|g||f + g|p−1dµ. (3.8)

With Holder’s inequality, let q = p/(p− 1), we have∫X

|f ||f + g|p−1dµ ≤(∫

X

|f |p)1/p(∫

X

|f + g|(p−1)/q

)1/q

=(∫

X

|f |p)1/p(∫

X

|f + g|p)1−1/p

= ||f ||p||f + g||p−1p . (3.9)

Likewise, we have ∫X

|g||f + g|p−1dµ ≤ ||g||p||f + g||p−1p . (3.10)

Hence,||f + g||pp ≤ (||f ||p + ||g||p)||f + g||p−1

p , (3.11)

thus||f + g||p ≤ ||f ||p + ||g||p. (3.12)

When p = +∞, let a = ||f ||∞ = ess sup |f |, and b = ||g||∞ = ess sup |g|, then given ε > 0, µ{x||f(x)| >a+ε/2} = 0, and µ{x||g(x)| > b+ε/2} = 0. Since |f(x)+g(x)| > a+b+ε⇒ |f(x)| > a+ε/2, or |g(x)| > b+ε/2,

µ{x||f(x) + g(x)| > a+ b+ ε} ≤ µ{x||f(x)| > a+ ε/2}µ{x||g(x)| > b+ ε/2} = 0. (3.13)

Hence, a+ b+ ε is an essential upper bound of |f + g|. This holds for all ε > 0, thus

||f + g||∞ = ess sup |f + g| ≤ a+ b = ||f ||∞ + ||g||∞. (3.14)

3.2 Lp Spaces

3.2.1 From Lp space to Lp space

Definition 3.6 (Lp space). Let (X,M, µ) be a measure space, and

Lp(X,M, µ) = {f : X → C | f is measurable, and ||f ||p < +∞},

by linearity of Lebesgue integration and Minkowski’s inequality, we can see that when p ∈ [1,+∞], Lp(X,M, µ)is a vector space with seminorm || · ||p, called the Lp space. When the measure space is clear from the context,the following simplified notations are often used: Lp(X,µ), Lp(X), Lp(µ), or Lp.

By Holder’s inequality, we immediately have

Proposition 3.2. Let p, q ∈ [1,+∞] be Holder conjugate (i.e. 1/p + 1/q = 1), if f ∈ Lp and g ∈ Lq, thenfg ∈ L1.

48

3.2. LP SPACES CHAPTER 3. INTRODUCTION TO LP SPACES

Proposition 3.3. Let f, g : X → C are both measurable functions defined on a measure space (X,M, µ),then ||f − g||p = 0 if and only if f = g a.e.[µ].

Proof. 1. (f = g a.e.[µ] ⇒ ||f − g||p = 0).

This direction is proved as follows: when p < +∞,

f = g a.e.[µ] ⇒ |f − g|p = 0 a.e.[µ] ⇒ ||f − g||pp =∫

X

|f − g|pdµ = 0. (3.15)

When p = +∞, f = g a.e.[µ] ⇒ ess sup |f − g| = 0.

2. (||f − g||p = 0 ⇒ f = g a.e.[µ]).

When p < +∞, let Sα = {x||f(x) − g(x)|p > α}, it is easy to see that ||f − g||pp ≥ αµ(Sα), thusµ(Sα) = 0. Let S = {x|f(x) 6= g(x)}. We note that S =

∪∞n=1 S1/n, thus µ(S) ≤

∑∞n=1 µ(S1/n) = 0,

which follows that f = g a.e.[µ].

When p = +∞, ess sup |f − g| = 0 ⇒ |f − g| = 0 a.e.[µ] ⇒ f = g a.e.[µ].

According to quotient space theorem, we can derive a normed space from (Lp, || · ||p) by merging theequivalent functions.

Definition 3.7 (Lp space). Let (X,M, µ) be a measure space, p ∈ [1,+∞], and ∼ be a relation in Lp(X,M, µ)with f ∼ g ⇔ ||f − g||p = 0 ⇔ f = g a.e.[µ], then Lp(X,M, µ) = {[f ] | f ∈ Lp(X,M, µ)} together with thenorm || · ||p given by ||[f ]||p = ||f ||p constitute a normed space, called the Lp space, denoted by Lp(X,M, µ).The simplified notation Lp(X,µ), Lp(X), Lp(µ), and Lp are often used when the measure is clear in context.

For Lp spaces, we have the following remarks

1. The elements in Lp are equivalence classes, but not functions.

2. It is meaningless to evaluate point-values of elements in Lp, given any x ∈ X, as different functions inthe same equivalence class may yield different values at that point.

3. We can take Lebesgue integration of the elements of Lp, as different functions in the same equivalenceclass yield the same integral value.

3.2.2 Lp Convergence and Completeness

Definition 3.8 (Lp convergence). Let (X,M, µ) be a measure space, and (fn)∞n=1 be a sequence of functionsin Lp(X,M, µ), then we say (fn) converges to f in Lp-norm when limn→∞ ||fn − f || = 0, denoted by

fnLp−−→ f .

Proposition 3.4. L∞ convergence is equivalent to uniform convergence on X except for a null set.

We note that Lp convergence is neither a sufficient condition nor a necessary condition of almost every-where convergence. We give two examples to show this (under Lebesgue measure space of R)c.

1. Let fn : R → R be defined by fn = χ[n,n+1], then for every p ∈ [1,+∞], fn ∈ Lp(R,B,m), and fn → 0everywhere, but ||fn||p = 1 for each n.

2. Consider the following sequence of functions: f1,0, f1,1, f2,0, . . . , f2,3, . . . , fk,0, fk,2k−1, . . . with fk,n =χ[n/2k,(n+1)/2k], then their Lebesgue integration converges to zero, but they converge nowhere.

Definition 3.9 (σ-finite measure). A measure µ defined on a measurable space (X,M) is called a σ-finitemeasure, if X is a countable union of measurable sets with finite measure. In this case, (X,M, µ) is calleda σ-finite measure space.

Lemma 3.1. Every Cauchy sequence (fn)∞n=1 in Lp has a subsequence (fnk)∞k=1 that converges to a function

f ∈ Lp almost everywhere.

49

3.3. IMPORTANT THEOREMS CHAPTER 3. INTRODUCTION TO LP SPACES

Proof. First, we consider the case with p < +∞. Since (fn)∞n=1 is Lp-Cauchy, we can choose (nk)∞k=1 suchthat ||fnk

− fnk+1 ||p < 2−k, we claim this sequence converges to a function f ∈ Lp almost everywhere. Letgm = |fn1 | +

∑mk=2 |fnk

− fnk−1 |, then for each m, gm ∈ Lp as it is a finite sum of Lp functions. By Fatou’slemma

|| limm→∞

gm||pp =∫

X

(lim

m→∞gm

)p

dµ =∫

X

limm→∞

gpmdµ ≤ lim

m→∞

∫X

gpmdµ = lim

m→∞||gm||pp. (3.16)

Hence,

|| limm→∞

gm||p ≤ limm→∞

||gm||p ≤ ||fn1 ||p +m∑

k=2

||fnk− fnk−1 ||p ≤ ||fn1 ||p + 1. (3.17)

It implies that (gm) is a non-decreasing sequence of non-negative functons that is bounded above almosteverywhere. Let g = limm→∞ gm, then g is defined almost everywhere, and g ∈ Lp. Define hm(x) =fn1(x) +

∑mk=2(fnk

(x) − fnk−1(x)), then (hm(x))∞m=1 is Cauchy almost everywhere, by completeness of C,it converges almost everywhere, let f(x) = limm→∞ hm(x), then f is defined almost everywhere. Note thathm = fnm , thus limk→∞ fnk

= f, a.e.[µ].In addition,

||f ||pp =∫

X

|f |pdµ =∫

X

limk→∞

|fnk|pdµ ≤ lim inf

k→∞

∫X

|fnk|pdµ = lim

k→∞||fnk

||pp ≤ g ∈ Lp. (3.18)

Thus f ∈ Lp.When p = +∞, L∞-Cauchy implies uniform Cauchy on X except for a null set. Thus, fn(x) is Cauchy

sequence almost everywhere, and thus fn converges almost everywhere. In addition, uniform Cauchy impliesuniform boundedness, thus f(x) = limn→∞ fn(x) has ||f ||∞ < +∞.

Theorem 3.4 (Riesz-Fisher Theorem). For p ∈ [1,+∞], Lp spaces are complete.

Proof. For each Cauchy sequence (fn)∞n=1 in Lp, we choose a subsequence (fnk)∞k=1 with ||fnk

−fnk−1 || < 2−k.From the proof of the above lemma, we know that fnk

converges to some f ∈ Lp almost everywhere. It remains

to show that fnLp

−−→ f . To this end, it suffices to show that fnk

Lp

−−→ f (Cauchy sequence with convergentsubsequence converges),

||f − fnm ||pp =∫

X

|f − fnm |pdµ =∫

X

limk→∞

|fnk− fnm |pdµ ≤ lim inf

k→∞

∫X

|fnk− fnm |pdµ = lim

k→∞||fnk

− fnm ||pp(3.19)

which converges to 0 as m→ ∞.

3.3 Important Theorems

3.3.1 Lusin’s Theorem

Lusin’s theorem states that every measurable function is nearly continuous, or more exactly, every measurablefunction with finite measure support is continuous on nearly all its domain.

Theorem 3.5 (Lusin’s theorem). Let µ be a regular Borel measure on a locally compact Hausdorff space Xsuch that µ(K) < +∞ for each compact subset K. Suppose f : X → C is a complex measurable functionwith finite measure support (i.e. µ(suppf) < +∞), then for each ε > 0, there exists a function with compactsupport g ∈ CC(X) such that µ({x|f(x) 6= g(x)}) < ε.

Proof. We start from a simple case where f is a non-negative function bounded above by 1 and with compactsupport, and then generalize the conclusion to a wider family of functions.

The basic idea is to decompose f into a series of attenuating differences, approximate each of term by acontinuous function using Urysohn lemma, and construct a uniformly convergent series with them.

Assume that f : X → R be a measurable function with compact support A = supp f (note µ(A) < +∞),and f(X) ⊆ [0, 1]. Then, we can construct an increasing sequence of non-negative simple functions (sn)∞n=1,such that sn ↑ f , and sn − sn−1 = 2−nχTn where Tn is measurable for each n.

Let t1 = s1, tn = sn − sn−1 for n > 1, then we have f =∑∞

n=1 tn. Since X is locally compact Hausdorff,and the support A is compact, there exists an open set U ⊃ A such that U is compact. By regularity of µ, for

50

3.3. IMPORTANT THEOREMS CHAPTER 3. INTRODUCTION TO LP SPACES

each n there is a compact set Kn, and an open set Vn such that Kn ⊂ A ⊂ Vn ⊂ U and µ(Vn\Kn) < 2−nε.By Urysohn lemma, we can choose hn ∈ Cc(X) such that Kn ≺ hn ≺ Vn.

Then, we define gm =∑m

n=1 2−nhn. For gm we have the following two claims:

1. gm converges uniformly onX. Given ξ > 0, choose an integerN withN > log2 ε+1, then∑∞

n=N+1 2−nhn ≤∑∞n=N+1 2−n < ξ. Hence, gm converges uniformly. Let g = limm→∞ gm =

∑∞n=1 2−nhn. By uniform

convergence theorem, g is a continuous function.

2. g has a compact support, since supp g ⊂ V , which is compact. Hence, g ∈ Cc(X).

3. Note that 2−nhn = tn on Kn. Hence, f(x) = g(x) except on S =∪∞

n=1(Vn\Kn), which has

µ(S) ≤∞∑

n=1

µ(Vn\Kn) <∞∑

n=1

2−nε = ε. (3.20)

From these claims we can conclude that the theorem holds for any non-negative functions bounded above by1 and has compact support.

We then generalize this conclusion through several steps.

1. Let f be a bounded non-negative function with compact support with an upper bound M . Then f/Mis bounded above by 1, which can be approximated by g ∈ Cc(X), then f can be approximated by Mg.

2. Let f be a bounded complex function with compact support, we can write f = (u+ − u−) + i(v+ − v−)where u+, u−, v+, v− are all bounded non-negative functions with compact support. Thus, we canapproximate them respectively by g+

u , g−u , g

+v , g

−v , and hence f can be approximated by g = (g+

u −g−u )+i(g+

v − g−v ).

3. Let f be a bounded complex function with a finite measure support A which is not necessarily compact.Then, by regularity of measure, we can find a compact set K ⊆ A such that µ(A\K) < ε/2. In addition,we can approximate the restrction f |K which is a bounded function with compact support by g ∈ Cc(X)such that g differs from f |K in a set S with µ(S) < ε/2. Hence, g differs from f in the set S ∪ (A\K)whose measure is less than ε.

4. If f has a finite measure support, but is not necessarily bounded. Let Bn = {x||f(x)| > n}, Then (Bn)is a non-increasing sequence of sets, with

∩∞n=1Bn = ∅. Since µ(B1) < µ(A) < +∞, thus by continuity

of µ, limn→∞ µ(Bn) = 0. We can choose N such that µ(BN ) < ε/2, then f ′ = (1 − χBN )f is boundedabove by N and has a finite measure support, thus we can find g to approximate f ′ such that g differsfrom f ′ in a set S with µ(S) < ε/2. As a result, g differs from f in the set S ∪ BN whose measure isless than ε.

The proof of the theorem is completed.

Based on Lusin’s theorem, we can derive the following important result.

Theorem 3.6. Let X be a locally compact Hausdorff space, µ be a regular Borel measure on X with µ(K) <+∞ for every compact set K, then Cc(X) is dense in Lp(X,µ), when 1 ≤ p < +∞.c

In other words, given any f ∈ Lp(X,µ), there exists a sequence of functions (gn)∞n=1 in Cc(X) such thatgn converges to f in Lp-norm, i.e. limn→∞ ||f − gn||p = 0.

Proof. The proof is conducted in two stages. First, we approach f by a sequence of functions with finitemeasure support, and then approximate them with compactly supported functions by Lusin’s theorem.

Let S be the set of all measurable functions with finite measure support, i.e.

S = {s : X → C|s is measurable, and µ({x|s(x) 6= 0}) < +∞}.

We first claim that S is dense in Lp(X,µ). This is shown as follows. For each f ∈ Lp that is non-negative, we can choose an increasing sequence of simple measurable functions (sn)∞n=1 such that sn ↑ f, a.e..Obviously, sn ∈ Lp for each n. An integrable simple function must have finite measure support, thus sn ∈ Sfor each n. In addition, |f − sn|p < |f |p and |f |p ∈ L1, by dominated convergence theorem, we have

limn→∞

||f − sn||pp = limn→∞

∫X

|f − sn|pdµ =∫

X

limn→∞

|f − sn|dµ = 0. (3.21)

51

3.3. IMPORTANT THEOREMS CHAPTER 3. INTRODUCTION TO LP SPACES

Since x 7→ xp is a continuous function when p ∈ [1,∞), thus limn→∞ ||f − sn||p = 0.While when f complex measurable function in Lp, we can approximate each component (positive/negative

parts or the real/imaginary parts) respectively. To sum up, S is dense in Lp, and thus the claim is proved.We then claim that Cc(X) is dense in S. This is shown as follows. For every s ∈ S, by Lusin’s lemma,

we can find g ∈ Cc(X) such that |g| ≤ ||s||∞, and g differs from s on a set with measure less than ε for anygiven ε > 0. As a result,

||g − s||p =(∫

X

|g − s|pdµ)1/p

≤ (2||s||p∞ε)1/p = ε1/p(2||s||p∞)1/p. (3.22)

It implies that for each s we can find g ∈ Cc(X) such that ||g − s||p can be arbitrarily small, which followsthat Cc(X) is dense in S.

Combining the two claims above, we can conclude that Cc(X) is dense in S. (Let A ⊃ B ⊃ C be in ametric space, such that B is dense in A, and C is dense in B, then C is dense in A).

Remarks:

1. When p = +∞, Cc(X) is not dense in L∞(X). Because convergence in L∞-norm implies uniform con-vergence, we can immediately see that any non-continuous function cannot be approached by continuousfunctions in L∞-norm (due to uniform convergence theorem).

2. The completion of (Cc(X), || · ||∞) is the space of all continuous functions on X such that for everyε > 0, there exists a compact set K ⊂ X with |f(x)| < ε on X\K.

3.3.2 Egoroff’s Theorem and Convergence in measure

The Egoroff’s theorem states that pointwise convergent sequence of measurable functions in a finite measurespace is nearly uniformly convergent.

Theorem 3.7 (Egoroff’s theorem). Let (X,M, µ) be a measure space with µ(X) < +∞, and (fn)∞n=1 be asequence of measurable functions that converges almost everywhere to f . Then, for every ε > 0, there existsa subset E ⊂ X with µ(X\E) < ε such that (fn) converges uniformly on E.

Proof. Let Y ⊆ X be the set of all points at which fn(x) converges to f(x), then µ(Y c) = 0, and let

S(n, k) = {x||fi(x) − fj(x)| < 1/k, ∀i, j > n}.

It is easy to see that we have S(n, k) ⊆ S(n + 1, k) for each n, and thus (S(n, k)) is an increasing sequencefor each given k. In addition, for each x ∈ Y , limn→∞ fn(x) = f(x), thus there exists an integer N such that|fi(x) − fj(x)| < 1/k when i, j > N , therefore, x ∈ S(N, k). This follows that Y ⊆

∪∞n=1 S(n, k) for each k.

By continuity of the measure µ, we have

µ(X) = µ(Y ) ≤ limn→∞

µ(S(n, k)) ≤ µ(X). (3.23)

Then, for each k, we can choose nk such that |µ(X) − µ(S(n, k))| < 2−kε. Let E =∩∞

k=1 S(nk, k). Then, wehave the following two claims

1. µ(X\E) < ε, this can be seen by

µ(X\E) ≤∞∑

k=1

µ(X\S(nk, k)) < ε. (3.24)

2. (fn) converges to f uniformly on E. For each δ > 0, we can find k > 1/δ, then for every x ∈ S(nk, k),we have |fi(x) − fj(x)| < 1/k < δ when i, j > nk by definition. And, note that E ⊆ S(nk, k). Thisclaim is proved.

Combining the two claims above, we can see that E is the set that we desire. The proof is completed.

It is important to note that µ(X) < +∞ is necessary. If µ is not a finite measure, then fn = χ[n,n+1] is apointwise convergent sequence that is clearly not uniformly convergent.

52

3.3. IMPORTANT THEOREMS CHAPTER 3. INTRODUCTION TO LP SPACES

Definition 3.10 (Convergence in measure). Let (X,M, µ) be a measure space, f, f1, f2, . . . be measurablefunctions on X, then we say fn converges to f in measure if

∀ε > 0, limn→∞

µ({x||fn(x) − f(x)| > ε}) = 0,

or equivalently,∀ε > 0, ∃N ∈ N+ s.t. µ({x||fn(x) − f(x)| > ε}) < ε, ∀n > N.

As an important corollary of Egoroff’s theorem, in a finite measure space, almost everywhere convergenceimplies convergence in measure.

Proposition 3.5. Let (X,M, µ) be a measure space with µ(X) < +∞, and (fn)∞n=1 be a sequence of mea-surable functions that converges almost everywhere to f , then fn converges to f in measure.

Proof. By Egoroff’s theorem, we can find a subset E with µ(X\E) < ε for any given ε > 0 such that fn

converges to f uniformly on E. Hence, we can find N , such that |fn(x) − f(x)| <= ε on E when n > N . Itfollows that

µ({x||fn(x) − f(x)| > ε}) ≤ µ(X\E) < ε. (3.25)

The proposition is proved.

Note that the converse is in general NOT true. However, we have the following “weak” converse.

Proposition 3.6. Let (X,M, µ) be a measure space with µ(X) < +∞, and (fn)∞n=1 be a sequence of mea-surable functions that converges to a measurable function f in measure. Then, there exists a subsequence(fnk

)∞k=1 such that fnkconverges to f almost everywhere.

Proof. We choose nk such thatµ({x||f(x) − fnk

(x)| > 2−k}) < 2−k. (3.26)

Let Ek = {x||f(x) − fnk(x)| > 2−k}. Then if x /∈

∪∞i=k Ei, then |fni(x) − f(x)| < 2−i for every i ≥ k.

Thus fni(x) → f(x) when i → ∞. Let A =∩∞

k=1

∪∞i=k Ei, if x /∈ A, then x /∈

∪∞i=k Ei for some k, then

fni(x) → f(x) when i→ ∞. It means that fni converges to f on X\A. And, by continuity of µ, we have

µ(A) ≤ limk→∞

µ

( ∞∪i=k

Ei

)= lim

k→∞

∞∑i=k

2−i = limk→∞

2−k+1 = 0. (3.27)

Hence, (fnk) converges to f almost everywhere.

Dominated convergence theorem holds when the condition of almost everywhere convergence is replacedby convergence in measure.

Theorem 3.8 (Dominated convergence theorem with convergence in measure condition). Let (fn)∞n=1 bea sequence of measurable functions defined on a σ-finite measure space (X,M, µ) with |fn| < g for someg ∈ L1(µ). If fn converges to f in measure, then f ∈ L1(µ) and∫

X

fdµ = limn→∞

∫X

fndµ.

Proof. Let A ⊂ X be a measurable subset with µ(A) < +∞, then we claim that∫

Afdµ = limn→∞

∫Afndµ.

Since fn converges to f in measure, it is easy to see that fn|A converges to f |A in measure, therefore, anysubsequence of fn|A converges to f |A in measure. By the proposition above, we can choose a subsequencfor each subsequence of fn|A such that the chosen subsequence converges to f |A almost everywhere, by thestandard form of DCT, we can conclude that the integral of the chosen subsequence converges to

∫Afdµ.

From the above argument, we can see that every subsequence of the sequence (∫

Afndµ) in itself contains a

subsequence that converges to∫

Afdµ, thus (

∫Afndµ) →

∫Afdµ when n→ ∞. The claim is proved.

If µ(X) < +∞, then we are done. If not, since X is σ-finite, we can find a countable collection of disjointfinite measure sets {Ak}∞k=1 such that X =

⊔∞k=1Ak. Then,∫

X

fdµ =∞∑

k=1

∫Ak

fdµ =∞∑

k=1

limn→∞

∫Ak

fndµ = limn→∞

∞∑k=1

∫Ak

fndµ = limn→∞

∫X

fndµ. (3.28)

Since both f and each fn are dominated by some g ∈ L1, thus all series in above formula is absolutelyconvergent, and thus the operations are valid.

53

Chapter 4

Product Measure

4.1 Product Measure Space

4.1.1 Product Measurable Space

Definition 4.1 (Product measurable space). Let (X,M) and (Y,N ) be two measurable spaces, and denoteby M×N the σ-algebra generated by the subsets of form A × B with A ∈ M and B ∈ N (i.e. the smallestσ-algebra that contains the sets of form A×B). Then we call (X × Y,M×N ) be the product measurablespace.

Definition 4.2 (Sections of a set). Let E ⊂ X × Y , then Ex = {y|(x, y) ∈ E} ⊂ Y is called a x-section ofE; Ey = {x|(x, y) ∈ E} ⊂ X is called a y-section of E.

Proposition 4.1. Let (X,M) and (Y,N ) be two measurable spaces, and E be measurable in their productmeasurable space, (i.e. E ∈ M×N ), then Ex ∈ N ∀x ∈ X and Ey ∈ N ∀y ∈ Y .

Important strategy of proof: Before the proof, we first note an important strategy that will be repeatedlyused in proving a series of propositions. Let M be a σ-algebra generated by a S, (i.e. M is the smallestσ-algebra containing S), then to prove some statement P holds for every measurable set in M, it sufficesto show that all sets for which P holds constitute a σ-algebra that contains S. Typically, it comprises thefollowing steps:

1. show that P holds for every set in S;

2. define C to be the collection of all sets for which P holds

3. show that S ⊂ C, which is equivalent to showing that P holds for every set in S;

4. show that C is a σ-algebra by verifying the three conditions;

5. finally, we can conclude that M ⊂ C, which means that P holds for every set in M.

Proof. Let C be the collection of all sets that satisfy Ex ∈ N ∀x ∈ X and Ey ∈ M ∀y ∈ Y . It suffices toprove that C is a σ-algebra that contains the sets of the form A×B with A ∈ M and B ∈ N .

First, it is easy to show that E = A×B ∈ C when A ∈ M and B ∈ N . In this case, Ex is either ∅ or B,and Ey is either ∅ or A, which are all measurable sets.

Then, we show that C defined above is a σ-algebra.

1. ∅ = ∅ × ∅ ∈ C.

2. Let E ∈ C. For Ec, we have (Ec)x = (Ex)c ∈ N and (Ec)y = (Ey)c ∈ M, thus Ec ∈ C.

3. Let E =∪∞

n=1En with En ∈ C, for each n. Then, Ex =∪∞

n=1(En)x ∈ N , and Ey =∪∞

n=1(En)y ∈ M,thus E ∈ C.

Hence, we can conclude that C is a σ-algebra that contains M.

54

4.1. PRODUCT MEASURE SPACE CHAPTER 4. PRODUCT MEASURE

Definition 4.3 (Sections of a function). Given a function f : X × Y → Z, then fx : Y → Z defined byfx(y) = f(x, y) is called a x-section of f ; fy : X → Z defined by fy(x) = f(x, y) is called a y-section of f .

Proposition 4.2. Let (X,M) and (Y,N ) be two measurable spaces, and Z be a topological space, f : (X ×Y,M × N ) → Z be a measurable function, then fx : Y → Z is measurable on (Y,N ), and fy : X → Z ismeasurable on (X,M).

Proof. Let A be a measurable set in Z, and E = f−1(A) then for any x ∈ X, (fx)−1(Z) = {y|f(x, y) ∈ A} =Ex. since f is measurable, thus E ∈ M×N , and thus Ex ∈ N . Therefore, fx is measurable. Likewise, wecan show the measurability of fy.

4.1.2 Measure by Extension: Hahn-Kolmogorov theorem

Before deriving the concept of product measure, we first introduce the following important theorem aboutextending a function to a measure.

Theorem 4.1 (Hahn-Kolmogorov theorem). Let X be a non-empty set, A be an algebra of subsets of X(i.e. closed under set difference, finite union and finite intersection), then any countable additive functionµ0 : A → [0,+∞] extends to a measure µ on M = σ(A) (the σ-algebra generated by aset). If µ0 is σ-finite,then the extension is unique.

Proof. The proof proceeds by a series of claims.Claim 1: Define µ∗ : 2X → [0,+∞] by

µ∗(E) = inf

{ ∞∑n=1

µ0(An)

∣∣∣∣∣E ⊆∞∪

n=1

An, An ∈ A ∀n ∈ N+

}.

then µ∗ is an outer measure on X.

Proof of Claim 1. 1. We consider ∅ as being covered by an empty collection, thus µ∗(∅) = 0.

2. The monotonicity of µ∗ follows from that µ∗ is defined as an infimum.

3. For sub-additivity, let E =∪∞

n=1En, then given ε > 0, for each En, we can choose a cover {Ani}∞i=1 suchthat

∑∞i=1 µ0(Ani) < µ∗(E) + 2−nε. Then all Ani form a countable cover of E, with

∑n,i µ0(Ani) <∑∞

n=1 µ∗(En) + ε. Hence, µ∗(E) ≤

∑∞n=1 µ

∗(En).Then, we can conclude that µ∗ is an outer measure.

Claim 2: µ∗ extends µ0, i.e. µ∗(A) = µ0(A) for every A ∈ A.

Proof of Claim 2. Let A ∈ A. Clearly, A ⊆ A ∪ ∅ ∪ ∅ ∪ · · · , which follows that µ∗(A) ≤ µ0(A). For the otherdirection (µ0(A) ≤ µ∗(A)), it suffices to assume that µ∗(A) < +∞, otherwise we are done. given ε > 0,choose {An}∞n=1 that covers A and

∑∞n=1 µ0(An) < µ∗(A) + ε. B1 = A ∩ A1 and Bn = A ∩ (An\

∪i<nAi),

then {Bn}∞n=1 are in A and they form a partition of A. By countable addivitity and monotonicity of µ0 inA, we have

µ0(A) = µ0

( ∞∪n=1

Bn

)=

∞∑n=1

µ0(Bn) ≤∞∑

n=1

µ0(An) < µ∗(A) + ε. (4.1)

As this holds for every ε > 0, µ0(A) ≤ µ∗(A).

Claim 3: Let M be the collection of all sets satisfying the Caratheodory condition, i.e. E ∈ M iffµ∗(S) = µ∗(S ∩ E) + µ∗(S ∩ Ec), ∀S ⊂ X. Then, M is a σ-algebra.

Proof of Claim 3. 1. When E = ∅, for each S, µ∗(S ∩ E) = µ∗(∅) = 0, and µ∗(S\E) = µ∗(S). Hence, thecondition trivially holds.

2. Let E ∈ M, then Ec ∈ M. This can be easily seen by noting the symmetry in Caratheodory condition.

55

4.1. PRODUCT MEASURE SPACE CHAPTER 4. PRODUCT MEASURE

3. Let E1, E2 ∈ M, and E = E1 ∪ E2. Then,

µ∗(S ∩ (E1 ∪ E2)) + µ∗(S ∩ (Ec1 ∩ Ec

2)) ≤ µ∗(S ∩ E1) + µ∗(S ∩Ec1 ∩E2) + µ∗(S ∩ Ec

1 ∩Ec2)

= µ∗(S ∩ E1) + µ∗(S ∩Ec1) = µ∗(S). (4.2)

Combining this with sub-addivitity of µ∗, we derive the Caratheodory condition. Thus E1 ∪ E2 ∈ M.

4. Note that E1 ∩E2 = (Ec1 ∪Ec

2)c and E1\E2 = E1 ∩Ec

2, we immediately obtain that M is closed underfinite union and intersection as well as set differences. In other words, M is an algebra of subsets of X.

5. It remains to show that M is closed under countable union. Let E =∪∞

n=1En with En ∈ M for eachn ∈ N+. Due to sub-additivity of µ∗, it suffices to show that given any ε > 0, µ∗(S ∩ E) + µ∗(S\E) <µ∗(S) + ε. We can assume here that µ∗(S) < +∞, otherwise we are done.

Let Bn = En\∪

i<nEi, then Bn ∈ M for each n, since M is an algebra. And, E =⊔∞

n=1Bn. LetGn =

∪ni=1Bn, then {Gn}∞n=1 ∈ M, and hence Given ε > 0 and S ⊂ X, by sub-additivity of µ∗, we

have

µ∗(S ∩E) ≤∞∑

n=1

µ∗(S ∩Bn). (4.3)

Hence, we can find N such that

µ∗(S ∩E) <N∑

n=1

µ∗(S ∩Bn) + ε. (4.4)

By induction from Caratheodory condition, it is easy to show that

µ∗(S ∩GN ) =N∑

n=1

µ∗(S ∩Bn). (4.5)

Note that GN satisfies Caratheodory condition, hence,

µ∗(S ∩ E) + µ∗(S ∩ Ec) < µ∗(S ∩GN ) + µ∗(S ∩GcN ) + ε = µ∗(S) + ε. (4.6)

Now, we can conclude that M is a σ-algebra.

Claim 4: A ⊂ M, i.e. Caratheodory condition holds for every A ∈ A.

Proof of Claim 4. It suffices to show that for each A ∈ A, S ⊂ X, and ε > 0, µ∗(E ∩ A) + µ∗(E ∩ Ac) <µ∗(E) + ε. We can assume that µ∗(E) < +∞, otherwise we are done. By definition of µ∗, we can choose{An}∞n=1 ⊂ A covering A such that

∑∞n=1 µ0(An) < µ∗(E) + ε. Then

µ∗(E ∩A) + µ∗(E ∩Ac) ≤∞∑

n=1

(µ∗(An ∩A) + µ∗(An ∩Ac)) =∞∑

n=1

µ∗(An) < µ∗(E) + ε. (4.7)

Hence, A ∈ M.

Claim 5: Define µ : M → [0,+∞] by µ = µ∗|M, then µ is a measure on M.

Proof of Claim 5. From claim 1, we directly know that µ(∅) = 0. It remains to show the σ-additivity. Let{En}∞n=1 ⊂ M be disjoint and E =

⊔∞n=1En, and GN =

⊔Nn=1En. It suffices to show that µ∗(E) ≥∑∞

n=1 µ∗(En). By induction from Caratheodory induction, it is easy to show that

µ∗(E) ≥ µ∗(GN ) =N∑

n=1

µ∗(En) (4.8)

Take the limit as N → ∞, we get

µ∗(E) ≥N∑

n=1

µ∗(En). (4.9)

56

4.1. PRODUCT MEASURE SPACE CHAPTER 4. PRODUCT MEASURE

We can conclude the existence of µ from the claims above. In the following, we continue to show theuniqueness when µ0 is σ-finite, which is summarized by the following claim.

Claim 6: Given any measure ν defined on M that agrees with µ0 on A, ν = µ∗.

Proof of Claim 6. First, we can see that ν ≤ µ∗ which immediately follows from monotonicity of ν. LetE ∈ M, it suffices to show that given ε > 0, ν(E) > µ∗(E)− ε. Assume that µ∗(E) < +∞, we will generalizeit later.

∑∞n=1 µ0(An) < µ∗(E)−ε/2, and let Gn =

∪ni=1An, and G =

∪∞i=1An. It has G ⊃ E. Since G ∈ M,

we haveµ∗(G\E) = µ∗(G) − µ∗(E) <

ε

2. (4.10)

On the other hand, limn→∞ µ∗(Gn) = µ(G) ≥ µ(E), hence, there exists N such that µ∗(GN ) > µ∗(E)− ε/2.Since GN can be expressed as finite union of disjoint sets in A, we can derive that ν(GN ) = µ∗(GN ), sincethey are both measures agreeing on A. Consequently,

ν(E) = ν(G) − ν(G\E) ≥ ν(GN ) − ν(G\E) = µ∗(GN ) − ν(G\E) > µ∗(E) − ε/2 − ε/2 = µ∗(E) − ε. (4.11)

We then consider the case where µ∗(E) = +∞, due to σ-finiteness, we can find a {An}∞n=1 in A that cover Eand µ∗(An) = µ0(An) < +∞. Let Gn =

∪ni=1An, then Gn ∈ A, and µ∗(Gn) < +∞, and thus

ν(E ∩Gn) = µ∗(E ∩Gn). (4.12)

Since µ∗(E) = +∞, given any M > 0, we can find an N , such that µ∗(E) > M and thus ν(E) > M . Itimplies that ν(E) also equals ∞.

The proof of the entire theorem is completed.

4.1.3 Monotone Class

Definition 4.4 (Monotone class). A collection of sets M is called a monotone class if it is closed undermonotonical limit,

1. let E1, E2, . . . ∈ M, if E1 ⊂ E2 ⊂ · · · , then∪∞

i=1Ei ∈ M;

2. let E1, E2, . . . ∈ M, if E1 ⊃ E2 ⊃ · · · , then∩∞

i=1Ei ∈ M.

It is trivial to see that

Proposition 4.3. Every σ-algebra is a monotone class.

Proposition 4.4. Arbitrary intersection of monotone classes is a monotone class.

Then “the smallest monotone class” containing some collection C is defined to be the intersection of allmonotone classes that contain C. The following lemma is important in establishing the product measure.

Lemma 4.1. Let A be an algebra of subsets of X, then the smallest monotone class containing A is preciselythe σ-algebra generated by A.

Proof. Let M denote the smallest monotone class containing A. Then, we are to show M = σ(A). Sinceσ(A) is a monotone class, we have M ⊆ σ(A). For the other direction, it is enough to show that M is aσ-algebra.

1. ∅ ∈ A ⊂ M,

2. Let N = {E ∈ M|Ec ∈ M}. By definition N ⊆ M, and it is easy to verify that N is a monotone classcontaining A, since M is the smallest M ⊆ N . Thus M = N , which follows that M is closed underset complement.

3. Let F ∈ M, and DF = {F |F ∩E ∈ M}. It is easy to verify that DF is a monotone class, thus M ⊂ DF

for each F ∈ M. It implies that M is closed under finite intersection. Note that E1 ∪E2 = (Ec1 ∩Ec

2)c,

therefore, M is also closed under finite union. As a result, we can conclude that M is a algebra.

It is easy to see from definition that, being both a monotone class and an algebra of subsets, M is a σ-algebra.

57

4.1. PRODUCT MEASURE SPACE CHAPTER 4. PRODUCT MEASURE

4.1.4 Product Measure

Proposition 4.5. Let (X,M) and (Y,N ) be two measurable space, and define

A =

{n⊔

i=1

Ai ×Bi

∣∣∣∣∣Ai ∈ M, Bi ∈ N , {Ai ×Bi}∞i=1 are disjount

}

Then, A is the smallest algebra of subsets that contains all sets of the form A×B with A ∈ M and B ∈ N .

Proof. Denote by A∗ the smallest algebra of subsets that contains {A×B|A ∈ M, B ∈ N}. We are to showthat A = A∗. One direction A ⊂ A∗ is trivial, which is directly from the fact that A∗ is closed under finiteunion. For the other direction, it suffices to show that A is in itself an algebra, i.e. it is closed under setcomplement, finite union and intersection. Verification of this is lengthy but not difficult, and the details areomitted here.

Proposition 4.6. Let (X,M, µ) and (Y,N , ν) be two measure spaces, and A be the smallest algebra contain-ing {A×B|A ∈ M, B ∈ N}, then for each E ∈ A, we can write E =

⊔ni=1Ai ×Bi with Ai ∈ M and Bi ∈ N

for each i. Define ρ0 : A → [0,+∞] by ρ0(E) =∑n

i=1 µ(Ai)ν(Bi), then ρ0 is well defined and is countableadditive on A.

Proof. Given E =⊔n

i=1Ai × Bi, then for each x ∈ X, Ex =⊔

i:x∈AiBi, which is clearly measurable in Y .

Define φ : X → [0,+∞] by φ(x) = ν(Ex), then φ(x) =∑n

i=1 ν(Bi)χAi , which is clearly a measurable function.We then define λ : A → [0,+∞] by

λ(E) =∫

X

φdµ.

Then, we have

λ(E) =∫

X

φdµ =n∑

i=1

∫X

ν(Bi)χAidµ =n∑

i=1

µ(Ai)ν(Bi) = ρ0(E). (4.13)

This shows that no matter how you partition E into disjoint union of Ai × Bi, ρ0(E) will be evaluated toλ(E), thus it is well defined.

By monotone convergence theorem, we can immediately see that λ satisfies countable additivity, so doesρ0 since it is identical to λ.

Let (X,M, µ) and (Y,N , ν) be two σ-finite measure spaces, and A be the smallest algebra containing{A × B|A ∈ M, B ∈ N}. Let ρ0 : A → [0,+∞] be defined by ρ0 (

⊔ni=1Ai ×Bi) =

∑ni=1 µ(Ai)ν(Bi), then

ρ0 is countably additive on A. By Hahn-Kolmogorov theorem, ρ0 uniquely extends to a measure on M×N .The following theorem gives an constructive description of the measure.

Theorem 4.2. Let (X,M, µ) and (Y,N , ν) be two σ-finite measure spaces, and given a measurable setE ∈ M × N , let φE : X → [0,+∞] be defined by φE(x) = ν(Ex) and ψE : Y → [0,+∞] be defined byψE(y) = µ(Ey), then φE and ψE are measurable, and∫

X

φEdµ =∫

Y

ψEdν.

In particular, when E =⊔n

i=1Ai ×Bi with Ai ∈ M and Bi ∈ N , then the integral equals∑n

i=1 µ(Ai)ν(Bi).

Proof. Let D be all the subsets of X × Y for which the stated condition holds,

D ={E

∣∣∣∣φE , ψE are measurable, and∫

X

φEdµ =∫

Y

ψEdν

}.

It suffices to show that D contains an σ-algebra that contains all sets in form of A × B with A ∈ M andB ∈ N .

Let A be the smallest algebra of subsets that contains every A×B with A ∈ M and B ∈ N . Given E ∈ A,we can write E =

⊔ni=1Ai × Bi with Ai ∈ M and Bi ∈ N for each i = 1, . . . , n. Then φE =

∑ni=1 ν(Bi)χAi

and ψE =∑n

i=1 µ(Ai)χBi . It is easy to verify that E ∈ D, i.e. the stated conditions holds for E.Due to lemma 4.1, to complete the proof, it suffices to show that D is a monotone class. (If this is true,

then D contains the smallest monotone class which is precisely M×N .)

58

4.2. FUBINI’S THEOREM CHAPTER 4. PRODUCT MEASURE

Let {En}∞n=1 ⊂ D with E1 ⊂ E2 ⊂ · · · , and E =∪∞

n=1En. Then, for each x ∈ X, En,x ↑ Ex andfor each y ∈ Y , Ey

n ↑ Ey. Due to monotonicity and continuity of measure, we have ν(En,x) ↑ ν(Ex), andµ(Ey

n) ↑ µ(Ey). Hence, φE and ψE are measurable. By monotone convergence theorem,∫X

φEdµ = limn→∞

∫X

φEndµ = lim

n→∞

∫Y

ψEndν =

∫Y

ψEdν. (4.14)

Hence, the stated condition holds for E, and thus E ∈ D. We can show that E ∈ D when En ↓ E in a similarway and using σ-finiteness. The above argument shows that D is a monotone class, thus it contains M×Nby lemma 4.1. It follows that the stated condition holds for every measurable set in M×N .

Note that the given integral is precisely the λ defined in the proof of proposition 4.6, hence it equals∑ni=1 µ(Ai)ν(Bi) when E =

⊔ni=1Ai ×Bi with Ai ∈ M and Bi ∈ N .

Definition 4.5 (Product measure). Let (X,M, µ) and (Y,N , ν) be two σ-finite measure spaces, and given ameasurable set E ∈ M× N , let φE : X → [0,+∞] be defined by φE(x) = ν(Ex) and ψE : Y → [0,+∞] bedefined by ψE(y) = µ(Ey), then the product measure µ× ν : M → [0,+∞] is defined by

(µ× ν)(E) =∫

X

ν(Ex)dµ =∫

Y

µ(Ey)dν.

From theorem 4.2, we know that µ× ν is well defined on M×N . While the σ-additivity follows from theproperties of Lebesgue integration and monotone convergence theorem.

Definition 4.6 (Product measure space). Let (X,M, µ) and (Y,N , ν) be two σ-finite measure spaces, then(X × Y,M×N , µ× ν) is called the product measure space.

4.2 Fubini’s theorem

Fubini’s theorem is an important theorem related to Lebesgue integration on measure product spaces.

Theorem 4.3 (Fubini’s theorem). Let (X,M, µ) and (Y,N , ν) be two measure spaces, f : X × Y → C beM×N -measurable function, then

1. If f(x) is a non-negative function, i.e. f(X) ⊆ [0,+∞], and let

φ(x) =∫

Y

fx(y)dν and ψ(y) =∫

X

fy(x)dµ

then φ : X → [0,+∞] and ψ : Y → [0,+∞] are measurable, and∫X×Y

fd(µ× ν) =∫

X

φdµ =∫

Y

ψdν. (4.15)

This can be written as follows∫X×Y

f(x, y)dµ(x)dν(y) =∫

X

(∫Y

f(x, y)dν(y))dµ(x) =

∫Y

(∫X

f(x, y)dµ(x))dν(y). (4.16)

2. If f is a complex function and∫X

(∫Y

|f(x, y)|dν(y))dµ(x) < +∞ or

∫Y

(∫X

|f(x, y)|dµ(x))dν(y) < +∞,

then f ∈ L1(µ× ν).

3. If f ∈ L1(µ×ν), then fx ∈ L1(Y, ν) for almost every x ∈ X, and fy ∈ L1(X,µ) for almost every y ∈ Y ,and φ ∈ L1(µ) and ψ ∈ L1(ν), and Eq.(4.16) holds.

Proof. Here just gives a very brief sketch. The product measure theorem has essentially established a restrictedFubini’s theorem on characteristic functions of measurable sets. This can be readily extended to all non-negative simple functions. By monotone convegence theorem, we can further generalize the results to generalnon-negative measurable functions, and then to complex integrable functions.

59

4.2. FUBINI’S THEOREM CHAPTER 4. PRODUCT MEASURE

We note that σ-finiteness is required here, without which the theorem is not true in general. Here gives anexample. Let X = Y = [0, 1] and (X,M, µ) is Lebesgue measure space, and (Y,N , ν) be counting measurespace. Define f(x, y) = δx,y, which is the characteristic function of the measurable set {(x, y) ∈ X×Y |x = y}.We can see that on one hand ∫

X

(∫Y

f(x, y)dν(y))dµ(x) =

∫X

1dµ(x) = 1; (4.17)

on the other hand, ∫Y

(∫X

f(x, y)dµ(x))dν(y) =

∫X

0dν(y) = 0. (4.18)

Exchanging the order of integration gives rise to different results.

60

Chapter 5

Complex Measures

5.1 The Space of Complex Measures

Definition 5.1 (Complex measure). Let (X,M) be a measurable space, then µ : M → C is called a complexmeasure, if it satisfies the following conditions

1. µ(∅) = 0;

2. if E =⊔∞

i=1Ei with Ei ∈ M for each i and Ei ∩ Ej = ∅ when i 6= j, then µ(E) =∑∞

i=1 µ(Ei).

To make the distinction clear, the measure defined previously whose range is in [0,+∞] is called positivemeasure. A complex measure whose range lies in R is called a real measure or a signed measure.

Definition 5.2 (Total variation). Let µ be a complex measure on a measurable space (X,M), then its totalvariation |µ| : M → [0,+∞] is defined by

|µ|(E) = supE

∑E∈E

|µ(E)|

Here, E is a countable partition of E comprised of measurable sets.

Proposition 5.1. Let µ be a complex measure on a measurable space (X,M), then its total variation |µ| isa positive measure on (X,M).

Proof. It is trivial to see that |µ|(∅) = 0, just by considering a collection of all emptysets as a partition of anemptyset. The important part is to show that |µ| satisfies σ-additivity on M.

Let {An}∞n=1 ⊂ M be a collection of disjoint measurable sets and A =⊔∞

n=1An. By definition, givenε > 0, for each An, we can find a countable partition {Eni}∞i=1 where An =

⊔∞i=1Eni and Eni ∈ M, ∀n, i, such

that∑∞

i=1 |µ(Eni)| > |µ|(An)−2−nε, Gathering all Eni together, we form a collection {Eni, n ∈ N+, i ∈ N+},which is obviously a countable partition of A. Hence, by definition of |µ|, we have

|µ|(A) ≥∞∑

n=1

∞∑i=1

|µ(Eni)| =∞∑

n=1

(|µ|(An) − 2−nε) =

( ∞∑n=1

|µ|(An)

)− ε (5.1)

As this holds for every ε > 0,

|µ|(A) ≥∞∑

n=1

|µ|(An). (5.2)

For the other direction, let {Ek}∞k=1 be a countable partition of A, where each Ek is measurable. Hence, thecollection {Ek ∩An}∞k=1 forms a countable partition of An. By defnition, we have

∞∑k=1

|µ(Ek ∩An)| ≤ |µ|(An). (5.3)

61

5.2. LEBESGUE DECOMPOSITION OF MEASURES CHAPTER 5. COMPLEX MEASURES

On the other hand, {Ek ∩An}∞n=1 is a partition of Ek. Consequently, by additivity of complex measure,

∞∑k=1

|µ(Ek)| =∞∑

k=1

∣∣∣∣∣∞∑

n=1

µ(Ek ∩An)

∣∣∣∣∣ ≤∞∑

k=1

∞∑n=1

|µ(Ek ∩An)| =∞∑

n=1

∞∑k=1

|µ(Ek ∩An)| ≤∞∑

n=1

|µ|(An). (5.4)

Here, the exchange of the sum order is justified by the fact that |µ(Ek ∩ An)| are non-negative. Since thisholds for every partition {Ek} of A, it holds for their supremium, as a result

|µ|(A) ≤∞∑

n=1

|µ|(An). (5.5)

Hence, the σ-additivity of |µ| is established, and we can conclude that |µ| is a positive measure.

Proposition 5.2. Let (X,M) be a measurable space, then

1. all complex measures on (X,M) forms a vector space;

2. all real measures on (X,M) forms a vector space;

3. if µ1, µ2 are positive measures, then µ1 + µ2 and αµ1 with α ≥ 0 are positive measures.

It is trivial to verify this proposition.

Theorem 5.1 (Hahn-Jordan decomposition). Let µ be a real measure on a measurable space (X,M), thenwe can write µ = µ+ − µ− with

µ+ =12(|µ| + µ)

µ− =12(|µ| − µ)

Then both µ+ and µ− are positive measures on (X,M). The decomposition given above is called the Hahn-Jordan decomposition of µ.

Proof. Since |µ| and µ are real measures, thus µ+ and µ− are also real measures. The non-negativeness ofµ+ and µ− readily follows from the definition of |µ|.

5.2 Lebesgue Decomposition of Measures

5.2.1 Absolute continuity and Singularity

Definition 5.3 (Absolute continuity). Let µ and λ respectively be positive and complex measures on a mea-surable space (X,M). λ is said to be absolutely continuous w.r.t µ, denoted by λ� µ if ∀E ∈ M, µ(E) =0 ⇒ λ(E) = 0.

The following proposition gives an equivalent characterization of absolute continuity in ε− δ language.

Proposition 5.3. Let µ and λ respectively be positive and complex measures on a measurable space (X,M),then λ� µ if and only if ∀ε > 0, ∃δ > 0, ∀E ∈ M, µ(E) < δ ⇒ |λ|(E) < ε.

Definition 5.4 (Measure support). Let λ be a complex measure on a measurable space (X,M) and A ∈ M,we say λ is supported in A if ∀E ∈ M, λ(E) = λ(E ∩A).

Definition 5.5 (Singularity). Let µ and λ respectively be positive and complex measures on a measurablespace (X,M). λ is said to be singular w.r.t µ, denoted by λ ⊥ µ, if there is A ∈ M such that λ is supportedon A and µ(A) = 0.

Definition 5.6 (Mutual singularity). Let λ1, λ2 be complex measures on a measurable space (X,M), we sayλ1 and λ2 are mutually singular, denoted by λ1 ⊥ λ2, if there exist two disjoint measurable sets A and Bsuch that λ1 and λ2 are respectively supported on A and B.

62

5.2. LEBESGUE DECOMPOSITION OF MEASURES CHAPTER 5. COMPLEX MEASURES

The following proposition shows that the definitions of singularity and mutual singularity are consistent.

Proposition 5.4. Let µ and λ respectively be positive and complex measures on a measurable space (X,M)then λ is singular w.r.t µ if and only if λ and µ are mutually singular.

Proof. If λ is singular w.r.t. µ, then we can find A such that λ is supported in A and µ(A) = 0, as a result,for every E ∈ M, we have

µ(E ∩Ac) = µ(E) − µ(E ∩A) = µ(E). (5.6)

Hence, µ is supported in Ac, which is a measurable set disjoint from A. Hence, µ and λ are mutually singular.For the other direction, we assume that λ and µ are mutually singular, then we have two disjoint measurablesets A and B such that λ and µ are respectively supported in A and B. As a result,

µ(A) = µ(A ∩B) = µ(∅) = 0. (5.7)

Hence, λ is singular w.r.t. µ.

Proposition 5.5. Let µ be a (complex/real/positive) measure on a measurable space (X,M) that is supportedin A ∈ M, and B be a measurable set with A ⊂ B, then λ is also supported on B.

Proof. For each E ∈ M, we have µ(E∩Ac) = µ(E)−µ(E∩A) = 0. Thus µ(E∩(B\A)) = µ(E∩(B\A)∩Ac) =0. As a result, µ(E ∩B) = µ(E ∩A) + µ(E ∩ (B\A)) = µ(E ∩A) = µ(E). Hence, µ is supported on B.

Proposition 5.6. Let µ be a positive measure, and λ1, λ2 be complex measures on a measurable space (X,M).Then,

1. if λ1, λ2 are supported in A ∈ M, so is α1λ1 + α2λ2 for each α1, α2 ∈ C;

2. if λ1 � µ and λ2 � µ, then α1λ1 + α2λ2 � µ for each α1, α2 ∈ C;

3. if λ1 ⊥ µ and λ2 ⊥ µ, then α1λ1 + α2λ2 ⊥ µ for each α1, α2 ∈ C;

Proof. 1. For each E ∈ M, we have

(α1λ1 +α2λ2)(E ∩A) = α1λ1(E ∩A) +α2λ2(E ∩A) = α1λ1(E) +α2λ2(E) = (α1λ1 +α2λ2)(E) (5.8)

Hence, α1λ1 + α2λ2 is supported in A.

2. For each E with µ(E) = 0, we have

(α1λ1 + α2λ2)(E) = α1λ1(E) + α2λ2(E) = α1 · 0 + α2 · 0 = 0. (5.9)

Hence, α1λ1 + α2λ2 � µ.

3. We can find A1 and A2 respectively for λ1 and λ2 such that λ1 and λ2 are respectively supported inA1 and A2, and µ(A1) = 0, and µ(A2) = 0. Then λ1 and λ2 are both supported in A1 ∪ A2, andµ(A1 ∪A2) = 0. In addition, by the first statement, we know α1λ1 + α2λ2 is supported in A1 ∪A2.

Proposition 5.7. Let λ be a complex measure on a measurable space (X,M), then λ� |λ|.

Proof. For each E ∈ M with |λ|(µ) = 0, by definition of |λ|, we have

|λ(E)| ≤ |λ|(E) = 0. (5.10)

Hence λ(E) = 0. Therefore, λ� |λ|.

Proposition 5.8. Let µ and λ be a respective positive and complex measure on a measurable space (X,M),then

1. if λ is supported in A ∈ M, so is |λ|;

2. if λ� µ, so is |λ|;

3. if λ ⊥ µ, so is |λ|.

63

5.2. LEBESGUE DECOMPOSITION OF MEASURES CHAPTER 5. COMPLEX MEASURES

Proof. 1. If λ is supported in A, then we have for each E ∈ M,

|λ|(E) = supE

∑E∈E

|λ(Ei)| = supE

∑E∈E

|λ(E ∩A)| = |λ|(E ∩A). (5.11)

Here E represents a countable partition of E, and thus {E ∩A|E ∈ E} is a countable partition of E ∩A.Hence, we can conclude that |λ| is supported in A.

2. Assume λ � µ. For each A ∈ M with µ(A) = 0, we have µ(E) = 0 and thus λ(E) = 0 for everymeasurable set E ⊂ A. Hence, for each countable partition E of A, we have∑

E∈E

|λ(E)| = 0. (5.12)

By definition, we have |λ|(A) = 0. Hence, |λ| � µ.

3. Assume λ ⊥ µ. Then we can find A ∈ M such that λ is supported in A, and µ(A) = 0. By the firststatement, we know that |λ| is also supported on A. Hence, |λ| ⊥ µ.

Corollary 5.1. Let λ1 and λ2 be complex measures on a measurable space (X,M) with λ1 ⊥ λ2, then|λ1| ⊥ |λ2|.

Proof. Since λ1 ⊥ λ2, we can find two disjoint measurable sets A and B such that λ1, λ2 are respectivelysupported in A and B. Then |λ1| and |λ2| are also respectively supported in A and B, thus |λ1| ⊥ |λ2|.

Proposition 5.9. Let µ be a positive measure and λ1, λ2 be complex measures on a measurable space (X,M),with λ1 � µ and λ2 ⊥ µ, then λ1 ⊥ λ2. In particular, if λ1 = λ2 = λ, then λ = 0.

Proof. Since λ2 ⊥ µ, there are two disjoint measurable sets A and B such that µ and λ2 are respectivelysupported in A and B. As a result, λ1 is supported in A, thus λ1 ⊥ λ2. If λ1 = λ2 = λ, then for each E ∈ M,we have

λ(E) = λ(E ∩A) = λ(E ∩A ∩B) = λ(∅) = 0. (5.13)

Hence λ = 0.

5.2.2 Lebesgue Decomposition Theorem and Radon-Nikodym Theorem

Proposition 5.10. Let µ be a positive measure on a measurable space (X,M), and h ∈ L1(µ), defineλ : M → C by

λ(E) =∫

E

hdµ,

then λ is absolutely continuous w.r.t. µ, i.e. λ� µ.

Proof. This immediately follows from the fact that integration on a null set is zero.

Lemma 5.1. Let (X,M, µ) be a σ-finite positive measure space, then there exists f ∈ L1(µ) such that0 < f < 1.

Proof. Since µ is σ-finite, we can write X =∪∞

n=1En with µ(En) < +∞. Let

gn =2−n

1 + µ(En)χEn (5.14)

Let f =∑∞

n=1 gn. We can see that for each x ∈ X,

0 < f(x) <∞∑

n=1

2−n = 1. (5.15)

And by monotone convergence theorem,∫X

fdµ =∞∑

i=1

∫X

gndµ =∞∑

i=1

2−n

1 + µ(En)µ(En) ≤

∞∑i=1

2−n = 1. (5.16)

Hence, f ∈ L1(µ).

64

5.2. LEBESGUE DECOMPOSITION OF MEASURES CHAPTER 5. COMPLEX MEASURES

Theorem 5.2 (Lebesgue Decomposition Theorem). Let µ be a σ-finite positive measure on a measurablespace (X,M), and λ be a σ-finite complex measure on (X,M), then there are unique complex measures λa

and λs such that λ = λa + λs, λa � µ, and λs ⊥ µ.

Theorem 5.3 (Radon-Nikodym Theorem). Let (X,M, µ) be a σ-finite measure space, and λ be a finitemeasure that is absolutely continuous w.r.t. µ, then there is h ∈ L1(µ) such that dλ = hdµ, i.e.

∀E ∈ M, λ(E) =∫

E

hdµ.

Here, h is called the Radon-Nikodym derivative of λ w.r.t µ, and is written as h = dλ/dµ.

These two theorems are proved together in the following.

Proof. First of all, we assume λ be positive and finite. The conclusion will be generalized later.By previous lemma, we can choose w ∈ L1(µ) with 0 < w < 1. Define φ : M → [0,+∞] by

ϕ(E) = λ(E) +∫

E

wdµ. (5.17)

We can write this as dϕ = dλ+ wdµ for short. It is easy to verify that ϕ is a positive and finite measure on(X,M), and λ ≤ ϕ. Note that L2(X,M, ϕ) is a Hilbert space. Consider the functional given by f 7→

∫Xfdλ.

Then we have ∣∣∣∣∫X

fdλ

∣∣∣∣ ≤ ∫X

|f |dλ ≤∫

X

|f |dϕ = 〈|f |, 1〉 ≤ ||f ||2 · ||1||2 = ||f ||2(ϕ(X))1/2. (5.18)

Since ϕ(X) < +∞, thus the functional defined above is a bounded linear functional. By Riesz representationtheorem, we there exists a unique g ∈ L2(ϕ) such that∫

X

fdλ = 〈f, g〉 =∫

X

fgdϕ, ∀f ∈ L2(ϕ). (5.19)

This can be written as dλ = gdϕ. Hence, for each measurable set E, we have

λ(E) =∫

X

gdϕ, (5.20)

and thus the average of g on E has

AE(g) =1

ϕ(E)

∫X

gdϕ =λ(E)ϕ(E)

∈ [0, 1]. (5.21)

Hence, we can conclude that g(x) ∈ [0, 1], a.e.[ϕ]. Without losing generality, we can assume that 0 ≤ gle1.Moreover, we have ∫

X

fdλ =∫

X

fgdϕ =∫

X

fgdλ+∫

X

fgwdµ. (5.22)

This can be rewritten as ∫X

f(1 − g)dλ =∫

X

fgwdµ. (5.23)

Let A = g−1[0, 1), and B = g−1{1}, then X = A tB (due to 0 ≤ gle1). Define λa, λs : M → [0,+∞] by

λa(E) := λ(E ∩A) and λs(E) := λ(E ∩B). (5.24)

It is easy to verify that λa and λs are both positive measures and λ = λa + λs. From Eq.(5.23), we have∫B

wdµ =∫

B

gwdµ

∫X

χBgwdµ =∫

X

χB(1 − g)dµ =∫

B

0dµ = 0. (5.25)

Since 0 < w < 1, from this we can conclude that µ(B) = 0; while it is obvious that λs is supported in B, thusλs ⊥ µ. For λa, we let fn = (1 +

∑ni=1 g

i)χE . Again, by Eq.(5.23), we have∫E

(1 − gn+1)dλ =∫

X

fn(1 − g)dλ =∫

X

fngwdλ =∫

E

g(1 + g + · · · + gn)dλ. (5.26)

65

5.2. LEBESGUE DECOMPOSITION OF MEASURES CHAPTER 5. COMPLEX MEASURES

As n → ∞, we have 1 − gn+1 ↑ chiA and g(1 + g + · · · + gn)w ↑ h where h is some non-negative measurablefunction. By monotone convergence theorem, we have χAdλ = hdµ.

λa(E) = λ(E ∩A) =∫

E

χAdλ =∫

E

hdµ. (5.27)

Hence, λa � µ. Recall our assumption that λ is a finite measure, hence h ∈ L1(µ).Note that up to now we have prove Radon-Nikodym theorem for the case with λ being a positive measure.

If λ is a finite complex measure, we can decompose it into real and imaginary, positive and negative parts(Hahn-Jordan decomposition), and find the Radon-Nikodym derivatives for each of this part.

For Lebesgue decomposition theorem, we can decompose the space into disjoint union of finite measuresets. And perform the decomposition within each set respectively.

In the following we show that the Lebesgue decomposition is unique. Suppose λ = λa +λs = λ′a +λ′s withλa, λ

′a � µ and λs, λ

′s ⊥ µ. Then, we have

λ′a − λa = λs − λ′s. (5.28)

While λ′a − λa � µ and λs − λ′s ⊥ µ, thus λ′a = λa and λ′s = λs.The uniqueness of the Radon-Nikodym derivative (up to almost everywhere equality) follows readily from

the fact that ∫E

hdµ =∫

E

h′dµ ∀E ∈ M ⇒ h = h′ a.e.[µ]. (5.29)

Theorem 5.4 (Polar decomposition). Let µ be a complex measure on a measurable space (X,M), then thereexists a complex measurable function h with |h(x)| = 1 ∀x ∈ X, such that dµ = hd|µ|, i.e.

µ(E) =∫

E

hd|µ|, ∀E ∈ M.

Proof. Since µ � |µ|, by Radon-Nikodym theorem, there exists a complex measurable function h such thatµ = hd|µ|. In the following, we show that |h(x)| = 1 almost everywhere w.r.t. |µ|. Let Ar = {x ∈ X :|h(x)| < r}. Let {Er,j}j be partition of Ar.

∞∑j=1

|µ(Er,j)| =∞∑

j=1

∣∣∣∣∣∫

Er,j

hd|µ|

∣∣∣∣∣ ≤ r

∞∑j=1

|µ|(Er,j) = rµ(Ar). (5.30)

This follows that |µ|(Ar) ≤ r|µ|(Ar), implying that |µ|(Ar) = 0. As this holds for any 0 ≤ r < 1, we knowthat |h(x)| ≥ 1, a.e.[µ]. For any E ∈ M with |µ|(E) > 0, we have

|AE(h)| =∣∣∣∣ 1|µ|(E)

∫E

hd|µ|∣∣∣∣ = |µ(E)|

|µ|(E)≤ 1. (5.31)

Therefore, |h(x)| ≤ 1, a.e.[µ]. Then, we can conclude that |h(x)| = 1, a.e.[µ]. Choose an h′ = h, a.e.[µ] andh′(x) = 1, ∀x ∈ X, then we are done.

66

Part II

Fundamentals of Functional Analysis

67

Chapter 6

Review of Linear Algebra

6.1 Vector Spaces

First, we review several key concepts in linear algebra, including vector space, linear functional, linear trans-form, subspaces, and quotient spaces. It is important to note that the definitions of some concepts are givenin a generalized way that is different from the one in an elementary linear algebra textbook, such as linearspan, basis, etc.

Definition 6.1 (Vector space). A vector space is a set X over a field F is a set X together with an addtion+ : X ×X → X and a scalar multiplication · : F ×X → X, which satisfy the following axioms

1. (commutativity of addition) ∀x, y ∈ X x+ y = y + x;

2. (associativity of addition) ∀x, y, z ∈ X (x+ y) + z = x+ (y + z);

3. (identity element of addition) ∃0 ∈ X ∀x ∈ X x+ 0 = x;

4. (inverse element of addition) ∀x ∈ X ∃(−x) x+ (−x) = 0;

5. (identity of scalar multiplication) ∀x 1x = x;

6. (distributivity of scalar multiplication over vector addition) ∀α ∈ F, x, y ∈ X α(x+ y) = αx+ αy;

7. (distributivity of scalar multiplication over field addition) ∀α, β ∈ F, x ∈ X (α+ β)x = αx+ βx;

8. (compatibility of scalar multiplication and field multiplication) ∀α, β ∈ F, x ∈ X α(βx) = (αβ)x.

For conciseness of notation, one typically writes x− y to represent x+ (−y). When F is R, it is called a realvector space; when F is C, it is called a complex vector space.

Definition 6.2 (Linear independence). Let S be a subset of a vector space X over field F, if there exists afinite subset x1, . . . , xn ∈ S and α1, . . . , αn ∈ F which are not all zeros, such that

∑ni=1 αixi = 0, then S is

called linearly dependent; otherwise it is called linear independent.

Definition 6.3 (Linear span). Let S be a subset of a vector space X, the linear span of X, denoted byspanS is defined to be the set of all linear combinations of finitely many vectors in S,

spanS =

{n∑

i=1

αixi

∣∣∣∣∣x1, . . . , xn ∈ S, α1, . . . , αn ∈ F

}If spanS = X, we say that S spans X.

Definition 6.4 (Basis). A basis of a vector space X is a linearly independent set that spans X.

Definition 6.5 (Dimension). If a vector space X admits a finite basis, then all basis in X is finite, and theyhave the same cardinality called the dimension of X, denoted by dimX. Then such a vector space is calleda finite dimensional vector space. Otherwise (X admits no finite basis), then it is called an infinitedimensional vector space, and we write dimX = +∞ in this case.

68

6.2. SUBSPACES CHAPTER 6. REVIEW OF LINEAR ALGEBRA

Note that the notion of basis introduced above works well in finite dimensional spaces, but there areseveral subtly different ways in generalizing it to infinite dimensional spaces. We will discuss a generalizednotion of basis later (when introducing normed spaces).

6.2 Subspaces

Definition 6.6 (Subspace). Let X be a vector space over the field F, and Y ⊂ X, if Y together with theaddition and scalar multiplication inherited from X forms a vector space over F, then Y is called a subspaceof X

Proposition 6.1. Let Y be a subset of a vector space X over field F, then Y is a subspace of X if and onlyif 0 ∈ Y and Y is closed under linear operations i.e. x, y ∈ Y ⇒ x+ y ∈ Y and x ∈ Y, α ∈ F ⇒ αx ∈ Y .

Proposition 6.2. Intersection of subspaces of a vector space is also a subspace.

Proof. Let U be a collection of subspaces, denote V =∩

U∈U U .

1. Since every U ∈ U is a subspace, 0 ∈ U, ∀U ∈ U , thus 0 ∈ V .

2. Let x, y ∈ V and α, β ∈ F, then x, y ∈ U, ∀U ∈ U , thus αx + βy ∈ U, ∀U ∈ U , which means thatαx+ βy ∈ V . Hence, V is closed under linear operations.

To conclude, V is a subspace.

Proposition 6.3 (Direct sum of subapces). Let X be a vector space over the field F, U and V be subspacesof X, then denote

U + V = {u+ v|u ∈ U, v ∈ V },

then U +V is also a subspace of X. In particular, U ∩V = {0} if and only if every x ∈ U +V can be uniquelyexpressed as x = u + v with u ∈ U and v ∈ V . In this case, U + V is called a direct sum of U and V ,denoted by U ⊕ V .

Proof. We first show that U + V is a subspace of X. Let x1, x2 ∈ U + V , then we can write x1 = u1 + v1 andx2 = u2 + v2, with u1, u2 ∈ U and v1, v2 ∈ V . Hence, for any α1, α2 ∈ F, we have

α1x1 + α2x2 = α1(u1 + v1) + α2(u2 + v2) = (α1u1 + α2u2) + (α1v1 + α2v2) ∈ U + V. (6.1)

It follows that U + V is a subspace of X, so is U ⊕ V .Then, we show that U ∩ V = {0} if and only if the decomposition of x into u+ v with u ∈ U and v ∈ V

is unique. For one direction, we assume that U ∩ V = {0}, and then show the uniqueness of the expression.Suppose x = u+ v = u′ + v′ with u, u′ ∈ U and v, v′ ∈ V . Hence, u− u′ = v− v′ which are in both U and V ,thus u− u′ = v − v′ = 0, and therefore u = u′ and v = v′. For the other direction, we assume the uniquenessand prove that U ∩ V = {0}. If there is non-zero vector x ∈ U ∩ V , then −x ∈ U ∩ V , and we can write0 = 0 + 0 and 0 = x+ (−x), contradicting the uniqueness of expression.

Proposition 6.4. Let X be a vector space and U be its subspace, then there is a subspace V of X such thatX = U ⊕ V . (Note that V is generally not unique).

Proposition 6.5. Let U and V be subspaces of vector X over the field F, then

dim(U + V ) + dim(U ∩ V ) = dim(U) + dim(V ),

In particular,dim(U ⊕ V ) = dim(U) + dim(V ).

69

6.3. LINEAR MAPS CHAPTER 6. REVIEW OF LINEAR ALGEBRA

6.3 Linear Maps

Definition 6.7 (Linear map). Let X and Y be vector spaces over field F, a map T : X → Y is called alinear map if it preserves linear operations, i.e.

T (αx+ βy) = αT (x) + βT (y), ∀x, y ∈ X, α, β ∈ F.

A linear map is also called a linear transform, or a linear operator. We often write Tx instead of T (x).

Definition 6.8 (Domain and Range of linear map). Let T : X → Y be a linear map, the set {T (x)|x ∈ X} iscalled the range or the image of T , denoted by ImT . And X is called the domain of T , denoted by DomT .

Proposition 6.6. Let T : X → Y be a linear map, then ImT is subspace of Y .

Proof. Let y1, y2 ∈ ImT , then there are x1, x2 ∈ X such that y1 = Tx1 and y2 = Tx2. Hence, for anyα1, α2 ∈ F, we have

α1y1 + α2y2 = α1Tx1 + α2Tx2 = T (α1x1 + α2x2) ∈ ImT. (6.2)

Hence ImT is a subspace of Y .

Definition 6.9 (Kernel(Null space)). Let T : X → Y be a linear map, the set {x ∈ X|T (x) = 0}, denoted bykerT , is called the kernel or null space of f .

Proposition 6.7. Let T : X → Y be a linear map, then kerT is a subspace of X.

Proof. Let x1, x2 ∈ kerT , then for any α1, α2 ∈ F, we have T (α1x1 + α2x2) = α1Tx1 + α2Tx2 = 0, whichfollows that α1x1 + α2x2 ∈ kerT . Hence, kerT is a subspace of X.

Proposition 6.8. Let T : X → Y be a bijective linear map, then the inverse map T−1 is also linear.

Proof. Let y1, y2 ∈ Y , then since T is bijective, there exist unique x1, x2 ∈ X such that Tx1 = y1 andTx2 = y2, i.e. x1 = T−1y1 and x2 = T−1y2. Then, for any α1, α2 ∈ F, we have T (α1x1+α2x2) = α1y1+α2y2,thus T−1(α1y1 + α2y2) = α1x1 + α2x2 = α1T

−1x1 + α2T−1x2. It means that T−1 is linear.

Definition 6.10 (Linear isomorphism). Let X and Y be two vector spaces, a bijective linear map T : X → Y(there exists an inverse map) is called a linear isomorphism. X and Y are said to be linearly isomorphicif there is a linear isomorphism between them.

Proposition 6.9. Let T : X → Y be a linear map, then

1. T is injective if and only if Tx = 0 ⇒ x = 0, i.e. kerT = {0};

2. T is surjective if and only if ImT = Y ;

3. T is bijective (isomorphism) if and only if kerT = {0} and ImT = Y .

Proof. 1. First, we assume that T is injective. Then there is a unique x such that Tx = 0. And we knowT0 = 0, thus Tx = 0 ⇒ x = 0. For the other direction, we assume that Tx = 0 ⇒ x = 0. Then forx1, x2 ∈ X with x1 6= x2, we have Tx1 − Tx2 = T (x1 − x2) 6= 0, i.e. Tx1 6= Tx2, thus T is injective.

2. This is just a re-statement of the definition of a surjective map.

3. This immediately follows from the above two points.

The following proposition tells us that the characteristics of a linear map can be determined from itsbehavior on the basis.

Proposition 6.10. Let B be a basis of a vector space X, and T : X → Y be a linear map, and TB = {Te|e ∈B},

1. if T is injective, then TB is linearly independent;

2. if T is surjective, then TB spans Y ;

70

6.4. QUOTIENT SPACES CHAPTER 6. REVIEW OF LINEAR ALGEBRA

3. if T is bijective (isomorphism), then TB is a basis of Y .

Proof. We only need to prove the first two points, while the third one is an immediate corollary of them.

1. Suppose T is injective and TB is linearly dependent, then there exists e1, . . . , en ∈ B, and α1, . . . , αn ∈ Fsuch that

∑ni=1 αiTei = 0, thus T (

∑ni=1 αiei) = 0. Because T is injective,

∑ni=1 αiei = 0, contradicting

the assumption that B is a basis.

2. When T is surjective, for each y ∈ Y , we have y = Tx for some x ∈ X. Since B is a basis of X, we canwrite x =

∑ni=1 αiei with ei ∈ B for each i. As a result, y =

∑ni=1 αiTei, thus y ∈ span(TB).

Theorem 6.1. Let T : X → Y be a linear transform, then

dim(X) = dim(ImT ) + dim(kerT ).

Proof. Choose a subspace V of X such that X = kerT ⊕ V , then dim(X) = dim(kerT ) + dim(V ). It sufficesto show that V and ImT is isomorphic. Let T ′ : V → ImT be the restriction of T , we are going to show thatT ′ is an isomorphism.

Given x ∈ V with T ′x = 0, then Tx = 0, thus x ∈ kerT . Note that V ∩ kerT = {0}, thus x = 0.Hence, T ′ is injective. For each y ∈ ImT , then there exists x ∈ X such that y = Tx. On the other hand,X = kerT ⊕ V , hence there is u ∈ kerT and v ∈ V such that x = u + v. Therefore, y = Tu + Tv, andTu = 0, thus y = Tv = T ′v. Hence, T ′ is surjective. To conclude, T ′ is an linear isomorphism, thusdim(ImT ) = dim(V ).

Definition 6.11 (Rank of linear transform). Let T be a linear transform, then dim(ImT ) is called the rankof T , denoted by rankT .

6.4 Quotient Spaces

Proposition 6.11. Let X be a vector space and E be its subspace, then the relation x ∼ y defined by x−y ∈ Eis an equivalence relation, and the equivalence class containing x is given by [x] = x + E = {x + v|v ∈ E},which is called a coset of E.

Definition 6.12 (Quotient space). Let X be a vector space over field F and E be its subspace, then theequivalence relation x ∼ y defined by x − y ∈ E induces the quotient set Q = {[x]|x ∈ E}, on which we candefine addition as

[x] + [y] = [x+ y].

and scalar multiplication asα[x] = [αx].

These operations are well defined (independent from the choice of representatives). Hence, Q together withthe these linear operations forms a vector space over field F, called the quotient space, denoted by X/E.

The following proof verifies that the linear operations defined on the quotient space are well defined.

Proof. For addition, we are to show x ∼ x′, y ∼ y′ ⇒ x+y ∼ x′+y′, which can be seen from (x′+y′)−(x+y) =(x′ − x) + (y′ − y) ∈ E. For scalar multiplication, we are to show x ∼ x′ ⇒ αx ∼ αx′, ∀α ∈ F, which can beseen from αx′ − αx = α(x′ − x) ∈ E.

Proposition 6.12. Let X be a vector space, and E be its subspace, let Y = X/E, and define T : X → Y byTx = [x], then T is a linear map, and kerT = E.

Proof. Let x, y ∈ X, then for any α, β ∈ F, we have

T (αx+ βy) = [αx+ βy] = [αx] + [βy] = α[x] + β[y] = αTx+ βTy (6.3)

Hence, T is linear. AndTx = 0 ⇔ [x] = 0 ⇔ x ∼ 0 ⇔ x ∈ E. (6.4)

This implies that kerT = E.

71

6.4. QUOTIENT SPACES CHAPTER 6. REVIEW OF LINEAR ALGEBRA

Theorem 6.2. Let T : X → Y be a linear map then X/ kerT is a isomorphic to ImT , in particular, the mapT ′ : X/ kerT → ImT given by T ′[x] = Tx is a well defined linear isomorphism.

Proof. To show that T ′ is well defined, it is equivalent to showing that given x, y ∈ X with x ∼ y, we haveTx = Ty, this can be seen from

Tx− Ty = T (x− y) = 0, (6.5)

the last equality is due to that x− y ∈ kerT . Then, we show that T ′ is an isomorphism. First, given x withT ′[x] = 0, then Tx = 0, thus x ∈ kerT , i.e. [x] = [0]. Hence, T is injective. Second, for each y ∈ ImT , wecan find x ∈ X with Tx = y, and thus T ′[x] = y. Hence, T is surjective.

Proposition 6.13 (Codimension). Let X be a vector space over field F, and E be a subspace of X, then thedimension of the quotient space dim(X/E) is called the codimension of E in X, denoted by codimE.

Proposition 6.14. Let X be a vector space, U and V be its subspaces with X = U ⊕ V , then X/U isisomorphic to V , in particular, T : V → X/U defined by v 7→ [v] to V is an isomorphism.

Proof. Suppose Tv = [0], then v ∈ U , since U ∩ V = {0}, thus v = 0. Hence, T is injective. Given any[x] ∈ X/U , we can write x = u+ v, thus x− v ∈ U , i.e. x ∼ v. As a result, Tv = [v] = [x]. It follows that Tis surjective.

Corollary 6.1. Let X be a vector space over field F, and E be a subspace of X, then

dim(E) + codim(E) = dim(E) + dim(X/E) = dim(X).

72

Chapter 7

Normed and Banach Spaces

7.1 Metric Spaces

7.1.1 Metric and Metric Spaces

Definition 7.1 (Metric (distance)). Let X be a set, then d : X ×X → R is called a metric or a distanceon X if it satisfies

1. (positiveness) d(x, y) ≥ 0, ∀x, y ∈ X, and d(x, y) = 0 ⇔ x = y;

2. (symmetry) d(x, y) = d(y, x), ∀x, y ∈ X;

3. (triangle inequality) d(x, z) ≤ d(x, y) + d(y, z), ∀x, y, z ∈ X.

Definition 7.2 (Metric space). A set X together with a metric d defined thereon is called a metric space,denoted by (X, d).

Proposition 7.1. Each metric space induces a topology, which is generated by the topological basis {B(x, r)|x ∈X, r > 0}, where B(x, r) = {y ∈ X|d(y, x) < r}. Being a topological space with the induced topology, a metricspace is Hausdorff.

Definition 7.3 (Open ball, Closed ball, and Sphere). Let (X, d) be a metric space, then

1. B(x0, r) = {x ∈ X|d(x, x0) < r} is called the open ball centered at x0 with radius r;

2. B(x0, r) = {x ∈ X|d(x, x0) ≤ r} is called the closed ball centered at x0 with radius r;

3. S(x0, r) = {x ∈ X|d(x, x0) = r} is called the sphere centered at x0 with radius r.

It is easy to see that open balls are open sets, closed balls and spheres are closed sets.

7.1.2 Convergence in Metric Spaces

Definition 7.4 (Convergence in metric space). Let (X, d) be a metric space, a sequence (xn)∞n=1 is said toconverges to x ∈ X, if d(xn, x) → 0, and x is called the limit of (xn), denoted by limn→∞ xn = x or xn → x.

Definition 7.5 (Bounded set). A subset A of a metric space (X, d) is said to be bounded, if there existsM > 0 such that d(x, y) ≤M, ∀x, y ∈ A.

Definition 7.6 (Cauchy sequence). Let (X, d) be a metric space, a sequence (xn)∞n=1 is called a Cauchysequence for every ε > 0, there exists N such that d(xm, xn) < ε, ∀m,n > N .

The relations between convergence, boundedness and Cauchy sequence are summarized by the followingproposition.

Proposition 7.2. Let (X, d) be a metric space, and (xn)∞n=1 be a sequence in X, then

1. if (xn) is a Cauchy sequence, then (xn) is bounded;

73

7.1. METRIC SPACES CHAPTER 7. NORMED AND BANACH SPACES

2. if (xn) is convergent, then (xn) is a Cauchy sequence;

3. if (xn) is a Cauchy sequence and has a convergent subsequence that converges to x, then (xn) convergesto x;

4. if every subsequence of (xn) has a subsequence that converges to the same limit x, then (xn) convergesto x.

Proof. 1. Since (xn) is Cauchy, we can find N , such that d(xN , xm) < ε for some ε > 0 and every m > N .Let D = max d(x1, xN ), . . . , d(xN−1, xN ), ε, then for any i, j ∈ N+, we have

d(xi, xj) ≤ d(xi, xN ) + d(xj , xN ) ≤ 2D. (7.1)

Hence, (xn) is bounded.

2. Let (xn) converge to x, then for any ε > 0, there exists N such that d(xn, x) < ε/2 for any n > N .Hence, for any m,n > N , we have d(xm, xn) ≤ d(xm, x) + d(xn, x) < ε, which means that (xn) isCauchy.

3. Let (xn) be a Cauchy sequence with a convergent subsequence (xnk) that converges to x. Then, given

ε > 0, we can find N such that d(xnk, x) < ε/2 when nk > N , and d(xn, xm) < ε/2 when n,m > N ,

thus for every n > N , we can find a nK > n, and d(xn, x) ≤ d(xn, xnK) + d(xnK

, x) < ε. Hence (xn)converges to x.

4. Suppose (xn) does not converge to x, we can find a subsequence of (xnk) such that d(xnk

, x) > ε,∀k ∈ N+

for some ε > 0. It is easy to see that every subsequence of (xnk) does not converge to x, violating the

assumption that every subsequence has a subsequence that converges to x.

Definition 7.7 (Dense set). Let X be a topological space, and A ⊂ X is called dense in X, if its closureA = X.

There are several equivalent statements in testing whether a set is dense.

Proposition 7.3. Let X be a metric space, A be a subset of X then the following statements are equivalent

1. A is dense;

2. for every x ∈ X, there is a sequence (an)∞n=1 in A such that an → x;

3. every open ball in X contains some point in A.

Proof. 1. (1) ⇒ (2). This immediately follows from the fact in topology that every point of A is either inA or a limit point of A;

2. (2) ⇒ (3). Given x ∈ X, we find (an) in A that converges to x. For any ε > 0, there exists N ∈ N+

such that d(an, x) < ε i.e. an ∈ B(x, ε) for every n > N . Hence, B(x, ε) ∩A 6= ∅.

3. (3) ⇒ (1). If statment (3) does not hold, then we can find a non-empty open ball U with U ∩ A = ∅,which implies that A ⊂ U c. Since U c is closed. A ⊂ U c, thus A 6= X.

Definition 7.8 (Separable metric space). A metric space X is said to be separable if there exists a countabledense set in X.

Definition 7.9 (Complete metric space). A metric space X is said to be complete if every Cauchy sequencein X is convergent.

74

7.2. NORMED SPACES CHAPTER 7. NORMED AND BANACH SPACES

7.2 Normed Spaces

7.2.1 Norm and Normed Spaces

In the following discusssion, the field F is either R or C.

Definition 7.10. Let X be a vector space over the field F, a norm on X is a real-valued function || · || onX that satisfies

1. ∀x ∈ X ||x|| ≥ 0 and ||x|| = 0 ⇔ x = 0;

2. ∀x ∈ X,α ∈ F ||αx|| = |α|||x||;

3. ∀x, y ∈ X, ||x+ y|| ≤ ||x|| + ||y||.

Definition 7.11 (Normed space). A (real or complex) vector space X together with a norm || · || definedthereon is called a normed space, denoted by (X, || · ||).

Proposition 7.4. Let (X, || · ||) be a normed space, then d : X × X → R defined by d(x, y) = ||x − y|| isa metric on X, which is called the induced metric. Hence, every normed space is considered as a metricspace with with the induced normed.

Proposition 7.5. Let (X, || · ||) be a normed space, then the norm x 7→ ||x|| is a continuous function.

Proof. When xn → x, by definition of convergence in metric space, it has ||xn−x|| → 0, hence |||xn||−||x||| ≤||xn − x|| → 0, which implies that || · || is continuous.

Proposition 7.6. The addition (x, y) 7→ x + y and scalar multiplication (α, x) 7→ αx in a normed space iscontinuous.

Proof. Let (xn) → x, (yn) → y, and (αn) → α. Then,

||(x+ y) − (xn + yn)|| = ||(x− xn) + (y − yn)|| = ||x− xn|| + ||y − yn|| → 0 (7.2)

And, since (xn) is convergent, thus it is bounded, and there is M such that ||xn|| ≤M for every n, and thus

||αnxn − αx|| ≤ ||αnxn − αxn|| + ||αxn − αx|| ≤M ||αn − α|| + ||α|| · ||xn − x|| → 0. (7.3)

The two equations above respectively show the continuity of addition and scalar multiplication.

Definition 7.12 (Equivalent norms). Let X be a vector space, || · ||1 and || · ||2 be two norms defined on Xif there are a, b > 0 such that

a||x||1 ≤ ||x||2 ≤ b||x||1, ∀x ∈ X,

then || · ||1 and || · ||2 are said to be equivalent norms. It is easy to verify that that this is an equivalencerelation between norms on X.

Theorem 7.1. Let X be a vector space, || · ||1 and || · ||2 be two norms on X, then they are equivalent normsif and only if they induce the same topology of X.

Proof. First we show that equivalent norms induce the same topology. Suppose there are a, b > 0 such thata||x||1 ≤ ||x||2 ≤ b||x||1, ∀x ∈ X. Let T1 and T2 be the topologies respectively induced by || · ||1 and || · ||2.Given A ∈ T1, then for every a ∈ A, we can find an r > 0 such that {x|||x − a||1 < r} ⊂ A, and hence{x|||x− a||2 < br} ⊂ A, thus A ∈ T2. Likewise, we can show A ∈ T2 ⇒ A ∈ T1. As a result, T1 = T2.

Then we show that the norms that induce the same topology are equivalent. Suppose both || · ||1 and|| · ||2 induce the topology T , then within {x|||x||1 < 1}, we can find {x|||x||2 ≤ a} for some a > 0, and in{x|||x||2 < 1} we can find C = {x|||x||1 < b} for some b > 0, hence we have ||x||2 = a ⇒ ||x||1 < 1 and||x||1 = b⇒ ||x||2 < 1, therefore, (1/b)||x||1 ≤ ||x||2 ≤ a||x||1 for every x, which implies that || · ||1 and || · ||2are equivalent norms.

Theorem 7.2. All norms defined on a finite dimensional vector space are equivalent.

75

7.2. NORMED SPACES CHAPTER 7. NORMED AND BANACH SPACES

Proof. Let X be an n-dimensional vector spaces with a basis {e1, . . . , en}. Then any vector x ∈ X can beuniquely written as x =

∑ni=1 αiei. We define ||x||1 =

∑ni=1 |αi|, then it is easy to verify that || · ||1 is a norm

on X. It suffices to show that every norm on X is equivalent to || · ||1.Let || · || be a norm on X. Then for every x =

∑ni=1 αiei ∈ X, we have

||x|| ≤n∑

i=1

|αi|||ei|| ≤(

nmaxi=1

||ei||) n∑

i=1

|αi| =(

nmaxi=1

||ei||)||x||1. (7.4)

Let S = {x|||x|| = 1}, which is a closed set due to the continuity of norm, thus || · ||1 attains the minimumvalue on S, i.e. there exists x∗ ∈ S with ||x∗||1 = minx∈S ||x||1. Since x∗ 6= 0, thus ||x∗||1 > 0. Hence, forevery x 6= 0 we have

||x||1 = ||x|| · ||(x/||x||)||1 ≥ ||x|| · ||x∗||1. (7.5)

Combining the results above, we can conclude that || · || is equivalent to || · ||1.

7.2.2 Seminorm and Norm of Quotient Spaces

Definition 7.13 (Seminorm). Let X be a vector space over the field F, a seminorm on X is a real-valuedfunction ρ on X that satisfies

1. ∀x ∈ X ρ(x) ≥ 0;

2. ∀x ∈ X,α ∈ F ρ(αx) = |α|ρ(x);

3. ∀x, y ∈ X, ρ(x+ y) ≤ ρ(x) + ρ(y).

In particular, a seminorm ρ is a norm if it satisfies ρ(x) = 0 ⇒ x = 0.

Theorem 7.3. Let E be a subspace of a vector space X, and ρ : X → R be a seminorm on X, then thefunction ρ′ : X/E → R given by ρ′([x]) = infy∈E ρ(x− y) is a well defined seminorm. In particular, if ρ is anorm on X, and E is closed (w.r.t the topology induced by ρ), then ρ′ is a norm on X/E.

Proof. 1. First of all, we need to show that the given construction is well defined, i.e. when x ∼ x′, wehave infy∈X0 ρ(x− y) = infy′∈X0 ρ(x

′ − y′).

Given x, x′ ∈ X with x ∼ x′, i.e. x−x′ ∈ E. Let d = infy∈X0 ρ(x−y), we can find (yn)∞n=1 in E such thatρ(x− yn) → d. Then for x′, we let y′n = x′−x+ yn, then y′n ∈ E for each n, and ρ(x′− y′n) = ρ(x− yn),as a result, ρ(x′ − y′n) → d. Hence,

infy′∈E

ρ(x′ − y′) ≤ infnρ(x′ − y′n) = d = inf

y∈Eρ(x− y). (7.6)

Likewise, we have infy∈E ρ(x − y) ≤ infy′∈E ρ(x′ − y′). Hence, the equality holds, implying that theconstruction is well defined.

2. Then, we need to show that the defined function is a seminorm.

(a) ρ([x]) is non-negative for each x ∈ X, which directly inherits from the non-negativity of ρ. Inparticular, when [x] = [0], since x ∈ E, we have ρ′([x]) ≤ rho(x− x) = ρ(0) = 0.

(b) Consider ρ(α[x]). If α = 0,

ρ′(α[x]) = ρ′([αx]) = ρ′([0]) = 0 = |α|ρ′([x]). (7.7)

If α 6= 0, let (yn) be a sequence in E that has ||x − yn|| → ρ′([x]). Then (αyn) is also a sequencein E, and it has ρ(αx− αyn) = |α|ρ(x− yn). Hence by definition, we have

ρ′(α[x]) = ρ′([αx]) = infy∈X0

ρ(αx− y) ≤ infnρ(αx− αyn) = |α| · inf

nρ(x− yn) = |α| · ρ′([x]). (7.8)

On the other hand, we note that x = α−1αx. Applying the above conclusion, we have

ρ′([x]) = ρ′(α−1[αx]) ≤ |α|−1 · ρ′([αx]). (7.9)

which implies that ρ′([αx]) ≥ |α|ρ′([x]). Combining the above two results, we obtain ρ′(α[x]) =|α| · ρ′([x]).

76

7.3. BANACH SPACES CHAPTER 7. NORMED AND BANACH SPACES

(c) Let x, y ∈ X, then there exists sequences (un)∞n=1 and (vn)∞n=1 in E such that ρ(x− un) → ρ′([x])and ρ(y − vn) → ρ′([y]), let wn = un + vn, then (wn) is also a sequence in E, and we have

ρ′([x] + [y]) = ρ′([x+ y]) = infw∈X0

ρ(x+ y − w) ≤ infnρ(x+ y − wn)

= infnρ((x− un) + (y − vn)) ≤ lim

n→∞ρ(x− un) + ρ(y − vn) = ρ′([x]) + ρ′([y]). (7.10)

The triangle inequality is thus established.

Therefore, the function defined by ρ([x]) = infy∈X0 ρ(x− y) is a well-defined seminorm on X/X0.

3. Finally, we show that when ρ is a norm, ρ′ is a norm on X/E. Suppose ρ′([x]) = 0, then there exists asequence (yn) in E such that ρ(x, yn) → 0, which means that yn → x. Since E is closed, it contains allits limit points, thus x ∈ E, i.e. x ∼ 0 or equivalently [x] = [0]. Hence ρ′ is a norm.

Theorem 7.4. Let X be a (complex or real) vector space, and ρ be a seminorm on X, then E = {x|ρ(x) = 0}is a subspace of X. On the quotient space X/E, the function given by ||[x]|| = infy∈E ρ(x− y) is a norm, andit has ||[x]|| = ρ(x).

Proof. We first show that E is a closed subspace. Let x, y ∈ E, and α, β ∈ F, then by sub-additivity, we haveρ(αx+ βy) ≤ αρ(x) + βρ(y) = 0, thus αx+ βy ∈ E.

By theorem 7.3, we know that || · || as defined above is a seminorm. To show that it is a norm, it sufficesto show that ||[x]|| = 0 ⇒ x ∈ E. By definition, there is a sequence (yn) in E such that ρ(x, yn) → 0.By sub-additivity of ρ, we have ρ(x) ≤ ρ(x, yn) + ρ(yn) = ρ(x, yn), which can be arbitrarily small. Hence,ρ(x) = 0, thus x ∈ E. Then, we can conclude that || · || is a norm on X E.

Finally, we show the equality ||[x]|| = ρ(x). For each y ∈ E, we have ρ(x) ≤ ρ(x, y) + ρ(y) = ρ(x, y),hence ρ(x) ≤ infy∈E ρ(x, y) = ||[x]||. On the other hand, ρ(x) = ρ(x− 0) ≥ infy∈E ρ(x, y). Together, we have||[x]|| = ρ(x).

7.3 Banach Spaces

Recall that a complete metric space is a space in which every Cauchy sequence converges.

Definition 7.14 (Banach space). A complete normed space is called a Banach space.

Proposition 7.7. A subspace Y of a Banach space X is complete if and only if it is closed.

Proof. First, suppose Y is complete. Given a sequence in Y that converges to some point x ∈ X, then it isCauchy, and by completeness of Y , it converges in Y , thus x ∈ Y , which implies that Y contains all its limitpoints, thus Y is closed. Second, suppose Y is closed. Given a Cauchy sequence in Y , it must converge tosome point x ∈ X due to completeness of X, and thus by closedness of Y , we have x ∈ Y , which implies thatY is complete.

Definition 7.15 (Isometry). Let X and Y be normed spaces, then T : X → Y is called a isometry if it isan isomorphism and preserves norm, i.e. ||Tx|| = ||x||, ∀x ∈ X. X and Y are said to be isometric if thereis an isometry between them.

Theorem 7.5 (Existence of completion of normed space). Let E be a normed vector space, then there existsa complete normed space E and a linear transform T such that ||Tx|| = x, ∀x ∈ E, and TE is dense in E.Here, E is called a completion of E.

Proof. The basic idea is that we start from the vector space of Cauchy sequences and then derive a quotientspace by combining the ones that are equivalent (we will define the equivalence later during the construction).Let C(E) be the set of all Cauchy sequences of E.

Claim 1 C(E) is a vector space, i.e. it is closed under addition and scalar multiplication.

77

7.3. BANACH SPACES CHAPTER 7. NORMED AND BANACH SPACES

Proof of Claim 1. Let x, y ∈ C(E), then given ε > 0, there exists sufficiently large N such that ||xm − xn|| <ε/2 and ||ym − yn|| < ε/2 when m,n > N . Hence,

||(x+ y)m − (x+ y)n|| = ||(xm − xn) + (ym − yn)|| ≤ ||xm − xn|| + ||ym − yn|| < ε. (7.11)

Thus, x+y ∈ C(E). Given α ∈ C, if α = 0, then for any sequence x, αx is the zero sequence, which is obviouslyCauchy and thus in C(E). If α 6= 0, given x ∈ C(E) and ε > 0, there exists M such that ||xm − xn|| < ε/|α|.As a result,

||(αx)m − (αx)n|| = ||α(xm − xn)|| = |α| · ||xm − xn|| < ε. (7.12)

Hence, αx ∈ C(E). Then, we can conclude that C(E) is a vector space.

Given x, y ∈ C(E), we define the following relation

x ∼ y ⇔ limn→∞

||xn − yn|| = 0.

Claim 2 The relation defined above is an equivalence relation.

Proof of Claim 2. First, x ∼ x is obvious, since ||xn − xn|| = 0, ∀n, its limit must be zero. Second, x ∼y ⇒ y ∼ x follows from the fact that ||xn − yn|| = ||yn − xn||. Third, if x ∼ y and y ∼ z, then by triangleinequality, ||xn − zn|| ≤ ||xn − yn|| + ||yn − zn|| → 0, thus x ∼ z.

In this sense, we call x and y equivalent Cauchy sequences if x ∼ y, and we use [x] to denote all the Cauchysequences that are equivalent to x.

Claim 3 Based on the equivalence relation defined above, [0] is a subspace of C(E).

Proof of Claim 3. Given x, y ∈ C(E) with x ∼ 0 and y ∼ 0, and α, β ∈ C, then

limn→∞

||αxn + βyn|| ≤ limn→∞

(|α| · ||xn|| + |β| · ||yn||) = |α| limn→∞

||xn|| + |β| limn→∞

||yn|| = 0. (7.13)

Hence, αxn + βyn ∼ 0, implying that [0] is a subspace.

Hence, we can define the quotient space E = C(E)/[0] with [x] + [y] = [x + y] and [αx] = α[x]. (It is awell known result in linear algebra that the operations in quotient space are well defined).

Claim 4 The vector space E defined above can be equipped with a norm || · || defined by ||[x]|| = limn→∞ ||xn||.

Proof of Claim 4. First of all, we need to verify that such a definition is well defined, i.e. x ∼ y ⇒limn→∞ ||xn|| = limn→∞ ||yn||. This is briefly shown below.

limn→∞

||xn|| = limn→∞||yn + (xn − yn)|| ≤ limn→∞

||yn|| + limn→∞

||xn − yn|| = limn→∞

||yn||, (7.14)

likewise, we can get limn→∞ ||yn|| ≤ limn→∞ ||xn||. Then, the equality is established. In the following, weshow that this real valued function is a norm on E.

1. ||[x]|| is non-negative, which follows from the fact that ||xn|| is non-negative. And, when [x] = [0], i.e.x ∼ 0, then by definition of the equivalence relation, we have

||[x]|| = 0 ⇔ limn→∞

||xn|| = 0 ↔ x ∼ 0 ⇔ [x] = [0]. (7.15)

2. Given x ∈ C(E) and α ∈ C, we have

||α · [x]|| = ||[α · x]|| = limn→∞

||αxn|| = limn→∞

|α| · ||xn|| = |α| · limn→∞

||xn|| = |α| · ||[x]||. (7.16)

3. Given x, y ∈ C(E), we have

||[x]+[y]|| = ||[x+y]|| = limn→∞

||xn +yn|| ≤ limn→∞

(||xn||+ ||yn||) = limn→∞

||xn||+ limn→∞

||yn|| = ||[x]||+ ||[y]||.(7.17)

78

7.3. BANACH SPACES CHAPTER 7. NORMED AND BANACH SPACES

Hence, we can conclude that || · || is a well-defined norm on E.

Claim 5 The normed space (E, || · ||) defined above is complete, i.e. it is a Banach space.

Proof of Claim 5. Let ([x1], [x2], . . .) be a Cauchy sequence in E. Note that each element in this sequence isin itself an equivalence class of Cauchy sequences. We use xij to denote the j-th element in xi. To show thatE is complete, it suffices to show that there is [y] ∈ E such that limn→∞ ||[xn − y]|| = 0.

We define y by yj = xjj (take the j-th vector in the sequence xj to be the j-th vector in the sequence y).First, we need to show that y ∈ C(E), i.e. it is a Cauchy sequence. We do this as follows.

Since ([xn]) is Cauchy, given ε > 0, we can find N1 such that ||[xm]− [xn]|| = limk→∞ ||xmk − xnk|| < ε/3when m,n > N1. Then, we can choose N2, such that when k > N2, ||xmk − xnk|| < ε/3. Hence, whenm,n, q > max(N1, N2), we have ||xmm−xqm|| < ε/3, and ||xqn−xnn|| < ε/3. On the other hand, we fix someq > max(N1, N2), since xq is Cauchy, we can chooseN3 such that whenm,n > N3, we have ||xqm−xqn|| < ε/3.Together, for every m,n > max(N1, N2, N3), we have

||ym − yn|| = ||xmm − xnn|| ≤ ||xmm − xqm|| + ||xqm − xqn|| + ||xqn − xnn|| < ε/3 + ε/3 + ε/3 = ε. (7.18)

It follows that y is a Cauchy sequence.Then, we show that ([xn]) converges to [y]. Given ε > 0, we can choose M , such that ||[xn − xk]|| < ε

for each n, k < ε, meaning limi→∞ ||xni − xki|| < ε, so when k is sufficiently large ||xnk − xkk|| < ε, whichfollows that limk→∞ ||xnk − xkk|| < ε. By definition, it means that ||[xn]− [y]|| < ε, ∀n > M . As we can findsuch M for arbitrarily small ε > 0, we can conclude that limn→∞ ||[xn]− [y]|| = 0, in other words, [xn] → [y].Therefore, E is complete.

We use x = (x, x, . . .) to denote a constant sequence (which is clearly Cauchy), and define T : E → E by

Tx = [x].

Claim 6 The T defined above is an injective linear map with ||Tx|| = ||x||, ∀x ∈ E.

Proof of Claim 6. The linearity of T can be seen from below. Given x, y ∈ E and α, β ∈ C, we have

T (αx+ βy) = [ ˜αx+ βy] = [αx+ βy] = α[x] + β[y] = αTx+ βTy. (7.19)

Then we show that it preserves norm as follows.

||Tx|| = ||[x]|| = limn→∞

||xn|| = ||x||. (7.20)

Since T preserves norm, Tx = 0 ⇒ ||x|| = ||Tx|| = 0 ⇒ x = 0, thus T is injective.

Claim 7 TE is dense in E, i.e. TE = E.

Proof of Claim 7. It suffices to prove that given any [y] ∈ E, there is a sequence of (xn)∞n=1 in E such thatTxn → [y].

Given [y] ∈ E, we define xn = yn, then since y is a Cauchy sequence.

limn→∞

||[y] − Txn|| = limn→∞

||[y] − [xn]|| = limn→∞

(lim

k→∞||yk − xnk||

)= lim

n→∞lim

k→∞||yk − yn|| = 0. (7.21)

It means that Txn → y, and therefore, we can conclude that TE is dense in E.

The entire proof of existence is completed.

Theorem 7.6 (Uniqueness of completion of normed space). Let E1 and E2 be two completion of E, then theyare isometric. Formally, if E1 and E2 are complete normed spaces, and there are linear maps T1 : E → E1

and T2 : E → E2 which satisfy TiE = E and ||Tix|| = ||x|| ∀x ∈ E for each i = 1, 2, then there is an isometryT : E1 → E2.

79

7.3. BANACH SPACES CHAPTER 7. NORMED AND BANACH SPACES

Proof. We have shown above that T1 and T2 are both injective. By restricting their target spaces, we getT ′

1 : E → T1E and T ′2 : E → T2E, which are both isomorphisms. Define T = T ′

2 ◦ T ′−11 : T1E → T2E. Then

T is an isomorphism between T1E and T2E.Based on T , we define T : E1 → E2 as follows. Given y1 ∈ E1, since T1E is dense in E1, we can

choose a sequence (xn) in E such that (T1xn) converges to y1. Because T1 preserves norm, we know that||xm − xn|| = ||T1xm − T1xn||, and note that (T1xn) is Cauchy (due to convergence), hence (xn) is Cauchy.And, since T2 also preserves norm, we can see that the sequence (T2xn) in E2 is also a Cauchy sequence. Bycompleteness of E2, T2xn converges to a unique element y2 ∈ E2. The map T is defined to be y1 7→ y2, wherey2 is found as described above.

Claim 1 The map T is well-defined.

Proof of Claim 1. We need to prove that y2 is independent from the choice of the intermediate Cauchysequence (xn) in E. Let x and x′ be two sequences in E such that T1xn → y1 and T1x

′n → y1, hence

T1(xn − x′n) → 0, which follows that ||xn − x′n|| = ||T1(xn − x′n)|| → 0. Note that T2 also preserves norm,thus ||T2xn − T2x

′n|| → 0, implying that T2xn and T2x

′n converges to the same limit. Therefore, y2 chosen by

the described process is unique for each y1, meaning that T is well defined.

Claim 2 T is a linear map with ||T x|| = ||x|| for every x ∈ E1.

Proof of Claim 2. Let x1, y1 ∈ E1 and α, β ∈ C, we choose two sequences (un) and (vn) in E with T1un → x1

and T1vn → y1. Let x2 = T x1 = limn→∞ T2un and y2 = T y1 = limn→∞ T2vn. (We have shown above thatx2 and y2 are well defined and independent from the choice of (un) and (vn)). Then, it is easy to see thatT1(αun + βvn) → (αx1 + βy1), hence

T (αx1 + βy1) = limn→∞

T2(αun + βvn) = α limn→∞

T2un + β limn→∞

T2vn = αTx1 + βTx2. (7.22)

Hence, T is a linear map.In addition, by continuity of norm and the assumption that T1 and T2 preserve norms, we have

||T x1|| = || limn→∞

T2un|| = limn→∞

||T2un|| = limn→∞

||un|| = limn→∞

||T1un|| = || limn→∞

T1un|| = ||x1||. (7.23)

Hence, T preserves norm.

Claim 3 T is a linear isomorphism.

Proof of Claim 3. It suffices to show that T is bijective. First, since T preserves norm, thus

T x = 0 ⇒ ||T x|| = 0 ⇒ ||x|| = 0 ⇒ x = 0. (7.24)

Hence, T is injective. Then we show that it is also surjective. Given y2 ∈ E2, we can find a sequence (xn) inE with T2xn → y2, and hence xn is a Cauchy sequence, thus T1xn converges to an element in E1, denoted byy1. According to the construction process described above, we know T y1 = y2. Therefore, we can concludethat T is a linear isomorphism.

To sum up, T is an isometry. The proof of the theorem of uniqueness is completed.

Proposition 7.8. Every finite dimensional normed space is a Banach space.

Proof. Let X be a normed space with a basis {e1, . . . , en}. Given a Cachy sequence (xi)∞i=1 in X. Letxi =

∑nj=1 αijej . Note that all norms are equivalent in finite dimensional space, thus (xi) is Cachy in terms

of the norm defined by ||x||1 =∑n

j=1 |αj | where x =∑n

j=1 αjej . From this, we can readily see that (αij)∞i=1

is Cachy for each j, hence it is convergent (due to completeness of R and C) Define βj = limi→∞ aij , andx =

∑nj=1 βjej . Then

||xi − x|| ≤n∑

j=1

|αij − βj | · ||ej || → 0. (7.25)

Hence xi → x by definition. Therefore, we can conclude that X is a Banach space.

Corollary 7.1. Every finite dimensional subspace of a normed space is closed.

80

Chapter 8

Hilbert Spaces

8.1 Inner Product and Inner Product Spaces

Definition 8.1 (Inner product). Let H be a complex vector space, then a bineary operation 〈·, ·〉 : H×H → Cis called an inner product if it satisfies

1. (linearity w.r.t the first argument) ∀x, y, z ∈ H,α, β ∈ C 〈αx+ βy, z〉 = α〈x, z〉 + β〈y, z〉;

2. (conjugate symmetry) ∀x, y ∈ H 〈x, y〉 = 〈y, x〉;

3. (positiveness) ∀x ∈ H 〈x, x〉 ≥ 0 and 〈x, x〉 = 0 ⇔ x = 0.

Definition 8.2 (Inner product space). A vector space H together with an inner product defined thereon iscalled an inner product space.

Theorem 8.1 (Cauchy-Schwartz inequality). Let H be an inner product space and x, y ∈ H, then

|〈x, y〉| ≤ 〈x, x〉1/2〈y, y〉1/2,

and the equality holds if and only if x and y are linearly dependent.

Proof. Without losing generality, we assume y 6= 0 through the entire proof.

0 ≤ 〈x− λy, x− λy〉 = 〈x, x〉 + |λ|2〈y, y〉 − 2Re(λ〈x, y〉) (8.1)

Let λ = 〈x, y〉/〈y, y〉, then we have

〈x, x〉 +|〈x, y〉|2

〈y, y〉− 2

|〈x, y〉|2

〈y, y〉≥ 0. (8.2)

It follows that|〈x, y〉|2 ≤ 〈x, x〉〈y, y〉. (8.3)

The inequality is derived. Suppose x and y are linearly independent, we can write x = λy. then

|〈x, y〉|2 = |〈λy, y〉|2 = |λ|2〈y, y〉2 (8.4)

and〈x, x〉〈y, y〉 = 〈λy, λy〉〈y, y〉 = |λ|2〈y, y〉2 (8.5)

Combining them, we get the equality. For the other direction, we assume that the equality holds, then

〈x, x〉 +|〈x, y〉|2

〈y, y〉− 2

|〈x, y〉|2

〈y, y〉= 0, (8.6)

by letting λ = 〈x, y〉/〈y, y〉, the equality can be written into

|〈x, y〉|2

〈y, y〉− 2

|〈x, y〉|2

〈y, y〉= 〈x− λy, x− λy〉 = 0. (8.7)

It implies that x− λy = 0.

81

8.1. INNER PRODUCT AND INNER PRODUCT SPACES CHAPTER 8. HILBERT SPACES

Proposition 8.1. Let H be an inner product space and, then || · || : H → R given by ||x|| = 〈x, x〉1/2 definesa norm on H, hence an inner product space can be considered as a normed space with the induced norm.

Proof. The positiveness of || · || follows from the positiveness of inner product, while the ||αx|| = |α| · ||x||follows from the linearity and conjuate symmetry. It remains to show the triangle inequality.

||x+ y||2 = 〈x+ y, x+ y〉2 = 〈x, x〉 + 〈y, y〉 + 2Re〈x, y〉 = ||x||2 + ||y||2 + 2Re〈x, y〉. (8.8)

By Cauchy-Schwartz’s inequality

Re〈x, y〉 ≤ |〈x, y〉| ≤ 〈x, x〉1/2〈y, y〉1/2 = ||x|| · ||y||. (8.9)

Hence,||x+ y||2 ≤ ||x||2 + ||y||2 + 2||x|| · ||y|| = (||x|| + ||y||)2. (8.10)

The triangle inequality is derived.

The following are some typical examples of Hilbert space.

1. Cn: 〈x, y〉 =∑n

i=1 xiyi;

2. l2: 〈x, y〉 =∑∞

i=1 xiyi;

3. L2(X,µ): 〈f, g〉 =∫

Xfgdµ.

Proposition 8.2. The inner product is a continuous function of both arguments.

Proof. Consider an inner product space H, let (xn)∞n=1 be a sequence converging to x ∈ H, and (yn)∞n=1 bea sequence converging to y ∈ H, it suffices to show that 〈xn, yn〉 → 〈x, y〉. This can be seen from

|〈xn, yn〉 − 〈x, y〉| ≤ |〈xn, yn〉 − 〈x, yn〉| + |〈x, yn〉 − 〈x, y〉|= |〈xn − x, yn〉| + |〈x, 〈〉, yn − y〉| ≤ ||xn − x|| · ||yn|| + ||x|| · ||yn − y|| → 0. (8.11)

Note that in the above deduction, we made use of the fact that ||yn|| is bounded.

We have seen that every inner product space is a normed space (every inner product induces a norm), butthe converse is in general not true, namely not every normed space can be an inner product space (not everynorm can be induced by an inner product). The following theorem specifies the condition when the conversecan be true.

Theorem 8.2. Let (X, || · ||) be a complex normed space, then || · || can be induced by an inner product if andonly if it satisfies the parallelogram law as follows

||x+ y||2 + ||x− y||2 = 2(||x||2 + ||y||2), ∀x, y ∈ X.

under such condition, the inner product satisfies the following polarization identity.

〈x, y〉 =14(||x+ y||2 − ||x− y||2 + i||x+ iy||2 − i||x− iy||2

).

Proof. For one direction, we assume that || · || is induced by an inner product, i.e. ||x|| = 〈x, x〉1/2 for everyx ∈ X, and then prove the parallelogram law as follows.

||x+ y||2 + ||x− y||2 = 〈x+ y, x+ y〉 + 〈x− y, x− y〉= ||x||2 + ||y||2 + 2Re〈x, y〉 + ||x||2 + ||y||2 − 2Re〈x, y〉= 2(||x||2 + ||y||2). (8.12)

In addition, we have

||x+ y||2 − ||x− y||2 = (||x||2 + ||y||2 + 2Re〈x, y〉) − (||x||2 + ||y||2 − 2Re〈x, y〉) = 4Re〈x, y〉, (8.13)

82

8.2. ORTHOGONALITY CHAPTER 8. HILBERT SPACES

and

(||x+ iy||2 − ||x− iy||2) = (||x||2 + ||y||2 + 2Re〈x, iy〉) − (||x||2 + ||y||2 − 2Re〈x, iy〉) = 4Im〈x, y〉. (8.14)

Hence, the polarization identity is established.For the other direction, the parallelogram law is assumed, we construct an inner product that induces the

given norm. We define 〈·, ·〉 by

〈x, y〉 =14(||x+ y||2 − ||x− y||2 + i||x+ iy||2 − i||x− iy||2

). (8.15)

One can verify that such a construction is an inner product.

Corollary 8.1. An isometry T between Hilbert spaces preserves inner product, i.e. 〈Tx, Ty〉 = 〈x, y〉.

Proposition 8.3. From theorem 8.2, we know that the inner product is uniquely determined by the norm,since an isometry preserves the norm, it also preserves the inner product.

8.2 Orthogonality

Definition 8.3 (Orthogonality). Let H be an inner product space and x, y ∈ H, then x and y are said to beorthogonal if 〈x, y〉 = 0, denoted by x ⊥ y.

We notation is often used for sets. Let A,B be subsets of H, and x ∈ H, then x ⊥ A means x ⊥ y, ∀y ∈ A,and A ⊥ B means x ⊥ y, ∀x ∈ A y ∈ B.

Definition 8.4 (Orthonormal set). Let H be an inner product space and S ⊂ H, then S is called anorthogonal set if the elements in S are mutually orthogonal, i.e. x ⊥ y, ∀x, y ∈ S, x 6= y. Particularly, if||x|| = 1, ∀x ∈ S, S is called an orthonormal set, or an orthonormal system.

Theorem 8.3 (Pythagorean theorem). Let H be an inner product space, and x1, . . . , xn ∈ H be mutuallyorthogonal, then

||x1 + · · · + xn||2 = ||x1||2 + · · · + ||xn||2.

Proof. The proof is simply done by expanding the inner product in the left hand side, and then applying theorthogonal condition to eliminate the mutual terms.

Corollary 8.2. Let H be an inner product space, and {e1, . . . , en} be an orthonormal set in H, then for anyα1, . . . , αn ∈ C, we have ∣∣∣∣∣

∣∣∣∣∣n∑

i=1

αiei

∣∣∣∣∣∣∣∣∣∣2

=n∑

i=1

|αi|2.

Proposition 8.4. A orthogonal set of non-zero vectors is linearly independent.

Proof. Let H be an inner product space, and S ⊂ H be an orthogonal set. Then it is obvious that any finitesubset {x1, . . . , xn} ⊂ S is orthogonal. Given α1, . . . , αn ∈ C with

∑ni=1 αixi = 0, then by Pythagorean, we

have ∣∣∣∣∣∣∣∣∣∣

n∑i=1

αixi

∣∣∣∣∣∣∣∣∣∣ =

n∑i=1

|αi|2||xi||2 = 0, (8.16)

implying that αi = 0 for each i = 1, . . . , n. Hence, x1, . . . , xn are linearly independent. As the linearindependence holds for any finite subset of S, S is linearly independent.

Theorem 8.4 (Bessel’s Inequality). Let (en)∞n=1 be an orthonormal sequence in an inner product space H,then for any x ∈ H, we have

∞∑n=1

|〈x, en〉|2 ≤ ||x||2. (8.17)

(Remark: this inequality also holds for finite sequence).

83

8.3. HILBERT SPACES AND CLOSED SUBSPACES CHAPTER 8. HILBERT SPACES

Proof. Let yn =∑n

i=1〈x, ei〉ei, and zn = x− yn. Then ||y||2 =∑n

i=1 |〈x, ei〉|2. In addition, we for each i ≤ nwe have

〈zn, ei〉 = 〈x, ei〉 −n∑

j=1

(〈x, ej〉)ejei = 〈x, ei〉 − 〈x, ej〉 = 0, (8.18)

thus ei ⊥ zn, which follows that yn ⊥ zn, and hence by Pythagorean, we have

n∑i=1

|〈x, ei〉|2 = ||yn||2 = ||x||2 − ||zn||2 ≤ ||x||2. (8.19)

As this holds for any n ∈ N+, we can conclude that

limi→∞

∞∑i=1

|〈x, ei〉|2 ≤ ||x||2. (8.20)

Theorem 8.5 (Gram-Schmidt orthonormalization). Let {xn}∞n=1 be a linearly independent subset of an innerproduct space H, then we let

e1 =x1

||x1||

rn = xn −n−1∑i=1

〈xn, ei〉ei, ∀n > 1

en =rn

||rn||

Then {en}∞n=1 be an orthonormal set, and for each n, span{ei}ni=1 = span{xi}n

i=1, and span{ei}∞i=1 =span{xi}∞i=1. In particular if {xn}∞n=1 is a basis of H, then {ei}∞i=1 is also a basis of H.

Proof. We first need to show that the procedure is well defined, i.e. rn 6= 0 for each n > 1, which directlyfollows the linear independence assumption. And, ||en|| = 1, ∀n directly follows from the normalization steps.To show that {en} is orthogonal, it suffices to show that 〈rn, ei〉 = 0 for every i < n. This can be seen from

〈rn, ei〉 = 〈xn, ei〉 −n−1∑j=1

〈xn, ei〉〈ej , ei〉 = 〈xn, ei〉 − 〈xn, ei〉 = 0. (8.21)

Subsequently, we show that span{ei}ni=1 = span{xi}n

i=1. This can be easily shown by induction and theconstruction rules. When this is proved, span{ei}∞i=1 = span{xi}∞i=1 immediately follows due to the definitionof span.

8.3 Hilbert Spaces and Closed Subspaces

Definition 8.5 (Hilbert space). A complete inner product space is called a Hilbert space.

Theorem 8.6 (Projection theorem). Let H be a Hilbert space, and E be a closed convex subset of H, thenfor every x ∈ H, there exists a unique y ∈ E such that

||x− y|| = infz∈E

||x− z||.

Here, y is called the projection of x onto E, denoted by y = PEx. In particular, if C is a closed subspace ofH (C is obviously convex), then x − PEx ⊥ E. Conversely, if there is a point y ∈ E such that x − y ⊥ E,then y is unique and is given by y = PEx.

Proof. Given x and E, let d = infz∈E ||x− z||. Then, we can find a sequence (zn)∞n=1 such that ||x− zn|| → d.We claim that (zn) is a Cauchy sequence, which is shown as follows. Given ε > 0, we can find N such that

84

8.3. HILBERT SPACES AND CLOSED SUBSPACES CHAPTER 8. HILBERT SPACES

||x − zn||2 < d2 − ε/2 for any n > N . Let m,n > N and v = (zm + zn)/2, by convexity of E, v ∈ E, thus||x− v|| ≥ d. According to parallelogram law, we have

||zm − zn||2 = 2(||x− zm||2 + ||x− zn||2) − 4||x− v||2 < 4d2 + ε− 4d2 = ε. (8.22)

It implies that (zn) is a Cauchy sequence, and since E is closed in a complete space, zn converges to somey ∈ E, thus ||x− y|| = d.

In the following, we show that such y is unique. Let y1, y2 ∈ E with ||x − y1|| = ||x − y2|| = d, we havev = (y1 + y2)/2 ∈ E by convexity of E. Hence, by parallelogram, we have

||y1 − y2||2 = 2(||x− y1||2 + ||x− y2||2) − 4||x− v||2 ≤ 4d2 − 4d2 = 0. (8.23)

Hence y1 = y2. The uniqueness is shown.Let E be a closed subspace, and x ∈ E and y = PEx, if x−y ⊥ E does NOT hold, then we can find v ∈ E

such that 〈x− y, v〉 6= 0. Choose α = 〈x−y,v〉||v||2 , then we have

||x− (y + αv)||2 = ||x− y||2 − |α|2|〈x− y, v〉|2 < ||x− y||2. (8.24)

Note that y + αv ∈ E, this result contradicts the assumption that ||x− y|| attains the minimum. Hence, wecan conclude that x− y ⊥ E.

Finally, we prove that the y ∈ E that satisfies x − y ⊥ E must have y = PEx. For such a y we on onehand have ||x− y|| ≥ ||x− PEx||, and on the other hand, by Pythagorean

||x− y||2 = ||x− PEx||2 − ||PEx− y||2, (8.25)

hence the only possibility is that ||PEx− y|| = 0, thus y = PEx.

Proposition 8.5. Let E be a closed subspace of a Hilbert space H, then the following conditions are equivalent

1. x ∈ E;

2. x = PEx;

3. ||x||2 = 〈x, PEx〉;

4. ||x|| = ||PEx||.

Proof. 1. (1) ⇔ (2). Since x ∈ E, it is obvious that infz∈E ||x−z|| attains minium at z = x, thus PEx = x.Since PEx ∈ E, the other direction is obvious.

2. (2) ⇒ (3).〈x, x〉 = 〈x, PEx+ (x− PEx)〉 = 〈x, PEx〉 + 〈x, x− PEx〉 = 〈x, PEx〉. (8.26)

3. (3) ⇒ (4).

||x−PEx||2 = ||x||2 + ||PEx||2 − 2Re〈x, PEx〉 = ||x||2 + ||PEx||2 − 2||x||2 = ||PEx||2 −||x||2 ≥ 0. (8.27)

Hence ||x||2 = ||PEx||2.

4. (4) ⇒ (1), (2). By Pythagorean and projection theorem, we have

||x||2 = ||PEx||2 + ||x− PEx||2, (8.28)

hence ||x− PEx|| = 0, which implies (2), and equivalently (1).

Proposition 8.6. Let H be an inner product space, and S ⊂ H be any subset, define S⊥ = {y|y ⊥ x,∀x ∈ S}.Then S⊥ is a closed subspace. And S ⊆ (S⊥)⊥.

85

8.4. COMPLETE ORTHONORMAL SYSTEMS CHAPTER 8. HILBERT SPACES

Proof. Let x, y ∈ S⊥, α, β ∈ C, then for every z ∈ S, we have

〈αx+ βy, z〉 = α〈x, z〉 + β〈y, z〉 = 0, (8.29)

which implies that αx + βy ∈ S⊥, hence S⊥ is a subspace of H. In addition, let (xn) be a sequence in S⊥

that converges to x ∈ H, then for every y ∈ S, by continuity of inner product,

〈x, y〉 = limn→∞

〈xn, y〉 = limn→∞

0 = 0, (8.30)

which follows that S⊥ is closed. On the other hand,

x ∈ S ⇒ x ⊥ y ∀y ∈ S⊥ ⇒ x ∈ (S⊥)⊥, (8.31)

thus S ⊆ (S⊥)⊥.

Theorem 8.7 (Orthogonal complement). Let E be a closed subspace of a Hilbert space H, and E⊥ = {y|y ⊥x ∀x ∈ E}, then

1. H = E ⊕ E⊥;

2. (E⊥)⊥ = E.

Here, E⊥ is called the orthogonal complement of E in H.

Proof. 1. (H = E ⊕ E⊥). Given x ∈ E ∩ E⊥, by definition, x ⊥ x, hence x = 0, which implies thatE ∩ E⊥ = {0}. Let x ∈ H, then by projection theorem, we can write x = PEx + (x − PEx), in whichPEx ∈ E and x− PEx ∈ E⊥, hence we can conclude that H = E ⊕ E⊥.

2. ((E⊥)⊥ = E). From proposition 8.6, we know that E ⊆ (E⊥)⊥. For the other direction, let x ∈ (E⊥)⊥.Note that x− PEx ∈ E⊥, then 〈x, x− PEx〉 = 0, implying that ||x||2 = 〈x, PEx〉, hence x ∈ E.

Proposition 8.7. Let E1 and E2 be two subspaces of H, then E1 ⊆ E2 ⇒ E⊥2 ⊆ E⊥

1 . In particular, whenboth E1 and E2 are closed, then the converse also holds, i.e. E1 ⊆ E2 ⇔ E⊥

2 ⊆ E⊥1 .

Proof. For one direction, if E1 ⊆ E2, then y ⊥ x,∀x ∈ E2 ⇒ y ⊥ x,∀x ∈ E1, then E⊥2 ⊆ E⊥

1 . For the otherdirection, when E1 and E2 are both closed, we have E1 = E⊥

1 and E2 = E⊥2 , hence

E⊥2 ⊆ E⊥

1 ⇒ (E⊥1 )⊥ ⊆ (E⊥

2 )⊥ ⇒ E1 ⊆ E2. (8.32)

8.4 Complete Orthonormal Systems

Definition 8.6 (Convergent series). Let X be a normed space and (xi)∞i=1 be a sequence in X, then thepartial sum of (xi) is defined by sn =

∑ni=1 xi. If the sequence of partial sums (sn) converges to s ∈ X,

then we say that the infinite series∑∞

i=1 xi converges to s. This can be written as

s =∞∑

i=1

xi.

Definition 8.7 (Absolute convergence). Let X be a normed space and (xi)∞i=1 be a sequence in X, then theinfinite series

∑∞i=1 xi is said to be absolutely convergent if the infinite series

∑∞i=1 ||xi|| converges.

The following theorem states the relation between absolute convergence and convergence of an infiniteseries.

Theorem 8.8. Let X be a normed space and (xi)∞i=1 be a sequence in X, then if the infinite series∑∞

i=1 xi

is absolutely convergent, then its partial sum is a Cauchy sequence. In paricular, if X is a Banach space, thenit converges.

86

8.4. COMPLETE ORTHONORMAL SYSTEMS CHAPTER 8. HILBERT SPACES

Proof. Suppose∑∞

i=1 xi is absolutely convergent. Let an =∑n

i=1 ||xi||, then an is a bounded monotonesequence, and thus it converges to some a ∈ [0,+∞). Given ε > 0, we can find N such that a − an =∑∞

i=n+1 ||xi|| < ε, ∀n > N . Let sn =∑n

i=1 xi be the partial sum of the given sequence. Then, for anym > n > N , we have

||sm − sn|| =

∣∣∣∣∣∣∣∣∣∣

m∑i=n+1

xi

∣∣∣∣∣∣∣∣∣∣ ≤

m∑i=n+1

||xi|| ≤ a− an < ε, (8.33)

which implies that (sn) is a Cachy sequence. When X is complete, it converges.

Theorem 8.9. Let H be a Hilbert product space, and (ei)∞i=1 be an orthonormal sequence in H, then

limn→∞

∣∣∣∣∣∣∣∣∣∣

n∑i=1

αiei

∣∣∣∣∣∣∣∣∣∣ =

( ∞∑i=1

|ai|2)1/2

,

if the right hand side is finite, then the infinite series∑∞

i=1 αiei converges and∣∣∣∣∣∣∣∣∣∣∞∑

i=1

αiei

∣∣∣∣∣∣∣∣∣∣ =

( ∞∑i=1

|ai|2)1/2

.

Proof. By Pythagorean theorem, we have ∣∣∣∣∣∣∣∣∣∣

n∑i=1

αiei

∣∣∣∣∣∣∣∣∣∣2

=n∑

i=1

|αi|2. (8.34)

Both hand side are increasing sequences, thus

limn→∞

∣∣∣∣∣∣∣∣∣∣

n∑i=1

αiei

∣∣∣∣∣∣∣∣∣∣2

= limn→∞

n∑i=1

|αi|2. (8.35)

By continuity x 7→ x1/2 for x ≥ 0, we have

limn→∞

∣∣∣∣∣∣∣∣∣∣

n∑i=1

αiei

∣∣∣∣∣∣∣∣∣∣ =

(lim

n→∞

n∑i=1

|αi|2)1/2

=

( ∞∑i=1

|αi|2)1/2

. (8.36)

If the right hand side is finite, then the infinite series∑n

i=1 αiei is absolutely convergent, there is s ∈ H withs =

∑∞i=1 αiei. By continuity of norm, we have

||s|| = || limn→∞

sn|| = limn→∞

||sn|| =

( ∞∑i=1

|αi|2)1/2

. (8.37)

Definition 8.8 (Complete system). Let X be a normed space, then a subset S is called a complete systemif spanS = X.

We give several examples of complete systems in Hilbert spaces

1. {x 7→ xn}∞n=0 is a complete system in (C[0, 1], || · ||∞) and in L2[0, 1].

2. {x 7→ 1} ∪ {x 7→ cos(nx)}∞n=1cup{x 7→ sin(nx)}∞n=1 is a complete system in L2[−π, π].

Definition 8.9 (Complete orthonormal system). An orthonormal set in a Hilbert space that is also a completesystem is called a complete orthonormal system.

Theorem 8.10. Let S be a subset of Hilbert space H, then S is a complete system if and only if x ⊥ S (i.e.x ⊥ y, ∀y ∈ S) implies x = 0.

87

8.4. COMPLETE ORTHONORMAL SYSTEMS CHAPTER 8. HILBERT SPACES

Proof. For one direction, we assume that S is a complete system in H. and x ⊥ y, ∀y ∈ S. Since spanS = H,we can find a sequence (xn) in spanS such that xn → x. By continuity of inner product and the fact thatx ⊥ s for every s ∈ spanS, then

〈x, x〉 = limn→∞

〈xn, x〉 = limn→∞

0 = 0. (8.38)

It follows that x = 0.For the other direction, let E = spanS, it suffices to show that H ⊆ E. Let x ∈ H, then x = PEx+ (x−

PEx). Clearly, x− PEx ⊥ S, hence x− PEx = 0, thus x = PEx ∈ E, implying that H ⊆ E.

In discussing infinite dimensional normed spaces, we use the following notion of basis, which is a general-ization of the basis in finite dimensional space.

Definition 8.10 (Basis in normed space). Let X be a (infinite dimensional) normed space, then {ei}∞i=1

is called a basis of X if for each x ∈ X, there exists a unique sequence (αi)∞i=1 in C such that such thatx =

∑∞i=1 αiei.

Theorem 8.11. A complete orthonormal system of a Hilbert space is a basis. Concretely, let H be a Hilbertspace with a complete orthonormal system {ei}∞i=1, then for each x ∈ H, there is a unique sequence (αi)∞i=1 ∈ l2

such that x =∑∞

i=1 αiei, and αi = 〈x, ei〉. The result can be written as

x =∞∑

i=1

〈x, ei〉ei.

Proof. From Bessel’s inequality, we know that for each x ∈ H

∞∑i=1

|〈x, ei〉|2 ≤ ||x||2, (8.39)

hence the infinite series∑∞

i=1〈x, ei〉ei is absolutely convergent and thus converges. Let y =∑∞

i=1〈x, ei〉ei,then y − x ⊥ ei for each ei. Since {ei}∞i=1 is a complete system, by theorem 8.10, we have y − x = 0, thusy = x. This shows that every x ∈ H can be expressed in form of the given infinite series.

In the following, we show that such an expression is unique. Suppose x =∑∞

i=1 αiei, then we have

z =∞∑

i=1

(αi − 〈x, ei〉)ei = 0. (8.40)

and for each i,αi − 〈x, ei〉 = 〈z, ei〉. (8.41)

Combining the above two leads to αi = 〈x, ei〉 for each i.

Theorem 8.12 (Parseval’s identity). Let {ei}ni=1 be an orthonormal set in a Hilbert space H, then it is

complete if and only if the following Parseval’s identity is satisfied for every x ∈ H

||x||2 =∞∑

i=1

|〈x, ei〉|2.

Proof. If {ei}∞i=1 is complete, then it is a basis, and thus for each x ∈ H, we have

x =∞∑

i=1

〈x, ei〉ei. (8.42)

By theorem 8.9, we get

||x||2 =∞∑

i=1

|〈x, ei〉|2. (8.43)

The parseval’s identity is thus established. For the other direction, the parseval’s identity is assumed. letE = span{ei}∞i=1, then given x ∈ H, by Pythagorean, we have

||x− PEx|| = ||x||2 − ||PEx||2 = ||x||2 − ||x||2 = 0. (8.44)

It implies that x = PEx ∈ E. Hence, the given set is a complete system.

88

8.4. COMPLETE ORTHONORMAL SYSTEMS CHAPTER 8. HILBERT SPACES

Up to now, we have derived several theorems in characterizing a complete orthonormal system. They aresummarized by the following theorem.

Theorem 8.13 (Characterization of complete orthonormal system). Let {ei}∞i=1 be an orthonormal set in aHilbert space H, then the following statements are equivalent:

1. {ei}∞i=1 is a complete system (i.e. {ei}∞i=1 = H);

2. 〈x, ei〉 = 0 ∀i⇒ x = 0;

3. {ei}∞i=1 is a basis of H (i.e. ∀x ∈ H x =∑∞

i=1〈x, ei〉ei);

4. parseval’s identity is satisfied (i.e. ∀x ∈ H ||x||2 =∑∞

i=1 |〈x, ei〉|2).

Not every Hilbert space admits a complete orthonormal sequence, the following proposition gives anequivalent condition.

Proposition 8.8. A Hibert space has a complete orthonormal sequence if and only if it is separable (i.e. ithas a countable dense set).

Proof. Let H be a Hilbert space, then if H has a complete orthonormal sequence (ei)∞i=1, we let Sn ={∑n

i=1 αiei|αi ∈ Q}, then it is easy to show that∪∞

n=1 Sn is a dense set in H.For the other direction, if H is separable, there is a countable dense, denoted by S = si

∞i=1, from which

we can select a maximal linearly independent subset by induction, and by applying Gram-Schmidt orthonor-malization to this subset, we can obtain a complete orthonormal system.

Theorem 8.14. Every infinite dimensional separable Hilbert spaces are isometric. In particular, every infinitedimensional separable Hilbert space is isometric to l2.

Proof. Let H be an infinite dimensional separable Hilbert space, we can choose an orthonormal basis {ei}∞i=1.Then we can define T : H → l2 by Tx = (〈x, ei〉)∞i=1. By theorem 8.9(for showing onto) 8.11(for showingone-to-one) and parseval’s identity (for showing norm-perserving), we can prove that T is an isometry. HenceH is isometric to l2. Therefore, all infinite dimensional separable Hilbert spaces are isometric.

89

Chapter 9

Linear Functionals

9.1 The Space of Linear Functionals

All vector spaces in this notes are real or complex vector spaces.

Definition 9.1 (Linear functional). Let X be a vector space over the field F (F is either R or C), then alinear map f : X → F is called a linear functional.

In other words, a linear functional is a linear map whose range is in R or C.

Definition 9.2 (The space of linear functionals). All linear functionals defined on a vector space X forms avector space, called the space of linear functionals, denoted by X#. The addition and scalar multiplicationare defined as

(f + g)(x) = f(x) + g(x) and (αf)(x) = αf(x).

The following are two examples of linear functionals:

1. Given f ∈ C[0, 1], F : C[0, 1] → R defined by F (x) =∫ 1

0x(t)f(t)dt is a linear functional;

2. Let H be a inner product space, given y ∈ H, then f : H → C defined by f(x) = 〈x, y〉 is a linearfunctional.

Lemma 9.1. Let f 6= 0 be a linear functional on X, then and x0 ∈ X satisfy f(x0) 6= 0. Then for eachx ∈ X, it can be uniquely written into x = y + λx0, where y ∈ ker f . In other words, X = ker f ⊕ span{x0}.

Proof. Since f 6= 0, there is v ∈ X such that f(x0) 6= 0. Then, we can write

x = (x− f(x)x0) + f(x)x0 =(x− f(x)

f(x0)x0

)+

f(x)f(x0)

x0. (9.1)

Note that

f

(x− f(x)

f(x0)x0

)= f(x) − f(x)

f(x0)f(x0) = 0. (9.2)

For uniqueness, if there are y1, y2 ∈ ker f and λ1, λ2 ∈ F with y1 + λ1x0 = y2 + λ2x0, then (λ1 − λ2)x0 =y2 − y1 ∈ ker f , thus f((λ1 − λ2)x0) = 0, since f(x0) 6= 0, we have λ1 = λ2, as a result, y1 = y2.

Corollary 9.1. Let f be a non-zero linear functional on X, then

codim(ker f) = 1.

Corollary 9.2. Let f, g be non-zero linear functionals on X, then ker f = ker g if and only if f = λg forsome λ 6= 0.

Proof. First, assume f = λg for some λ 6= 0. Then f(x) = 0 ⇔ g(x) = 0, it implies that ker f = ker g.For the other direction, assume ker f = ker g = E. We can choose x0 ∈ E. By the lemma above, we knowthat for each x ∈ X, we can write x = y + γx0 with y ∈ E. Then, f(x) = γf(x0) and g(x) = γg(x0). Letλ = f(x0)/g(x0), we know f(x) = λg(x). Note that λ is a constant independent of x.

90

9.2. DUAL SPACES CHAPTER 9. LINEAR FUNCTIONALS

9.2 Dual Spaces

9.2.1 Bounded Linear Functionals

Definition 9.3 (Bounded linear functional). Let X be a normed space, then a linear functional f on X iscalled a bounded linear functional if there exists c > 0 such that |f(x)| ≤ c||x||, ∀x ∈ X.

Lemma 9.2. Let f be a linear functional on a normed space X, then f is continuous if and only if f iscontinuous at 0.

Proof. The “only if” direction is trivial. For “if” part, let x ∈ X, for any sequence (xn) that converges to x,we have xn − x converges to 0. Suppose f is continuous at 0, then f(xn) − f(x) = f(xn − x) → 0, implyingthat f(xn) → f(x), thus f is continuous on X.

Theorem 9.1. Let f be a linear functional on a normed space X, then f is continuous if and only if f isbounded.

Proof. Suppose f is continuous, then there exists δ > 0, such that |f(x)| < 1 whenever ||x|| ≤ δ. Then, foreach x ∈ X and x 6= 0, we let u = δ

||x||x, then ||u|| = δ and

|f(x)| =||x||δ

|f(u)| ≤ 1δ||x||. (9.3)

Hence, f is bounded. For the other direction, assume there is M > 0 such that |f(x)| ≤ M ||x|| for eachx ∈ X. Let (xn) be a sequence that converges to x, then |f(xn) − f(x)| = |f(xn − x)| ≤ M ||xn − x|| → 0,which means that f(xn) → f(x).

Proposition 9.1. Let f be a linear functional on a normed space X, then f is bounded if and only if ker fis closed.

Proof. If f is bounded, then it is continuous, we immediately have ker f = f−1{0} is closed, since {0} isclosed. On the other hand, if f is unbounded. Then we can find a sequence (xn)∞n=1 with ||xn|| = 1 andf(xn) > n for each n. Let yn = xn/f(xn), then f(yn) = 1 and ||yn|| < 1/n. Hence yn → 0. Let zn = yn − y1,then f(zn) = 0, i.e. zn ∈ ker f , and zn → −y1, where f(−y1) = −1. It shows that there exists a sequence inker f that converges to some point outside ker f . Hence, ker f is not closed.

Lemma 9.3. Let E be a closed subspace of a normed space X and x0 ∈ X ∩Ec such that X = E⊕ span{x0}.Let d = infy∈E ||x0 − y||, then there exists a bounded linear functional f on E such that ker f = E, ||f || = 1,and f(x0) = d.

Proof. Since E is closed, Ec is open. Hence, there is a open ball B(x0, ε) ⊂ Ec for some ε > 0, and thusd > ε > 0. Since X = E ⊕ span{x0}, each x ∈ X can be written uniquely as x = y + αx0. Then we candefine a functional f0 on X by f0(x) := α. Due to the uniqueness of decomposition, f0 is well-defined. Thelinearity of f0 can be seen from

(αx1 + αx2) = α1(y1 + λ1x0) + α2(y2 + λ2x0) = (α1y1 + α2y2) + (α1λ1 + α2λ2)x0 (9.4)

It means that f0(α1x1 + α2x2) = α1λ1 + α2λ2 = α1f0(x1) + α2f0(x2). In addition, we can easily see thatf0(x0) = 1 and f0(y) = 0, ∀y ∈ E, implying that ker f = E. In the following, we will show that f0 is abounded functional with ||f0|| = 1/d. For each x ∈ X and x /∈ E, there exists y ∈ E and α 6= 0 such that

x = y + αx0 = α(x0 − y′), (9.5)

where y′ = −y/α ∈ E. Hence,

|f0(x)| = |α| ≤ ||x||||x0 − y′||

≤ 1d||x||. (9.6)

In addition, when x ∈ E, f(x) = 0 ≤ (1/d)||x||. The above shows that f0 is bounded with ||f0|| ≤ 1/d. Tocomplete the proof, it remains to show the other direction, namely ||f0|| ≥ 1/d. By definition of d, we canchoose a sequence (yn) in E such that ||x0 − yn|| → d. And, it can be readily seen that f0(x0 − yn) = 1 foreach n. As a result,

||f0|| ≥ supn

|f0(x0 − yn)|||x0 − yn||

≥ limn→∞

1||x0 − yn||

=1d. (9.7)

Take f = df0, then we are done.

91

9.2. DUAL SPACES CHAPTER 9. LINEAR FUNCTIONALS

Proposition 9.2. Let E be a closed subspace of a normed space X with codim(E) = 1, then there exists abounded linear functional f with ker f = E.

Proof. Since codim(E) = 1, there exists a subspace V with dim(V ) = 1 such that X = E ⊕ V . Then, thereexists x0 ∈ X ∩ Ec such that V = span{x0}. By lemma 9.3, we know there is a bounded linear functional fon X such that ker f = E.

To sum up, for a linear functional f on a normed space, the following statements are equivalent:

1. f is bounded;

2. f is continuous;

3. ker f is closed.

9.2.2 Dual Space

Proposition 9.3. All bounded linear functionals on X form a linear subspace of X#.

Proof. Let f and g be bounded linear functionals on X, and |f(x)| ≤ A||x||, |g(x)| ≤ B||x|| for every x ∈ X.Then, we have for each x

|(f + g)(x)| = |f(x) + g(x)| ≤ |f(x)| + |g(x)| ≤ A||x|| +B||x|| = (A+B)||x|| (9.8)

and|(αf)(x)| = |αf(x)| ≤ (|α|A)||x|| (9.9)

Hence, f + g and αf are both bounded linear functionals.

Proposition 9.4. Let || · || be a real valued functions defined on all bounded linear functionals, which is givenby

||f || = supx 6=0

|f(x)|||x||

= sup{|f(x)| : ||x|| = 1}.

Then || · || is a norm on the space of all bounded linear functionals.

Proof. The non-negativeness of || · || directly follows from the definition. And

||αf || = sup{|αf(x)| : ||x|| = 1} = sup{|α| · |f(x)| : ||x|| = 1} = |α| sup{|f(x)| : ||x|| = 1} = |α| · ||f ||. (9.10)

For sub-additivity, we have

||f + g|| = sup{|f(x) + g(x)| : ||x|| = 1} ≤ sup{|f(x)| + |g(x)| : ||x|| = 1}= sup{|f(x)| : ||x|| = 1} + sup{|g(x)| : ||x|| = 1}= ||f || + ||g||. (9.11)

In addition, when ||f || = 0, we have |f(x)| = 0 for all x 6= 0, hence f = 0.

Definition 9.4. Let X be a normed vector space. The vector space of all bounded linear functionals on Xendowed with the norm defined above is called the dual space of X, denoted by X∗.

Proposition 9.5. Let X be a normed space, and f ∈ X∗, then

|f(x)| ≤ ||f || · ||x||, ∀x ∈ X.

Proposition 9.6. Let X be a normed space, and f ∈ X∗ with f 6= 0, then

inf{||x|| : f(x) = 1} =1

||f ||.

92

9.2. DUAL SPACES CHAPTER 9. LINEAR FUNCTIONALS

Proof. For every x with f(x) = 1, we have ||f || · ||x|| ≥ |f(x)| = 1, thus ||x|| ≥ 1/||f ||, hence

inf{||x|| : f(x) = 1} ≥ 1/||f ||. (9.12)

By definition of ||f ||, we can choose a sequence (xn) with ||xn|| = 1, ∀n and |f(xn)| → ||f ||. Withoutlosing generality, we assume f(xn) 6= 0 for all n. Let yn = xn/f(xn), then f(yn) = 1, ∀n. and ||yn|| =||xn||/|f(xn)| → 1/||f ||. Hence

inf{||x|| : f(x) = 1} ≤ infn∈N+

||yn|| ≤ limn→∞

||yn|| = 1/||f ||. (9.13)

Combining the two inequalities together, we can get the equality.

Theorem 9.2. Let X be a normed vector space, then its dual space X∗ is a Banach space.

Proof. Consider a Cauchy sequence (fn) in X∗, then for each x ∈ X, we have ||fn(x) − fm(x)|| ≤ ||fn −fm|| · ||x||, hence (fn(x)) is a Cauchy sequence. By completeness of a real field or a complex field, the limitof (fn(x)) exists. Then we can define a functional f by

f(x) := limn→∞

fn(x).

It is easy to verify that f is linear (due to linearity of limit operation). In addition, since (fn) is Cauchy,there is N such that ||fn − fm|| < ε when n,m > N , thus for each x, and n > N

|f(x) − fn(x)| = | limm→∞

fm(x) − fn(x)| = limm→∞

|fm(x) − fn(x)| ≤ limm→∞

||fm − fn|| · ||x|| ≤ ε||x||. (9.14)

Hence, f − fn ∈ X∗, thus f = (f − fn) + fn ∈ X∗. Then, we can conclude that X∗ is complete.

Proposition 9.7. Let X be a normed space, and (fn) be a sequence in X∗ that converges to f ∈ X∗, thenfor each x ∈ X, we have

limn→∞

fn(x) = f(x).

Proof. This can be seen from |f(x) − fn(x)| ≤ ||fn − f || · ||x|| → 0.

Proposition 9.8. Let X be a normed space, and (fn) be a sequence in X∗ that converges to f ∈ X∗, and(xn) be a sequence in X that converges to x ∈ X, then

limn→∞

fn(xn) = f(x).

Proof. Since (xn) is convergent, it is bounded. Let M = supn ||xn||, then

|f(x) − fn(xn)| ≤ |f(x) − f(xn)| + |f(xn) − fn(xn)| ≤ ||f || · ||x− xn|| + ||f || ·M → 0. (9.15)

9.2.3 Riesz Representation Theorem on Hilbert Space

Theorem 9.3 (Riesz Representation Theorem). Let H be a Hilbert space, and f be a linear functional onH. Then f ∈ H∗ if and only if there exists a unique y ∈ H such that f(x) = 〈x, y〉, ∀x ∈ H. In addition,||f || = ||y||.

Proof. We first show the “if” part. Let f be defined by f(x) = 〈x, y〉. By Cauchy-Swartz inequality, we havefor each x, |f(x)| ≤ ||x|| · ||y||, which implies that ||f || ≤ ||y||. Take x = y, then ||f || · ||y|| ≥ |f(y)| = ||y||2,hence ||f || ≥ ||y||. Therefore ||f || = ||y||. Obviously, f ∈ H∗.

For the “only if” part, let f ∈ H∗. If f = 0, then we can take y = 0. In the following, we assume f 6= 0.Let L = ker f , then L is a closed subspace of H. with codim(L) = 1. Since H = L⊕L⊥, we have dim(L⊥) = 1.Hence, L⊥ = span{y0} for some y0 ∈ H. It is easy to see that f(y0) 6= 0, otherwise y0 ∈ L ∩ L⊥ = {0}. Foreach x ∈ H, we can write x = z + λy0 with z ∈ L. Then f(x) = λf(y0). Take y = y0f(y0)/||y0||2, then

〈x, y0f(y0)/||y0||2〉 = 〈z + λy0, y0f(y0)/||y0||2〉 = λ〈y0, y0f(y0)/||y0||2〉 = λf(y0) = f(x). (9.16)

Hence f = 〈·, y〉.Finally, we show the uniqueness. If f(x) = 〈x, y1〉 = 〈x, y2〉 for each x, Then 〈x, y1 − y2〉 = 0, ∀x ∈ H.

Hence, ||y1 − y2|| = 〈y1 − y2, y1 − y2〉 = 0, thus y1 = y2.

93

9.2. DUAL SPACES CHAPTER 9. LINEAR FUNCTIONALS

The following is a corollary derived by applying the Riesz representation theorem to Lebesgue measuretheory.

Corollary 9.3. Let (X,M, µ) be a measure space, then L2(µ) is a Hilbert space. Let λ : L2(µ) → R be abounded linear functional, then there exists g ∈ L2(µ) such that

λ(f) =∫

X

fgdµ, ∀f ∈ L2(µ)

and

||g||2 =∫

X

|g|2dµ = supf 6=0

|λ(f)|||f ||2

.

9.2.4 Hahn-Banach Theorem

Hahn-Banach theorem is one of the most important theorem in functional analysis. It states that everybounded linear functional defined on a subspace has a norm-preserving extension.

Theorem 9.4 (Hahn-Banach Theorem). Let X be a normed space, and E be its subspace, then for eachf0 ∈ E∗, there exists f ∈ X∗ with f |E = f0 and ||f ||X∗ = ||f0||E∗ .

Proposition 9.9. Let X be a normed space, for each x0 ∈ X, there exists f ∈ X∗ such that ||f || = 1 andf(x0) = ||x0||.

Proof. Define a linear functional one the subspace span{x0}, by f0(λx0) := λ||x0||. It is easy to verify thatthis is a linear functional and has norm ||f0|| = 1. By Hahn-Banach theorem, there is an extension f ∈ X∗

with ||f || = 1 and f(x0) = f0(x0) = ||x0||.

Proposition 9.10. Let X be a normed space, then for each x ∈ X, we have

||x|| = supf∈X∗,f 6=0

|f(x)|||f ||

= sup{|f(x)| : ||f || = 1}.

Proof. First, from |f(x)| ≤ ||f || · ||x||, we know

sup{|f(x)| : ||f || = 1} ≤ ||x||. (9.17)

On the other hand, for each x, we can find f ′ with ||f ′|| = 1 and f ′(x) = ||x||, hence

sup{|f(x)| : ||f || = 1} ≥ |f ′(x)| = ||x||. (9.18)

Then the equality is established.

Corollary 9.4. Let X be a normed space, if x ∈ X has f(x) = 0, ∀f ∈ X∗, then x = 0.

Proof. We can choose f ∈ X∗ such that f(x) = ||x||, by assumption, we have ||x|| = 0, thus x = 0.

Corollary 9.5. Let X be a normed space, and x1, x2 ∈ X have x1 6= x2, then there is f ∈ X∗ such thatf(x1) 6= f(x2).

Proof. We can choose f ∈ X∗ such that f(x1 − x2) = ||x1 − x2|| > 0, then f(x1) 6= f(x2).

Proposition 9.11. Let X be a normed vector space and E be a proper closed subspace of X, if x0 ∈ X ∩Ec

and d = infy∈E ||x0 − y|| then there exists f ∈ X∗ such that ||f || = 1, f(y) = 0 on E, and f(x0) = d.

Proof. Since E is closed and x0 /∈ E, we know that d > 0. Otherwise, we can find a sequence in E thatconverges to x0, contradicting the closedness of E. Consider the augmented space E+ = E ⊕ span{x0}, thenE is a subspace of E+ with codim(E) = 1 (note that the codimension is w.r.t E+). By lemma 9.3, thereexists a bounded linear functional f0 defined on E+ such that ||f0|| = 1, f0 = 0 on E and f(x0) = d. ByHahn-Banach theorem, we can find a norm-preserving extension of f0 on X, then we are done.

94

9.2. DUAL SPACES CHAPTER 9. LINEAR FUNCTIONALS

Proposition 9.12. Let X be a normed space, for each subspace L of X we denote

L⊥ = {f ∈ X∗|f(x) = 0, ∀x ∈ L},

and for each subspace F of X∗ we denote

F⊥ = {x ∈ X|f(x) = 0, ∀f ∈ F}.

Let L be a subspace of X, then

1. L⊥ is a closed subspace of X∗;

2. L ⊂ (L⊥)⊥;

3. If L is closed, then L = (L⊥)⊥.

Proof. 1. It is easy to verify that L⊥ is a subspace of X∗. Let (fn) be a sequence in L⊥ that converges tof ∈ X∗. To prove that L⊥ is closed, It suffices to show that f ∈ L⊥. This can be seen from

f(x) = limn→∞

fn(x) = 0. (9.19)

2. For each x ∈ L, we have f(x) = 0 for every f ∈ L⊥, thus x ∈ (L⊥)⊥ by definition. It follows thatL ⊂ (L⊥)⊥.

3. It suffices to show that (L⊥)⊥ ⊂ L, which equivalent to x /∈ L ⇒ x /∈ (L⊥)⊥. Let x /∈ L, then since Lis closed, d = infy∈L ||x− y|| > 0, then there exists f ∈ L⊥ such that f(x) = d, thus x /∈ (L⊥)⊥.

9.2.5 Second Dual

Definition 9.5 (Second dual). Let X be a normed space, then the dual space of X∗ is called the seconddual space of X, denoted by X∗∗.

Proposition 9.13. Let X be a normed vector space, for each x ∈ X, we can define a linear functionalx : X∗ → F by x(f) = f(x), then x is a bounded linear functional (i.e. gx ∈ X∗∗) with ||x|| = ||x||. The mapx 7→ x is called the canonical embedding of X into X∗∗, which is injective.

Proof. 1. For each x, x is a linear functional on X∗, which follows from

x(αf + βg) = (αf + βg)(x) = αf(x) + βg(x) = αx(f) + βx(g) (9.20)

2. We then show ||x|| = ||x|| as follows (

||x|| = sup{x(f) : ||f || = 1} = sup{|f(x)| : ||f || = 1} = ||x||. (9.21)

Note that we utilize proposition 9.10.

3. The map x 7→ x is linear. This can be seen from

( ˜αx+ βy)(f) = f(αx+ βy) = αf(x) + βf(y) = αx(f) + βy(f). (9.22)

4. The map x 7→ x is injective. This can be seen from x = 0 ⇒ f(x) = 0,∀f ∈ X∗ ⇒ x = 0.

Definition 9.6 (Reflexivity). A normed space X is said to be reflexive if the canonical embedding of X intoX∗∗ is an isomorphism. In other words, for each z ∈ X∗∗ we can find x ∈ X such that z(f) = f(x), ∀f ∈ X∗.

Proposition 9.14. Every reflexive normed space is complete.

Proof. If a normed space X is reflexive, then the canonical embedding is an isometry between X and X∗∗.We know that X∗∗ is complete, since it is in itself a dual space. Hence, X is complete.

95

9.2. DUAL SPACES CHAPTER 9. LINEAR FUNCTIONALS

Proposition 9.15 (Dual basis). Let X be an n-dimensional vector space with basis {e1, . . . , en}, then thereexists a unique set of vectors {f1, . . . , fn} in X∗ such that

fi(ej) = δij =

{1 i = j

0 i 6= j.

And {f1, . . . , fn} forms a basis of X∗, which is called the dual basis of {e1, . . . , en}.

Proof. First, we show the existence of {f1, . . . , fn} by construction. Since {e1, . . . , en} is a basis of X, foreach x it can be uniquely written as x =

∑ni=1 αiei with αi ∈ F. It is easy to check that αi depends linearly

on x. We define fi by fi(x) := αi. Then fi is well-defined (due to uniqueness of basis expansion) and is alinear functional. In addition, since every norm is equivalent for a finite dimensional spaces. There is C, suchthat

n∑i=1

|αi| ≤ C||x||. (9.23)

From this, we can readily see that fi is a bounded functional for each i.Then we show that {f1, . . . , fn} is a basis. Suppose there exists c1, . . . , cn such that

∑ni=1 cifi = 0. Then

for each ek, we haven∑

i=1

cifi(ek) =n∑

i=1

ciδik = ck = 0. (9.24)

It follows that to make∑n

i=1 cifi zero, all its coefficients must be zero. Hence {f1, . . . , fn} is linearly inde-pendent.

Given f ∈ X∗, for each x ∈ X, it can be written as x =∑n

i=1 αiei, then we have

f(x) = f

(n∑

i=1

αiei

)=

n∑i=1

αif(ei). (9.25)

And let f ′ =∑n

i=1 f(ei)fi, then

f ′(x) =n∑

i=1

f(ei)fi(x) =n∑

i=1

f(ei)αi. (9.26)

Clearly f = f ′ =∑∞

i=1 αifi. Hence, we can conclude that {f1, . . . , fn} is a basis of X∗.

Proposition 9.16. Every finite dimensional space is reflexive.

Proof. Let X be an n-dimensional space, and E = {e1, . . . , en} be its basis. Then, we can get a dual basis{f1, . . . , fn} for X∗. To show that the canonical embedding of X into X∗∗ is an isomorphism, it suffices toshow that {e1, . . . , en} is a dual basis of {f1, . . . , fn} for X∗∗. This can be seen from

ei(fj) = fj(ei) = δij , ∀i, j = 1, . . . , n (9.27)

Then we are done.

96

Chapter 10

Linear Operators

10.1 The Space of Linear Operators

In functional analysis, a linear map is also called a linear operator.

Definition 10.1 (Bouned linear operator). Let X and Y be normed vector spaces, a linear operator T : X →Y is called a bounded linear operator if there exists M > 0 such that

||Tx|| ≤M ||x||, ∀x ∈ X

The following proposition shows that all bounded linear operators between two given normed spacesconstitute a vector space.

Proposition 10.1. Let T1, T2 : X → Y be bounded linear operators, then for any α1, α2 ∈ C, α1T1 +α2T2 isalso a bounded linear operator.

Proof. Since T1 and T2 are bounded, there are M1,M2 > 0 such that for each x ∈ X, ||T1x|| ≤ M1||x|| and||T2x|| ≤M2||x||, thus

||(α1T1 + α2T2)x|| = ||α1T1x+ α2T2x|| ≤ |α1| · ||T1x|| + |α2| · ||T2x|| ≤ (|α1|M1 + |α2|M2)||x||. (10.1)

Hence, α1T1 + α2T2 is also bounded.

Definition 10.2 (The normed space of bounded linear operators). Let X and Y be normed vector spaces,then all bounded linear operators from X to Y form a vector space, denoted by L(X,Y ). In addition, thisspace is endowed with a norm || · || : L(X,Y ) → R defined by

||T || = supx 6=0

||Tx||||x||

= sup{||Tx|| : ||x|| = 1}.

It is easy to verify that the norm defined above is really a norm. In addition, by the definition of operatornorm, we immediately have

Proposition 10.2. Let T : X → Y be a bouned linear operator, then we have

||Tx|| ≤ ||T || · ||x||, ∀x ∈ X.

Proposition 10.3. Let X and Y be normed vector spaces, and T : X → Y be a linear operator, then T isbounded if and only if T is continuous.

Proof. First, assume T is bounded, i.e. there exists M > 0 such that ||Tx|| ≤ M ||x|| for each x ∈ X. Thenfor any x0 ∈ X and ε > 0, we can choose δ = ε/M such that when ||x − x0|| < δ, we have ||Tx − Tx0|| ≤M ||x− x0|| < Mδ, thus T is continuous.

For the other direction, we assume T is continuous. Then, there exists δ > 0 such that when ||x|| ≤ δ,||Tx|| ≤ 1. Then for arbitrary x ∈ X, we have

||Tx|| =∥∥∥∥T δx

||x||

∥∥∥∥ · ||x||δ ≤ 1δ||x||. (10.2)

Hence, T is bounded.

97

10.2. INVERSE OPERATORS CHAPTER 10. LINEAR OPERATORS

Corollary 10.1. Let T : X → Y be a bounded linear operator, then for any sequence (xn) in X that convergesto x, we have Txn → Tx.

Corollary 10.2. Let T : X → Y be a linear operator between normed spaces X and Y , then T is boundedthen kerT is closed.

Proof. If T is bounded, thus T is continuous, hence kerT = T−1{0} is closed.

Proposition 10.4. Let A : X → Y and B : Y → Z be bounded linear operators, then the operator BA : X →Z is bounded and has

||BA|| ≤ ||A|| · ||B||.

Proof. For every x ∈ X we have

||BAx|| ≤ ||B|| · ||Ax|| ≤ ||B|| · ||A|| · ||x||. (10.3)

If T is a linear operator from X to X, then we use the following notation Tn to denote the compositionof n such operator.

Corollary 10.3. Let T : X → X be a bounded linear operator, then

||Tn|| ≤ ||T ||n.

Theorem 10.1. Let X be any normed space, and Y be a Banach space, then L(X,Y ) is a Banach space.

Proof. What we need to show here is that L(X,Y ) is complete. Let (An) be any Cauchy sequence in L(X,Y ).Then given ε > 0, there exists N such that ∀m,n > N, ||Am − An|| < ε, thus ∀m,n > N, ||Amx− Anx|| ≤||Am − An|| · ||x|| < ε||x||, thus (Anx) is a Cauchy sequence for each x ∈ X. Since Y is complete, (Anx)converges for each x. Then we can define a map A : X → Y by

Ax := limn→∞

Anx. (10.4)

It is trivial to verify that A is a linear map. Since (An) is a Cauchy sequence, it is bounded, i.e. supn ||An|| <+∞. Hence, we have

||Ax|| ≤ supn

||Anx|| ≤ supn

(||An|| · ||x||) = (supn

||An||)||x||. (10.5)

This shows that A ∈ L(X,Y ). In addition, given ε > 0, there is N ∈ N+ such that when m,n > N ,||An −Am|| < ε, and thus

∀x ||x|| ≤ 1 ⇒ ||Anx−Amx|| ≤ ε. (10.6)

Take m→ +∞, by continuity of norm, we have for each n > N ,

||Anx−Ax|| = ||Anx− limm→∞

(Amx)|| = limm→∞

||Anx−Amx|| ≤ ε. (10.7)

It follows that ||An −A|| → 0, i.e. An → A.

10.2 Inverse Operators

Definition 10.3 (Inverse operator). Let A : X → Y be a linear operator, if there is a linear operatorB : Y → X such that AB = IdX and BA = IdY then B is called the linear operator of A, denoted by A−1,and A is called invertible.

Proposition 10.5. A linear operator A : X → Y is invertible if and only if it is bijective (one-to-one andonto).

Proposition 10.6. Let A : X → Y and B : Y → Z be invertible linear operators, then BA : X → Z isinvertible and (BA)−1 = A−1B−1.

98

10.2. INVERSE OPERATORS CHAPTER 10. LINEAR OPERATORS

Proof. This can be readily seen from (BA)(A−1B−1) = BIdY B−1 = BB−1 = IdZ and (A−1B−1)(BA) =

A−1IdY A = A−1A = IdX .

Proposition 10.7. Let T : X → Y be a linear operator onto Y such that there exists b > 0 and ||Tx|| ≥b||x|| ∀x ∈ X, then T is invertible and T−1 is a bounded linear operator with ||T−1|| ≤ 1/b.

Proof. First, we can see T is injective from

Tx = 0 ⇒ 0 ≥ b||x|| ⇒ ||x|| = 0 ⇒ x = 0. (10.8)

Hence, T is bijective, and thus it is invertible. In addition, we note that x = TT−1x, thus

||x|| = ||TT−1x|| ≥ b||T−1x||, ∀x ∈ X (10.9)

Thus ||T−1x|| ≤ (1/b)||x||, which follows that T−1 is bounded and has ||T || ≤ 1/b.

Proposition 10.8. Let A : X → X be a bounded linear operator with ||A|| < 1, then I −A is invertible, and

(I −A)−1 =∞∑

k=0

Ak, with ||(I −A)−1|| ≤ (1 − ||A||)−1.

Proof. Let B =∑∞

k=0Ak. First, since ||A|| < 1,

∞∑k=0

||A||k ≤ (1 − ||A||)−1.

It shows that as an infinite series, B converges, and B is a bounded operator with ||B|| ≤ (1 − ||A||)−1. Itremains to show that B is the inverse of I−A. We note that I+AB = I+BA = B, thus I = B−AB = (I−A)Band I = B −BA = B(I −A).

Corollary 10.4. Let A be a bounded linear operator that has a bounded inverse, and B be a linear operatorthat satisfies ||A−B|| < 1/||A−1||, then B is invertible.

Proof. We note thatB = A− (A−B) = A(I −A−1(A−B)). (10.10)

And||A−1(A−B)|| < ||A−1||(1/||A−1||) = 1 (10.11)

Hence, (I −A−1(A−B)) is invertible, thus B is invertible.

This corollary tells us that for each bounded linear operator with bounded inverse, there is a neighborhoodin which the operators possess such properties as well.

In the following, we introduce the Open mapping theorm. It is an important theorem in functional analysis,which gives a useful sufficient condition to judge whether the inverse of a bounded operator is bounded.

Theorem 10.2 (Open mapping theorem). Let X and Y be Banach spaces, and T : X → Y be a boundedlinear operator, then T is an open mapping, i.e. T sends every open set in X to an open set in Y .

Corollary 10.5. Let X and Y be Banach spaces, and T : X → Y be an invertible bounded linear operator,then T−1 is bounded.

Proof. By open mapping theorem, T is an open mapping. Hence, (T−1)−1(U) = T (U) is open for each openset U in X, thus T−1 is continuous, thus it is bounded.

99

10.3. DUAL OPERATORS CHAPTER 10. LINEAR OPERATORS

10.3 Dual Operators

Definition 10.4 (Dual operator). Let A : X → Y be a bounded linear operator. Then for each h ∈ Y ∗,the linear functional f on X defined by f(x) := h(Ax) is a bounded, i.e. f ∈ X∗. This can be seen from|f(x)| ≤ ||h|| · ||A|| · ||x||. Then, we can define an operator A∗ : Y ∗ → X∗ by

(A∗h)(x) = h(Ax), ∀h ∈ Y ∗

which is called the dual operator of A.

Proposition 10.9. Let A : X → Y be a bounded linear operator, then its dual operator A∗ : Y ∗ → X∗ isalso a bounded linear operator, with ||A∗|| = ||A||.

Proof. We first show that A∗ is linear. This can be seen from

(A∗(αf + βg))(x) = (αf + βg)(Ax) = αf(Ax) + βg(Ax) = α(A∗f)(x) + β(A∗g)(x)= (αA∗f + βA∗g)(x). (10.12)

In addition, we have for each f ∈ Y ∗,

|(A∗f)(x)| = |f(Ax)| ≤ ||f || · ||A|| · ||x||. (10.13)

Hence ||A∗f || ≤ ||A|| · ||f ||, which implies that ||A∗|| ≤ ||A||. On the other hand, from the corollary of Hahn-Banach theorem, for every x ∈ X, there is gx ∈ Y ∗ with ||gx|| = 1, and gx(Ax) = ||Ax||. Let fx = A∗gx,then

||Ax|| = gx(Ax) = (A∗gx)(x) ≤ ||A∗gx|| · ||x|| ≤ ||A∗|| · ||gx|| · ||x|| = ||A∗|| · ||x||. (10.14)

Taking supremium of the left hand side, we get ||A|| ≤ ||A∗||.

Proposition 10.10. Let A,B ∈ L(X,Y ), then (αA+ βB)∗ = αA∗ + βB∗ for each α, β ∈ C.

Proof. For each f ∈ Y ∗ and x ∈ X, we have

((αA+ βB)∗f)(x) = f((αA+ βB)(x)) = f(αAx+ βBx) = αf(Ax) + βf(Bx)= α(A∗f)(x) + β(B∗f)(x) = ((αA∗ + βB∗)(f))(x). (10.15)

Proposition 10.11. Let A ∈ L(X,Y ) be invertible, then A∗ is also invertible, and (A∗)−1 = (A−1)∗.

Proof. For each f ∈ Y ∗ and x ∈ X, we have

((A−1)∗A∗f)(x) = (A∗f)(A−1x) = f(AA−1x) = f(x), (10.16)

thus (A−1)∗A∗f = f, ∀f ∈ Y ∗. Similarly, we can show A∗(A−1)∗f = f, ∀f ∈ X∗.

Proposition 10.12. Let A ∈ L(X,Y ) and B ∈ L(Y, Z) then (BA)∗ = A∗B∗.

Proof. For each f ∈ Z∗, and x ∈ X, we have

((BA)∗f)(x) = f(BAx) = (B∗f)(Ax) = (A∗(B∗f))(x), (10.17)

hence (BA)∗ = A∗B∗.

10.4 Convergence of Operators

There are several different notions of convergence in regard to linear operators.

Definition 10.5 (Convergence of operators). Let X and Y be normed vector spaces, (An) be a sequence ofoperators in L(X,Y ) and A ∈ L(X,Y ), then

100

10.4. CONVERGENCE OF OPERATORS CHAPTER 10. LINEAR OPERATORS

1. we say (An) converges uniformly to A or (An) converges in norm to A if ||An −A|| → 0, denotedby An → A;

2. we say (An) converges strongly to A if Anx→ Ax, ∀x ∈ X, denoted by Ans−→ A;

3. we say (An) converges weakly to A if f(Anx) → Ax, ∀x ∈ X, f ∈ Y ∗, denoted by Anw−→ A.

Proof. Uniform convergence of operators implies strong convergence; strong convergence implies weak con-vergence.

Proof. 1. (An → A⇒ Ans−→ A). Assume An → A, for each x ∈ X, we have

||Anx−Ax|| ≤ ||An −A|| · ||x|| → 0 (10.18)

Hence, Ans−→ A.

2. (Ans−→ A⇒ An

w−→ A). This directly follows from the fact that when f is a bounded linear functional,then xn → x⇒ f(xn) → f(x).

101