some notes on measure theory - uni-bielefeld.depreston/rest/measures/files/measures.pdf · could...

Some Notes on Measure Theory

Chris Preston

This version: February 2005

These notes present the material on measures and kernels which are needed inorder to read my lecture notes Specifications and their Gibbs states [16]. Theycould perhaps be used as a general introduction to some parts of measure theory,but the account is somewhat biased and the contents are determined entirely bythe kind of results used in [16]. In particular, the topological aspects of measuretheory are missing completely.

The theory in [16] really only requires an integral for non-negative mappings,and so we also make this restriction here. The disadvantage is then that we havenon-negative cones of mappings instead of vector spaces, but the advantage isthat the value ∞ is much less of a nuisance and in most cases can be regardedas just another number.

There are many texts providing a more balanced account of measure theory.The classical text is Halmos [8] and a very good modern book is Cohn [3]; thefirst course I gave on the subject was based on Taylor [17]. However, the bookeveryone should look at at least once is Meyer [14].

Chris PrestonPankowFebruary, 2005

Contents

1 Extended real numbers 4

2 Measurable spaces and mappings 7

3 Measures 18

4 The Caratheodory extension theorem 24

5 Measures on the real line 30

6 How the integral will be introduced 34

7 Partially ordered sets 35

8 Real valued mappings 40

9 Real valued measurable mappings 47

10 The integral 52

11 The Daniell integral 61

12 The Radon-Nikodym theorem 70

13 Image and pre-image measures 76

14 Kernels 82

15 Product measures 88

16 Countably generated measurable spaces 99

17 The Dunford-Pettis theorem 108

18 Substandard Borel spaces 112

19 The Kolmogorov extension property 116

20 Convergence of conditional expectations 122

21 Existence of conditional distributions 129

3

22 Standard Borel spaces 132

23 The usual diagonal argument 137

Bibliography 139

Index 140

1 Extended real numbers

Most of the mappings we will be dealing with take their values in the set R+∞ of

non-negative extended real numbers. In this chapter we list (without proofs) thefacts which we need about these numbers. The reader should check through thelist to see that there are no surprises.

The natural numbers {0, 1, 2, . . .} are denoted by N, the positive natural numbers{1, 2, . . .} by N+. A countable set is one which is either finite or countably infinite(the latter meaning it has the same cardinality as N).

Put R+ = {a ∈ R : a ≥ 0} and let R+∞ = R+ ∪ {∞}; the operations of addition

and multiplication on R+ will be extended to R+∞ by letting a+∞ = ∞+a = ∞

for all a ∈ R+∞, a · ∞ = ∞ · a = ∞ for all a ∈ R+

∞ \ {0} and 0 · ∞ = ∞ · 0 = 0.As can be easily verified, these extended operations satisfy the usual associative,commutative and distributive laws (without any restriction).

If a, b ∈ R+∞ then a − b is not always defined; however, it is useful to always

assign |a − b| a value: This is the usual value if a, b ∈ R+ and ∞ in all othercases, which means that |∞ − ∞| = ∞. Let a, b ∈ R+

∞ with a ≤ b; then thereexists c ∈ R+

∞ with b = a+ c, and c is unique unless both a and b are equal to ∞.It is convenient to define b − a to be the ‘real’ difference if a, b ∈ R+ and to be∞ otherwise. Thus b = a + (b − a) for all a, b ∈ R+

∞ with a ≤ b and b − a = ∞whenever b = ∞.

The usual total order ≤ on R+ will be extended to a total order on R+∞ (also

denoted by ≤) by letting a ≤ ∞ for each a ∈ R+∞. In particular a ≤ |a − b| + b

and b ≤ |a − b| + a for all a, b ∈ R+∞. Put a ∧ b = min{a, b}, a ∨ b = max{a, b}

and note that |a− b| = (a ∨ b− a) + (a ∨ b− b) for all a, b ∈ R+∞.

We use a ≥ b as an alternative notation for b ≤ a. Moreover, a < b means ofcourse a ≤ b but not a = b, and a > b means a ≥ b but not a = b.

A sequence {an}n≥1 from R+∞ converges to a ∈ R+ if for each ε > 0 there exists

m ≥ 1 such that |a− an| < ε for all n ≥ m. This means exactly that there existsp ≥ 1 such that an ∈ R+ for all n ≥ p and that the sequence {an}n≥p convergesto a in R. A sequence {an}n≥1 from R+

∞ converges to ∞ if for each b ∈ R+ thereexists m ≥ 1 such that an ≥ b for all n ≥ m. If {an}n≥1 converges to a ∈ R+

∞

then a is uniquely determined by {an}n≥1; this value a is called the limit of thesequence and will be denoted by limn→∞ an, or mostly just by limn an. If {an}n≥1

is a sequence from R+∞ then the statement limn an = a is short for the statement

that the sequence {an}n≥1 converges with limit a.

Let {an}n≥1 be a convergent sequence from R+∞ with a = limn an. If a ∈ R+ then

limn |a− an| = 0. However, if a = ∞ then |a− an| = ∞ for all n ≥ 1.

A sequence {an}n≥1 from R+∞ is said to be increasing if an ≤ an+1 and decreasing

if an+1 ≤ an for all n ≥ 1. Each increasing sequence {an}n≥1 converges: If there

4


exists a ∈ R+ with an ≤ a for all n ≥ 1 then limn an is just the limit in R, andif no such a ∈ R+ exists then limn an = ∞. Moreover each decreasing sequence{an}n≥1 also converges: Either an = ∞ for all n ≥ 1, in which case limn an = ∞,or an < ∞ for all n ≥ p for some p ≥ 1, and then limn an is just the limit of thesequence {an}n≥p in R.

Let A be a subset of R+∞; an element a ∈ R+

∞ is said to be an upper bound (resp.lower bound) for A if b ≤ a (resp. b ≥ a) for all b ∈ A. An upper bound (resp.lower bound) a is called a least upper bound (resp. greatest lower bound) for Aif a ≤ b for each upper bound b for A (resp. if a ≥ b for each lower bound b forA). If a least upper bound (resp. greatest lower bound) exists then it is clearlyunique and it will be denoted by sup(A) (resp. by inf(A)).

Each non-empty subset A of R+∞ possesses both a least upper bound and a greatest

lower bound. If A is a bounded subset of R+ (i.e., if b is an upper bound for A forsome b ∈ R+) then sup(A) is the least upper bound of A in R. If A has no upperbound in R+ then sup(A) = ∞. If A = {∞} then inf(A) = ∞. If A 6= {∞} theninf(A) is the greatest lower bound of A \ {∞} in R.

If {an}n≥1 is any sequence from R+∞ and m ≥ 1 then {an : n ≥ m} will be used

to denote the set of values {a ∈ R+∞ : a = an for some n ≥ m}.

If {an}n≥1 is an increasing sequence from R+∞ then limn an is just the least upper

bound of its set of values {an : n ≥ 1}. Thus limn an is the unique element a ofR+

∞ with the properties: (i) an ≤ a for all n ≥ 1 and (ii) if b ∈ R+∞ with b ≤ a

and b 6= a then b ≤ an for all large enough n. Similarly, if {an}n≥1 is a decreasingsequence from R+

∞ then limn an is the greatest lower bound of {an : n ≥ 1}, andtherefore limn an is the unique element a of R+

∞ with the properties: (i) an ≥ afor all n ≥ 1 and (ii) if b ∈ R+

∞ with b ≥ a and b 6= a then b ≥ an for all largeenough n.

Let {an}n≥1 be any sequence from R+∞ and for n ≥ 1 let bn = sup{am : m ≥ n}

and cn = inf{am : m ≥ n}. Then the sequence {bn}n≥1 is decreasing and {cn}n≥1

is increasing. The limits limn bn and limn cn are denoted by lim supn an andlim infn an respectively. Then lim infn an ≤ lim supn an, with equality if and onlyif the sequence {an}n≥1 converges. Moreover, if this is the case then

lim infn→∞

an = limn→∞

an = lim supn→∞

an .

A binary operation ? : R+∞ × R+

∞ → R+∞ is said to be monotone if a ? b ≤ a′ ? b′

whenever a ≤ a′ and b ≤ b′. If ? is monotone and {an}n≥1 and {bn}n≥1 areincreasing sequences from R+

∞ then {an ? bn}n≥1 is also an increasing sequence.A monotone operation ? is defined to be continuous if

limn→∞

(an ? bn) =(

limn→∞

an

)

?(

limn→∞

bn)


holds for all increasing sequences {an}n≥1 and {bn}n≥1 from R+∞.

The operations +, ·, ∨ and ∧ on R+∞ are all continuous (and so in particular

monotone). Moreover, for all a, b ∈ R+ the operation (c, d) 7→ ac + bd is alsocontinuous. Furthermore, these operation are all finite, where a binary operation? on R+

∞ is said to be finite if a ? b ∈ R+ for all a, b ∈ R+.

Intervals in R+∞ are defined as expected: For all a, b ∈ R+

∞ with a < b put

[a, b] = {c ∈ R+∞ : a ≤ c ≤ b} , (a, b) = {c ∈ R+

∞ : a < c < b} ,

(a, b] = {c ∈ R+∞ : a < c ≤ b} , [a, b) = {c ∈ R+

∞ : a ≤ c < b} .

2 Measurable spaces and mappings

Before we define a measure in Chapter 3 we must first look at various classes ofsubsets (algebras, σ-algebras, monotone classes, d-systems) of a given set and therelationships between them. We also need to consider some elementary propertiesof measurable spaces and mappings.

Let X be a non-empty set and S be a non-empty subset of P(X) (where P(X)denotes the set of all subsets of X). Then S is called

— an algebra if X \ A ∈ S for each A ∈ S and S is closed under finite unions(i.e., A1 ∪ · · · ∪ An ∈ S whenever A1, . . . , An ∈ S).

— a σ-algebra if X \ A ∈ S for each A ∈ S and S is closed under countableunions (i.e.,

⋃

n≥1An ∈ S for each sequence {An}n≥1 of elements from S),

— a monotone class if whenever {An}n≥1 is an increasing (resp. is a decreasing)sequence of elements from S then

⋃

n≥1An ∈ S (resp.⋂

n≥1An ∈ S).

— a d-system if X ∈ S and A2 \ A1 ∈ S for all A1, A2 ∈ S with A1 ⊂ A2.

It is clear that a σ-algebra is both an algebra and a monotone class, and that analgebra always contains the elements ∅ andX. Moreover, an algebra is also closedunder finite intersections and a σ-algebra is closed under countable intersections.

Lemma 2.1 (1) A monotone class is a σ-algebra if and only if it is an algebra.

(2) Let {An}n≥1 be a sequence from an algebra A. Then there exists a disjointsequence {Bn}n≥1 from A with Bn ⊂ An for each n ≥ 1 and

⋃

n≥1Bn =⋃

n≥1An.

(3) An algebra A is a σ-algebra if and only if⋃

n≥1An ∈ A for each increasingsequence {An}n≥1 from A.

(4) An algebra A is a σ-algebra if and only if⋃

n≥1An ∈ A for each disjointsequence {An}n≥1 from A.

Proof Straightforward.

Let S be a subset of P(X). Then, since P(X) is a σ-algebra containing S andan arbitrary intersection of σ-algebras is again a σ-algebra, it follows that theintersection of all the σ-algebras containing S is a σ-algebra containing S. Thisσ-algebra is called the σ-algebra generated by S and it will be denoted by σ(S).Of course, if F is any σ-algebra containing S then by construction F containsσ(S), which means that σ(S) is the smallest σ-algebra containing S.

The algebra, monotone class and the d-system generated by S are defined inexactly the same way, and these subsets of P(X) will be denoted by a(S), m(S)

7


and d(S) respectively. Note that if S is empty then m(S) is also empty andσ(S) = a(S) = d(S) = {∅, X}.

It is instructive to see what σ(T ) and a(T ) are in the special case in which Tis a non-empty finite subset of P(X), and to be explicit suppose that T consistsof the n elements A1, . . . , An. Denote by p(T ) the set of all non-empty subsetsof X having the form A′

1 ∩ · · · ∩ A′n, where for each j the set A′

j is either Aj

or X \ Aj. Then p(T ) is finite (and in fact can contain at most 2n elements).Moreover, denote by p(T ) the set of all subsets of X which can written as unionsof elements from p(T ). (An empty union is allowed here, so ∅ ∈ p(T ).)

Lemma 2.2 (1) The elements of p(T ) form a partition of X, meaning that eachpoint of X lies in exactly one element of p(T ).

(2) Let 1 ≤ j ≤ n; then for each B ∈ p(T ) either B ⊂ Aj or B ∩Aj = ∅. Hence(by (1)) Aj is the disjoint union of the elements of p(T ) it contains.

(3) σ(T ) = a(T ) = p(T ).

Proof (1) It is clear that different elements of p(T ) are disjoint. Moreover, it isalso clear that for each x ∈ X there is an element of p(T ) containing x (since xis either in Aj or X \ Aj for each j) and this just says that the union of all theelements in p(T ) is equal to X.

(2) This is clear.

(3) It follows immediately from (1) that p(T ) is a σ-algebra and thus also analgebra, and by (2) T ⊂ p(T ). But any algebra or σ-algebra containing T mustcontain p(T ) and thus also p(T ), and therefore σ(T ) = a(T ) = p(T ).

Lemma 2.2 implies in particular that σ(T ) and a(T ) are finite when T is finite.

The following very useful fact is called the monotone class theorem:

Proposition 2.1 If A is an algebra then σ(A) = m(A).

Proof This is equivalent to showing that σ(A) is a monotone class and that m(A)is a σ-algebra. But a σ-algebra is always a monotone class, and by Lemma 2.1 (1)a monotone class is a σ-algebra if and only if it is an algebra. It thus remainsto show that m(A) is an algebra. Let m′(A) = {X \ A : A ∈ m(A)} andput M = m(A) ∩ m′(A). Then A ⊂ M, and it is easily checked that M isa monotone class, and therefore m(A) ⊂ M, i.e., M = m(A). But this justmeans that X \ A ∈ m(A) for each A ∈ m(A). Now for each A ∈ m(A) letM(A) = {B ∈ m(A) : A ∪ B ∈ m(A)}; then it is again easily checked thatM(A) is a monotone class. Moreover, if in addition A ∈ A then A ⊂ M(A), andhence M(A) = m(A) for each A ∈ A. This implies that C ∪ D ∈ m(A) for all


C ∈ A, D ∈ m(A) (since C ∪ D ∈ M(C) = m(A)), and hence A ⊂ M(D) foreach D ∈ m(A), i.e., A = M(D) for each D ∈ m(A). In other words, m(A) isclosed under finite unions.

A subset S ⊂ P(X) is said to be closed under finite intersections if A1 ∩A2 ∈ Sfor all A1, A2 ∈ S.

Proposition 2.2 If S is closed under finite intersections then d(S) is an algebra.

Proof The set d(S) is non-empty, since X ∈ d(S), and also X \A ∈ d(S) for allA ∈ d(S), since d(S) is a d-system. Put

C = {A ∈ d(S) : A ∩ B ∈ d(S) for all B ∈ S} ;

then S ⊂ C ⊂ d(S) and it is easily seen that C is a d-system. Hence C = d(S),i.e., A ∩B ∈ d(S) for all A ∈ d(S), B ∈ S. Now put

D = {A ∈ d(S) : A ∩B ∈ d(S) for all B ∈ d(S)} ;

then S ⊂ D ⊂ d(S) and so again D is a d-system. Therefore D = d(S), i.e.,A ∩ B ∈ d(S) for all A, B ∈ d(S). This shows that d(S) is an algebra.

Let Y be a non-empty subset of X; for each non-empty subset S of P(X) denoteby S|Y the subset of P(Y ) consisting of all sets having the form Y ∩ A for someA ∈ S. S|Y is referred to as the trace of S on Y . If E is a σ-algebra of subsets ofX then E|Y is a σ-algebra of subsets of Y .

Proposition 2.3 σ(S|Y ) = σ(S)|Y .

Proof Denote by F the subset of P(X) consisting of all sets of the form F∪(G\Y )with F ∈ σ(S|Y ) and G ∈ σ(S), and therefore F|Y = σ(S|Y ). Then it is easilychecked that F is a σ-algebra; moreover, F contains S, since if A ∈ S thenA = (A ∩ Y ) ∪ (A \ Y ) and A ∩ Y ∈ S|Y ⊂ σ(S|Y ). Thus σ(S) ⊂ F , whichimplies that σ(S)|Y ⊂ F|Y = σ(S|Y ). On the other hand, σ(S)|Y is a σ-algebracontaining S|Y , and hence also σ(S|Y ) ⊂ σ(S)|Y .

Let X and Y be non-empty sets and f : X → Y be a mapping; then for eachB ⊂ Y put f−1(B) = {x ∈ X : f(x) ∈ B} and for each S ⊂ P(Y ) put

f−1(S) = {A ∈ P(X) : A = f−1(B) for some B ∈ S} .

Lemma 2.3 (1) If E ⊂ P(X) is a σ-algebra then {B ∈ P(Y ) : f−1(B) ∈ E} isa σ-algebra of subsets of Y .

(2) For each σ-algebra F of subsets of Y the set f−1(F) is a σ-algebra of subsetsof X.


Proof Both (1) and (2) follow from the fact that f−1(∅) = ∅, f−1(Y ) = X,f−1(Y \ B) = X \ f−1(B) for each B ⊂ Y and f−1

(⋃

n≥1Bn

)

=⋃

n≥1 f−1(Bn)

for each sequence {Bn}n≥1 of subsets of Y .

Proposition 2.4 f−1(σ(S)) = σ(f−1(S)) for each subset S of P(Y ).

Proof By Lemma 2.3 (2) f−1(σ(S)) is a σ-algebra and f−1(S) ⊂ f−1(σ(S)),and this implies that σ(f−1(S)) ⊂ f−1(σ(S)). But by Lemma 2.3 (1) the setE = {B ∈ P(Y ) : f−1(B) ∈ σ(f−1(S))} is a σ-algebra, and it contains S; henceσ(S) ⊂ E and thus

f−1(σ(S)) ⊂ f−1(E)

= {A ∈ P(X) : A = f−1(B) for some B ∈ E} ⊂ σ(f−1(S)) .

A pair (X, E) consisting of a non-empty set X and a σ-algebra E of subsets of Xis called a measurable space.

Let (X, E) and (Y,F) be measurable spaces. A mapping f : X → Y is said toa measurable mapping from (X, E) to (Y,F) if f−1(F) ⊂ E . This will also beexpressed by saying that f : (X, E) → (Y,F) is measurable, or just by sayingthat f : X → Y is measurable if the σ-algebras E and F can be determined fromthe context.

Lemma 2.4 Let f : X → Y be a mapping and S be a subset of F with F = σ(S).Then f is measurable if and only if f−1(S) ⊂ E .

Proof This follows immediately from Proposition 2.4.

If f : (X, E) → (Y,F) and g : (Y,F) → (Z,G) are measurable mappings then thecomposition g ◦ f : X → Z is a measurable mapping from (X, E) to (Z,G).

Let f : (X, E) → (Y,F) be a measurable mapping, and thus f−1(F) ⊂ E . Nowthere are two further stronger properties which we will sometimes require of sucha mapping. The first is that f−1(F) = E should hold and the second is that bothf−1(F) = E and f(X) ∈ F hold. The first property is important when dealingwith countably generated measurable spaces (Chapter 16) and the second whendealing with what we call substandard Borel spaces (Chapter 18); these are somekind of poor man’s version of standard Borel spaces. Therefore, when lookingat constructions which preserve measurability (for example, taking products anddisjoint unions) we will also check whether these two additional properties arepreserved.

If X is a non-empty topological space then OX will denote the set of open subsetsof X. The σ-algebra σ(OX) is called the Borel σ-algebra of X or the σ-algebraof Borel subsets of X and it will be denoted by BX .


Lemma 2.5 If X and Y are non-empty topological spaces and f : X → Y is acontinuous mapping then f−1(BY ) ⊂ BX , i.e., f is a measurable mapping from(X,BX) to (Y,BY ). Moreover, if f is a homeomorphism then f−1(BY ) = BX .

Proof If f is continuous then f−1(OY ) ⊂ OX and thus by Proposition 2.4

f−1(BY ) = f−1(σ(OY )) = σ(f−1(OY )) ⊂ σ(OX) = BX .

Moreover, if f is a homeomorphism then f−1(OY ) = OX and so Proposition 2.4here implies that f−1(BY ) = BX .

The set R+∞ will be considered as a topological space with respect to the order

topology: In this topology a subset U of R+∞ is defined to be open if for each

x ∈ U there exists an open interval J with x ∈ J ⊂ U , where an open interval isa set having one of the forms {x ∈ R+

∞ : a < x < b} with a < b, {x ∈ R+∞ : x < b}

with b 6= 0 and {x ∈ R+∞ : x > a} with a 6= ∞. The mapping t 7→ t/(1 − t)

defines a homeomorphism from [0, 1] onto R+∞ and so in particular R+

∞ is compact.Moreover, the relative topology of R+ as a subset of R+

∞ is the same as the relativetopology of R+ as a subset of R. The Borel σ-algebra of R+

∞ will be denoted byB+∞.

The most basic measurable space in the whole of what follows is (R+∞,B

+∞). If

(X, E) is a measurable space then the set of all measurable mappings from (X, E)to (R+

∞,B+∞) will be denoted by M(E), thus a mapping f : X → R+

∞ is in M(E)if and only if f−1(B+

∞) ⊂ E . (We omit the X from this notation because it canalways be inferred from the σ-algebra E .)

It is usually rather easy to show that a mapping f : X → R+∞ is in M(E). This

is because by Lemma 2.4 it is enough to show that f−1(A) ∈ E for each A ∈ S,where S is a any subset of B+

∞ with σ(S) = E , and there are many simple choiceshere for S. For example, consider the following subsets of P(R+

∞):

S1 = {[0, a] : a ∈ R+ \ {0}} , S2 = {[0, a) : a ∈ R+ \ {0}} ,

S3 = {[a,∞] : a ∈ R+ \ {0}} , S4 = {(a,∞] : a ∈ R+ \ {0}} .

Lemma 2.6 σ(S1) = σ(S2) = σ(S3) = σ(S4) = B+∞.

Proof Let O be the set of open subsets of R+∞ and I the set of open intervals

(occurring in the definition of the order topology). Then each element of O canbe written as a countable union of elements from I, thus O ⊂ σ(I) and henceB+∞ = σ(O) ⊂ σ(I). On the other hand, I ⊂ O and so σ(I) ⊂ σ(I) = B+

∞. Thisshows that B+

∞ = σ(I). Let a ∈ R+ \ {0}; then

[0, a] =⋂

n≥1

[0, b+ n−1) ∈ σ(S2) , [0, a) = R+∞ \ [a,∞] ∈ σ(S3) ,


[a,∞] =⋃

n≥1

(a(1 − n−1),∞] ∈ σ(S4) , (a,∞] = R+∞ \ [0, a] ∈ σ(S1) ,

thus S1 ⊂ σ(S2), S2 ⊂ σ(S3), S3 ⊂ σ(S4) and S4 ⊂ σ(S1) which implies thatσ(S1) = σ(S2) = σ(S3) = σ(S4). Denote this common σ-algebra by F . Ifa ∈ R+ \{0} then [0, a) ∈ I, hence S1 ⊂ I and so F = σ(S1) ⊂ σ(I) = B+

∞. Nowlet a, b ∈ R+\{0} with a < b; then {x ∈ X : a < x < b} = [0, b)∩(a,∞] ∈ F andthe same holds for the other possibilities in I. Therefore I ⊂ F which impliesB+∞ = σ(I) ⊂ F . This shows that F = B+

∞.

Putting together Lemmas 2.4 and 2.6 results in the following criteria:

Lemma 2.7 A mapping f : X → R+∞ is in M(E) as soon as one of the following

conditions is satisfied:

(1) {x ∈ X : f(x) ≤ a} ∈ E for all a ∈ R+ \ {0},

(2) {x ∈ X : f(x) < a} ∈ E for all a ∈ R+ \ {0},

(3) {x ∈ X : f(x) ≥ a} ∈ E for all a ∈ R+ \ {0},

(4) {x ∈ X : f(x) > a} ∈ E for all a ∈ R+ \ {0}.

Proof For n = 1, 2, 3, 4 the condition (n) says that f−1(Sn) ⊂ E and thereforeby Lemma 2.4 f−1(B+

∞) ⊂ E , since by Lemma 2.6 (n) B+∞ = σ(Sn). Hence

f ∈ M(E).

Of course, the choice of Sn, n = 1, 2, 3, 4 was somewhat arbitrary, and manyother similar choices could have been made.

Next let S be a non-empty set and for each s ∈ S let (Xs, Es) be a measurablespace; put X =

∏

s∈S Xs. We will now define an appropriate product σ-algebraon the cartesian product X. Let us call a subset of X a measurable rectangle ifit has the form

∏

s∈S Es with Es ∈ Es for each s and Es 6= Xs for only finitelymany s ∈ S, and denote the set of such measurable rectangles by R; note thatR is closed under finite intersections.

Lemma 2.8 Denote by A (resp. A′) the set of all finite unions (resp. all finitedisjoint unions) of elements of R. Then A is an algebra and A′ = A. Moreover,σ(A) = σ(R).

Proof It is immediate that σ(A) = σ(R) since R ⊂ A and A ⊂ σ(R). NowA′ ⊂ A and A is closed under finite unions, and hence it will follow that A is analgebra with A′ = A once we show that if A ∈ A then A and X \ A are bothelements of A′. Thus consider A ∈ A, so A has the form R1 ∪ · · · ∪ Rn with


Rj =∏

s∈S Esj, where Esj ∈ Es for each s ∈ S, j = 1, . . . , n. Moreover, thereexists a finite subset S ′ of S such that Esj = Xs for all s /∈ S ′, j = 1, . . . , n. Foreach s ∈ S let Ts be the subset of P(Xs) consisting of the elements Es1, . . . , Esn,and let U be the subset of P(X) consisting of all elements of the form

∏

s∈S Es

with Es ∈ p(Ts) for each s (where p(Ts) is as in Lemma 2.2, and note thatp(Ts) = {Xs} for each s /∈ S ′). Then U ⊂ R (since p(Ts) ⊂ Es for each s) andby Lemma 2.2 (1) the elements of U form a partition of X. Furthermore, byLemma 2.2 (2) it follows that if U ∈ U then either U ⊂ A or U ∩ A = ∅, andthis implies that A is the (disjoint) union of the elements of U it contains, andthe same holds true of X \ A. In particular, A and X \ A are both elements ofA′.

The product of the σ-algebras Es, s ∈ S, is defined to be the σ-algebra σ(R),and this σ-algebra will be denoted by

∏

s∈S Es. Moreover, the product of themeasurable spaces (Xs, Es), s ∈ S, is defined to be the measurable space (X, E),where E =

∏

s∈S Es. For each s ∈ S there is the projection mapping ps : X → Xs

given by ps({xs}s∈S) = xs for each {xs}s∈S ∈ X, and ps : (X, E) → (Xs, Es) isclearly measurable. Now consider a further measurable space (Y,F) and for eachs ∈ S let fs : Y → Xs be a mapping. Then there is a mapping f : Y → Xdefined by letting f(y) = {fs(y)}s∈S for each y ∈ Y , and of course fs = ps ◦ f foreach s ∈ S. Moreover, all mappings from Y to X can be obtained in this way.

Proposition 2.5 Let f : Y → X be a mapping. Then f : (Y,F) → (X, E) ismeasurable if and only if ps ◦ f : (Y,F) → (Xs, Es) is measurable for each s ∈ S.

Proof The condition is clearly necessary since ps is measurable for each s. Thussuppose the mapping ps ◦ f : Y → Xs is measurable for each s ∈ S. Then, since

f−1(

∏

s∈S

Es

)

= f−1(

⋂

s∈S

p−1s (Es)

)

=⋂

s∈S

f−1(p−1s (Es)) =

⋂

s∈S

(ps ◦ f)−1(Es)

(and f−1s (Xs) = Y ), it follows that f−1(R) ∈ F for each measurable rectangle R,

and therefore by Lemma 2.4 f is measurable.

Suppose now that we have two product spaces: For each s ∈ S let (Xs, Es) and(Ys,Fs) be measurable spaces and let X =

∏

s∈S Xs, E =∏

s∈S Es, Y =∏

s∈S Ys

and F =∏

s∈S Fs. Also for each s ∈ S let fs : Xs → Ys be a mapping; then thereis a mapping f : X → Y given by f({xs}s∈S) = {fs(xs)}s∈S for all {xs}s∈S) ∈ X.

Proposition 2.6 (1) If fs : (Xs, Es) → (Ys,Fs) is measurable for each s ∈ Sthen f : (X, E) → (Y,F) is measurable.

(2) If f−1s (Fs) = Es for each s ∈ S then f−1(F) = E .

(3) If f−1s (Fs) = Es and fs(X) ∈ Fs for each s ∈ S and the set S is countable

then f(X) ∈ F .


Proof (1) Let RX (resp. RY ) be the measurable rectangles in X (resp. in Y ). IfR =

∏

s∈S Fs ∈ RY then f−1(R) =∏

s∈S f−1s (Fs) ∈ RX and thus f−1(RY ) ⊂ RX .

Therefore by Lemma 2.4 f is measurable.

(2) Let R =∏

s∈S Es ∈ RX ; then, since f−1s (Fs) = Es, there exists Fs ∈ Fs

with f−1s (Fs) = Es and, since f−1

s (Ys) = Xs), we can choose Fs = Ys wheneverEs = Xs. Thus R′ =

∏

s∈S Fs ∈ RY and f−1(R′) =∏

s∈S f−1s (Fs) =

∏

s∈S Es = Rand this, together with (1) shows that f−1(RY ) = RX . Hence by Proposition 2.4

f−1(F) = f−1(σ(RY )) = σ(f−1(RY )) = σ(RX) = E .

(3) Clearly f(X) =∏

s∈S fs(Xs). Thus if S is finite then f(X) is a measurablerectangle, hence assume S is countably infinite. Let {sn}n≥1 be an enumeration ofthe elements in S and for each n ≥ 1 put Rn =

∏nk=1 fsk

(Xsk)×

∏

k>n Ysk. Then

Rn is a measurable rectangle for each n ≥ 1 and f(X) =⋂

n≥1Rn. Thereforef(X) ∈ F .

We now discuss what are called sections, but only do this for the product of twospaces. Let X, Y, Z be sets and let f : X × Y → Z a be mapping. For eachx ∈ X let fx : Y → Z be the mapping defined by fx(y) = f(x, y) for all y ∈ Yand for each y ∈ Y let f y : X → Z be the mapping defined by f y(x) = f(x, y)for all x ∈ X. These mappings are known as sections.

Let (X, E), (Y,F) and (Z,G) be measurable spaces.

Proposition 2.7 Let f : (X × Y, E × F) → (Z,G) be a measurable mapping.Then the mapping fx : (Y,F) → (Z,G) is measurable for each x ∈ X andf y : (X, E) → (Z,G) is measurable for each y ∈ Y .

Proof If B ⊂ X × Y then for each x ∈ X put Bx = {y ∈ Y : (x, y) ∈ B} and foreach y ∈ Y put By = {x ∈ X : (x, y) ∈ B}; thus (IB)x = IBx

and (IB)y = IBy .Then (fx)

−1(G) = (f−1(G))x and (f y)−1(G) = (f−1(G))y for all G ∈ G, x ∈ Xand y ∈ Y . It is thus enough to show that if x ∈ X and y ∈ Y then Bx ∈ Fand By ∈ E for all B ∈ E × F . Let Sx : X → Y × Y be the mapping given bySx(y) = (x, y) for each y ∈ Y ; then

S−1x (E × F ) =

{

F if x ∈ E ,∅ otherwise ,

which implies that S−1x (R) ∈ F for each measurable rectangle R, and therefore

by Lemma 2.4 Sx is measurable. But Bx = S−1x (B), and hence Bx ∈ F for each

B ∈ E × F . In the same way By ∈ E for all B ∈ E × F .


Besides the product of measurable spaces there is the dual concept of a disjointunion. Let S be a non-empty set and for each s ∈ S let (Xs, Es) be a measurablespace; assume that the sets Xs, s ∈ S, are disjoint and put X =

⋃

s∈S Xs. Let

⋃

s∈S

Es = {E ⊂ X : E ∩Xs ∈ Es for all s ∈ S} ;

then E =⋃

s∈S Es is clearly a σ-algebra and the measurable space (X, E) is calledthe disjoint union of the measurable spaces (Xs, Es), s ∈ S.

For each s ∈ S the inclusion mapping is : Xs → X (with is(x) = x for eachx ∈ Xs) is measurable and in fact i−1

s (E) = Es. The result corresponding toProposition 2.5 holds (but is rather trivial): Let (Y,F) be a measurable spaceand let f : X → Y be a mapping. Then f : (X, E) → (Y,F) is measurable if andonly if f ◦ is : (Xs, Es) → (Y,F) is measurable for each s ∈ S.

The result corresponding to Proposition 2.6 also holds. Suppose we have twodisjoint unions: For each s ∈ S let (Xs, Es) and (Ys,Fs) be measurable spaceswith the families of sets {Xs}s∈S and {Ys}s∈S both disjoint and let (X, E) (resp.(Y,F)) be the disjoint union of the measurable spaces (Xs, Es), s ∈ S (resp.(Ys,Fs), s ∈ S). Also for each s ∈ S let fs : Xs → Ys be a mapping; then thereis a mapping f : X → Y given by f(x) = fs(x) for all x ∈ Xs, s ∈ S.

Proposition 2.8 (1) If fs : (Xs, Es) → (Ys,Fs) is measurable for each s ∈ Sthen f : (X, E) → (Y,F) is measurable.

(2) If f−1s (Fs) = Es for each s ∈ S then f−1(F) = E .

(3) If f−1s (Fs) = Es and fs(Xs) ∈ Fs for each s ∈ S and the set S is countable

then f(X) ∈ F .

Proof This is very straightforward.

Suppose now S is countable. Let (X, E) be the disjoint union of the measurablespaces (Xs, Es), s ∈ S, and let (Y,F) be a measurable space. Also for each s ∈ Slet fs : Xs → Y be a mapping; then there is a mapping f : X → S × Y given byf(x) = (s, fs(x)) for all x ∈ Xs, s ∈ S.

Proposition 2.9 (1) If fs : (Xs, Es) → (Y,F) is measurable for each s ∈ S thenf : (X, E) → (S × Y,P(S) × F) is measurable.

(2) If f−1s (F) = Es for each s ∈ S then f−1(P(S) ×F) = E .

(3) If f−1s (F) = Es and fs(Xs) ∈ F for each s ∈ S then f(X) ∈ P(S) ×F .


Proof For each s ∈ S let Ys = {s}×Y and Fs = {{s}×F : F ∈ F}, thus Fs is aσ-algebra of subsets of Ys. Then S×Y is the disjoint union of the sets Ys, s ∈ S.Let D be the σ-algebra

⋃

n≥1 Fs, hence A ⊂ S×Y is an element of D if and onlyif As ∈ F for each s ∈ S, where As is the section {y ∈ Y : (s, y) ∈ A}. ButD = P(S)×F : If A ∈ D then {s}×As is a measurable rectangle in S×Y for eachs ∈ S and hence A =

⋃

s∈S({s} ×As) ∈ P(S) ×F . Conversely, if A ∈ P(S) ×Fthen by Proposition 2.7 As ∈ F for each s ∈ S, and so A ∈ D. This shows that(S × Y,P(S)×F) is the disjoint union of the measurable spaces (Ys,Fs), s ∈ S.The result thus follows from Proposition 2.8, since the mapping f : X → S × Yhere corresponds to the mapping f in Proposition 2.8.

These results about disjoint unions are rather superficial and the only reason wehave presented them is because they do play a role when studying point processesin [16].

Let us end the chapter by looking at the product of topological spaces; here thereare two σ-algebras: The first is the product of the Borel σ-algebras, and thesecond is the Borel σ-algebra of the product topology, and we are interested infinding conditions which ensure that they are equal.

If X is a topological space then a subset O′X of OX is a base for the topology

if for each U ∈ OX and each x ∈ U there exists V ∈ O′X with x ∈ V ⊂ U . If

O′X can be chosen to be countable then X is said to have a countable base for its

topology.

For each s ∈ S let Xs be a non-empty topological space and put X =∏

s∈S Xs.A subset of X is called an open rectangle if it has the form

∏

s∈S Us with Us anopen subset of Xs for each s ∈ S and Us 6= Xs for only finitely many s ∈ S;the set of such open rectangles will be denoted by RO. The product topology onX is defined by stipulating that RO should be a base for the topology; a subsetU ⊂ X is thus open if for each x ∈ U there exists R ∈ RO with x ∈ R ⊂ U .

We now have the product σ-algebra F =∏

s∈S BXs, with BXs

the Borel σ-algebraof Xs for each s ∈ S and also the Borel σ-algebra BX of X.

Proposition 2.10 If S is countable and each Xs has a countable base for itstopology then F = BX .

Proof We first show that F ⊂ BX always holds (without any assumptions on Sand the Xs’s.) For s ∈ S let Rs denote the set of measurable rectangles havingthe form

∏

t∈S Bs with Bs ∈ BXsand Bt = Xt for t 6= s and let RO

s denote the setof open rectangles having the form

∏

t∈S Us with Us ∈ OXsand Ut = Xt for t 6= s.

Moreover, let ps : X → Xs be the projection mapping with ps({xt}t∈S) = xs foreach {xt}t∈S ∈ X. Then Rs = p−1

s (BXs) and RO

s = p−1s (OXs

) and therefore by


Proposition 2.4 Rs = p−1s (BXs

) = p−1s (σ(OXs

)) = σ(ROs ) ⊂ σ(OX) = BX . But if

R is a measurable rectangle in X then there exist s1, . . . , sn ∈ S and Rk ∈ Rsk,

1 ≤ k ≤ n, such that R =⋂n

k=1Rk. This shows that each measurable rectangleis in BX and thus F ⊂ BX .

We now need the assumptions. For each s ∈ S let Vs be a countable base for thetopology on Xs and let V be the set of all open rectangles of the form

∏

s∈S Vs,where Vs ∈ Vs for all s ∈ A for some finite set A ⊂ S and Vs = Xs for alls ∈ S \ A. Then V is countable and it is easy to see that V is a base from thetopology on X. In particular this means that each open subset of X can bewritten as a countable union of elements from V, thus OX ⊂ σ(V) and henceBX = σ(OX) ⊂ σ(V). But each element of V is a measurable rectangle andtherefore BX ⊂ F . This, together with the first part, implies that F = BX .

3 Measures

Measures are introduced in this chapter and some of their elementary propertiesstudied. These measures are usually defined on σ-algebras, but it is also necessaryto look at measures defined just on algebras.

In what follows let X be a non-empty set. Let S be a non-empty subset of P(X).A mapping µ : S → R+

∞ is said to be

— sub-additive if µ(A ∪ B) ≤ µ(A) + µ(B) for all A, B ∈ S with A ∪B ∈ S.

— additive if µ(A ∪ B) = µ(A) + µ(B) for all A, B ∈ S with A ∪ B ∈ S andA ∩ B = ∅.

— countably sub-additive if

µ(

⋃

n≥1

An

)

≤∑

n≥1

µ(An)

for each sequence {An}n≥1 in S with⋃

n≥1An ∈ S.

— countably additive if

µ(

⋃

n≥1

An

)

=∑

n≥1

µ(An)

for each disjoint sequence {An}n≥1 in S with⋃

n≥1An ∈ S.

— monotone if µ(A) ≤ µ(B) for all A, B ∈ S for all A ⊂ B.

— continuous if it is monotone and

µ(

⋃

n≥1

An

)

= limn→∞

µ(An)

for every increasing sequence {An}n≥1 from S with⋃

n≥1An ∈ S.

— ∅-continuous if it is monotone and limn µ(An) = 0 whenever {An}n≥1 is adecreasing sequence from S with

⋂

n≥1An = ∅.

It is clear that if µ : S → R+∞ has one of these properties then the restriction of

µ to a subset T of S has the same property. It is also clear that if ∅ ∈ S thenany countably sub-additive mapping µ : S → R+

∞ with µ(∅) = 0 is sub-additiveand every countably additive mapping µ : S → R+

∞ with µ(∅) = 0 is additive.

If S is a subset of P(X) with ∅, X ∈ S then an additive mapping µ : S → R+∞

with µ(∅) = 0 is said to be a finitely additive measure on S. A countably additivemapping µ : S → R+

∞ with µ(∅) = 0 is called a measure on S. A measure isalways a finitely additive measure (just take An = ∅ for each n ≥ 3).

In the following let A ⊂ P(X) be an algebra.

18

3 Measures 19

Lemma 3.1 A finitely additive measure µ on A is monotone.

Proof Let A, B ∈ A with A ⊂ B. Then B \ A ∈ A and thus

µ(A) ≤ µ(A) + µ(B \ A) = µ(A ∪ (B \ A)) = µ(B) ,

since µ is additive. Thus µ is monotone.

Proposition 3.1 A finitely additive measure µ on A is a measure if and only ifit is continuous.

Proof Suppose first that µ is a measure and let {An}n≥1 be an increasing sequenceof elements from A with A =

⋃

n≥1An ∈ A; then A is the disjoint union of thesets Am \ Am−1, m ≥ 1, (with A0 = ∅) and these sets are elements of A. Hence

µ(A) =∑

m≥1

µ(Am \ Am−1) = limn→∞

n∑

m=1

µ(Am \ Am−1)

= limn→∞

µ(

n⋃

m=1

(Am \ Am−1))

= limn→∞

µ(An) .

Conversely, suppose that µ is continuous and let {An}n≥1 be a disjoint sequenceof elements from A with A =

⋃

n≥1An ∈ A. For each n ≥ 1 let A′n =

⋃nm≥1Am;

then {A′n}n≥1 is an increasing sequence of elements from A with

⋃

n≥1A′n = A,

and therefore

µ(A) = µ(

⋃

n≥1

A′n

)

= limn→∞

µ(A′n) = lim

n→∞µ(

n⋃

m=1

Am

)

= limn→∞

n∑

m=1

µ(Am) =∑

n≥1

µ(An) ;

i.e., µ is a measure.

Lemma 3.2 Let µ be a measure on A and {An}n≥1 be a decreasing sequencefrom A with A =

⋂

n≥1An ∈ A and µ(A1) <∞. Then limn µ(An) = µ(A).

Proof For n ≥ 1 let Bn = A1 \An; then {Bn}n≥1 is an increasing sequence fromA with

⋃

n≥1Bn = A1 \A. Thus by Proposition 3.1 limn µ(Bn) = µ(A1 \A). Butµ(A1 \ A) = µ(A1) − µ(A) and µ(Bn) = µ(A1) − µ(An) for each n ≥ 1, sinceµ(A1) <∞, and therefore limn µ(An) = µ(A).

A measure or a finitely additive measure µ on A is finite if µ(X) < ∞. Ifµ(X) = 1 then a measure µ on A is called a probability measure.

3 Measures 20

Proposition 3.2 A finite finitely additive measure µ on A is a measure if andonly if it is ∅-continuous.

Proof Lemma 3.2 implies that a finite measure is ∅-continuous. Thus let µ bea finite finitely additive measure on A which is ∅-continuous. Let {An}n≥1 bean increasing sequence from A with A =

⋃

n≥1An ∈ A. For each n ≥ 1 putBn = A \ An; then {Bn}n≥1 is a decreasing sequence from A with

⋂

n≥1Bn = ∅

and therefore limn µ(Bn) = 0. But µ(A) = µ(Bn) + µ(An) for each n ≥ 1, sinceµ is additive, and hence µ(A) = limn(µ(Bn) + µ(An)) = limn µ(An). This showsthat µ is continuous and hence by Proposition 3.2 µ is a measure on A.

Lemma 3.3 A measure µ on A is countably sub-additive. In particular, if{An}n≥1 is a sequence from A with µ(An) = 0 for each n ≥ 1 and A =

⋃

n≥1An

an element of A then µ(A) = 0.

Proof Let {An}n≥1 be a sequence from A with A =⋃

n≥1An ∈ A. Then thereexists by Lemma 2.1 (2) a disjoint sequence {Bn}n≥1 from A with Bn ⊂ An foreach n ≥ 1 and

⋃

n≥1Bn = A. Then µ(Bn) ≤ µ(An) for each n ≥ 1, sinceµ is monotone and µ(A) =

∑

n≥1 µ(Bn), since µ is countably additive. Henceµ(A) ≤

∑

n≥1 µ(An), which implies that µ is countably sub-additive.

If µ is a probability measure on A then Lemma 3.3 is usually applied to thecomplements of the sets in the lemma, i.e., in the form: If {Bn}n≥1 is a sequencefrom A with µ(Bn) = 1 for each n ≥ 1 and B =

⋂

n≥1Bn ∈ A then µ(B) = 1.

Now let (X, E) be a measurable space (and so E is a σ-algebra). Then E is inparticular an algebra and so the above results are still valid when A is replacedby E . (Note that in this case the requirement involving

⋂

n≥1An and⋃

n≥1An inLemmas 3.2 and 3.3 are then automatically satisfied.)

If µ1 and µ2 are measures on E and a1, a2 ∈ R+ then the linear combinationa1µ1 + a2µ2 is also a measure, where (a1µ1 + a2µ2)(E) = a1µ1(E) + a2µ2(E) forall E ∈ E .

Proposition 3.3 below is a useful criterion for determining that two finite measuresare equal.

Lemma 3.4 Let µ1, µ2 be measures on E and let S ∈ E be such that the numbersµ1(S) and µ2(S) are finite and equal. Then

D = {A ∈ E : µ1(A ∩ S) = µ2(A ∩ S)}

is both a d-system and a monotone class.

3 Measures 21

Proof If A1, A2 ∈ D with A1 ⊂ A2 then

µ2((A2 \ A1) ∩ S) = µ2((A2 ∩ S) \ (A1 ∩ S))

= µ2(A2 ∩ S) − µ2(A1 ∩ S) = µ1(A2 ∩ S) − µ1(A1 ∩ S)

= µ1((A2 ∩ S) \ (A1 ∩ S)) = µ1((A2 \ A1) ∩ S)

and so A2 \A1 ∈ D. Thus D is a d-system, since clearly X ∈ D. If {An}n≥1 is anincreasing sequence from D and A =

⋃

n≥1An then {An ∩ S}n≥1 is an increasingsequence from E with A ∩ S =

⋃

n≥1(An ∩ S) and hence by Proposition 3.1

µ1(A ∩ S) = limn→∞

µ1(An ∩ S) = limn→∞

µ2(An ∩ S) = µ2(A ∩ S)

and so A ∈ D. On the other hand, if {An}n≥1 is a decreasing sequence from Dand A =

⋂

n≥1An then in the same way Lemma 3.2 shows that A ∈ D. Thisimplies that D is also a monotone class.

Lemma 3.5 Let S be a subset of E closed under finite intersections and suchthat there exists an increasing sequence {Sn}n≥1 from S with

⋃

n≥1 Sn = X. Letµ1, µ2 be measures on E and suppose that the numbers µ1(S) and µ2(S) are equaland finite for all S ∈ S. Then µ1(E) = µ2(E) for all E ∈ σ(S). In particular,µ1 = µ2 if σ(S) = E .

Proof Fix S ∈ S and let D = {A ∈ E : µ1(A ∩ S) = µ2(A ∩ S)}; by Lemma 3.4D is a d-system and thus d(S) ⊂ D, since S ⊂ D. Hence m(d(S)) ⊂ D, since byLemma 3.4 D is also a monotone class. But by Proposition 2.2 d(S) is an algebraand so by Proposition 2.1 m(d(S)) = σ(d(S)). Hence σ(S) ⊂ σ(d(S)) ⊂ D,which implies that µ1(E ∩ S) = µ2(E ∩ S) for all E ∈ σ(S). This shows thatµ1(E ∩ S) = µ2(E ∩ S) for all E ∈ σ(S) and all S ∈ S. Now let {Sn}n≥1 bean increasing sequence from S with

⋃

n≥1 Sn = X, and let E ∈ σ(S). Then{E ∩ Sn}n≥1 is an increasing sequence from E with

⋃

n≥1E ∩ Sn = E and thusby Proposition 3.1

µ1(E) = limn→∞

µ1(E ∩ Sn) = limn→∞

µ2(E ∩ Sn) = µ2(E) .

Therefore µ1(E) = µ2(E) for all E ∈ σ(S).

Proposition 3.3 Let S ⊂ E be closed under finite intersections with X ∈ S andσ(S) = E . Let µ1, µ2 be finite measures on E with µ1(A) = µ2(A) for all A ∈ S.Then µ1 = µ2.

3 Measures 22

Proof This is a special case of Lemma 3.5.

Proposition 3.3 will often be applied when A is an algebra with σ(A) = E . ThenA is closed under finite intersections and X ∈ A; thus if µ1, µ2 are finite measureson E with µ1(A) = µ2(A) for all A ∈ A then µ1 = µ2. Proposition 3.4 belowshows that this fact also holds for σ-finite measures: A measure µ on A is saidto be σ-finite if there exists a sequence {An}n≥1 from A with µ(An) < ∞ foreach n ≥ 1 and X =

⋃

n≥1An. When dealing with σ-finite measures it is oftenconvenient to choose the sequence {An}n≥1 here either to be increasing or to bedisjoint, and clearly both of these alternatives are possible (in the latter casemaking use of Lemma 2.1 (2)).

Proposition 3.4 Let A ⊂ E be an algebra with σ(A) = E and µ be a σ-finitemeasure on A. Let ν1, ν2 be measures on E with ν1(A) = ν2(A) = µ(A) for allA ∈ A. Then ν1 = ν2.

Proof Let S = {S ∈ A : µ(S) <∞}. Then S is a subset of E closed under finiteintersections and, since µ is σ-finite, there exists an increasing sequence {Sn}n≥1

from S with⋃

n≥1 Sn = X. In particular, A ⊂ σ(S) and so σ(S) = E . Moreover,for each S ∈ S the numbers ν1(S) and ν2(S) are finite and equal (since they areboth equal to µ(S)). Hence by Lemma 3.5 ν1(E) = ν2(E) for all E ∈ σ(S) = E ,i.e., ν1 = ν2.

There are good reasons for studying σ-finite measures: First, Lebesgue measureon the real line (perhaps the most fundamental measure of all) is σ-finite but notfinite. Second, there are several important results which hold for σ-finite measures(for example Fubini’s theorem and the Radon-Nikodym theorem) which do nothold for general measures.

The follow is a version of Proposition 3.2 for σ-finite measures which will beneeded in Chapter 5.

Lemma 3.6 Let A ⊂ P(X) be an algebra and let µ be a finitely additive measureon A. Let {Cn}n≥1 be an increasing sequence from A with X =

⋃

n≥1Cn andµ(Cn) <∞ for each n ≥ 1 such that

(1) limn µ(A ∩ Cn) = µ(A) for all A ∈ A.

(2) If {An}n≥1 is a decreasing sequence from A with⋂

n≥1An = ∅ and A1 ⊂ Cp

for some p ≥ 1 then limn µ(An) = 0.

Then µ is a measure on (X,A).

Proof Let {An}n≥1 be an increasing sequence from A with A =⋃

n≥1An ∈ A.Fix p ≥ 1, put A′ = A∩Cp and let A′

n = An ∩Cp for each n ≥ 1; then {A′n}n≥1 is

3 Measures 23

an increasing sequence from A with A′ =⋃

n≥1A′n ∈ A. Now for each n ≥ 1 put

Bn = A′ \A′n; then {Bn}n≥1 is a decreasing sequence from A with

⋂

n≥1Bn = ∅

and B1 ⊂ Cp and hence limn µ(Bn) = 0. Now µ(A′) = µ(Bn) + µ(A′n) for each

n ≥ 1, since µ is additive, and thus

µ(A ∩ Cp) = µ(A′) = limn→∞

(µ(Bn) + µ(A′n)) = lim

n→∞µ(A′

n) ≤ limn→∞

µ(An) .

It therefore follows that µ(A) = limp µ(A∩Cp) ≤ limn µ(An). But µ(An) ≤ µ(A)for all n ≥ 1, since µ is monotone, and hence also limn µ(An) ≤ µ(A). This showsthat µ is continuous and so by Proposition 3.2 µ is a measure on A.

4 The Caratheodory extension theorem

In this chapter we prove the Caratheodory extension theorem (Theorem 4.2).The main part of this result (which also occurs as Theorem 4.1) states that ifA ⊂ P(X) is an algebra then any measure on A extends to a measure on σ(A).Another proof of Theorem 4.1 will be given in Chapter 11 as a by-product of theconstruction of the Daniell integral.

In what follows let X be a non-empty set and let A ⊂ P(X) be an algebra.

Theorem 4.1 Every measure µ on A can be extended to a measure on σ(A):There exists a measure ν on σ(A) such that ν(A) = µ(A) for all A ∈ A. More-over, if µ is σ-finite then ν is uniquely determined by µ.

Proof The existence of ν is part of Theorem 4.2. The uniqueness for a σ-finitemeasure µ follows immediately from Proposition 3.4.

A monotone countably sub-additive mapping µ : P(X) → R+∞ with µ(∅) = 0 is

called an outer measure.

Let S be a subset of P(X) containing ∅ and X and let µ : S → R+∞ be a measure

on S. Define a mapping µ∗ : P(X) → R+∞ by

µ∗(A) = inf{

∑

n≥1

µ(An) : {An}n≥1 is a sequence from S with A ⊂⋃

n≥1

An

}

.

Lemma 4.1 The mapping µ∗ is an outer measure.

Proof It is clear that µ∗ is monotone and that µ∗(∅) = 0 (since µ(∅) = 0). Itthus remains to show that µ∗ is countably sub-additive.

Let {Sn}n≥1 be a sequence from P(X) and put S =⋃

n≥1 Sn. If µ∗(Sn) = ∞for some n then also µ∗(S) = ∞, since Sn ⊂ S and µ∗ is monotone, and in thiscase µ∗(S) = ∞ =

∑

n≥1 µ∗(Sn). We can thus assume that µ∗(Sn) < ∞ for each

n ≥ 1. Let ε > 0; then for each n ≥ 1 there is a sequence {An,k}k≥1 from S withSn ⊂

⋃

k≥1An,k and∑

k≥1 µ(An,k) < µ∗(Sn) + 2−nε.

Let h : N+ → N+×N+ be any bijective mapping and let {Bm}m≥1 be the sequencefrom S with Bm = Ah(m) for each m ≥ 1. Let M ≥ 1; then there exists N ≥ 1,K ≥ 1 so that h(m) ∈ {1, . . . , N} × {1, . . . , K} for all m ∈ {1, . . . ,M} and thus

M∑

m=1

µ(Bm) =M

∑

m=1

µ(Ah(m)) ≤N

∑

n=1

K∑

k=1

µ(An,k)

≤N

∑

n=1

(µ∗(Sn) + 2−nε) ≤N

∑

n=1

µ∗(Sn) + ε ≤∑

n≥1

µ∗(Sn) + ε .

24


Therefore∑

m≥1 µ(Bm) ≤∑

n≥1 µ∗(Sn)+ ε. But S =

⋃

n≥1 Sn ⊂⋃

m≥1 Bm (sincefor each n ≥ 1 and each x ∈ Sn there exists k ≥ 1 with x ∈ An,k and hence thereexists m ≥ 1 with x ∈ Bm) and it follows that

µ∗(S) ≤∑

m≥1

µ(Bm) ≤∑

n≥1

µ∗(Sn) + ε .

Since ε > 0 is arbitrary we have µ∗(S) ≤∑

n≥1 µ∗(Sn) and this shows that µ∗ is

countably sub-additive.

The mapping µ∗ is called the outer measure generated by µ.

Theorem 4.2 Let µ be a measure on A and let µ∗ be the outer measure generatedby µ. Put

B = {S ∈ P(X) : µ∗(T ∩ S) + µ∗(T \ S) = µ∗(T ) for all T ∈ P(X)} .

Then:

(1) B is a σ-algebra with A ⊂ B.

(2) The restriction of µ∗ to B is a measure.

(3) µ∗(A) = µ(A) for all A ∈ A.

(4) If B ∈ B with µ∗(B) <∞ then for each ε > 0 there exists a sequence {An}n≥1

from A with B ⊂⋃

n≥1An such that µ∗(⋃

n≥1An \B)

< ε.

(5) If B ∈ B with µ∗(B) < ∞ then for each ε > 0 there exists A ∈ A such thatµ∗((A \B) ∪ (B \ A)) < ε.

Let ν be the restriction of µ∗ to B; then the measure ν on B is called theCaratheodory extension of the measure µ on A.

As preparation for the proof of Theorem 4.2 we need a couple of lemmas.

Lemma 4.2 Let λ : P(X) → R+∞ be an outer measure and put

B = {S ∈ P(X) : λ(T ∩ S) + λ(T \ S) = λ(T ) for all T ∈ P(X)} .

Then B is a σ-algebra and the restriction of λ to B is a measure.

Proof (1) It is clear that ∅ ∈ B and so in particular B 6= ∅.

(2) X \S ∈ B for each S ∈ B, since T ∩ (X \S) = T \S and T \ (X \S) = T ∩Sfor all T ∈ P(X).


(3) Let S1, S2 ∈ B; then for all T ∈ P(X)

λ(T ) = λ(T ∩ S1) + λ(T \ S1)

= λ(T ∩ S1) + λ((T \ S1) ∩ S2) + λ((T \ S1) \ S2)

= λ(T ∩ (S1 ∪ S2) ∩ S1) + λ((T ∩ (S1 ∪ S2)) \ S1) + λ(T \ (S1 ∪ S2)) ,

since (T \ S1) ∩ S2 = (T ∩ (S1 ∪ S2)) \ S1, (T \ S1) \ S2 = T \ (S1 ∪ S2) and alsoS1 = (S1 ∪ S2) ∩ S1. But

λ(T ∩ (S1 ∪ S2) ∩ S1) + λ((T ∩ (S1 ∪ S2)) \ S1) = λ(T ∩ (S1 ∪ S2))

and therefore λ(T ∩ (S1∩S2))+λ(T \ (S1∪S2)) = λ(T ) for all T ∈ P(X). HenceS1 ∪ S2 ∈ B.

(4) By (1), (2) and (3) B is an algebra.

(5) Let S1, S2 ∈ B with S1 ∩ S2 = ∅; then, since S1 ∈ B

λ(T ∩ (S1 ∪ S2)) = λ(T ∩ (S1 ∪ S2) ∩ S1) + λ((T ∩ (S1 ∪ S2)) \ S1)

= λ(T ∩ S1) + λ(T ∩ S2)

for all T ∈ P(X).

(6) Let {Sn}n≥1 be a disjoint sequence from B. Then by induction on m it followsfrom (5) that for all m ≥ 1

λ(

T ∩(

m⋃

n=1

Sn

))

=m

∑

n=1

λ(T ∩ Sn)

for all T ∈ P(X).

(7) Let {Sn}n≥1 be a disjoint sequence from B. Since λ is monotone it followsfrom (4) and (6) that for all m ≥ 1

λ(T ) = λ(

T ∩(

m⋃

n=1

Sn

))

+ λ(

T \(

m⋃

n=1

Sn

))

=m

∑

n=1

λ(T ∩ Sn) + λ(

T \(

m⋃

n=1

Sn

))

≥m

∑

n=1

λ(T ∩ Sn) + λ(

T \(

⋃

n≥1

Sn

))

for all T ∈ P(X). Thus, since λ is countably sub-additive,

λ(T ) ≥∑

n≥1

λ(T ∩Sn)+λ(

T \(

⋃

n≥1

Sn

))

≥ λ(

T ∩(

⋃

n≥1

Sn

))

+λ(

T \(

⋃

n≥1

Sn

))

.

But an outer measure is sub-additive and hence also

λ(T ) ≤ λ(

T ∩(

⋃

n≥1

Sn

))

+ λ(

T \(

⋃

n≥1

Sn

))

.


Therefore λ(T ) = λ(

T ∩(⋃

n≥1 Sn

))

+ λ(

T \(⋃

n≥1 Sn

))

for all T ∈ P(X), andthis shows that

⋃

n≥1 Sn ∈ B.

(8) By (4), (7) and Lemma 2.1 (4) B is a σ-algebra.

(9) The restriction of λ to B is a measure: Let {Sn}n≥1 be a disjoint sequencefrom B. In (7) we saw that

λ(T ) ≥∑

n≥1

λ(T ∩ Sn) + λ(

T \(

⋃

n≥1

Sn

))

for all T ∈ P(X). In particular with T =⋃

n≥1 Sn this implies that

λ(

⋃

n≥1

Sn

)

≥∑

n≥1

λ(Sn) + λ(∅) =∑

n≥1

λ(Sn)

and therefore λ(⋃

n≥1 Sn

)

=∑

n≥1 λ(Sn), since λ is countably sub-additive.

Lemma 4.3 Let µ be a measure on A and let µ∗ be the outer measure generatedby µ. Then:

(1) µ∗(T ) = µ∗(T ∩ A) + µ∗(T \ A) for all A ∈ A, T ∈ P(X).

(2) µ∗(A) = µ(A) for all A ∈ A.

Proof (1) Let A ∈ A and T ∈ P(X); then µ∗(T ) ≤ µ∗(T ∩A) + µ∗(T \A), sinceµ∗ is sub-additive. We thus need to show that µ∗(T ) ≥ µ∗(T ∩ A) + µ∗(T \ A),and for this we can assume that µ∗(T ) <∞. Let ε > 0; then there is a sequence{An}n≥1 from A with T ⊂

⋃

n≥1An such that∑

n≥1 µ(An) ≤ µ∗(T ) + ε. But

∑

n≥1

µ(An) =∑

n≥1

(µ(An ∩ A) + µ(An \ A))

=∑

n≥1

µ(An ∩ A) +∑

n≥1

µ(An \ A) ≥ µ∗(T ∩ A) + µ∗(T \ A) ,

since An ∩ A, An \ A ∈ A and T ∩ A ⊂⋃

n≥1(An ∩ A), T \ A ⊂⋃

n≥1(An \ A).Thus µ∗(T )+ ε ≥ µ∗(T ∩A) +µ∗(T \A), and since ε > 0 was arbitrary it followsthat µ∗(T ) ≥ µ∗(T ∩ A) + µ∗(T \A),

(2) Let A ∈ A; then

µ∗(A) = inf{

∑

n≥1

µ(An) : {An}n≥1 is a sequence from A with A ⊂⋃

n≥1

An

}

and in particular µ∗(A) ≤ µ(A). (Just take A1 = A and An = ∅ for all n > 1.)Now let {An}n≥1 be any sequence from A with A ⊂

⋃

n≥1An. Let B1 = A1 ∩ A


and for each n > 1 put Bn = (An \⋃n−1

k=1 Ak) ∩ A. Then {Bn}n≥1 is a disjointsequence from A with Bn ⊂ An for each n ≥ 1 and

⋃

n≥1Bn = A. Thus, since µis a measure on A,

µ(A) =∑

n≥1

µ(Bn) ≥∑

n≥1

µ(An)

and this shows that µ(A) ≥ µ∗(A). Hence µ∗(A) = µ(A) for all A ∈ A.

Proof of Theorem 4.2: (1), (2) and (3) follow directly from Lemmas 4.2 and 4.3.

(4) Let B ∈ B with µ∗(B) < ∞ and let ε > 0. Then there is a sequence{An}n≥1 from A with B ⊂

⋃

n≥1An such that∑

n≥1 µ(An) < µ∗(B) + ε. Now

µ(An) = µ∗(An) for each n ≥ 1 and µ∗(⋃

n≥1An

)

≤∑

n≥1 µ∗(An), and therefore

µ∗(

⋃

n≥1

An \B)

= µ∗(

⋃

n≥1

An

)

− µ∗(B) < ε ,

since by (2) the restriction of µ∗ to B is a measure and so in particular it isadditive.

(5) Exactly as in (4) there is a sequence {An}n≥1 from A with B ⊂⋃

n≥1An suchthat

∑

n≥1 µ(An) < µ∗(B) + ε/2. Choose m ≥ 1 with∑

n≥m+1 µ(An) < ε/2 andput A =

⋃mn=1An. Then A ∈ A and

µ∗((A \B) ∪ (B \ A)) = µ∗(A \B) + µ∗(B \ A)

≤ µ∗(

⋃

n≥1

An \B)

+ µ∗(

⋃

n≥m+1

An

)

< ε

since (A \ B) ∩ (B \ A) = ∅, and again using the fact that the restriction of µ∗

to B is additive.

Lemma 4.4 Let ν on B be the Caratheodory extension of a measure µ on A.Then for each B ∈ B with ν(B) < ∞ there exists A ∈ σ(A) with B ⊂ A andν(A \B) = 0.

Proof By Theorem 4.2 (4) there exists for each m ≥ 1 a sequence {Am,n}n≥1

from A with B ⊂⋃

n≥1Am,n such that ν(⋃

n≥1Am,n \B)

< 1/m. ThereforeA =

⋂

m≥1

⋃

n≥1Am,n ∈ σ(A) with B ⊂ A and for each m ≥ 1

ν(A \B) ≤ ν(

⋃

n≥1

Am,n \B)

< 1/m ,

i.e., ν(A \B) = 0.


Proposition 4.1 Let ν on B be the Caratheodory extension of a σ-finite measureµ on A. Then for each B ∈ B there exists A ∈ σ(A) with B ⊂ A and ν(A\B) = 0.

Proof Since µ is σ-finite there is a sequence {Cn}n≥1 from A with µ(Cn) < ∞for each n ≥ 1 such that X =

⋃

n≥1 Cn. Now let B ∈ B and for each n ≥ 1put Bn = B ∩ Cn; then ν(Bn) ≤ µ(Cn) < ∞ and so by Lemma 4.4 there existsAn ∈ σ(A) with Bn ⊂ An so that ν(An \ Bn) = 0. Put A =

⋃

n≥1An; thenA ∈ σ(A) with B ⊂ A and ν(A \B) = 0, since

A \B =(

⋃

n≥1

An

)

\(

⋃

n≥1

Bn

)

⊂⋃

n≥1

(An \Bn)

and ν(⋃

n≥1(An \Bn))

≤∑

n≥1 ν(An \Bn) = 0.

Proposition 4.2 Let µ be a σ-finite measure on A and also denote by µ theunique extension to a measure on σ(A). Then for each B ∈ σ(A) and eachε > 0 there exists a sequence {An}n≥1 from A with B ⊂

⋃

n≥1An such thatµ(

⋃

n≥1An \B) < ε.

Proof Let µ∗ be the outer measure generated by µ; then µ(B) = µ∗(B) for allB ∈ σ(A) (by Theorem 4.2), and since µ is σ-finite there is an increasing sequence{Em}m≥1 from A with µ(Em) <∞ for each m ≥ 1 and X =

⋃

m≥1Em. Now letB ∈ σ(A) and let ε > 0; since µ(B ∩ Em) < ∞ Theorem 4.2 (4) implies thereexists a sequence {Fm,k}k≥1 from A with B ∩ Em ⊂

⋃

k≥1 Fm,k so that

µ(

⋃

k≥1

Fm,k \ (B ∩ Em))

= µ∗(

⋃

k≥1

Fm,k \ (B ∩ Em))

< 2−mε .

Let {An}n≥1 be an enumeration of the double sequence {Fm,k}m≥1,k≥1. Then{An}n≥1 is a sequence from A with B ⊂

⋃

n≥1An and

µ(

⋃

n≥1

An \B)

≤ µ(

⋃

m≥1

(

⋃

k≥1

Fm,k \ (B ∩ Em)))

≤∑

m≥1

µ(

⋃

k≥1

Fm,k \ (B ∩ Em))

<∑

m≥1

2−nε = ε .

5 Measures on the real line

In this chapter we show how all reasonable measures on the Borel subsets of thereal line R can be constructed. The most important example (and perhaps themost important measure of all) is Lebesgue measure.

Recall that the σ-algebra of Borel subsets of R is defined to be the σ-algebraBR = σ(OR), where OR is the set of open subsets of R. It is easy to see thatBR contains every interval, and in particular all intervals of the form (a, b] witha < b.

A measure µ on BR is called locally finite if µ(B) < ∞ for each bounded setB ∈ BR. Thus µ is locally finite if and only if µ((a, b]) <∞ for all a, b ∈ R witha < b. A locally finite measure is σ-finite, although in general the converse doesnot hold.

Lemma 5.1 Let µ be a locally finite measure on BR. Then there exists a mappingα : R → R such that µ((a, b]) = α(b)−α(a) for all a, b ∈ R with a < b. Moreover,any such mapping is increasing and right continuous. (This means: For eachx ∈ R and each ε > 0 there exists δ > 0 such that |α(y) − α(x)| < ε for ally ∈ [x, x + δ).)

Proof Define a mapping α : R → R by

α(x) =

{

µ((0, x]), if x ≥ 0,−µ((x, 0]), if x < 0.

Then µ((a, b]) = α(b) − α(a) holds for all a, b ∈ R with a < b: If 0 ≤ a < b thenµ((a, b]) = µ((0, b]) − µ((0, a]) = α(b) − α(a), on the other hand, if a < 0 ≤ bthen µ((a, b]) = µ((a, 0]) + µ((0, b]) = α(b) − α(a), and finally if a < b < 0 thenµ((a, b]) = µ((a, 0]) − µ((b, 0]) = α(b) − α(a).

Now let α : R → R be any mapping such that µ((a, b]) = α(b) − α(a) for alla, b ∈ R with a < b. Then α(y) − α(x) = µ((x, y]) ≥ 0 if x < y and thus αis increasing. Moreover, α is right continuous: Let x ∈ R and for each n ≥ 1let An = (x, x + 1/n]; then {An}n≥1 is a decreasing sequence from BR with⋂

n≥1An = ∅ and thus by Lemma 3.2 it follows that

limn→∞

(α(x+ 1/n) − α(x)) = limn→∞

µ(An) = µ(∅) = 0 ,

since µ(A1) = α(x + 1) − α(x) < ∞. Let ε > 0; then there exists N ≥ 1 suchthat |α(x+1/N)−α(x)| < ε and hence |α(y)−α(x)| < ε for all y ∈ [x, x+1/N),since α is increasing.

The main result of this chapter states that the converse of Lemma 5.1 holds:

30


Theorem 5.1 Let α : R → R be increasing and right continuous. Then thereexists a unique measure ν on BR such that ν((a, b]) = α(b)−α(a) for all a, b ∈ R

with a < b.

The most important example of Theorem 5.1 is when α is the identity mappingidR : R → R (with idR(x) = x for all x ∈ R). This mapping is strictly increasingand continuous and therefore by Theorem 5.1 there exists a unique measure λon BR such that λ((a, b]) = b − a for all a, b ∈ R with a < b. The measure λ iscalled Lebesgue measure on BR.

We start the proof of Theorem 5.1 by looking at the uniqueness.

Lemma 5.2 Let α : R → R be increasing and right continuous. Then there isat most one measure ν on BR so that ν((a, b]) = α(b)−α(a) for all a, b ∈ R witha < b.

Proof Let ν1, ν2 be measures on BR with ν1((a, b]) = α(b)− α(a) = ν2((a, b]) forall a, b ∈ R with a < b. Then the numbers ν1(S) and ν2(S) are finite and equalfor all S ∈ S, where S = {∅} ∪ {(a, b] : a, b ∈ R with a < b}. But the set S isclosed under finite intersections and there exists an increasing sequence {Sn}n≥1

from S with⋃

n≥1 Sn = R (for example Sn = (−n, n]). Therefore by Lemma 3.5ν1(F ) = ν2(F ) for all F ∈ σ(S). But σ(S) = BR: If a, b ∈ R with a < b then(a, b) =

⋃

n≥1(a, b− n−1] ∈ σ(S), and thus BR = σ(OR) ⊂ σ(S), since each opensubset of R can be written as the countable union of open intervals. This impliesthat ν1 = ν2.

The rest of the chapter deals with the existence of the measure ν.

For −∞ ≤ a < b < +∞ let us write 〈a, b〉 = (a, b] and for −∞ ≤ a < +∞ put〈a,+∞〉 = (a,+∞). Let

A ={

n⋃

k=1

〈ak, bk〉 : n ≥ 0 and −∞ ≤ a1 < b1 < a2 < · · · < an < bn ≤ +∞}

,

where⋃0

k=1〈ak, bk〉 = ∅; in particular (−∞, x] ∈ A for each x ∈ R.

Lemma 5.3 A is an algebra with σ(A) = BR.

Proof Easy exercise. (The proof that σ(A) = BR is already contained in theproof of Lemma 5.2).

Lemma 5.4 For each A ∈ A there exists a unique n ≥ 0 and unique numbers−∞ ≤ a1 < b1 < a2 < · · · < an < bn ≤ +∞ such that A =

⋃nk=1〈ak, bk〉.


Proof This is clear.

By Lemma 5.4 we can define a mapping µ : A → R+∞ by

µ(

n⋃

k=1

〈ak, bk〉)

=n

∑

k=1

(α(bk) − α(ak)) ,

where α(−∞) = inf(α(R)), α(+∞) = sup(α(R)) and µ(⋃0

k=1〈ak, bk〉)

= 0. Inparticular (with n = 1) this means µ((a, b]) = α(b) − α(a) for all a, b ∈ R witha < b.

Lemma 5.5 The mapping µ : A → R+∞ is additive: If A, B ∈ A with A∩B = ∅

then µ(A ∪B) = µ(A) + µ(B).

Proof Easy exercise.

Denote by A′ the set of bounded elements in A.

Lemma 5.6 Let A ∈ A′ and let ε > 0; then there exists n ≥ 0 and numbers−∞ < a1 < b1 < a2 < · · · < an < bn < +∞ with

⋃nk=1[ak, bk] ⊂ A such that

µ(⋃n

k=1(ak, bk])

≥ µ(A) − ε.

Proof Another exercise. (It is here that the right continuity of α is needed.)

Let A ∈ A′ and ε > 0; by Lemma 5.6 there exists B ∈ A′ such that the closureB of B is a subset of A with µ(B) ≥ µ(A)− ε. For each n ≥ 1 let Cn = (−n, n];then {Cn}n≥1 is an increasing sequence from A′ with R =

⋃

n≥1Cn. Moreover,µ(Cn) = α(n) − α(−n) <∞ for all n ≥ 1.

Lemma 5.7 µ(A) = limn µ(A ∩ Cn) for all A ∈ A.

Proof This is clear.

Lemma 5.8 µ is a finite measure on A.

Proof Clearly µ(∅) = 0 and µ(R) = sup(α(R)) − inf(α(R)) < +∞ and thus byLemma 5.5 µ is a finitely additive measure on A. We are going to make use ofLemma 3.6, so let {An}n≥1 be a decreasing sequence from A with

⋂

n≥1An = ∅

and A1 ⊂ Cp for some p ≥ 1 (and thus An ∈ A′ for all n ≥ 1). Let ε > 0; byLemma 5.6 there exists for each n ≥ 1 an element Bn ∈ A′ with Bn ⊂ An and


µ(Bn) > µ(An) − 2−nε. For each n ≥ 1 let Dn =⋂n

k=1Bk; then Dn ∈ A′ withDn ⊂ An. Moreover,

Dn =

n⋂

k=1

Bk =

n⋂

k=1

(Ak \ (Ak \Bk)) =(

n⋂

k=1

Ak

)

\n

⋃

k=1

(Ak \Bk) = An \n

⋃

k=1

(Ak \Bk)

and this implies that

µ(Dn) ≥ µ(An) −n

∑

k=1

µ(Ak \Bk) ≥ µ(An) −n

∑

k=1

2−kε > µ(An) − ε ,

i.e., µ(Dn) > µ(An) − ε. Now⋂

n≥1An = ∅ and so⋂

n≥1 Dn = ∅. But {Dn}n≥1

is a decreasing sequence of closed subsets of the compact set Cp = [−p, p] andhence there exists m ≥ 1 so that Dm = ∅. This means that µ(An) < ε for alln ≥ m, which shows that limn µ(An) < ε. Thus limn µ(An) = 0, and thereforeby Lemma 3.6 µ is a measure on A.

Now by Theorem 4.1 the measure µ on A can be extended to a measure ν on BR

(since by Lemma 5.3 σ(A) = BR) and then

ν((a, b]) = µ((a, b]) = α(b) − α(a)

for all a, b ∈ R with a < b. This completes the proof of Theorem 5.1.

6 How the integral will be introduced

Let (X, E) be a measurable space and let µ be a measure on E . In the followingfour chapters we are going to define the integral

∫

f dµ of f with respect to µfor each measurable mapping f : X → R+

∞. This will result in a functional∫

: M(E) → R+∞ with the following properties:

(1)∫

(af + bg) dµ = a∫

f dµ+ b∫

g dµ for all f, g ∈ M(E), a, b ∈ R+.

(2) If f, g ∈ M(E) with g ≤ f then∫

g dµ ≤∫

f dµ.

(3) If {fn}n≥1 is an increasing sequence from M(E) then

∫

(

limn→∞

fn

)

dµ = limn→∞

∫

fn dµ ,

where limn fn is defined pointwise by (limn fn)(x) = limn fn(x) for each x ∈ X.

(4)∫

IE dµ = µ(E) for all E ∈ E , where the mapping IE : X → R+∞ is given by

IE(x) =

{

1 if x ∈ E ,0 otherwise .

Note that the statements (1) and (3) presuppose that the set M(E) has certainproperties, namely that:

(5) af + bg ∈ M(E) for all f, g ∈ M(E) and all a, b ∈ R+.

(6) If {fn}n≥1 is an increasing sequence from M(E) then limn fn ∈ M(E).

Properties (3) and (6) are essentially statements about the order structure onM(E). Property (6) states that M(E) is complete with respect to the usual partialorder ≤ on M(E) (in which g ≤ f if and only if g(x) ≤ f(x) for all x ∈ X).Moreover, (3) then says that the integral should be a continuous mapping betweenthe partially ordered sets M(E) and R+

∞.

This observation will be used as the basis for our approach to introducing theintegral. In Chapter 7 we look at some general results about partially orderedsets, in Chapter 8 we apply these results to the set M(X) of all mappings from Xto R+

∞ and then in Chapter 9 specialise to the set M(E) of measurable mappings.Finally, the integral appears in Chapter 10. In Chapter 11 we give anotherapproach to the integral (the Daniell integral). In particular, this provides analternative proof of the Caratheodory extension theorem (Theorem 4.1).

34

7 Partially ordered sets

Our approach to the integral makes use of some general results about continuousmappings defined on complete partially ordered sets. These results are presentedin the present chapter.

Let D be a non-empty set. A binary relation ≤ on D is a partial order if:

(1) z ≤ z for all z ∈ D.

(2) z1 = z2 whenever z1 ≤ z2 and z2 ≤ z1.

(3) z1 ≤ z3 whenever z1 ≤ z2 and z2 ≤ z3.

In other words, the relation ≤ is reflexive, anti-symmetric and transitive.

A pair (D,≤) consisting of a non-empty set D and a partial order ≤ is called apartially ordered set or, for short, a poset. If (D,≤) is a poset then we mostlyjust write D instead of (D,≤) and assume that ≤ can be determined from thecontext. In fact, unless something explicit to the contrary is stated, the partialorders will always be denoted by ≤ (even if several posets are being consideredat the same time). The notation z1 ≥ z2 will often be used as an alternative toz2 ≤ z1. Moreover, z1 < z2 will mean that z1 ≤ z2 but z1 6= z2.

In what follows let D be a poset. A sequence of elements {zn}n≥1 from D is saidto be increasing if zn ≤ zn+1 and decreasing if zn+1 ≤ zn for all n ≥ 1.

Let A be a subset of D; an element z ∈ D is said to be an upper bound for A ifz′ ≤ z for all z′ ∈ A. An upper bound z is called a least upper bound for A ifz ≤ z′ for each upper bound z′ for A. If a least upper bound exists then it clearlyunique and it will be denoted by sup(A).

The poset D is said to be complete if each non-empty subset of D possesses aleast upper bound. For the rest of the chapter assume that D is complete.

If {zn}n≥1 is an increasing sequence of elements from D then the least upperbound sup({zn : n ≥ 1}) will be denoted by limn zn Thus limn zn is the uniqueelement z of D with the properties: (i) zn ≤ z for all n ≥ 1 and (ii) if z′ ∈ Dwith z′ ≤ z and z′ 6= z then z′ ≤ zn for all large enough n. Note that this is notthe definition of the limit used for R+

∞ in Chapter 1, but it is equivalent to it.

A binary operation ? : D × D → D is said to be monotone if w ? z ≤ w′ ? z′

whenever w ≤ w′ and z ≤ z′. If ? is monotone and {wn}n≥1 and {zn}n≥1 areincreasing sequences from D then {wn ? zn}n≥1 is also an increasing sequence. Amonotone operation ? is defined to be continuous if

limn→∞

(wn ? zn) =(

limn→∞

wn

)

?(

limn→∞

zn

)

holds for all increasing sequences {wn}n≥1 and {zn}n≥1 from D.

35


There is a binary operation ∨ : D×D → D defined by w ∨ z = sup{w, z} for allw, z ∈ D; it is clear that ∨ is monotone.

Lemma 7.1 The operation ∨ is continuous.

Proof Let {wn}n≥1, {zn}n≥1 be increasing sequences from D with w = limnwn

and z = limn zn; put v = limn(wn ∨ zn). Then wn ∨ zn ≤ w ∨ z for all n ≥ 1,and hence v ≤ w ∨ z. On the other hand, wn ≤ wn ∨ zn ≤ v for all n ≥ 1 andso w ≤ v, and in the same way z ≤ v. Therefore w ∨ z ≤ v and this shows thatw ∨ z = v.

If ? is a binary operation on D then a subset N of D will be called ?-closed orclosed under ? if z ? w ∈ N for all z, w ∈ N ; the subset N is said to be completeif limn zn ∈ N for each increasing sequence {zn}n≥1 from N .

For each subset N of D let N ↑ denote the set of all elements of D having theform limn zn for some increasing sequence {zn}n≥1 from N . Thus N ⊂ N ↑ andN = N↑ if and only if N is complete.

Lemma 7.2 Let N be a ∨-closed subset of D and let {zn}n≥1 be an increasingsequence from N ↑ with z = limn zn. Then there exists an increasing sequence{wn}n≥1 from N with wn ≤ zn for all n ≥ 1 and limnwn = z. In particular, thisimplies z ∈ N ↑.

Proof For each n ≥ 1 there exists an increasing sequence {zn,m}m≥1 from N withzn = limm zn,m. For each m ≥ 1 let wm = z1,m ∨ · · · ∨ zm,m; then wm ∈ N , sinceN is ∨-closed, and

wm = z1,m ∨ · · · ∨ zm,m ≤ z1,m+1 ∨ · · · ∨ zm,m+1 ≤ wm+1 ,

since ∨ is monotone. Therefore {wm}m≥1 is an increasing sequence from N ; putw = limm wm. Now

wn = z1,n ∨ · · · ∨ zn,n ≤ z1 ∨ · · · ∨ zn = zn ,

i.e., wn ≤ zn for all n ≥ 1 and so in particular w = limnwn ≤ limn zn = z. Buton the other hand, zn,m ≤ z1,m ∨ · · · ∨ zm,m = wm ≤ w for all m ≥ n ≥ 1; thuszn = limm zn,m ≤ w for all n ≥ 1 and hence z ≤ w, i.e., w = z.

Proposition 7.1 Let N be a ∨-closed subset of D. Then N ↑ is ∨-closed andcomplete. Moreover, if ? is a continuous operation on D and N is ?-closed thenso is N↑.


Proof Lemma 7.2 implies in particular that N ↑ is complete. Let ? be continuousand suppose N is ?-closed. Let z, w ∈ N ↑ and {zn}n≥1, {wn}n≥1 be increasingsequences from N with z = limn zn and w = limn wn. Then zn ?wn ∈ N , since Nis ?-closed and limn zn ?wn = z ?w. Thus z ?w ∈ N ↑, since N↑ is complete. Thisshows that N ↑ is ?-closed, and in particular N ↑ is ∨-closed, since by Lemma 7.1∨ is continuous.

Let E be a further complete poset and let N be a subset of D. A mappingψ : N → E is said to be monotone if ψ(w) ≤ ψ(z) whenever w, z ∈ N withw ≤ z. Let ψ : N → E be monotone; then {ψ(zn)}n≥1 is an increasing sequencein E for each increasing sequence {zn}n≥1 from N .

A monotone mapping ψ : N → E is said to be pre-continuous if

ψ(z) ≤ limn→∞

ψ(zn)

holds whenever {zn}n≥1 is an increasing sequence from N and z ∈ N is such thatz ≤ limn zn. Moreover, if N is complete then a monotone mapping ψ : N → E issaid to be continuous if

ψ(

limn→∞

zn

)

= limn→∞

ψ(zn)

holds for each increasing sequence {zn}n≥1 from N .

Lemma 7.3 Let N be a complete subset of D and ψ : N → E be a monotonemapping. Then ψ is continuous if and only if it is pre-continuous.

Proof Assume first that the mapping ψ is pre-continuous. Let {zn}n≥1 be anincreasing sequence from N and put z = limn zn. Then in particular z ≤ limn zn

and hence ψ(z) ≤ limn ψ(zn). But zn ≤ z and therefore ψ(zn) ≤ ψ(z) for alln ≥ 1, since ψ is monotone . Thus also limn ψ(zn) ≤ ψ(z), i.e., limn ψ(zn) = ψ(z),and this shows ψ is continuous. Now assume that ψ is continuous. Let {zn}n≥1

be an increasing sequence from N and let z ∈ N with z ≤ z ′ = limn zn. Then,since ψ is monotone, ψ(z) ≤ ψ(z′) = ψ(limn zn) = limn ψ(zn), and this shows ψis pre-continuous.

The following result gives the basic criterion for the existence of a continuousextension (and is, for example, the essential tool in Chapter 10 for defining theintegral).

Proposition 7.2 Let N be a ∨-closed subset of D and let ψ : N → E be amonotone mapping. Then there exists a continuous mapping ψ ′ : N↑ → E suchthat ψ′(z) = ψ(z) for all z ∈ N if and only if ψ is pre-continuous.


Proof If there exists a continuous mapping ψ′ : N↑ → E such that ψ′(z) = ψ(z)for all z ∈ N then by Lemma 7.3 ψ′ is pre-continuous. Therefore ψ, being therestriction of a pre-continuous mapping, is itself pre-continuous.

Assume then conversely that ψ is pre-continuous. Let z ∈ N ↑ and let {zn}n≥1,{wn}n≥1 be increasing sequences from N with limn zn = z = limn wn. Thenzm ≤ z = limn wn and thus ψ(zm) ≤ limn ψ(wn) for each m ≥ 1, which implieslimn ψ(zn) ≤ limn ψ(wn). But the same argument also shows of course thatlimn ψ(wn) ≤ limn ψ(zn) and therefore limn ψ(zn) = limn ψ(wn). There is thusa unique mapping ψ′ : N↑ → E such that ψ′(limn zn) = limn ψ(zn) for eachincreasing sequence {zn}n≥1 from N . In particular ψ′(z) = ψ(z) for each z ∈ N .

The mapping ψ′ is monotone: Let z, w ∈ N ↑ with z ≤ w and let {zn}n≥1, {wn}n≥1

be increasing sequences from N with z = limn zn and w = limnwn. Then, sinceN is ∨-closed, {zn ∨wn}n≥1 is an increasing sequence from N and by Lemma 7.1limn(zn ∨ wn) = z ∨ w = w. Thus

ψ′(z) = limn→∞

ψ(zn) ≤ limn→∞

ψ(zn ∨ wn) = ψ′(w) .

The mapping ψ′ is continuous: Let {zn}n≥1 be an increasing sequence from N ↑

with z = limn zn. Then ψ′(zn) ≤ ψ′(z) for each n ≥ 1, since ψ′ is monotone,and therefore limn ψ

′(zn) ≤ ψ′(z). But by Lemma 7.2 there exists an increasingsequence {wn}n≥1 from N with wn ≤ zn for each n ≥ 1 and limn wn = z. Thus

ψ′(z) = limn→∞

ψ(wn) = limn→∞

ψ′(wn) ≤ limn→∞

ψ′(zn) ≤ ψ′(z)

and hence ψ′(z) = limn ψ′(zn).

If there exists a continuous mapping ψ′ : N↑ → E such that ψ′(z) = ψ(z) for allz ∈ N then it is uniquely determined by ψ. (Let z ∈ N ↑; then z = limn zn for someincreasing sequence {zn}n≥1 from N and so ψ′(z) = limn ψ

′(zn) = limn ψ(zn).)

Let A be a subset of D; an element z ∈ D is said to be a lower bound for A ifz ≤ z′ for all z′ ∈ A. A lower bound z is called a greatest lower bound for A ifz′ ≤ z for each lower bound z′ for A. If a greatest lower bound exists then it isclearly unique and will be denoted by inf(A).

The poset D will be called co-complete if each non-empty subset of D possessesa greatest lower bound. If D is co-complete and {zn}n≥1 is a decreasing sequencefrom D then the greatest lower bound inf({zn : n ≥ 1}) will also be denoted bylimn zn. Therefore limn zn is the unique element z of D with the properties: (i)z ≤ zn for all n ≥ 1 and (ii) if z′ ∈ D with z ≤ z′ and z′ 6= z then zn ≤ z′ forall large enough n. Again, this is not the definition of the limit used for R+

∞ inChapter 1, but it is equivalent to it.

If D is co-complete then there is also a binary operation ∧ : D×D → D definedby z ∧ w = inf{z, w} for all z, w ∈ D. The operation ∧ is monotone but ingeneral need not be continuous.


In a co-complete poset D a subset N is said to be co-complete if limn zn ∈ N foreach decreasing sequence {zn}n≥1 from N .

In what follows assume that D is complete and co-complete. If {zn}n≥1 is anysequence from D and m ≥ 1 then {zn : n ≥ m} will be used to denote the set ofvalues {z ∈ D : z = zn for some n ≥ m}.

Let {zn}n≥1 be any sequence from D and for n ≥ 1 let un = sup{zm : m ≥ n}and vn = inf{zm : m ≥ n}. Then the sequence {un}n≥1 is decreasing and {vn}n≥1

is increasing. As in R+∞ the limits limn un and limn vn are denoted by lim supn zn

and lim infn zn respectively. It is easy to see that lim infn zn ≤ lim supn zn, and if{zn}n≥1 is either increasing or decreasing then

lim infn→∞

zn = limn→∞

zn = lim supn→∞

zn .

In a complete and co-complete poset D this equality can be used to define when ageneral sequence {zn}n≥1 from D converges. We do not pursue this topic furtherbecause we will only be dealing with posets where the convergence is definedexplicitly.

8 Real valued mappings

The general results of the previous chapter will now by applied to the poset M(X)of all mappings from a non-empty set X to R+

∞.

As was stated in Chapter 1 the poset R+∞ is both complete and co-complete: Let

A be a non-empty subset of R+∞. If there exists b ∈ R+ with a ≤ b for all a ∈ A

then A is a bounded subset of R and in this case sup(A) is the least upper boundof A in R; if no such b ∈ R+ exists then sup(A) = ∞. Moreover, if A = {∞}then inf(A) = ∞; otherwise inf(A) is the greatest lowest bound of A \ {∞} in R.

The operations +, ·, ∨ and ∧ on R+∞ are all continuous (and so in particular

monotone). Moreover, for all a, b ∈ R+ the operation (c, d) 7→ ac + bd is alsocontinuous.

Most of the mappings we will be dealing with take their values in R+∞ and so we

next look at the elementary properties of such mappings.

Let X be a non-empty set; the set of all mappings from X to R+∞ will be denoted

by M(X). The total order on R+∞ induces a partial order ≤ on M(X) defined

pointwise by stipulating that f ≤ g if and only if f(x) ≤ g(x) for all x ∈ X. Thus(M(X),≤) is a poset, and whenever we consider M(X) as a poset then it willalways be with respect to this partial order. The poset M(X) is both completeand co-complete. If A is a non-empty subset of M(X) then its least upper boundis the mapping given by sup(A)(x) = sup{f(x) : f ∈ A} for all x ∈ X and itsgreatest lower bound is the mapping given by inf(A)(x) = inf{f(x) : f ∈ A} forall x ∈ X. In particular, if {fn}n≥1 is either increasing or decreasing then

(

limn→∞

fn

)

(x) = limn→∞

fn(x)

for all x ∈ X. Moreover, if {fn}n≥1 is any sequence from M(X) then

(

lim supn→∞

fn

)

(x) = lim supn→∞

fn(x) and(

lim infn→∞

fn

)

(x) = lim infn→∞

fn(x)

for all x ∈ X. These statements about limn fn, lim supn fn and lim infn fn aredirect consequences of how the partial order ≤ is defined on M(X); however weuse their form as the basis for a definition and say that a sequence {fn}n≥1 fromM(X) converges to f ∈ M(X) if limn fn(x) = f(x) for all x ∈ X, and denote thelimit f by limn fn. It then follows from the properties of convergence in R+

∞ that{fn}n≥1 converges to f if and only if lim infn fn = lim supn fn and in this caselim infn fn = limn fn = lim supn fn.

For each binary operation ? on R+∞ there is a corresponding operation on M(X)

also denoted by ? and defined pointwise by letting (f ? g)(x) = f(x) ? g(x) for allx ∈ X. If ? is a monotone (resp. continuous) binary operation on R+

∞ then thecorresponding operation ? on M(X) is also monotone (resp. continuous).

40


In particular, this results in the continuous operations +, ·, ∨ and ∧ on M(X).Moreover, for all a, b ∈ R+ the operation (f, g) 7→ af + bg is continuous. We alsoneed the scalar multiplication · : R+ ×M(X) → M(X) given by (af)(x) = af(x)for all x ∈ X. (It turns out that R+ and not R+

∞ is the ‘correct’ set of scalars.)

For each a ∈ R+ the constant mapping in M(X) with value a will also be denotedby a. It will be clear from the context which usage is intended. (Of course, itdoesn’t matter if af is interpreted as the scalar multiplication of a with f or asthe product of the constant mapping a with f .) For all f, g ∈ M(X) define themapping |f − g| ∈ M(X) pointwise by letting |f − g|(x) = |f(x)− g(x)| for eachx ∈ X (recalling that |a− b| = ∞ as soon as one of a and b is equal to ∞).

A non-empty subset N of M(X) is said to be a subspace if af + bg ∈ N for allf, g ∈ N and all a, b ∈ R+. Note that a subspace always contains the constantmapping 0. By a complete (resp. co-complete) subspace is meant a subspace Nfor which N is also a complete (resp. co-complete) subset of M(X). If ? is abinary operation on M(X) then a subspace N is closed under ? (or is ?-closed)if f ? g ∈ N for all f, g ∈ N .

If N is a subspace of M(X) then the statement that N ′ is a subspace of N justmeans that N ′ is a subspace of M(X) with N ′ ⊂ N . Moreover, the statementthat N ′ is a complete subspace of N is just a shortened form of the statementthat N ′ is a complete subspace of M(X) with N ′ ⊂ N . The same usage applieswhen ‘complete’ is replaced by some other adjective.

Proposition 8.1 Let N be a ∨-closed subspace of M(X). Then N ↑ is a sub-space of M(X) which is complete and ∨-closed. Moreover, if N is closed under acontinuous operation ? on M(X) then so is N ↑.

Proof This follows from Proposition 7.1. (N ↑ is a subspace since for all a, b ∈ R+

the operation (f, g) 7→ af + bg is continuous.)

If f, g ∈ M(X) with g ≤ f then define the mapping f − g ∈ M(X) pointwise byletting (f − g)(x) = f(x) − g(x) for all x ∈ X (recalling that ∞− b = ∞ for allb ∈ R+

∞). Thus f = g + (f − g).

A subspace N of M(X) will be called complemented if for all f, g ∈ N with g ≤ fthere exists h ∈ N with f = g + h. In particular, N will be complemented iff − g ∈ N for all f, g ∈ N with g ≤ f .

Denote by MF(X) the set of mappings f ∈ M(X) with f(x) < ∞ for all x ∈ X,thus MF(X) is a complemented subspace of M(X). If N is a subspace of MF(X)then N is complemented if and only if f − g ∈ N for all f, g ∈ N with g ≤ f .

Note that f ∨ g + f ∧ g = f + g for all f, g ∈ M(X) and thus a complementedsubspace of MF(X) is ∧-closed if and only if it is ∨-closed. Note also that if asubspace N of M(X) contains the constant 1 then a ∈ N for all a ∈ R+.


A complemented subspace of M(X) which contains 1 and is closed under both ∨and ∧ will be called normal. This definition may seem to be based on a somewhatarbitrary combination of conditions. However, Propositions 9.1 and 9.3 will showthat a complete subspace is normal if and only if it has the form M(F) for someσ-algebra F ⊂ P(X).

Proposition 8.2 A complete normal subspace N of M(X) is also co-complete.Moreover, |f − g| ∈ N for all f, g ∈ N , and if g ≤ f then f − g ∈ N .

Proof Let {fn}n≥1 be a decreasing sequence from N with f = limn fn and assumefirst that f1 ∈ MF(X). For each n ≥ 1 let gn = f1 − fn; then {gn}n≥1 is anincreasing sequence from N and thus g = limn gn ∈ N . But g = f − f1 and sof = f1 + g ∈ N . Now let {fn}n≥1 be any decreasing sequence from N and againput f = limn fn. The above argument applied to the sequence {fn ∧m}n≥1 thenshows that f ∧m ∈ N for each m ≥ 1. But {f ∧m}m≥1 is an increasing sequencewith limm f ∧m = f and hence f ∈ N . This shows that N is co-complete.

Next, let f, g ∈ N with g ≤ f ; if n ≥ 1 then g ∧ n ∈ MF(X) and so in this casehn = f − (g ∧ n) is the unique element of N with f = (g ∧ n) + hn and note thathn(x) = ∞ whenever f(x) = ∞. Now {hn}n≥1 is a decreasing sequence from Nand so h = limn ∈ N , since N is co-complete. Thus h = f − g, since f = g + hand h(x) = ∞ whenever f(x) = ∞, i.e., f − g ∈ N . Finally, |f − g| ∈ N for allf, g ∈ N , since |f − g| = (f ∨ g − f) + (f ∨ g − g).

Let N be a normal subspace of M(X). Then by Proposition 8.1 N ↑ is closedunder both ∧ and ∨ and of course 1 ∈ N ↑. Therefore N ↑ is normal if and onlyif it is complemented. However, in general this will fail to be the case, and thebest partial result is perhaps the following:

Lemma 8.1 Let N be a ∨-closed complemented subspace of M(X) and let g ∈ Nand f ∈ N↑ with g ≤ f . Then there exists h ∈ N ↑ with f = g + h.

Proof Let {fn}n≥1 be an increasing sequence from N with f = limn fn; then{g∨ fn}n≥1 is an increasing sequence from N , also with limn(g∨ fn) = g∨ f = f .Now g ≤ g ∨ fn and so there exists hn ∈ N with (g ∨ fn) = g + hn; but

(g ∨ fn) = (g ∨ f1) ∨ · · · ∨ (g ∨ fn) = (g + h1) ∨ · · · ∨ (g + hn) = g + h′n ,

where h′n = h1 ∨ · · · ∨ hn, and {h′n}n≥1 is an increasing sequence from N . Leth = limn h

′n; then h ∈ N↑ and f = g + h.

Lemma 8.2 Let N be a normal subspace of M(X). Then N ↑ is normal if andonly if it is co-complete.


Proof As noted above, N ↑ is normal if and only if it is complemented. Moreover,by Proposition 8.2 a complete normal subspace is co-complete. It thus remainsto show that N ↑ is complemented when it is co-complete. Therefore supposeN↑ is co-complete. Let f, g ∈ N ↑ with g ≤ f and let {gn}n≥1 be an increasingsequence from N with limn gn = g. By Lemma 8.1 there exists hn ∈ N↑ suchthat f = gn + hn; for each n ≥ 1 let h′n = h1 ∧ · · · ∧ hn; then

f = g1 ∨ · · · ∨ gn + h1 ∧ · · · ∧ hn = gn + h′n

and {h′n}n≥1 is a decreasing sequence from N ↑. Hence, since N ↑ is co-complete,h′ = limn h

′n ∈ N↑, and f = g + h′. Therefore N ↑ is complemented.

A mapping f ∈ M(X) is said to be simple (or elementary) if the set of valuesf(X) of f is a finite subset of R+ and the set of such simple mappings will bedenoted by ME(X). For each F ⊂ X define IF ∈ ME(X) by

IF (x) =

{

1 if x ∈ F ,0 otherwise .

Lemma 8.3 (1) ME(X) is a normal subspace of M(X) which is ?-closed for eachfinite binary operation ? on R+

∞.

(2) If N is a subspace of M(X) with IF ∈ N for all F ⊂ X then ME(X) ⊂ N ,which means ME(X) is the smallest subspace of M(X) containing the mappingsIF , F ⊂ X.

(3) For each f ∈ M(X) there is an increasing sequence {fn}n≥1 from ME(X)such that f = limn fn. Thus M(X) = (ME(X))↑.

Proof (1) If ? is a finite operation on R+∞ then ME(X) is ?-closed since

(f ? g)(X) ⊂ S = {a ? b : a ∈ f(X), b ∈ g(X)}

and S is a finite subset of R+ whenever both f(X) and g(X) are. In particular,af+bg ∈ ME(X) for all f, g ∈ ME(X), a, b ∈ R+, since (c, d) 7→ ac+bd is a finiteoperation on R+

∞; hence ME(X) is a subspace of M(X). Moreover, the subspaceME(X) is complemented, since if f, g ∈ ME(X) with g ≤ f then (f − g)(X) isa subset of the finite set {a − b : a ∈ f(X), b ∈ g(X)}. Finally, it is clear that1 ∈ ME(X).

(2) If f ∈ ME(X) then f =∑

a∈f(X) aIEa, where Ea = {x ∈ X : f(x) = a}; thus

if IF ∈ N for each F ⊂ X then ME(X) ⊂ N .

(3) Let f ∈ M(X) and for each n ≥ 1 define fn ∈ ME(X) by

fn =n2n∑

m=1

(m− 1)2−nIEm,n,


where Em,n = {x ∈ X : (m − 1)2−n < f(x) ≤ m2−n}. Then fn ≤ fn+1 ≤ f foreach n ≥ 1 and f(x) ≤ 2−n + fn(x) for all x ∈ X with f(x) ≤ n. Thus {fn}n≥1

is an increasing sequence from ME(X) with limn fn = f .

A mapping f ∈ M(X) is said to be bounded if there exists b ∈ R+ such that f ≤ b(i.e., such that f(x) ≤ b for all x ∈ X) and the set of all such mappings will bedenoted by MB(X). In particular, every simple mapping is bounded.

Lemma 8.4 (1) MB(X) is a normal subspace of M(X) which is ?-closed foreach finite monotone operation ? on R+

∞.

(2) If f ∈ MB(X) then for each ε > 0 there exists g ∈ ME(X) with g ≤ f ≤ g+ε.

Proof (1) This is more-or-less the same Lemma 8.3 (1).

(2) Just take g = fn, where fn is as in the proof of Lemma 8.3 (3) with n chosenso that 2−n < ε and f ≤ n.

Let N be a subspace of M(X). A mapping Φ : N → R+∞ will be called linear if

it is additive and positive homogeneous, i.e., if

Φ(af + bg) = aΦ(f) + bΦ(g)

for all f, g ∈ N and all a, b ∈ R+, and note then that Φ(0) = 0. (The somewhatinaccurate description of being linear is employed because of its brevity.) Asin the general case in Chapter 7 a linear mapping Φ : N → R+

∞ is monotoneif Φ(g) ≤ Φ(f) whenever f, g ∈ N with g ≤ f . Moreover, a monotone linearmapping Φ : N → R+

∞ is pre-continuous if Φ(f) ≤ limn Φ(fn) whenever f ∈ Nand {fn}n≥1 is an increasing sequence from N with f ≤ limn fn. Finally, if N iscomplete then a monotone linear mapping Φ : N → R+

∞ is continuous if

Φ(

limn→∞

fn

)

= limn→∞

Φ(fn)

for each increasing sequence {fn}n≥1 from N . If N is a complemented subspaceof M(X) then any linear mapping is automatically monotone, since if f, g ∈ Nwith g ≤ f then there exists h ∈ N with f = g + h and so

Φ(g) ≤ Φ(g) + Φ(h) = Φ(g + h) = Φ(f) .

Proposition 8.3 Let N be a ∨-closed subspace of M(X) and let Φ : N → R+∞

be a monotone linear mapping. Then there exists a continuous linear mappingΦ′ : N↑ → R+

∞ with Φ′(f) = Φ(f) for all f ∈ N if and only if Φ is pre-continuous.


Proof This is all contained in Proposition 7.2, except for showing that Φ′ mustbe linear. However this follows from the definition of Φ′. (Recall Φ′ : N↑ → R+

∞

is the unique mapping with Φ′(limn fn) = limn Φ(fn) for each increasing sequence{fn}n≥1 from N and, since the operation (f, g) 7→ af + bg is continuous for eacha, b ∈ R+, it follows that Φ′ is linear.)

Note that if Φ is pre-continuous then the extension Φ′ : N↑ → R+∞ is unique.

For the remainder of the chapter let N be a complete normal subspace of M(X)and let Φ : N → R+

∞ be a continuous linear mapping.

Proposition 8.4 If {fn}n≥1 is a decreasing sequence from N and Φ(fm) < ∞for some m ≥ 1 then

Φ(

limn→∞

fn

)

= limn→∞

Φ(fn) .

Proof Let {fn}n≥1 be a decreasing sequence from N with Φ(fm) < ∞ and putf = limn fn. Since {Φ(fn)}n≥m is a decreasing sequence from R+

∞, the limitlimn Φ(fn) exists; thus it must be shown that this limit is equal to Φ(f). Foreach n ≥ m let hn be the unique maximal mapping with fn + hn = fm (andso hn(x) = ∞ whenever fm(x) = ∞); thus by Proposition 8.2 hn ∈ N . Nowthe sequence {hn}n≥m is increasing and if h = limn hn then f + h = fm (sinceh(x) = ∞ also holds whenever fm(x) = ∞). Moreover Φ(h) < ∞, since h ≤ fm

and Φ(fm) <∞. Therefore

Φ(f) + Φ(h) = Φ(f + h) = Φ(fm)

= limn→∞

Φ(fm) = limn→∞

Φ(fn + hn) = limn→∞

(Φ(fn) + Φ(hn))

= limn→∞

Φ(fn) + limn→∞

Φ(hn) = limn→∞

Φ(fn) + Φ(h)

and hence Φ(f) = limn Φ(fn), since Φ(h) <∞.

Lemma 8.5 Let N ′ be a subspace of M(X) and {fn}n≥1 be a sequence from N ′.

(1) If N ′ is complete and ∨-closed then sup{fn : n ≥ 1} ∈ N ′.

(2) If N ′ is co-complete and ∧-closed then inf{fn : n ≥ 1} ∈ N ′.

(3) If N ′ is a complete normal subspace of M(X) then lim supn fn and lim infn fn

are both elements of N ′.

Proof (1) Put f = sup{fn : n ≥ 1} ∈ N ′ and for each n ≥ 1 let gn = f1∨· · ·∨fn.Then {gn}n≥1 is an increasing sequence from N ′ with limn gn = f and thereforef ∈ N ′.

(2) is the same as (1) and (3) follows from (1) and (2).


Proposition 8.5 For each sequence {fn}n≥1 from N

Φ(

lim infn→∞

fn

)

≤ lim infn→∞

Φ(fn) .

Moreover, if Φ(

supm≥n fm

)

<∞ for some n ≥ 1 then also

Φ(

lim supn→∞

fn

)

≥ lim supn→∞

Φ(fn) ,

Proof Note by Lemma 8.5 all the mappings occurring here are elements of N .Now put f = lim infn fn and for each n ≥ 1 let gn = infm≥n fm; then {gn}n≥1 is anincreasing sequence from N with limn gn = f and thus Φ(f) = limn Φ(gn). Butgn ≤ fm and so Φ(gn) ≤ Φ(fm) for all m ≥ n ≥ 1. Hence Φ(gn) ≤ infm≥n Φ(fm)for each n ≥ 1 and this shows that

Φ(f) = limn→∞

Φ(gn) ≤ limn→∞

(

infm≥n

Φ(fm))

= lim infn→∞

Φ(fn) .

The second part follows in exactly the same way, but using Proposition 8.4. Herewe have lim supn fn = limn gn with gn = supm≥n fm and this time {gn}n≥1 is adecreasing sequence. The condition Φ

(

supm≥n fm

)

< ∞ (i.e., Φ(gn) < ∞) forsome n ≥ 1 is what we need to apply Proposition 8.4.

Lemma 8.6 Let g ∈ N with Φ(g) <∞ and put G = {x ∈ X : g(x) = ∞}. Then∞IG ∈ N and Φ(∞IG) = 0.

Proof Put hn = g − g ∧ n; then {hn}n≥1 is a decreasing sequence from N withΦ(h1) ≤ Φ(g) < ∞ and limn hn = ∞IG. Thus ∞IG ∈ N and by Proposition 8.4Φ(∞IG) = limn Φ(hn) = Φ(g) − limn Φ(gn) = 0, since {gn}n≥1 is an increasingsequence from N with limn gn = g.

Proposition 8.6 Let {fn}n≥1 be a convergent sequence from N with f = limn fn

and suppose there exists g ∈ N with Φ(g) < ∞ such that fn ≤ g for all n ≥ 1.Then the sequence {Φ(fn)}n≥1 converges and limn Φ(fn) = Φ(f). Moreover,

limn→∞

Φ(|fn − f |) = 0 .

Proof For each n ≥ 1 put gn = supk≥n |f − fk|; then by Proposition 8.2 andLemma 8.5 (1) gn ∈ N . Thus {gn}n≥1 is a decreasing sequence fromN with gn ≤ gfor all n ≥ 1 and limn gn(x) = 0 for all x /∈ G, where G = {x ∈ X : g(x) = ∞}.Therefore by Proposition 8.4 and Lemma 8.6

limn→∞

Φ(gn) = Φ(

limn→∞

gn

)

≤ Φ(∞IG) = 0 .

Hence also limn Φ(|fn − f |) = 0, since |f − fn| ≤ gn for each n ≥ 1. Finally,f ≤ |f − fn| + fn and fn ≤ |f − fn| + f , thus Φ(f) ≤ Φ(|f − fn|) + Φ(fn) andΦ(fn) ≤ Φ(|f−fn|)+Φ(f) and also Φ(fn) ≤ Φ(g) <∞ for all n ≥ 1. This impliesthat |Φ(f) − Φ(fn)| ≤ Φ(|f − fn|) for each n ≥ 1 and so limn Φ(fn) = Φ(f).

9 Real valued measurable mappings

The subspace M(E) of the poset M(X) will now be studied.

In what follows let (X, E) be a measurable space. Recall that M(E) denotes theset of all measurable mappings from (X, E) to (R+

∞,B+∞), thus

M(E) = {f ∈ M(X) : f−1(B+∞) ⊂ E} .

If A is an algebra of subsets of X then ME(A) will denote the set of those elementsf ∈ ME(X) for which {x ∈ X : f(x) = a} ∈ A for each a ∈ f(X). In particularthis means that ME(E) = M(E) ∩ ME(X), i.e., ME(E) consists of those mappingsf ∈ M(E) for which f(X) is a finite subset of R+. In Lemmas 9.1 and 9.2 letA ⊂ P(X) be an algebra.

Lemma 9.1 ME(A) is a normal subspace of M(X) which is ?-closed for eachfinite operation ? on R+

∞.

Proof Let ? be a finite operation on R+∞. For each h ∈ ME(A) and a ∈ R+ put

Eha = {x ∈ X : h(x) = a}. Let f, g ∈ ME(A); then

Ef?gc =

⋃

a?b=c

Efa ∩ Eg

b

for each c ∈ (f ? g)(X), where the union is restricted to values a, b ∈ R+ witha ∈ f(X) and b ∈ g(X). Thus Ef?g

c ∈ A for each c ∈ (f ? g)(X) and therefore byLemma 8.3 (2) f ? g ∈ ME(A). This shows that ME(A) is ?-closed.

In particular, af + bg ∈ ME(A) for all f, g ∈ ME(A) and all a, b ∈ R+∞, since

(c, d) 7→ ac + bd is a finite operation on R+∞. Therefore ME(A) is a subspace of

ME(X), since clearly 0 ∈ ME(A). Now let f, g ∈ ME(A) with g ≤ f and let

h =∑

a<b

(b− a)IEga∩Ef

b,

where the sum is restricted to values a, b ∈ R+ with a ∈ g(X) and b ∈ f(X) anda < b. Then h ∈ ME(A), since ME(A) is a subspace of ME(X) and IA ∈ ME(A)for all A ∈ A, (If A ∈ A and f = IA then f(X) ∈ {0, 1} with Ef

0 = X\A ∈ A andEf

1 = A ∈ A.) Moreover, h = f − g. This shows that ME(A) is a complementedsubspace of ME(A). Finally, it is clear that 1 ∈ ME(A).

Lemma 9.2 The mapping IA is in ME(A) for all A ∈ A, and any subspace Nof M(X) with IA ∈ N for each A ∈ A contains ME(A). Therefore ME(A) is thesmallest subspace of M(X) containing the mappings IA, A ∈ A.

47


Proof It was already shown in the proof of Lemma 9.1 that IA ∈ ME(A) for allA ∈ A. The rest is the same as Lemma 8.3 (2).

Lemma 9.3 For each f ∈ M(E) there is an increasing sequence {fn}n≥1 fromME(E) such that f = limn fn. Thus M(E) = (ME(E))↑.

Proof Let f ∈ M(E); for each n ≥ 1 define fn ∈ ME(X) by

fn =

n2n∑

m=1

(m− 1)2−nIEm,n

where Em,n = {x ∈ X : (m− 1)2−n < f(x) ≤ m2−n} (i.e., defined as in the proofof Lemma 8.3 (3)). Then {fn}n≥1 is an increasing sequence from ME(X) withlimn fn = f . But Em,n ∈ E for all n ≥ 1, 1 ≤ m ≤ n2n, and therefore fn ∈ ME(E)for each n ≥ 1.

Lemma 9.4 If {fn}n≥1 is any sequence from M(E) and A = {fn : n ≥ 1} thensup(A) and inf(A) are both elements of M(E).

Proof Put f = sup(A) and f ′ = inf(A); then for all a ∈ R+∞

{x ∈ X : f(x) ≤ a} =⋂

n≥0

{x ∈ X : fn(x) ≤ a}

is an element of E and hence f ∈ M(E). In the same way, for all a ∈ R+∞

{x ∈ X : f ′(x) < a} =⋃

n≥0

{x ∈ X : fn(x) < a}

is an element of E and therefore again f ′ ∈ M(E).

Proposition 9.1 M(E) is a complete normal subspace of M(X) which is also?-closed for each finite continuous operation ? on R+

∞.

Proof By Lemma 9.1 ME(A) is a normal subspace of M(X) and by Lemma 9.3M(E) = (ME(E))↑. Moreover, by Lemma 9.4 M(E) is co-complete. Therefore byLemma 8.2 M(E) is a complete normal subspace of M(X). Finally, it follows fromProposition 8.1 and Lemma 9.3 that M(E) is ?-closed for each finite continuousoperation ? on R+

∞.

By Propositions 9.1 and 8.2 |f − g| ∈ M(E) for all f, g ∈ M(E) and if g ≤ f thenf − g ∈ M(E).


Proposition 9.2 Let N be a complete subspace of M(E) with IE ∈ N for eachE ∈ E . Then N = M(E).

Proof Lemma 9.2 implies that ME(E) ⊂ N , since IE ∈ N for each E ∈ E .Therefore by Lemma 9.3 N = M(E), since N is complete.

By Proposition 9.1 M(E) is a complete normal subspace of M(X) and it is clearthat E = {E ⊂ X : IE ∈ M(E)}. The next result states that the converse alsoholds.

Proposition 9.3 Let N be a complete normal subspace of M(X). Then

F = {F ⊂ X : IF ∈ N}

is a σ-algebra and N = M(F).

Proof If F ∈ F then IF ∈ N with IF ≤ 1 and IX\F is the unique element ofM(X) with 1 = IF + IX\F . Thus IX\F ∈ N , since N is complemented, i.e.,X \ F ∈ F for all F ∈ F .

Now let E, F ∈ F ; then IE∩F = IE ∧ IF ∈ F , since N is ∧-closed, and henceE ∩ F ∈ F . This shows F is an algebra, since IX = 1 ∈ F .

Next let {Fn}n≥1 be an increasing sequence from F and put F =⋃

n≥1 Fn. Then{IFn

}n≥1 is an increasing from N with limn IFn= IF and thus IF ∈ N . This

shows F ∈ F and therefore by Lemma 2.1 (4) F is a σ-algebra.

Now let g ∈ N and a ∈ R+ with a > 0 and let Ga = {x ∈ X : g(x) > a}. Puth = a−1g − (a−1g) ∧ 1; then g ∈ N , since N is complemented and ∧-closed and1 ∈ N . For each n ≥ 1 let gn = (nh) ∧ 1; then {gn}n≥1 is an increasing sequencefrom N with limn gn = IGa

. Thus IGa∈ N , since N is complete, i.e., IGa

∈ Fwhich implies that g ∈ M(F). This shows N ⊂ M(F). But by definition IF ∈ Nfor all F ∈ F and therefore by Proposition 9.2 N = M(F).

Let MB(E) denote the set of the bounded mappings in M(E) (and thereforeMB(E) = M(E) ∩ MB(X)).

Lemma 9.5 (1) MB(E) is a co-complete normal subspace of M(X) which is?-closed for each finite continuous operation ? on R+

∞.

(2) If f ∈ MB(E) then for each ε > 0 there exists a mapping g ∈ ME(E) withg ≤ f ≤ g + ε.

Proof (1) This follows immediately from Lemma 8.3 (1) and Proposition 9.1.

(2) If f ∈ MB(E) then the mapping g ∈ ME(X) in the proof of Lemma 8.3 (2) isin fact an element of ME(E).

If f, g ∈ MB(E) then |f − g| = MB(E), since |f − g| is bounded.


Lemma 9.6 Let f, g ∈ M(E); then the set G♦ = {x ∈ X : f(x) ♦ g(x)} is in Ewhenever ♦ is one of the relations <, ≤, >, ≥, = or 6=.

Proof Let h = f ∨ g − g; thus by Proposition 9.1 h ∈ M(E). Now

{x ∈ X : f(x) > g(x)} = {x ∈ X : (f ∨ g)(x) > g(x)}

= {x ∈ X : h(x) > 0} ∩ {x ∈ X : g(x) <∞}

and thus G> ∈ E . Moreover, this implies also that G≥ ∈ E , since

{x ∈ X : f(x) ≥ g(x)}

= {x ∈ X : f(x) = ∞}∪⋂

n≥1

{x ∈ X : (f + 1/n)(x) > g(x)} .

The other four cases now follow directly from these two.

Lemma 9.7 Let {fn}n≥1 be any sequence from M(E) and f ∈ M(E); then thesets G = {x ∈ X : limn fn(x) exists} and G′ = {x ∈ X : limn fn(x) = f(x)} areboth in E .

Proof By Lemma 9.4 the mappings lim supn fn and lim infn fn are both elementsof M(E) and G = {x ∈ X : lim supn fn(x) = lim infn fn(x)}. Therefore byLemma 9.6 G ∈ E . Moreover, G′ = {x ∈ X : (IG lim supn fn)(x) = f(x)} and soagain making use of Lemmas 9.4 and 9.6 G′ ∈ E .

For the remainder of the chapter let µ be a measure on E .

If ♦ is one of the relations <, ≤, >, ≥, = or 6= and f, g ∈ M(E) then we saythat f ♦ g µ-almost everywhere (which is usually shortened to f ♦ g µ-a.e.) ifµ(X \ {x ∈ X : f(x)♦ g(x)}) = 0 The usage ‘µ-almost everywhere’ will also beapplied to more complex statements, for example if {fn}n≥1 is a sequence fromM(E) and f ∈ M(E) then limn fn = f µ-a.e. means that

µ(

X \{

x ∈ X : limn→∞

fn(x) = f(x)}

)

= 0 .

In general it means that if B is the set of elements for which the statement doesnot hold then µ(B) = 0 (with the implicit assumption that it has already beenverified that B ∈ E).

A sequence {fn}n≥1 from M(E) is said to converge in µ-measure to f ∈ M(E) if

limn→∞

µ({x ∈ X : |fn(x) − f(x)| > ε}) = 0

for each ε > 0. (Note that this is impossible if µ({x ∈ X : f(x) = ∞}) > 0, andwe leave the reader to find a better definition which also treats convergence at∞ properly.)


Proposition 9.4 Let µ be finite and let {fn}n≥1 be a sequence from M(E) whichconverges µ-a.e. to f ∈ M(E), where µ({x ∈ X : f(x) = ∞}) = 0. Then:

(1) {fn}n≥1 converges to f in µ-measure.

(2) For each ε > 0 there exists E ∈ E with µ(E) < ε such that {fn}n≥1 convergesuniformly to f on X \ E.

Proof Put F = {x ∈ X : f(x) <∞ and f(x) = limn fn(x)} and so µ(X \F ) = 0.For each each ε > 0, n ≥ 1 let

Eεn =

⋃

k≥n

{x ∈ X : |fk(x) − f(x)| > ε} .

Then {Eεn}n≥1 is a decreasing sequence from E with

⋂

n≥1Eεn ⊂ X \F and hence

by Lemma 3.2 limn µ(Eεn) = 0 since limn µ(Eε

n) = µ(⋂

n≥1Eεn) ≤ µ(X \ F ) = 0.

(1) This follows because {x ∈ X : |fn(x) − f(x)| > ε} ⊂ Eεn for each n ≥ 1.

(2) Let ε > 0; then for each m ≥ 1 there exists nm ≥ 1 with µ(E1/mnm ) < 2−mε.

Put E =⋃

m≥1 E1/mnm ; then E ∈ E and µ(E) <

∑

m≥1 2−mε = ε. Moreover, thesequence {fn}n≥1 converges uniformly to f on X \E, since |fn(x)− f(x)| ≤ 1/mfor all n ≥ nm for each x ∈ X \ E.

Proposition 9.5 Let {fn}n≥1 be a sequence from M(E) converging to f ∈ M(F)in µ-measure. Then there exists a subsequence {nk}k≥1 so that limk fnk

= f µ-a.e.

Proof Since {fn}n≥1 converges to f in µ-measure there exists a subsequence{nk}k≥1 such that µ({x ∈ X : |fnk

(x) − f(x)| ≥ 1/k}) < 2−k for each k ≥ 1. Letm ≥ 1; then for each k ≥ m

µ({x ∈ X : |fnk(x)− f(x)| ≥ 1/m}) ≤ µ({x ∈ X : |fnk

(x)− f(x)| ≥ 1/k}) < 2−k

which means that the series∑

k≥m µ({x ∈ X : |fnk(x)−f(x)| ≥ 1/m}) converges.

This in turn implies that

µ(

⋂

N≥1

⋃

k≥N

{x ∈ X : |fnk(x) − f(x)| ≥ 1/m}

)

= 0

for each m ≥ 1 and therefore by Lemma 3.3

µ(

⋃

m≥1

⋂

N≥1

⋃

k≥N

{x ∈ X : |fnk(x) − f(x)| ≥ 1/m}

)

= 0 .

But this says exactly that limk fnk= f µ-a.e.

10 The integral

We are now finally in a position to introduce the integral. The main interest isin the integral for a measure defined on a σ-algebra, but some of the results arebest stated for a measure defined only on an algebra. Thus to start with let Xbe a non-empty set and let A ⊂ P(X) be an algebra.

Recall from Chapter 8 that if N is a subspace of M(X) then a monotone linearmapping Φ : N → R+

∞ is said to be pre-continuous if Φ(f) ≤ limn Φ(fn) wheneverf ∈ N and {fn}n≥1 is an increasing sequence from N with f ≤ limn fn. Recallalso that a linear mapping Φ : ME(A) → R+

∞ is always monotone (since byLemma 9.1 ME(A) is complemented).

Proposition 10.1 (1) Let Φ : ME(A) → R+∞ be a linear mapping and define

µ : A → R+∞ by µ(A) = Φ(IA) for each A ∈ A. Then µ is a finitely additive

measure on A. Moreover, if Φ is pre-continuous then µ is a measure.

(2) For each finitely additive measure µ on A there exists a unique linear mappingΦµ : ME(A) → R+

∞ such that Φµ(IA) = µ(A) for all A ∈ A. Moreover, if µ is ameasure then Φµ is pre-continuous.

Proof (1) If Φ is linear then µ(∅) = Φ(I∅) = Φ(0) = 0, and if A1, A2 ∈ A withA1 ∩ A2 = ∅ then IA1∪A2

= IA1+ IA2

and so

µ(A1 ∪ A2) = Φ(IA1∪A2) = Φ(IA1

) + Φ(IA2) = µ(A1) + µ(A2) .

This shows that µ is a finitely additive measure on A.

Now let Φ be pre-continuous and let {An}n≥1 be an increasing sequence from Awith A =

⋃

n≥1An ∈ A. Then {IAn}n≥1 is an increasing sequence from ME(A)

with limn IAn= IA and therefore

µ(A) = Φ(IA) ≤ limn→∞

Φ(IAn) = lim

n→∞µ(An) .

But µ(An) = Φ(IAn) ≤ Φ(IA) = µ(A) for all n ≥ 1, since Φ is monotone, and

hence also limn µ(An) ≤ µ(A). Thus limn µ(An) = µ(A), which shows that µ iscontinuous. Therefore by Proposition 3.1 µ is a measure on A.

(2) Define Φµ : ME(A) → R+∞ explicitly by letting

Φµ(f) =∑

a∈f(X)

aµ({x ∈ X : f(x) = a}) .

In particular this means that Φµ(IA) = µ(A) for all A ∈ A. Moreover, Φµ islinear: It is clear that Φµ(af) = aΦµ(f) for all f ∈ ME(A), a ∈ R+ and so it is

52

10 The integral 53

enough to show that Φµ(f + g) = Φµ(f) + Φµ(g) for all f, g ∈ ME(A). Using thenotation in the proof of Lemma 9.1 it follows that for all c ∈ C = (f + g)(X)

µ(Ef+gc ) = µ

(

⋃

a+b=c

Efa ∩ Eg

b

)

=∑

a+b=c

µ(Efa ∩ Eg

b ) ,

with a and b being restricted to values in the sets A = f(X) and B = g(X)respectively. Therefore

Φµ(f + g) =∑

c

cµ(Ef+gc ) =

∑

c∈C

c∑

a+b=c

µ(Efa ∩ Eg

b )

=∑

c∈C

∑

a+b=c

(a+ b)µ(Efa ∩ Eg

b ) =∑

a∈A

∑

b∈B

(a + b)µ(Efa ∩ Eg

b )

=∑

a∈A

∑

b∈B

aµ(Efa ∩ Eg

b ) +∑

a∈A

∑

b∈B

bµ(Efa ∩ Eg

b )

=∑

a∈A

aµ(

Efa ∩

⋃

b∈B

Egb

)

+∑

b∈B

bµ(

⋃

a∈A

Efa ∩ Eg

b

)

=∑

a∈A

aµ(Efa ) +

∑

b∈B

bµ(Egb ) = Φµ(f) + Φµ(g) .

The uniqueness of Φµ follows immediately from Lemma 9.2. Suppose now that µis a measure on A; we must show that Φµ is pre-continuous and the next lemmagives the first step in this direction.

Lemma 10.1 Let A ∈ A, a ∈ R+ and let {fn}n≥1 be an increasing sequencefrom ME(A) with aIA ≤ limn fn. Then aµ(A) ≤ limn Φµ(fn).

Proof This holds trivially if a = 0 and so we can assume that a 6= 0. Let b ∈ R+

with b < a and for each n ≥ 1 let An = {x ∈ A : fn(x) > b}. Then {An}n≥1 isan increasing sequence from A with

⋃

n≥1An = A and hence by Proposition 3.1limn µ(An) = µ(A). But bIAn

≤ fn and hence bµ(An) = Φµ(bIAn) ≤ Φµ(fn) for

all n ≥ 1; therefore bµ(A) = limn bµ(An) ≤ limn Φµ(fn) and since this holds forall b < a it follows that aµ(A) ≤ limn Φµ(fn).

Let f ∈ ME(A) and let {fn}n≥1 be an increasing sequence from ME(A) withf ≤ limn fn. For each a ∈ f(X) let Ea = {x ∈ X : f(x) = a}. Now let a ∈ f(X);then {fnIEa

}n≥1 is an increasing sequence from ME(A) with aIEa≤ limn fnIEa

and hence by Lemma 10.1 aµ(Ea) ≤ limn Φµ(fnIEa). But fn =

∑

a∈g(X) fnIEafor

each n ≥ 1 and thus

Φµ(f) =∑

a∈f(X)

aµ(Ea) ≤∑

a∈g(X)

(

limn→∞

Φµ(fnIEa))

= limn→∞

∑

a∈g(X)

Φµ(fnIEa) = lim

n→∞Φµ(fn) .

10 The integral 54

This shows that Φµ is pre-continuous.

Here is a version of Proposition 3.2 which is stated in terms of the mapping Φµ:

Proposition 10.2 Suppose that µ is a finite finitely additive measure on A andlet Φµ : ME(A) → R+

∞ be the unique linear mapping such that Φµ(IA) = µ(A) forall A ∈ A. Then µ is a measure if and only if limn Φµ(fn) = 0 for each decreasingsequence {fn}n≥1 from ME(A) with limn fn = 0.

Proof Suppose µ is a measure and let {fn}n≥1 be a decreasing sequence fromME(A) with limn fn = 0. Since f1 is bounded there exists b ∈ R+ with f1 ≤ b(and then fn ≤ b for all n ≥ 1). Let ε > 0, choose η > 0 with 2µ(X)η < ε andfor each n ≥ 1 let An = {x ∈ X : fn(x) ≥ η}. Then fn ≤ bIAn

+ η and therefore

Φµ(fn) ≤ bΦµ(IAn) + ηΦ(IX) = bµ(An) + ηµ(X) < bµ(An) + ε/2 ,

since Φµ is linear and monotone. But {An}n≥1 is a decreasing sequence from Awith

⋂

n≥1An = ∅ and thus by Proposition 3.2 limn µ(An) = 0. Hence thereexists m ≥ 1 so that bµ(An) < ε/2 for all n ≥ m, which implies that Φµ(fn) < εfor all n ≥ m. This shows that limn Φµ(fn) = 0. Conversely, if limn Φµ(fn) = 0for each decreasing sequence {fn}n≥1 from ME(A) with limn fn = 0 then clearlyµ is ∅-continuous and therefore by Proposition 3.2 µ is a measure.

Proposition 10.3 For each measure µ on A there exists a unique continuouslinear mapping Φµ : (ME(A))↑ → R+

∞ such that Φµ(IA) = µ(A) for all A ∈ A.

Proof By Proposition 10.1 (2) Φµ : ME(A) → R+∞ is pre-continuous and by

Lemma 9.1 ME(A) is a ∨-closed subspace of M(X). Thus by Proposition 8.4 Φµ

extends uniquely to a continuous linear mapping from (ME(A))↑ to R+∞ (which

will also be denoted by Φµ). Thus Φµ : (ME(A))↑ → R+∞ is the unique continuous

linear mapping such that Φµ(IA) = µ(A) for all A ∈ A.

Now in what follows let (X, E) be a measurable space.

Proposition 10.4 Let Φ : M(E) → R+∞ be a continuous linear mapping and

define µ : E → R+∞ by µ(E) = Φ(IE). Then µ is a measure on E .

Proof By Proposition 9.1 M(E) is complete and hence by Lemma 7.3 Φ is alsopre-continuous. The result thus follows from Proposition 10.1 (1).

Theorem 10.1 For each measure µ on E there exists a unique continuous linearmapping Φµ : M(E) → R+

∞ such that Φµ(IE) = µ(E) for all E ∈ E .

10 The integral 55

Proof By Lemma 9.3 M(E) = (ME(E))↑ and therefore the result follows fromProposition 10.3.

Proposition 10.4 implies that the mapping µ 7→ Φµ (given in Theorem 10.1)defines a bijection between the set of measures on E and the set of continuouslinear mappings from M(E) to R+

∞.

It is usual to write something like∫

f dµ instead of Φµ(f) (at least if Φµ(f) 6= ∞)and to call Φµ(f) the integral of f with respect to µ. However, we prefer tojust write µ(f) instead of Φµ(f). This means that a measure µ on E will alsobe considered as the unique continuous linear mapping µ : M(E) → R+

∞ withµ(IE) = µ(E) for all E ∈ E .

In terms of the mapping µ : M(E) → R+∞ a measure µ is finite if and only if

µ(1) <∞ and is a probability measure if and only if µ(1) = 1.

Theorem 10.2 Let µ be a measure on E and let {fn}n≥1 be a decreasing sequencefrom M(E) such that µ(fm) <∞ for some m ≥ 1. Then

µ(

limn→∞

fn

)

= limn→∞

µ(fn) .

Proof This is just a special case of Proposition 8.4, since by Proposition 9.1 M(E)is a complete normal subspace of M(X).

Theorem 10.3 Let {fn}n≥1 be a sequence from M(E). Then

µ(

lim infn→∞

fn

)

≤ lim infn→∞

µ(fn) .

Moreover, if µ(

supm≥n fm

)

<∞ for some n ≥ 1 then also

µ(

lim supn→∞

fn

)

≥ lim supn→∞

µ(fn) ,

Proof This is just a special case of Proposition 8.5.

Theorem 10.3 (or at least the first part of it) is known as Fatou’s lemma. Thenext result is known as the dominated convergence theorem.

Theorem 10.4 Let {fn}n≥1 with f = limn fn be a convergent sequence fromM(E) and suppose there exists g ∈ M(E) with µ(g) < ∞ such that fn ≤ g forall n ≥ 1. Then the sequence {µ(fn)}n≥1 converges and limn µ(fn) = µ(f).Moreover, limn µ(|fn − f |) = 0.

10 The integral 56

Proof This is just a special case of Proposition 8.6.

Proposition 10.5 Let µ be a measure on E .

(1) µ = 0 (i.e., µ(f) = 0 for all f ∈ M(E)) if and only if µ(1) = 0.

(2) If µ(f) = 0 then µ(fg) = 0 for all g ∈ M(E).

Proof (1) Let N = {f ∈ M(E) : µ(f) = 0}; then N is a complete subspace ofM(E) and so by Proposition 9.2 N = M(E) if and only if µ(IE) = 0 for all E ∈ E ,and this holds if and only if µ(1) = µ(X) = 0, since the mapping µ : E → R+

∞ ismonotone.

(2) This follows from (1), since the mapping g 7→ µ(fg) (from M(E) to R+∞) is

linear and continuous and thus by Proposition 10.4 a measure.

Lemma 10.2 Let µ be a measure on E . Then

aµ({x ∈ X : f(x) ≥ a}) ≤ µ(f)

for all f ∈ M(E) and all a ∈ R+.

Proof Put E = {x ∈ X : f(x) ≥ a}; then aIE ∈ ME(E) with aIE ≤ f and thusaµ(E) = µ(aIE) ≤ µ(f).

Proposition 10.6 Let µ be a measure on E and f ∈ M(E). Then:

(1) µ(f) = 0 if and only if µ({x ∈ X : f(x) > 0}) = 0.

(2) µ(fIN) = 0 for all N ∈ E with µ(N) = 0.

(3) If E ∈ E with µ(X \ E) = 0 then µ(fIE) = µ(f).

(4) If µ(f) <∞ then µ({x ∈ X : f(x) = ∞}) = 0.

(5) If µ(f) < ∞ and F ∈ E is such that {x ∈ X : 0 < f(x) < ∞} ⊂ F thenµ(fIF ) = µ(f). Moreover, µ(hfIF ) = µ(hf) for all h ∈ M(E).

Proof For each a ∈ R+ put Ea = {x ∈ X : f(x) ≥ a}.

(1) Let E = {x ∈ X : f(x) > 0}. If µ(f) = 0 then by Lemma 10.2 aµ(Ea) = 0for each a ∈ R+ and in particular µ(E1/n) = 0 for all n ≥ 1. Thus by Lemma 3.3

µ(E) = µ(

⋃

n≥1

E1/n

)

≤∑

n≥1

µ(E1/n) = 0 .

10 The integral 57

Suppose conversely µ(E) = 0 and let g ∈ ME(F) with g ≤ f . Then g ≤ bIE,where b = max(g(X)), and therefore µ(g) ≤ µ(bIE) = bµ(E) = 0. It now followsfrom the definition of µ(f) that µ(f) = 0.

(2) This follows immediately from (1), since {x ∈ X : (fIN)(x) > 0} ⊂ N .

(3) By (2) µ(f) = µ(fIE + fIX\E) = µ(fIE) + µ(fIX\E) = µ(fIE).

(4) By Lemma 10.2 aµ({x ∈ X : f(x) = ∞}) ≤ aµ(Ea) ≤ µ(f) for each a ∈ R+,since {x ∈ X : f(x) = ∞} ⊂ Ea, and thus µ({x ∈ X : f(x) = ∞}) = 0.

(5) Let F ′ = {x ∈ X : f(x) < ∞}; then fIF ′ ≤ fIF ≤ f , which implies thatµ(fIF ′) ≤ µ(fIF ) ≤ µ(f). But by (4) µ(X \ F ′) = 0 and therefore by (3)it follows that µ(fIF ′) = µ(f). Thus µ(fIF ) = µ(f). Now let E ∈ E ; thenµ(IEf) <∞ and {x ∈ X : 0 < (IEf)(x) < ∞} ⊂ {x ∈ X : 0 < f(x) < ∞} ⊂ F ,and hence applying the first part to IEf shows that µ(IEfIF ) = µ(IEf). ButN = {h ∈ M(E) : µ(hfIF ) = µ(hf) is clearly a closed subspace of M(E) andhence by Proposition 9.2 N = M(E), i.e., µ(hfIF ) = µ(hf) for all h ∈ M(E).

The following result is the Cauchy-Schwarz inequality :

Proposition 10.7 Let µ be a measure on E ; then µ(fg)2 ≤ µ(f 2)µ(g2) for allf, g ∈ M(E).

Proof If either µ(f 2) = 0 or µ(g2) = 0 then by Proposition 10.6 (1) µ(fg) = 0and so we can assume that µ(f 2) > 0 and µ(g2) > 0. Moreover, we can thenalso assume that µ(f 2) < ∞ and µ(g2) < ∞, and this implies that µ(fg) < ∞,since fg ≤ 1

2(f 2 + g2). Now if λ ≥ 0 then |f − λg| ∈ M(E) and thus also

|f − λg|2 ∈ M(E). But then |f − λg|2 + 2λfg = f 2 + λ2g2 and hence

µ(|f − λg|2) + 2λµ(fg) = µ(f 2) + λ2µ(g2) .

It therefore follows that 2λµ(fg) ≤ λ2µ(g2) + µ(f 2) for all λ ≥ 0 which impliesµ(fg)2 ≤ µ(f 2)µ(g2).

Recall that a measure µ on E is σ-finite if if there exists a sequence {Bn}n≥1

from E with µ(Bn) < ∞ for each n ≥ 1 and X =⋃

n≥1Bn. Recall also that itis always possible to choose the sequence {Bn}n≥1 here either to be increasing orto be disjoint. Let

M+F(E) = {f ∈ M(E) : 0 < f(x) <∞ for all x ∈ X} .

Lemma 10.3 A measure µ on E is σ-finite if and only if µ(v) < ∞ for somev ∈ M+

F(E).

10 The integral 58

Proof Suppose there exists v ∈ M+F(E) such that µ(v) < ∞, and for each n ≥ 1

let Bn = {x ∈ X : v(x) ≥ 1/n}. Then Bn ∈ E and⋃

n≥1Bn = X, since v(x) > 0for all x ∈ X. But by Lemma 10.2 µ(Bn) ≤ nµ(v) <∞ for each n ≥ 1 and henceµ is σ-finite. Suppose conversely that µ is σ-finite; then there exists a disjointsequence {Bn}n≥1 from E with µ(Bn) < ∞ for each n ≥ 1 and X =

⋃

n≥1Bn.For each n ≥ 1 let

vn =

n∑

k=1

2−k(1 + µ(Bk))−1IBk

.

Then vn ∈ ME(E) ⊂ M(E), the sequence {vn}n≥1 is increasing and

µ(vn) =

n∑

k=1

2−k(1 + µ(Bk))−1µ(Bk) ≤

n∑

k=1

2−k < 1 .

Let v = limn vn; then v ∈ M(E) and µ(v) ≤ 1. But 0 < v(x) < ∞ for all x ∈ X,since v(x) = 2−k(1 + µ(Bk))

−1 for all x ∈ Bn, n ≥ 1, and thus v ∈ M+F(E).

Lemma 10.4 Let µ be a σ-finite measure on E and f, g ∈ M(E).

(1) f ≤ g holds µ-a.e. if and only if µ(IEf) ≤ µ(IEg) for all E ∈ E , and thisholds if and only if µ(hf) ≤ µ(hg) for all h ∈ M(E).

(2) f = g holds µ-a.e. if and only if µ(IEf) = µ(IEg) for all E ∈ E , and thisholds if and only if µ(hf) = µ(hg) for all h ∈ M(E).

Proof (1) Assume first f ≤ g µ-a.e. and let G = {x ∈ X : f(x) > g(x)}, thusµ(G) = 0. Then IEfIX\G ≤ IEgIX\G for each E ∈ E and therefore

µ(IEf) = µ(IEfIX\G) ≤ µ(IEgIX\G) = µ(IEg)

by Proposition 10.6 (3). But {h ∈ M(E) : µ(hf) ≤ µ(hf)} is clearly a closedsubspace of M(E) and hence by Proposition 9.2 µ(hf) ≤ µ(hf) for all h ∈ M(E)(and this part holds for an arbitrary measure µ). Suppose now conversely thatµ(hf) ≤ µ(hf) for all h ∈ M(E , and so in particular µ(IEf) ≤ µ(IEg) for allE ∈ E). Fix b ∈ R+ and B ∈ E with µ(B) <∞; for each ε > 0 let

Aε = {x ∈ X : f(x) ≥ (g + ε)(x)} ∩ {x ∈ B : g(x) ≤ b} ;

then, since IAε(g + ε) ≤ IAε

f , it follows that

µ(IAεg) + εµ(Aε) = µ(IAε

(g + ε)) ≤ µ(IAεf) ≤ µ(IAε

g) ,

and so µ(Aε) = 0, since µ(IAεg) ≤ µ(bIB) = bµ(B) <∞. Thus by Lemma 3.3

µ({x ∈ B : f(x) > g(x) and g(x) ≤ b}) = µ(

⋃

n≥1

A1/n

)

≤∑

n≥1

µ(A1/n) = 0 ,

10 The integral 59

and therefore again making use of Lemma 3.3

µ({x ∈ B : f(x) > g(x)}) = µ(

⋃

n≥1

{x ∈ B : f(x) > g(x) and g(x) ≤ n})

= 0 .

Now let {Bn}n≥1 be a sequence from E with µ(Bn) <∞ for each n ≥ 1 and withX =

⋃

n≥1Bn. Then, once more using Lemma 3.3, we have

µ({x ∈ X : f(x) > g(x)}) = µ(

⋃

n≥1

{x ∈ Bn : f(x) > g(x)})

= 0 .

(2) This follows immediately from (1).

Let µ be a measure on E ; a subset U of M(E) is said to be uniformly µ-integrableif for each ε > 0 there exists δ > 0 such that µ(fIE) < ε for all f ∈ U and allE ∈ E with µ(E) < δ.

Lemma 10.5 Let U be a subset of M(E) and suppose there exists h ∈ M(E) withµ(h) <∞ such that f ≤ h for all f ∈ U . Then U is uniformly µ-integrable.

Proof This follows from Lemma 10.6 below, since µ(fIE) ≤ µ(hIE) for all f ∈ Uand all E ∈ E .

Note that if U is as in Lemma 10.5 then there exists b ∈ R+ (= µ(h)) such thatµ(f) ≤ b for all f ∈ U . If µ is finite then usually the converse is true, i.e., ifU ⊂ M(E) is uniformly µ-integrable then there will exist b ∈ R+ with µ(f) ≤ bfor all f ∈ U . (This will be true if for each δ > 0 the set X can be written as afinite union

⋃mk=1Ek with µ(Ek) < δ for each k.)

Lemma 10.6 Let h ∈ M(E) with µ(h) < ∞. Then for each ε > 0 there existsδ > 0 such that µ(hIE) < ε for all E ∈ E with µ(E) < δ. In other words, the oneelement set {h} is uniformly µ-integrable.

Proof For n ≥ 1 let En = {x ∈ X : h(x) > n}; then {hIEn}n≥1 is a decreasing

sequence from M(E) with µ(hIE1) ≤ µ(h) < ∞ and limn hIEn

= hIE∞, where

E∞ = {x ∈ X : h(x) = ∞}. Therefore by Theorem 10.2 and Proposition 10.6(2) and (4) limn µ(hIEn

) = µ(hIE∞) = 0. Now let ε > 0; then there exists m ≥ 1

such that µ(hIEm) < ε/2. Let δ = ε/(2m); if E ∈ E with µ(E) < δ then

µ(hIE) = µ(hIE∩Em+ hIE\Em

) = µ(hIE∩Em) + µ(hIE\Em

)

≤ µ(hIEm) + µ(mIE) = µ(hIEm

) +mµ(E) < ε/2 + ε/2 = ε ,

since hIE∩Em≤ hIEm

and hIE\Em≤ mIE.

Lemma 10.5 implies that the following result is a generalisation of Theorem 10.4(but only with µ finite). Put Mµ(E) = {f ∈ M(E) : µ({x ∈ X : f(x) = ∞}) = 0}.

10 The integral 60

Proposition 10.8 Let µ be finite and let {fn}n≥1 be a convergent sequence fromMµ(E) with limn fn = f ∈ Mµ(E). Suppose that the set {fn : n ≥ 1} is uniformlyµ-integrable. Then there exists b ∈ R+ such that µ(f) ≤ b and µ(fn) ≤ b for eachn ≥ 1 and limn µ(|fn − f |) = 0.

Proof Let F = {x ∈ X : f(x) < ∞ and fn(x) < ∞ for all n ≥ 1} and foreach m ≥ 1 let Em = {x ∈ F : fn(x) ≤ m for all n ≥ 1}; then {Em}m≥1 is anincreasing sequence from E with

⋃

m≥1Em = F and therefore by Proposition 3.1limn µ(Em) = µ(F ) = µ(X), since by Lemma 3.3 µ(X \ F ) = 0. Let ε > 0 andchoose δ > 0 so that µ(fnIE) < ε/3 for all n ≥ 1 whenever µ(E) < δ. Since µ isfinite there then exists p ≥ 1 such that µ(X \Ep) < δ and thus by Theorem 10.3

µ(fIX\Ep) = µ

(

limn→∞

fnIX\Ep

)

≤ lim infn→∞

µ(fnIX\Ep) ≤ ε/3 .

Now fnIEp≤ p for each n ≥ 1 and so µ(fn) = µ(fnIEp

) + µ(fnIX\Ep) ≤ p + ε/3

for each n ≥ 1; also µ(f) ≤ p+ ε/3, since fIEp≤ p. Moreover,

µ(|f − fn|) = µ(|f − fn|IEp) + µ(|f − fn|IX\Ep

)

≤ µ(|f − fn|IEp) + µ(fIX\Ep

) + µ(fnIX\Ep)

< µ(|f − fn|IEp) + 2ε/3 .

Put gn = |f − fn|IEp; then gn ≤ p for all n ≥ 1 (and µ(p) < ∞ since µ is finite)

and limn gn = 0. Thus by Theorem 10.4 limn µ(gn) = 0 and this shows thatlimn µ(|fn − f |) = 0. Finally, taking b = p+ ε/3 (for some arbitrary ε > 0) showsthat µ(f) ≤ b and µ(fn) ≤ b for each n ≥ 1.

11 The Daniell integral

In this chapter we give another approach to the integral. The material presentedhere will not be used elsewhere, and so the chapter can be omitted. In thefollowing let X be a non-empty set.

Theorem 11.1 Let N be a normal subspace of M(X) and let Φ : N → R+∞ be

a pre-continuous linear mapping. Then there exists a σ-algebra E ⊂ P(X) withN ⊂ M(E) and a measure µ on E such that Φ(f) = µ(f) for all f ∈ N .

The construction involved in the proof of this result is due to Daniell (in 1917) andthus goes under the name of the Daniell integral. However, before starting withthe proof we apply Theorem 11.1 to the obtain another proof of the Caratheodoryextension theorem (Theorem 4.1). Thus let A ⊂ P(X) be an algebra and let µbe a measure on A. We want to show that there exists a measure ν on σ(A) suchthat ν(A) = µ(A) for all A ∈ A. By Lemma 9.1 ME(A) is a normal subspace ofM(X) and by Proposition 10.1 (2) the unique linear mapping Φµ : ME(A) → R+

∞

with Φµ(IA) = µ(A) for all A ∈ A is pre-continuous. Thus by Theorem 11.1there exists a σ-algebra E ⊂ P(X) with ME(A) ⊂ M(E) and a measure λ on Esuch that Φµ(f) = λ(f) for all f ∈ ME(A). Then A ⊂ E , since IA ∈ ME(A) foreach A ∈ A and hence σ(A) ⊂ E . Let ν be the restriction of λ to σ(A); thenµ(A) = Φµ(IA) = λ(IA) = λ(A) = ν(A) for all A ∈ A.

We now turn to the proof of Theorem 11.1.

A complete subspace N of M(X) will be called weakly complemented if for allf, g ∈ N with g ≤ f there exists an increasing sequence {gn}n≥1 from N ∩MF(X)with g = limn gn and for each n ≥ 1 an element hn ∈ N with f = gn + hn. (Thesequence {hn}n≥1 is then more-or-less decreasing, and it can actually be chosento be decreasing when N is ∧-closed. However, we do not need to make use ofthis fact.)

Let us say that a complete subspace N of M(X) is weakly normal if it is a weaklycomplemented subspace containing 1 and closed under ∧ and ∨. In particular, acomplete normal subspace N is weakly complemented and hence weakly normal.(Let f, g ∈ N with g ≤ f . Then {g ∧ n}n≥1 is an increasing sequence fromN ∩MF(X) with g = limn(g ∧n), and there exists hn ∈ N with f = (g ∧n) + hn,since N is complemented and g ∧ n ≤ f .)

Lemma 11.1 Let N be a normal subspace of M(X). Then N ↑ is weakly normal.

Proof Let f, g ∈ N ↑ with g ≤ f and let {g′n}n≥1 be an increasing sequence fromN with g = limn g

′n. For n ≥ 1 put gn = g′n ∧ n; then {gn}n≥1 is an increasing

61


sequence from N ∩ MF(X) ⊂ N↑ ∩ MF(X) with g = limn gn. But gn ∈ N andgn ≤ f and therefore by Lemma 8.1 there exists hn ∈ N↑ with f = gn + hn.

Let N be a normal subspace of M(X) and Φ : N → R+∞ be a pre-continuous

linear mapping. Then by Lemma 11.1 N ↑ is a complete weakly normal subspaceof M(X) and Proposition 8.4 implies there exists a continuous linear mappingΦ′ : N↑ → R+

∞ such that Φ′(f) = Φ(f) for all f ∈ N . Theorem 11.1 thus followsimmediately from the following result:

Proposition 11.1 Let N be a complete weakly normal subspace of M(X) andlet Φ : N → R+

∞ be a continuous linear mapping. Then there exists a σ-algebraE ⊂ P(X) with N ⊂ M(E) and a measure µ on E such that Φ(f) = µ(f) for allf ∈ N .

The rest of the chapter is taken up with the proof of Proposition 11.1. Thus letN be a complete weakly normal subspace of M(X) and let Φ : N → R+

∞ be acontinuous linear mapping. Define mappings Φ∗, Φ∗ : M(X) → R+

∞ by letting

Φ∗(f) = inf{Φ(g) : g ∈ N with g ≥ f} ,

Φ∗(f) = sup{Φ(h) − Φ(g) : h, g ∈ N with Φ(g) <∞ and h ≤ g + f}

for each f ∈ M(X). (Note that Φ∗(f) ≥ 0 since 0 ≤ 0 + f and Φ(0)−Φ(0) = 0.)

Lemma 11.2 (1) The mappings Φ∗ and Φ∗ are both monotone.

(2) Φ∗(f) ≤ Φ∗(f) for each f ∈ M(X).

(3) Φ∗(af) = aΦ∗(f) and Φ∗(af) = aΦ∗(f) for all f ∈ M(X), a ∈ R+.

(4) Φ∗(f1 + f2) ≤ Φ∗(f1) + Φ∗(f2) for all f1, f2 ∈ M(X).

(5) Φ∗(f1 + f2) ≥ Φ∗(f1) + Φ∗(f2) for all f1, f2 ∈ M(X).

(6) If f ∈ M(X), g ∈ N , h ∈ N ′ with f + h ≤ g then Φ∗(f) ≤ Φ(g) − Φ(h).

(7) Φ∗(f1 + f2) ≤ Φ∗(f1) + Φ∗(f2) ≤ Φ∗(f1 + f2) for all f1, f2 ∈ M(X).

(8) Φ∗(g) = Φ(g) = Φ∗(g) for each g ∈ N .

(9) If {fn}n≥1 is any sequence from M(X) and f =∑

n≥1 fn then

Φ∗(f) ≤∑

n≥1

Φ∗(fn) .

(Here f = limn sn, where {sn}n≥1 is the increasing sequence with sn =∑n

k=1 fk

for each n ≥ 1.)


Proof Let N ′ = {g ∈ N : Φ(g) <∞}; thus N ′ is a subspace of M(X).

(1) This is clear.

(2) Let h ∈ N , g′ ∈ N ′ with h ≤ g′ + f and let g ∈ N with g ≥ f . Thenh ≤ g′ + g and thus Φ(h) ≤ Φ(g′ + g) = Φ(g′)+Φ(g). Hence Φ(h)−Φ(g′) ≤ Φ(g)and this implies that Φ∗(f) ≤ Φ∗(f).

(3) It is clear that Φ∗(af) = aΦ∗(f) for all f ∈ M(X), a ∈ R+. Moreover,Φ∗(0) = 0, since if h ∈ N , g ∈ N ′ with h ≤ g + 0 then Φ(h) ≤ Φ(g) andhence Φ(h) − Φ(g) ≤ 0 (and we have already noted that Φ∗(f) ≥ 0 for allf ∈ M(X)). This implies that Φ∗(0f) = 0Φ∗(f) for all f ∈ M(X), and so itremains to show that if a > 0 then Φ∗(af) = aΦ∗(f) for all f ∈ M(X). Nowif h ∈ N , g ∈ N ′ with h ≤ g + f then ah ∈ N , ag ∈ N ′ and ah ≤ ag + afand hence Φ∗(af) ≥ Φ(ah) − Φ(ag) = a(Φ(h) − Φ(g)), from which it followsthat Φ∗(af) ≥ aΦ∗(f). Applying this to the mapping af and with a replaced byb = 1/a then also gives Φ∗(f) = Φ∗(b(af)) ≥ bΦ∗(af), i.e., aΦ∗(f) ≤ Φ∗(af).

(4) Let g1, g2 ∈ N with g1 ≥ f1 and g2 ≥ f2. Then g1 + g2 ≥ f1 + f2 withg1 + g2 ∈ N , and thus Φ∗(f1 + f2) ≤ Φ(g1 + g2) = Φ(g1) + Φ(g2). It thereforefollows that Φ∗(f1 + f2) ≤ Φ∗(f1) + Φ∗(f2).

(5) Let h1, h2 ∈ N , g1, g2 ∈ N ′ with h1 ≤ g1 + f1 and h2 ≤ g2 + f2. Thenh1 + h2 ∈ N , g1 + g2 ∈ N ′ with h1 + h2 ≤ g1 + g2 + f1 + f2 and thus

Φ∗(f1 + f2) ≥ Φ(h1 + h2) − Φ(g1 + g2) = Φ(h1) − Φ(g1) + Φ(h2) − Φ(g2) .

Therefore Φ∗(f1 + f2) ≥ Φ∗(f1) + Φ∗(f2).

(6) Let f ∈ M(X), g ∈ N and h ∈ N ′ with f + h ≤ g. If Φ(g) = ∞ thenclearly Φ∗(f) ≤ Φ(g) − Φ(h) and so we can assume that g ∈ N ′. Now h ≤ g andN is weakly complemented, and so there exists an increasing sequence {hn}n≥1

from N ∩ MF(X) with h = limn hn and for each n ≥ 1 an element h′n ∈ N withg = hn + h′n. Since Φ is continuous we then have Φ(h) = limn Φ(hn). Let ε > 0;there thus exists p ≥ 1 with Φ(hp) > Φ(h)− ε. But f +hp = f +h ≤ g = hp +h′pand hence f ≤ h′p, since hp ∈ MF(X). Therefore

Φ∗(f) ≤ Φ(h′p) = Φ(g) − Φ(hp) < Φ(g) − Φ(h) + ε

and this implies that Φ∗(f) ≤ Φ(g) − Φ(h).

(7) We first show that Φ∗(f1 +f2) ≤ Φ∗(f1)+Φ∗(f2), and for this we can assumeΦ∗(f1) < ∞. Let h ∈ N , g, g1 ∈ N ′ with h ≤ g + f1 + f2 and g1 ≥ f1. Theng + g1 ∈ N ′ and h ≤ g + g1 + f2 and thus

Φ(g1) + Φ∗(f2) ≥ Φ(g1) + Φ(h) − Φ(g + g1) = Φ(h) − Φ(g) .

Therefore Φ(g1) + Φ∗(f2) ≥ Φ∗(f1 + f2) for all g1 ∈ N ′ with g1 ≥ f1. and thisimplies that Φ∗(f1 + f2) ≤ Φ∗(f1) + Φ∗(f2) (since Φ∗(f1) <∞).


We next show that Φ∗(f1) + Φ∗(f2) ≤ Φ∗(f1 + f2) and here we can assume thatΦ∗(f1 + f2) < ∞. Let g ∈ N ′ with g ≥ f1 + f2 and let h2 ∈ N , g2 ∈ N ′ withh2 ≤ g2 + f2 (and so in fact h2 ∈ N ′). Then

h2 + f1 ≤ g2 + f2 + f1 = g2 + f1 + f2 ≤ g2 + g ,

i.e., f1 + h2 ≤ g + g2, and hence by (6) Φ∗(f1) ≤ Φ(g + g2) − Φ(h2). Therefore

Φ∗(f1) + Φ(h2) − Φ(g2) ≤ Φ(g) ,

and thus Φ∗(f1) + Φ∗(f2) ≤ Φ(g) for all g ∈ N ′ with g ≥ f1 + f2. This in turnimplies that Φ∗(f1) + Φ∗(f2) ≤ Φ∗(f1 + f2) (again since Φ∗(f1 + f2) <∞).

(8) This is clear.

(9) Let {fn}n≥1 be a sequence from M(X) and put f =∑

n≥1 fn. We need toshow Φ∗(f) ≤

∑

n≥1 Φ∗(fn) and for this it can assumed that∑

n≥1 Φ∗(fn) < ∞(and so in particular Φ∗(fn) <∞ for each n ≥ 1). Let ε > 0; then for each n ≥ 1there exists gn ∈ N with gn ≥ fn and Φ(gn) < Φ∗(fn) + 2−nε. Let g =

∑

n≥1 gn;then, since N is a complete subspace and Φ is a continuous linear mapping itfollows that g ∈ N and

Φ(g) = limn→∞

Φ(

n∑

k=1

gk

)

= limn→∞

n∑

k=1

Φ(gk) =∑

n≥1

Φ(gn) .

But g ≥ f and therefore

Φ∗(f) ≤ Φ(g) =∑

n≥1

Φ(gn) ≤∑

n≥1

(Φ∗(fn) + 2−nε) =∑

n≥1

Φ∗(fn) + ε .

Thus Φ∗(f) ≤∑

n≥1 Φ∗(fn), since ε > 0 is arbitrary.

Now put M = {f ∈ M(X) : Φ∗(f) = Φ∗(f)} and let Ψ : M → R+∞ be the

restriction of Φ∗ (or Φ∗) to M . Also let M ′ = {f ∈ M : Ψ(f) <∞}.

Lemma 11.3 (1) M is a complete subspace of M(X) with N ⊂M .

(2) Ψ : M → R+∞ is continuous linear mapping and Ψ(g) = Φ(g) for all g ∈ N .

(3) If f1 ∈M ′, f2 ∈ M(X) with f1 + f2 ∈M then f2 ∈M .

(4) If f1, f2 ∈M ′ then f1 ∨ f2 and f1 ∧ f2 are both in M ′.

(5) If f ∈M ′ and h ∈ N then h ∧ f ∈M ′.

Proof It follows immediately from Lemma 11.2 (1), (2), (3), (4) and (5) that Mis a subspace of M(X) and that Ψ is a monotone linear mapping. Moreover, byLemma 11.2 (8) N ⊂ M and Ψ(g) = Φ(g) for all g ∈ N .


To complete the proofs of (1) and (2) we must still show that M is completeand that Ψ is continuous, but before doing this we first look at part (3). Thusconsider f1 ∈M ′, f2 ∈ M(X) with f1 + f2 ∈M . Then by Lemma 11.2 (7)

Ψ(f1 + f2) = Φ∗(f1 + f2) ≤ Φ∗(f1) + Φ∗(f2) ≤ Φ∗(f1 + f2) = Ψ(f1 + f2)

and (reversing the roles of f1 and f2)

Ψ(f1 + f2) = Φ∗(f1 + f2) ≤ Φ∗(f1) + Φ∗(f2) ≤ Φ∗(f1 + f2) = Ψ(f1 + f2) .

But Φ∗(f1) = Ψ(f1) = Φ∗(f1), and therefore Ψ(f1) + Φ∗(f2) = Ψ(f1) + Φ∗(f2);hence Φ∗(f2) = Φ∗(f2), since Ψ(f1) <∞, i.e., f2 ∈M .

Now let {fn}n≥1 be an increasing sequence from M with f = limn fn. Thenby Lemma 11.2 (1) Φ∗(f) ≥ limn Φ∗(fn) = limn Ψ(fn). This means that if wecan show that Φ∗(f) ≤ limn Ψ(fn) then it will follow both that f ∈ M andthat Ψ(f) = limn Ψ(fn), thus proving that M is complete and also that Ψ iscontinuous. We will thus show that Φ∗(f) ≤ limn Φ∗(fn), and for this it can beassumed that limn Φ∗(fn) = a < ∞. Since fn ≤ fn+1 there exists hn ∈ M(X)with fn+1 = fn + hn and then by (3) hn ∈M (noting that Ψ(fn) ≤ a <∞). Butfn+1 = f1 +

∑nk=1 hk for each n ≥ 1 and thus f = f1 +

∑

n≥1 hn. Therefore byLemma 11.2 (9)

Φ∗(f) ≤ Φ∗(f1) +∑

n≥1

Φ∗(hn)

= Ψ(f1) +∑

n≥1

Ψ(hn) = Ψ(f1) +∑

n≥1

(Ψ(fn+1) − Ψ(fn))

= limn→∞

(

Ψ(f1) +n−1∑

k=1

(Ψ(fk+1) − Ψ(fk)))

= limn→∞

Ψ(fn) .

We have thus now shown that (1), (2) and (3) hold.

(4) For j ∈ {1, 2} let fj ∈ M ′. Let ε > 0; then there exist gj, g′j ∈ N ′, hj ∈ N

with fj ≤ gj, hj ≤ g′j + fj and Φ(gj) + Φ(g′j) < Φ(hj) + ε/2. (Note thatΦ(hj) ≤ Φ(g′j) + Φ(gj) < ∞, since hj ≤ g′j + gj.) Then f1 ∨ f2 ≤ g1 ∨ g2 and byassumption g1 ∨ g2 ∈ N ; hence Φ∗(f1 ∨ f2) ≤ Φ(g1 ∨ g2). Moreover,

(h1 + g′2) ∨ (h1 + g′1) ≤ g′1 + g′2 + f1 ∨ f2 .

(Let bj, cj, dj ∈ R+ with dj ≤ cj + bj for j ∈ {1, 2}; then

(d1 − c1) ∨ (d2 − c2) = (d1 − c1 − d2 + c2) ∨ 0 + d2 − c2)

= (d1 + c2 − d2 − c1) ∨ 0 + d2 − c2

= (d1 + c2) ∨ (d2 + c1) − (d2 + c1) + d2 − c2

= (d1 + c2) ∨ (d2 + c1) − (c1 + c2)


and thus (d1 + c2) ∨ (d2 + c1) ≤ c1 + c2 + b1 ∨ b2. But this also remains truewhen R+ is replaced by R+

∞, since the right-hand-side of the inequality is ∞when any of the values is equal to ∞.) Now g′1 + g′2 ∈ N ′ and by assumptionv = (h1 + g′2) ∨ (h2 + g′1) ∈ N ′, since Φ(v) ≤ Φ(h1 + g′2 + h2 + g′1) < ∞. Thenv ≤ g′1 +g′2 +f1 ∨f2 and so Φ∗(f1 ∨f2) ≥ Φ(v)−Φ(g′1 +g′2), from which it followsthat Φ∗(f1 ∨ f2) ≥ Φ(v) − Φ(g′1 + g′2). But

g1 ∨ g2 + h1 + h2 ≤ (h1 + g′2) ∨ (h2 + g′1) + g1 + g2 = v + g1 + g2 .

(Let bj, cj, dj ∈ R+ with dj ≤ cj + bj for j ∈ {1, 2}; then, using the calculationmade above,

b1 ∨ b2 + c1 + c2 − (d1 + c2) ∨ (d2 + c1)

= b1 ∨ b2 − (d1 − c1) ∨ (d2 − c2) ≤ (b1 − d1 + c1) + (b2 − d2 + c2)

since (y1∨y2)− (x1 ∨x2) ≤ (y1−x1)+(y2−x2) whenever x1, x2, y1, y2 ∈ R withx1 ≤ y1 and x2 ≤ y2. Thus

b1 ∨ b2 + d1 + d2 ≤ (d1 + c2) ∨ (d2 + c1) + b1 + b2

and this still holds when R+ is replaced by R+∞, since the right-hand-side of the

inequality is ∞ when any of the values is equal to ∞.) Therefore

Φ(g1 ∨ g2) + Φ(h1 + h2) = Φ(g1 ∨ g2 + h1 + h2)

≤ Φ(v + g1 + g2) = Φ(v) + Φ(g1 + g2)

which implies that

Φ∗(f1 ∨ f2) − Φ∗(f1 ∨ f2) ≤ Φ(g1 ∨ g2) − Φ(v) + Φ(g′1 + g′2)

≤ Φ(g1 + g2) − Φ(h1 + h2) + Φ(g′1 + g′2)

≤ Φ(g1) − Φ(h1) + Φ(g′1) + Φ(g2) − Φ(h2) + Φ(g′2)

< ε/2 + ε/2 = ε .

This shows that Φ∗(f1∨f2) = Φ∗(f1∨f2), i.e., f1∨f2 ∈M , and hence f1∨f2 ∈M ′,since Ψ(f1 ∨ f2) ≤ Ψ(f1) + Ψ(f2) <∞. Finally, it now follows from (3) that alsof1 ∧ f2 ∈M ′, since f1 ∨ f2 + f1 ∧ f2 = f1 + f2 and Ψ(f1 ∨ f2) ∈M ′.

(5) Let f ∈ M ′ and h ∈ N . Now let ε > 0; then there exist g, g′ ∈ N ′ andh′ ∈ N with f ≤ g, h′ ≤ g′ + f and Φ(g) + Φ(g′) < Φ(h′) + ε. (Note thatΦ(h′) ≤ Φ(g′) + Φ(g) < ∞, since h′ ≤ g′ + g.) Then h ∧ f ≤ h ∧ g and byassumption h ∧ g ∈ N ; hence Φ∗(h ∧ f) ≤ Φ(h ∧ g). Moreover,

h′ ∧ (h+ g′) ≤ g′ + h ∧ f .

(Let a, b, c, d ∈ R+ with d ≤ c+ b; then as above a∧ (d− c) = (a+ c)∧d− c andthus d∧(a+c) ≤ c+a∧b. But this also remains true when R+ is replaced by R+

∞.)


Now v = h′ ∧ (h + g′) ∈ N ′, since Φ(v) ≤ Φ(h′) < ∞. Then vδ ≤ g′ + h ∧ f andthus Φ∗(h∧f) ≥ Φ(v)−Φ(g′), from which it follows that Φ∗(h∧f) ≥ Φ(v)−Φ(g′).But

h ∧ g + h′ ≤ (h+ g′) ∧ h′ + g = v + g .

(Let a, b, c, d ∈ R+ with d ≤ c+ b; then, using the calculation made above,

a ∧ b+ c− (a+ c) ∧ d = a ∧ b− a ∧ (d− c) ≤ b− d+ c .

Thus a∧ b+ d ≤ (a+ c)∧ d+ b and this still holds when R+ is replaced by R+∞.)

Therefore Φ(h ∧ g) + Φ(h′) = Φ(h ∧ g + h′) ≤ Φ(v + g) = Φ(v) + Φ(g) and so

Φ∗(h ∧ f) − Φ∗(h ∧ f) ≤ Φ(h ∧ g) − Φ(v) + Φ(g′) ≤ Φ(g) + Φ(g′) − Φ(h′) < ε .

This shows that Φ∗(h ∧ f) = Φ∗(h ∧ f), i.e., h ∧ f ∈ M , and hence h ∧ f ∈ M ′,since Ψ(h ∧ f) ≤ Ψ(f) <∞.

By Lemma 11.3 (1) and (2) M ′ is a subspace of M(X). Put

E = {E ⊂ X : IE ∧ f ∈M ′ for all f ∈M ′} .

Lemma 11.4 E is a σ-algebra and N ∪M ′ ⊂ M(E).

Proof First note that if f ∈ M ′ and g ∧ f ∈M for some g ∈ M(X) then in factg ∧ f ∈M ′, since Ψ(g ∧ f) ≤ Ψ(f) <∞, and this means that

E = {E ⊂ X : IE ∧ f ∈M for all f ∈M ′} .

Let E ∈ E ; if f ∈ M ′ then IE ∧ f + IX\E ∧ f = 1 ∧ f and Lemma 11.3 (5)implies that 1 ∧ f ∈ M ′, since by assumption 1 ∈ N . Thus by Lemma 11.3 (3)IX\E ∧ f ∈ M . This shows that X \ E ∈ E . Now let E, F ∈ E ; if f ∈ M ′ thenIE∩F ∧ f = (IE ∧ f) ∧ (IF ∧ f) and hence by Lemma 11.3 (3) IE∩F ∧ f ∈ M ′.This shows that E ∩ F ∈ E . But clearly ∅ ∈ E and therefore E is an algebra.

Next let {En}n≥1 be an increasing sequence from E and put E =⋃

n≥1En. Iff ∈ M ′ then {IEn

∧ f}n≥1 is an increasing from M ′ with limn IEn∧ f = IE ∧ f

and thus by Lemma 11.3 (1) IE ∧ f ∈ M . This shows E ∈ E and therefore byLemma 2.1 (4) E is a σ-algebra.

Now let g ∈ N and a ∈ R+ with a > 0 and let Ga = {x ∈ X : g(x) > a}.Put h = a−1g − (a−1g) ∧ 1; then g ∈ N , since N is complemented and ∧-closedand 1 ∈ N . For each n ≥ 1 let gn = (nh) ∧ 1; then {gn}n≥1 is an increasingsequence from N with limn gn = IGa

. Thus if f ∈ M ′ then by Lemma 11.3 (4){gn ∧ f}n≥1 is an increasing sequence from M ′ with limn gn ∧ f = IGa

∧ f and soby Lemma 11.3 (1) IGa

∧ f ∈M . This shows that Ga ∈ E and hence Lemma 2.7implies that g ∈ M(E).


But the same proof also shows that M ′ ⊂ M(E): Let g ∈M ′; then the mapping hdefined above satisfies a−1g = (a−1g)∧1+h and by Lemma 11.3 (5) (a−1g)∧1 isin M ′, and so Lemma 11.3 (3) implies that h ∈M ′. Again using Lemma 11.3 (5)this shows gn ∈M ′, and exactly as before it then follows that g ∈ M(E).

Note that if E ⊂ X with IE ∈M ′ then by Lemma 11.3 (5) E ∈ E . Now define amapping µ : E → R+

∞ by letting

µ(E) =

{

Ψ(IE) if IE ∈M ′ ,∞ otherwise .

Lemma 11.5 µ is a measure on E .

Proof Note that if E, F ∈ E with E ⊂ F and IE ∈M ′ then by the definition ofE it follows that IE ∈ M ′, since IE = IE ∧ IF .

We will show µ is countably additive. Thus let {En}n≥1 be a disjoint sequencefrom E and put E =

⋃

n≥1En. If IEn/∈M ′ for some n then by the above remark

IE /∈ M ′ (since En ⊂ E) and in this case µ(E) = ∞ =∑

n≥1 µ(En). We canthus assume that IEn

∈ M ′ for all n ≥ 1. Put gn =∑n

m=1 IEm; then {gn}n≥1

is an increasing sequence from M ′ with limn gn = IE. Thus by Lemma 11.3 (1)IE ∈M and by Lemma 11.3 (2)

Ψ(IE) = limn→∞

Ψ(gn) = limn→∞

Ψ(

n∑

m=1

IEm

)

= limn→∞

n∑

m=1

Ψ(IEm) = lim

n→∞

n∑

m=1

µ(Em) =∑

n≥1

µ(En) .

But Ψ(IE) = µ(E): Either IE ∈M ′, in which case µ(E) = Ψ(IE), or IE ∈M \M ′

and here µ(E) = ∞ = Ψ(IE). This shows µ is countably additive, and thereforeit is a measure, since clearly µ(∅) = Ψ(0) = 0.

Lemma 11.6 Let g ∈ ME(E) and f ∈ M(E) with g ≤ f , and where either f ∈M ′

or µ(f) <∞. Then g ∈M ′ and Ψ(g) = µ(g).

Proof The mapping g has the form∑

a∈A aIEawith A a finite subset of R+ \ {0}

and Ea, a ∈ A, disjoint elements from E . Suppose that IEa∈M ′ for each a ∈ A;

then g ∈M ′, since M ′ is a subspace of M(X), and

Ψ(g) =∑

a∈A

aΨ(IEa) =

∑

a∈A

aµ(Ea) = µ(g) .

But if f ∈M ′ then a−1f ∈M ′ and IEa∧ (a−1f) = IEa

and hence IEa∈M ′, since

Ea ∈ E . On the other hand, if µ(f) <∞ then µ(g) <∞ and so µ(Ea) <∞, andthen again IEa

∈M ′ for each a ∈ A.


Lemma 11.7 M ′ = {f ∈ M(E) : µ(f) < ∞} and Ψ(f) = µ(f) holds for allf ∈M ′.

Proof Let f ∈ M ′; then by Lemma 11.4 f ∈ M(E) and hence by Lemma 9.4there exists an increasing sequence {fn}n≥1 from ME(E) with limn fn = f . Thusby Lemma 11.6 fn ∈M ′ with Ψ(fn) = µ(fn) for each n ≥ 1 and therefore

Ψ(f) = limn→∞

Ψ(fn) = limn→∞

µ(fn) = µ(f) ,

since Ψ and µ are both continuous. This shows that Ψ(f) = µ(f) for all f ∈M ′

and so in particular M ′ ⊂ {f ∈ M(E) : µ(f) <∞}.

It remains to show that if f ∈ M(E) with µ(f) < ∞ then f ∈ M ′. But again byLemma 9.3 there is an increasing sequence {fn}n≥1 from ME(E) with limn fn = fand by Lemma 11.6 fn ∈ M ′ with Ψ(fn) = µ(fn) for each n ≥ 1. ThereforeΨ(f) = µ(f) <∞, i.e., f ∈M ′.

Lemma 11.8 Φ(f) = µ(f) for all f ∈ N .

Proof By Lemmas 11.3 (1) and 11.4 N ⊂ M ∩ M(E) and by Lemma 11.3 (2)Φ(f) = Ψ(f) for all f ∈ N . Let f ∈ N ; if Φ(f) < ∞ then f ∈ M ′ and thus byLemma 11.7 Φ(f) = Ψ(f) = µ(f). On the other hand, if Φ(f) = ∞ then f /∈M ′

and thus again by Lemma 11.7 µ(f) = ∞, i.e., Φ(f) = ∞ = µ(f). This showsthat Φ(f) = µ(f) for all f ∈ N .

The proof of Proposition 11.1 is now complete.

12 The Radon-Nikodym theorem

In this chapter we look at the Radon-Nikodym theorem, which is one of the mostuseful results in measure theory. In the following let (X, E) be a measurablespace.

Let µ and ν be measures on E . Then ν is said to be absolutely continuous withrespect to µ if for each ε > 0 there exists δ > 0 such that ν(E) < ε for allE ∈ E with µ(E) < δ. Moreover, ν is said to be weakly absolutely continuouswith respect to µ, and we then write ν � µ, if ν(E) = 0 for all E ∈ E withµ(E) = 0. It follows from Proposition 10.6 (1) that ν � µ if and only if ν(f) = 0for all f ∈ M(E) with µ(f) = 0.

If ν is absolutely continuous with respect to µ then clearly also ν � µ. For finitemeasures the converse holds:

Lemma 12.1 If µ and ν be finite measures on E with ν � µ then ν is absolutelycontinuous with respect to µ.

Proof Suppose that ν is not absolutely continuous with respect to µ. Then thereexists an ε > 0 and for each n ≥ 1 an element En ∈ E such that µ(En) < 2−n butwith ν(En) ≥ ε. Put Fn =

⋃

m≥nEm; then {Fn}n≥1 is a decreasing sequence fromE with ν(Fn) ≥ ν(En) ≥ ε for each n ≥ 1 (since En ⊂ Fn) and by Lemma 3.3µ(Fn) ≤

∑

m≥n µ(Em) ≤∑

m≥n 2−m = 2−n+1. Let F =⋂

n≥1 Fm; then sinceµ and ν are finite, it follows from Lemma 3.2 that ν(F ) = limn ν(Fn) ≥ ε andµ(F ) = limn µ(Fn) = 0. But this is not possible because ν � µ, and hence νmust be absolutely continuous with respect to µ.

Let µ be a measure on E and let h ∈ M(E); define µ·h : M(E) → R+∞ by

µ·h (f) = µ(hf) for all f ∈ M(E). Then µ·h is clearly linear and continuous andthus by Proposition 10.4 it is a measure. The measure µ·h is finite if and onlyif µ(h) < ∞, since µ·h (1) = µ(h). Note the rather extreme case with ∞ theconstant mapping with value ∞: µ·∞ is the measure with

µ·∞ (E) =

{

0 if µ(E) = 0 ,∞ otherwise.

The simplest case of this construction is with h = IF for some F ∈ E , whichresults in the measure µ·IF with µ·IF (E) = µ(F ∩ E) for all E ∈ E .

Proposition 12.1 Let µ be a measure on E and let h ∈ M(E) with µ(h) < ∞.Then the measure µ·h is absolutely continuous with respect to µ.

Proof This is just Lemma 10.6.

70


Proposition 12.2 Let µ be a measure on E . Then µ·h� µ for all h ∈ M(E).

Proof This is just Proposition 10.6 (2).

The converse of Proposition 12.2 holds for σ-finite measures; this result is theRadon-Nikodym theorem:

Theorem 12.1 Let µ, ν be σ-finite measures on E with ν � µ. Then there existsh ∈ M(E) such that ν = µ·h. Moreover, if ν = µ·h′ also holds with h′ ∈ M(E)then h′ = h µ-a.e.

Proof The uniqueness is clear: If ν = µ·h′ = µ·h with h, h′ ∈ M(E) thenµ(h′IE) = ν(E) = µ(hIE) for all E ∈ E , and thus Lemma 10.4 (2) impliesthat h′ = h µ-a.e.

To show that h exists we first reduce things to the case where the measuresµ and ν are both finite. If v ∈ M+

F(E) then the mapping v−1 ∈ M(X) with

v−1(x) = 1/v(x) for each x ∈ X is also an element of M+F(E), since

{x ∈ X : v−1(x) > a} = {x ∈ X : v(x) < 1/a} ∈ E

for all a ∈ R+ \ {0}.

Lemma 12.2 Let ω be a measure on E and let g, h ∈ M(E) be mappings. Then(ω·g)·h = ω·(gh). In particular, if v ∈ M+

F(E) then (ω·v)·v−1 = ω.

Proof If f ∈ M(E) then ((ω·g)·h)(f) = (ω·g)(hf) = ω(ghf) = (ω·(gh))(f), andtherefore (ω·g)·h = ω·(gh). The final statement follows because vv−1 = 1 andω·1 = ω.

Let ω be a measure on E and v ∈ M+F(E). By Proposition 12.2 and Lemma 12.2

it follows that ω·v � ω and ω � ω·v and so ω and ω·v have the same the sets ofmeasure zero.

Now µ and ν are σ-finite and thus by Lemma 10.3 there exist u, v ∈ M+F(E) such

that µ(u) < ∞ and ν(v) < ∞. This just means that the measures µ·u and ν·vare finite. Moreover, ν·v � µ·u, since ν·v � ν, ν � µ and µ � µ·u. Supposethere exists h ∈ M(E) such that ν·v = (µ·u)·h. Then by Lemma 12.2

ν = (ν·v)·v−1 = ((µ·u)·h)·v−1 = µ·(uhv−1) .

Thus it is enough to consider the case of finite measures. We need the followingfact (adapted from one of the standard proofs of the Hahn-Jordan decompositionfor finite signed measures):


Lemma 12.3 Let µ and ν be finite measures on E with ν(X) > µ(X). Thenthere exists C ∈ E with ν(C) > 0 such that ν(E ∩ C) ≥ µ(E ∩ C) for all E ∈ E .

Proof Let δ1 = sup{µ(E)− ν(E) : E ∈ E} (thus δ1 ≥ 0, since µ(∅) − ν(∅) = 0)and choose E1 ∈ E with µ(E1) − ν(E1) ≥ δ1/2. Then by induction on n we candefine a sequence {δn}n≥1 from R+ and a sequence {En}n≥1 from E so that

δn = sup{

µ(E) − ν(E) : E ∈ E with E ∩ Ek = ∅ for k = 1, . . . , n− 1}

and En ∈ E a subset of X \⋃n−1

k=1 Ek with µ(En) − ν(En) ≥ δn/2 for each n > 1.In particular {En}n≥1 is a disjoint sequence from E . Put E∞ =

⋃

n≥1En andC = X \ E∞. Then µ(E∞) − ν(E∞) =

∑

n≥1(µ(En) − ν(En)) ≥∑

n≥1 δn/2, andfrom this it follows that both µ(E∞) − ν(E∞) ≥ 0 and

∑

n≥1 δn <∞. Thereforeν(C)−µ(C) = ν(B)−µ(B)−ν(E∞)+µ(E∞) ≥ ν(B)−µ(B) and so in particularν(C) > 0, and also limn δn = 0. Let E ∈ E ; then

E ∩ C = E ∩ (X \ E∞) ⊂ X \n−1⋃

k=1

Ek ,

and so µ(E ∩C)− ν(E ∩C) ≤ δn for each n ≥ 1, i.e.., ν(E ∩ C) ≥ µ(E ∩C).

Now to the proof of Theorem 12.1 for finite measures, so let µ, ν be finite measureson E with ν � µ. Consider the set

U ={

g ∈ M(E) : µ(gIE) ≤ ν(E) for all E ∈ E}

;

then U 6= ∅, since 0 ∈ U and by Proposition 9.2 it easily follows that

U ={

g ∈ M(E) : µ(gf) ≤ ν(f) for all f ∈ M(E)}

.

Let g, h ∈ U and let A = {x ∈ X : g(x) ≥ h(x)}; then for each f ∈ M(E)

µ((g ∨ h)f) = µ(gIAf) + µ(hIX\Af) ≤ ν(IAf) + ν(IX\A) = ν(f)

and this shows that g ∨ h ∈ U for all g, h ∈ U . Moreover, U is a complete subsetof M(E): Let {gn}n≥1 be an increasing sequence from U and let g = limn gn.Then {gnf}n≥1 is an increasing sequence from M(E) with limn gnf = gf and thusµ(gf) = limn µ(gnf) ≤ ν(f), since µ(gnf) ≤ ν(f) for all n ≥ 1.

Let α = sup{µ(g) : g ∈ U}; then α ≤ ν(X) <∞, since µ(g) ≤ ν(X) for all g ∈ U .For each n ≥ 1 choose gn ∈ U with µ(gn) ≥ α − 1/n and let hn = g1 ∨ · · · ∨ gn.Then {hn}n≥1 is an increasing sequence from U and so h = limn hn is also anelement of U . But α − 1/n ≤ µ(gn) ≤ µ(hn) ≤ µ(h) ≤ α for each n ≥ 1 andtherefore µ(h) = α.


We now show that ν(E) = µ(hIE) for all E ∈ E , thus suppose that this is notthe case. Then there exists B ∈ E with µ(hIB) < ν(B) and so there also existsδ > 0 such that ν(B) > µ(hIB)+ δµ(B). Consider the measures µ′, ν ′ on E givenby µ′(E) = µ((h+ δ)IBIE) and ν ′(E) = ν(B ∩ E) for all E ∈ E . Then

µ′(X) = µ((h+ δ)IB) = µ(hIB) + δµ(B) < ν(B) = ν ′(X) .

Hence µ′ and ν ′ are both finite and ν ′(X) > µ′(X). Thus by Lemma 12.3 thereexists C ∈ E with ν ′(C) > 0 such that ν ′(E ∩ C) ≥ µ′(E ∩ C) for all E ∈ E . Leth′ = h+ δIB∩C ; then

µ(h′f) + µ(hIB∩Cf) = µ(hf) + δµ(IB∩Cf) + µ(hIB∩Cf)

= µ(hf) + µ((h+ δ)IB∩Cf) ≤ µ(hf) + ν(IB∩Cf)

= µ(hIX\B∩Cf) + ν(IB∩Cf) + µ(hIB∩Cf)

≤ ν(IX\B∩Cf) + ν(IB∩Cf) + µ(hIB∩Cf)

= ν(f) + µ(hIB∩Cf)

for all f ∈ M(E) and so in particular µ(h′IE) ≤ ν(E) for all E ∈ E , sinceµ(hIB∩CIE) ≤ µ(h) = α < ∞. Hence h′ ∈ U . But ν(B ∩ C) = ν ′(C) > 0 andν � µ and so µ(B ∩ C) > 0, which means that

µ(h′) = µ(h) + δµ(B ∩ C) = α + δµ(B ∩ C) > α .

This is a contradiction and therefore ν(E) = µ(hIE) for all E ∈ E , i.e., ν = µ·h.This completes the proof of Theorem 12.1.

The element h ∈ M(E) in Theorem 12.1 is called a (or the) Radon-Nikodymdensity (or just the density) of ν with respect to µ.

Measures τ1 and τ2 on E are said to be mutually singular, and we then writeτ1 ⊥ τ2, if there exist E1, E2 ∈ E with X = E1 ∪ E2 and E1 ∩ E2 = ∅ such thatτ1(E1) = τ2(E2) = 0.

Theorem 12.2 Let µ, ν be σ-finite measures. Then there exist σ-finite measuresν1, ν2 such that ν = ν1 + ν2, ν1 � µ and ν2 ⊥ µ. Moreover, this decomposition isunique: If ν ′1, ν

′2 are measures with ν = ν ′1 + ν ′2, ν

′1 � µ and ν ′2 ⊥ µ then ν ′1 = ν1

and ν ′2 = ν2.

Proof Let τ = µ + ν; then τ is σ-finite and µ � τ , thus by Theorem 12.1there exists h ∈ M(E) such that µ = τ ·h. Put D1 = {x ∈ X : h(x) > 0}and D2 = {x ∈ X : h(x) = 0} and define measures ν1, ν2 on E by lettingν1(E) = ν(E ∩ D1) and ν2(E) = ν(E ∩ D2) for all E ∈ E . Therefore ν1 andν2 are σ-finite and ν = ν1 + ν2. Now X = D1 ∪ D2 and D1 ∩ D2 = ∅, andν2(D1) = ν(D1 ∩D2) = ν(∅) = 0 and µ(D2) = τ(hID2

) = 0, and so ν2 ⊥ µ.


Let E ∈ E with µ(E) = 0. Then τ(hIE) = 0 and thus by Proposition 10.6 (1)τ(E ∩ E1) = 0, since E ∩ D1 = {x ∈ X : h(x)IE(x) > 0}. It therefore followsthat ν1(E1) = ν(E ∩D1) ≤ τ(E ∩D1) = 0, and this shows that ν1 � µ.

It remains to establish the uniqueness. Thus let ν ′1, ν′2 be measures on E with

ν = ν ′1 + ν ′2, ν′1 � µ and ν ′2 ⊥ µ. In particular, ν ′1 and ν ′2 are σ-finite (since

ν = ν ′1 + ν ′2 and ν is σ-finite). Now since ν2 ⊥ µ and ν ′2 ⊥ µ, there existE1, E2, E

′1, E

′2 ∈ E with X = E1 ∪ E2 = E ′

1 ∪ E′2 and E1 ∩ E2 = E ′

1 ∩ E′2 = ∅

with µ(E1) = µ(E ′1) = ν2(E2) = ν ′2(E

′2) = 0. Put F1 = E1∪E ′

1 and F2 = E2∩E ′2;

then X = F1 ∪ F2, F1 ∩ F2 = ∅ and µ(F1) = ν2(F2) = ν ′2(F2) = 0. But ν1 � µand ν ′1 � µ and so by Theorem 12.1 there exist h, h′ ∈ M(F) with ν1 = µ·h andν ′1 = µ·h′. It follows from Proposition 10.6 (3) (since µ(X \ F2) = µ(F1) = 0)that

ν1(E) = µ(hIE) = µ(hIE∩F2) = ν1(E ∩ F2) = ν1(E ∩ F2) + ν2(E ∩ F2)

= ν(E ∩ F2)

= ν ′1(E ∩ F2) + ν ′2(E ∩ F2) = ν ′1(E ∩ F2) = µ(h′IE∩F2) = µ(h′IE)

= ν ′1(E)

for all E ∈ E , i.e., ν ′1 = ν1. Hence also ν ′2 = ν2: If E ∈ E with ν(E) <∞ then

ν2(E) = ν(E) − ν1(E) = ν(E) − ν ′1(E) = ν ′2(E)

and thus ν2(E) = ν ′2(E) for all E ∈ E since ν is σ-finite.

The representation of ν as the sum of ν1 and ν2 in Theorem 12.2 is called theLebesgue decomposition of ν with respect to µ. The proof of the theorem showsthat this decomposition has a very simple form: There exist D1, D2 ∈ E withX = D1∪D2 andD1∩D2 = ∅ such that ν1(E) = ν(E∩D1) and ν2(E) = ν(E∩D2)for all E ∈ E .

In what follows let E ′ be a sub-σ-algebra of E , i.e., a σ-algebra E ′ with E ′ ⊂ E .Then we can consider M(E ′) as a subset of M(E), and it is then in fact a closedsubspace of M(E).

Let µ : M(E) → R+∞ be a measure and let µ′ : M(E ′) → R+

∞ be the restriction ofµ to M(E ′). Then µ′ is clearly linear and continuous and so by Proposition 10.4 itis a measure on E ′. As a mapping from E ′ to R+

∞ it is of course just the restrictionto E ′ of the mapping µ : E → R+

∞.

Theorem 12.3 Let µ be a finite measure on E and let f ∈ M(E) be a mappingwith µ(f) < ∞. Then there exists g ∈ M(E ′) such that µ(gh) = µ(fh) for allh ∈ M(E ′) (and so µ(g) = µ(f) <∞). If g′ ∈ M(E ′) also satisfies µ(g′h) = µ(fh)for all h ∈ M(E ′) then g′ = g µ-a.e.


Proof Let ν : M(E ′) → R+∞ be the restriction to M(E ′) of the finite measure

µ·f : M(E) → R+∞. Then ν is a finite measure on (X, E ′) and ν � µ′, where

µ′ is again the restriction of µ to M(E ′), since if E ∈ E ′ with µ′(E) = 0 thenµ(E) = µ′(E) = 0 and so by Proposition 12.1 ν(E) = µ·f (E) = 0. Thus byTheorem 12.1 there exists g ∈ M(E ′) such that ν = µ′·g and then

µ(gh) = µ′(gh) = µ′·g (h) = ν(h) = µ·f (h) = µ(fh)

for all h ∈ M(E ′). Finally, if g′ ∈ M(E ′) also satisfies µ(g′h) = µ(fh) for allh ∈ M(E ′) then by Lemma 10.4 (2) g′ = g µ-a.e.

Here is a version of Theorem 12.3 for bounded mappings

Theorem 12.4 Let µ be a finite measure on E and f ∈ MB(E) with f ≤ a. Thenthere exists g ∈ MB(E ′) with g ≤ a such that µ(gh) = µ(fh) for all h ∈ M(E ′).

Proof By Theorem 12.3 there exists g ∈ M(E ′) such that µ(gh) = µ(fh) for allh ∈ M(E ′). For n ≥ 1 let Bn = {x ∈ X : g(x) > a+ 1/n}; then

(a+ 1/n)µ(Bn) ≤ µ(gIB+n) = µ(fIBn) ≤ aµ(Bn)

and thus µ(Bn) = 0. Put B =⋃

n≥1Bn and g′ = gIX\B; then g′ ∈ M(E ′) withg′ ≤ a and µ(B) = 0. Therefore by Proposition 10.6 (3) µ(g ′h) = µ(gh) = µ(fh)for all h ∈ M(E ′).

If µ is a probability measure on E then the mapping g ∈ M(E ′) in Theorem 12.3is called a version of the conditional expectation of f with respect to E ′ and it isusually denoted by Eµ(f |E ′).

The next result shows that the operation of taking the conditional expectationbehaves in some sense like an orthogonal projection.

Lemma 12.4 Let µ be a probability measure on E and let E1, E2 be sub-σ-algebrasof E with E2 ⊂ E1. Let f ∈ MB(E) and for k = 1, 2 let fk ∈ MB(Ek) be a versionof the conditional expectation of f with respect to Ek. Then

µ((f − f1)2) + µ((f1 − f2)

2) = µ((f − f2)2) .

In particular, µ((f − f1)2) ≤ µ((f − f2)

2).

Proof Since f2 ∈ MB(E2) ⊂ MB(E1) it follows that µ(f2f) = µ(f2f1), and alsoµ(f 2

1 ) = µ(f1f1) = µ(f1f), since f1 ∈ MB(E1). Therefore

µ((f − f1)2) + µ((f1 − f2)

2) + 2µ(f 21 ) + 2µ(ff2)

= µ((f − f1)2) + µ((f1 − f2)

2) + 2µ(f1f) + 2µ(f1f2)

= µ((f − f1)2 + (f1 − f2)

2 + 2f1f + 2f1f2)

= µ(f 2 + 2f 21 + f 2

2 ) = µ(f 2) + 2µ(f 21 ) + µ(f 2

2 )

= µ((f − f2)2) + 2µ(f 2

1 ) + 2µ(ff2) ,

i.e., µ((f − f1)2) + µ((f1 − f2)

2) = µ((f − f2)2).

13 Image and pre-image measures

If (X, E) and (Y,F) are measurable spaces and f : X → Y is a measurablemapping then for each measure µ on E there is a measure µ ◦ f ∗ defined on Fwhich is called the image of µ under f . In this chapter we look at its properties.In particular we are interested in determining which measures on F can occur asimage measures, i.e., which have the form µ ◦ f ∗ for some measure µ on E .

Let X and Y be non-empty sets and let N be a subspace of M(Y ). A mappingΦ : N → M(X) is linear if it is additive and positive homogeneous, i.e., if

Φ(af + bg) = aΦ(f) + bΦ(g)

for all f, g ∈ N and all a, b ∈ R+. Note then that Φ(0) = 0. The mapping Φis said to be monotone if Φ(g) ≤ Φ(f) whenever f, g ∈ N with g ≤ f ; in thiscase {Φ(fn)}n≥1 is an increasing sequence in M(X) for each increasing sequence{fn}n≥1 from N . Finally, if N is a complete subspace of M(Y ) then a linearmapping Φ : N → M(X) is called continuous if it is monotone and

Φ(

limn→∞

fn

)

= limn→∞

Φ(fn)

for each increasing sequence {fn}n≥1 of elements from N . If N is a complementedsubspace of M(Y ) (such as M(F)) then any linear mapping Φ : N → M(X) isautomatically monotone.

Lemma 13.1 Let N be a subspace of M(Y ) and let Φ : N → M(X) be a linearmapping. Then:

(1) The image Φ(N) = {g ∈ M(X) : g = Φ(f) for some f ∈ N} of N under Φis a subspace of M(X).

(2) If M is a subspace of M(X) then Φ−1(M) = {f ∈ N : Φ(f) ∈ M} is asubspace of N . Moreover, if N and M are both complete and Φ is continuousthen Φ−1(M) is also complete.

Proof (1) Let g1, g2 ∈ N and a1, a2 ∈ R+; there therefore exist f1, f2 ∈ N withg1 = Φ(f1) and g2 = Φ(f2). Then a1f1 + a2f2 ∈ N , since N is a subspace ofM(Y ) and Φ(a1f1 + a2f2) = a1Φ(f1) + a2Φ(f2) = a1g1 + a2g2, since Φ is linear.Thus Φ(N) is a subspace of M(X), since 0 = Φ(0) ∈ Φ(N).

(2) Let f1, f2 ∈ Φ−1(M) and a1, a2 ∈ R+; then since Φ is linear and M is asubspace Φ(a1f1 +a2f2) = a1Φ(f1)+a2Φ(f2) ∈M , and so a1f1 +a2f2 ∈ Φ−1(M).Thus Φ−1(M) is a subspace of N , since Φ(0) = 0 ∈ M . Now suppose N and Mare both complete and that Φ is continuous; let {fn}n≥1 be an increasing sequencefrom Φ−1(M) and put f = limn fn (and so f ∈ N). Then Φ(f) = limn Φ(fn),

76


since Φ is continuous, and {Φ(fn)}n≥1 is an increasing sequence from M . HenceΦ(f) ∈ M , since M is complete, i.e., f ∈ Φ−1(M). This implies that Φ−1(M) iscomplete.

If M is a subspace of M(X) then the statement that Φ : N → M is a linearmapping means Φ : N → M(X) is a linear mapping such that Φ(N) ⊂ M .A similar interpretation is also to be made for monotone linear and continuouslinear mappings.

In what follows let (X, E) and (Y,F) be measurable spaces and let f : X → Ybe a measurable mapping from (X, E) to (Y,F) (i.e., with f−1(F) ⊂ E). Thenfor each g ∈ M(F) the mapping g ◦ f is in M(E) and so we have a mappingf ∗ : M(F) → M(E) given by f ∗(g) = g ◦ f for each g ∈ M(F). The mapping f ∗

is clearly linear and continuous.

Lemma 13.2 (1) The mapping f ∗ : M(F) → M(E) is surjective if and only iff−1(F) = E .

(2) The mapping f ∗ : M(F) → M(E) is injective if and only if ∅ is the onlyelement F ∈ F with F ∩ f(X) = ∅. In particular, f ∗ is injective whenever f issurjective.

Proof (1) Let N = {f ∗(g) : g ∈ M(F)}; then by Lemma 13.1 (1) N is a subspaceof M(E). Let {hn}n≥1 be an increasing sequence from N and for each n ≥ 1 letgn ∈ M(F) with f ∗(gn) = hn. Let g′n = g1 ∨ · · · ∨ gn; then

f ∗(g′n)(x) = g′n(f(x)) = g1(f(x)) ∨ · · · ∨ gn(f(x)) = h1(x) ∨ · · · ∨ hn(x) = hn(x)

for each x ∈ X, i.e., f ∗(g′n) = hn. Now {g′n}n≥1 is an increasing sequence fromM(F) and hence by Proposition 9.1 g = limn g

′n ∈ M(F). But

f ∗(g)(x) = g(f(x)) = limn→∞

g′n(f(x)) = limn→∞

hn(x)

for each x ∈ X, thus limn hn = f ∗(g) ∈ N , which shows that N is complete.It therefore follows from Proposition 9.2 (1) that N = M(F) (i.e., that f ∗ issurjective) if and only if IE ∈ N for all E ∈ E .

Suppose first that f−1(F) = E and let E ∈ E ; there thus exists F ∈ F withf−1(F ) = E and then IE = f ∗(IF ) ∈ N . This implies f ∗ is surjective. Supposeconversely that f ∗ is surjective and let E ∈ E ; there thus exists g ∈ M(F) withIE = f ∗(g). Let F = {y ∈ Y : g(y) = 1}; then F ∈ F and IE = f ∗(IF ) and soE = f−1(F ). This shows that f−1(F) = E .

(2) Let g1, g2 ∈ M(F) and let D = {y ∈ Y : g1(y) 6= g2(y)}; thus by Lemma 9.6D ∈ F . Now f ∗(g1) = f ∗(g2) if and only if g1(f(x)) = g2(f(x)) for all x ∈ X,


i.e., if and only if D ∩ f(X) = ∅. Thus if ∅ is the only element F ∈ F withF ∩ f(X) = ∅ then f ∗ is injective. Conversely, if there exists F ∈ F with F 6= ∅

and F ∩ f(X) = ∅ then IF 6= 0 but f ∗(IF ) = 0 = f ∗(0) and so in this case f ∗ isnot injective.

A measurable space (Y,F) is said to be separable if {y} ∈ F for each y ∈ Y .If (Y,F) is separable then f ∗ : M(F) → M(E) is injective if and only if f issurjective (since if f is not surjective then {y} ∩ f(X) = ∅ for some y ∈ Y ).

Let µ : M(E) → R+∞ be a measure. The mapping µ ◦ f ∗ : M(F) → R+

∞ is thenlinear and continuous and is thus a measure on F which will be denoted by f∗µ.As a mapping from F to R+

∞ we have (f∗µ)(F ) = µ(f−1(F )) for each F ∈ F .The measure f∗µ is called the image of µ under f . Note that

(f∗µ)(g) = µ(f ∗g) = µ(g ◦ f)

for all g ∈ M(F).

Let us now look at the converse of this concept. Let ν be a measure on F ; thenwe call a measure µ on E a pre-image of ν under f if ν = f∗µ. When discussingthe existence of pre-image measures it is sensible to assume that f−1(F) = E .If this is not the case then E can always be replaced by f−1(F). However, evenif the pre-image exists as a measure on f−1(F), this still leaves the problemof extending a measure from f−1(F) to E , and in this generality there are nonon-trivial results about such extensions.

If f−1(F) = E and a pre-image measure exists then it is clearly unique, since iff∗µ1 = f∗µ2 then µ1(f

−1(F )) = µ2(f−1(F )) for all F ∈ F .

Let ν be a measure on F ; then a subset B ⊂ Y is said to be thick with respectto ν if ν(F ) = 0 for all F ∈ F with F ∩ B = ∅. Of course, an element B ∈ F isthick with respect to ν if and only if ν(Y \B) = 0.

Theorem 13.1 Suppose f−1(F) = E and let ν be a measure on F . Then thepre-image of ν under f exists if and only if f(X) is thick with respect to ν. Inparticular, if f(X) ∈ F then the condition is that ν(Y \ f(X)) = 0.

Proof The condition is clearly necessary, since if ν = f∗µ and F ∈ F withF ∩ f(X) = ∅ then f−1(F ) = ∅ and hence ν(F ) = µ(f−1(F )) = µ(∅) = 0.

Suppose conversely that f(X) is thick with respect to ν. Let g1, g2 ∈ M(F) withg1 ◦ f = g2 ◦ f , and let D = {g1(y) 6= g2(y)}; then by Lemma 9.6 D ∈ F , andD ∩ f(X) = ∅, and so ν(D) = 0. Hence by Proposition 10.6 (3)

ν(g1) = ν(g1IY \D) = ν(g2IY \D) = ν(g2) .


There thus exists a mapping µ : M(E) → R+∞ such that µ(g ◦ f) = ν(g) for all

g ∈ M(F), i.e., such that f∗µ = ν (and µ is unique since f ∗ is surjective). Itremains to show that µ is a measure. Let h1, h2 ∈ M(E) and a1, a2 ∈ R+; sincef ∗ is surjective there exist g1, g2 ∈ M(F) with hj = f ∗(gj) = gj ◦ f for j = 1, 2.Then (a1g1 + a2g2) ◦ f = a1(g1 ◦ f) + a2(g2 ◦ f) = a1h1 + a2h2 and so

µ(a1h1 + a2h2) = ν(a1g1 + a2g2) = a1ν(g1) + a2ν(g2) = a1µ(h1) + a2µ(h2) .

This shows µ is additive. Let {hn}n≥1 be an increasing sequence from M(E)with h = limn hn and for each n ≥ 1 let gn ∈ M(F) with hn = gn ◦ f . Putg′n = g1 ∨ · · · ∨ gn; then the sequence {g′n}n≥1 is increasing and

hn = h1 ∨ · · · ∨ hn = (g1 ◦ f) ∨ · · · ∨ (gn ◦ f) = (g1 ∨ · · · ∨ gn) ◦ f = g′n ◦ f

for each n ≥ 1. Let g′ = limn g′n; then h = g′ ◦ f and thus

limn→∞

µ(hn) = limn→∞

µ(g′n ◦ f) = limn→∞

ν(g′n) = ν(g′) = µ(g′ ◦ f) = µ(h) .

This shows µ is continuous and therefore it is a measure.

Theorem 13.1 is most often applied to the case in which f : X → Y is a surjectivemapping with f−1(E) = F ; for each measure ν on E there then exists a uniquemeasure µ on F with ν = f∗µ.

Here is a well-known fact which is a simple application of Theorem 13.1:

Proposition 13.1 Let ν be a measure on F and let B ⊂ Y be thick with respectto ν. Then there exists a unique measure µ on F|B (with F|B the trace σ-algebraon B) such that µ(F ∩B) = ν(B) for all F ∈ F .

Proof Let i : B → Y be the inclusion mapping (with i(y) = y for all y ∈ B).Then i−1(F ) = F ∩ B for each F ⊂ Y and hence i−1(F) = F|B. Moreover,i(B) = B and so by Theorem 13.1 there exists a unique measure µ on (F|B suchthat ν = i∗µ, i.e., µ is the unique measure such that µ(F ∩ B) = ν(B) for allF ∈ F .

We end the chapter with a construction which is needed in [16] when dealing withparticle models.

If (X, E) is a measurable space then let X/ denote the set of all measures onE taking only values in the set N (and so each measure p ∈ X/ is finite, sincep(X) ∈ N); put E/ = σ(E♦), where E♦ is the set of all subsets of X/ having theform {p ∈ X/ : p(E) = k} with E ∈ E and k ∈ N.

Now let (Y,F) be a second measurable space and let f : (X, E) → (Y,F) bea measurable mapping. If p ∈ X/ and f∗p is the image measure on F then(f∗p)(F ) = p(f−1(F )) ∈ N for all F ∈ F and so f∗p ∈ Y/. Thus there is amapping f/ : X/ → Y/ given by f/(p) = f∗p for each p ∈ X/.


Proposition 13.2 (1) The mapping f/ : (X/, E/) → (Y/,F/) is measurable.

(2) If f−1(F) = E then (f/)−1(F/) = E/.

(3) If f−1(F) = E and f(X) ∈ F then f/(X/) ∈ F/.

Proof (1) Let F ∈ F and k ∈ N; then

(f/)−1({q ∈ Y/ : q(F ) = k})

= {p ∈ X/ : f/(p)(F ) = k} = {p ∈ X/ : p(f−1(F )) = k} .

Thus (f/)−1(F♦) ⊂ E♦ and therefore by Lemma 2.4 (f/)

−1(F/) ⊂ E/.

(2) Let E ∈ E and k ∈ N; then there exists F ∈ F with f−1(F ) = E and thecalculation in (1) shows that

(f/)−1({q ∈ Y/ : q(F ) = k}) = {p ∈ X/ : p(E) = k} .

This implies (f/)−1(F♦) = E♦ (since in (1) we showed that (f/)

−1(F♦) ⊂ E♦).Therefore by Proposition 2.4 (f/)

−1(F/) = E/.

(3) Put f(X) = D and so D ∈ F . If p ∈ X/ then

(f∗p)(D) = p(f−1(D)) = p(X) = p(f−1(Y )) = (f∗p)(Y ) .

On the other hand, if q ∈ Y/ with q(D) = q(Y ) then by Theorem 13.1 there existsa measure p on E with f∗p = q, and since p(f−1(F )) = q(F ) ∈ N for all F ∈ Fand f−1(F) = E it follows that p ∈ X/. Therefore

f/(X) = {f/(p) : p ∈ X/} = {f∗p : p ∈ X/}

=⋃

n∈N

{q ∈ Y/ : q(D) = n} ∩ {q ∈ Y/ : q(Y ) = n}

and hence f/(X) ∈ F/.

It is often useful to partition the space X/ into components consisting of thosemeasures having the same total measure. Thus for each n ∈ N let Xn

/ denote theset of all measures p on E taking only values in the set Nn = {0, 1, . . . , n} andwith p(X) = n; put En

/ = σ(En♦), where En

♦ is the set of all subsets of Xn/ having

the form {p ∈ Xn/ : p(E) = k} with E ∈ E and k ∈ Nn. Thus X/ is the disjoint

union of the sets Xn/ , n ∈ N.

Proposition 13.3 We have E/ = {A ⊂ X/ : A ∩Xn/ ∈ En

/ for each n ∈ N} andtherefore the measurable space (X/, E/) is the disjoint union of the measurablespaces (Xn

/ , En/ ), n ∈ N.


Proof Put D = {A ⊂ X/ : A ∩Xn/ ∈ En

/ for each n ∈ N}, so D is the σ-algebrain the definition of the disjoint union.

Let Dn/ = {A ∩ Xn

/ : A ∈ E/}; then Dn/ is the trace σ-algebra of E/ on Xn

/ andthus Dn

/ = σ(Dn♦), where Dn

♦ = {A ∩ Xn/ : A ∈ E♦}. But Dn

♦ = En♦ and hence

Dn/ = En

/ , i.e., En/ = {A ∩Xn

/ : A ∈ E/}. Therefore if A ∈ E/ then A ∩ Xn/ ∈ En

/

for each n ∈ N, which implies that A ∈ D. This shows E/ ⊂ D.

Conversely, let A ∈ D; then A ∩ Xn/ ∈ En

/ and thus there exists An ∈ E/ withA ∩ Xn

/ = An ∩ Xn/ and this implies that A ∩ Xn

/ ∈ E/ for each n ∈ N, sinceXn

/ ∈ E/. Finally, we then have A =⋃

n∈N(A∩Xn

/ ) ∈ E/, i.e., D ⊂ E/, and henceD = E/.

The result corresponding to Proposition 13.2 also holds for each n ∈ N. Letf : (X, E) → (Y,F) be a measurable mapping. If p ∈ Xn

/ then f∗p ∈ Y n/ . Thus

there is a mapping fn/ : Xn

/ → Y n/ given by fn

/ (p) = f∗p for each p ∈ Xn/ .

Proposition 13.4 (1) The mapping fn/ : (Xn

/ , En/ ) → (Y n

/ ,Fn/ ) is measurable.

(2) If f−1(F) = E then (fn/ )−1(Fn

/ ) = En/ .

(3) If f−1(F) = E and f(X) ∈ F then fn/ (Xn

/ ) ∈ Fn/ .

Proof This is the same as the proof of Proposition 13.2.

14 Kernels

Kernels can thought of as measurable families of measures. In general there aretwo measurable spaces (X, E) and (Y,F) involved in the definition of a kerneland it turns out that kernels can be identified with the set of continuous linearmappings π : M(F) → M(E). This is the direct analogue of the fact that measurescan be regarded as continuous linear mappings µ : M(F) → R+

∞.

Recall from Chapter 13 that if X and Y are non-empty sets and N is a subspaceof M(Y ) then a mapping Φ : N → M(X) is linear if Φ(af + bg) = aΦ(f) + bΦ(g)for all f, g ∈ N and all a, b ∈ R+. Moreover, Φ is monotone if Φ(g) ≤ Φ(f)whenever f, g ∈ N with g ≤ f , and if N is a complete subspace of M(Y ) then alinear mapping Φ : N → M(X) is continuous if it is monotone and

Φ(

limn→∞

fn

)

= limn→∞

Φ(fn)

for each increasing sequence {fn}n≥1 of elements from N .

To start with let (Y,F) be a measurable space and X be a non-empty set. Wesay that a mapping π : X ×F → R+

∞ is a pre-kernel if π(x, ·) is a measure on Ffor each x ∈ X.

Proposition 14.1 Let Φ : M(F) → M(X) be a continuous linear mapping anddefine π : X × F → R+

∞ by π(x, F ) = Φ(IF )(x). Then π is a pre-kernel.

Proof Let x ∈ X and define Φx : M(F) → R+∞ by Φx(f) = Φ(f)(x) for each

f ∈ M(F). Then, since Φ is a continuous linear mapping, Φx : M(F) → R+∞ is

also a continuous linear mapping and therefore by Proposition 10.4 the mappingF 7→ Φx(IF ) is a measure on F . But we have Φx(IF ) = Φ(IF )(x) = π(x, F ) andhence π(x, ·) is a measure on F .

Theorem 14.1 For each pre-kernel π there exists a unique continuous linearmapping Φπ : M(F) → M(X) such that Φπ(IF ) = π(·, F ) for all F ∈ F .

Proof In order to emphasise the structure of the proof it is convenient to againemploy the the notation from Theorem 10.1: Thus if µ is a measure on F thenΦµ : M(F) → R+

∞ is the unique continuous linear mapping with Φµ(IF ) = µ(F )for all F ∈ F . Define a mapping Φπ : M(F) → M(X) by

Φπ(f)(x) = Φπ(x,·)(f)

for all f ∈ M(F), x ∈ X. In particular, Φπ(IF )(x) = Φπ(x,·)(IF ) = π(x, F ) for allF ∈ F , x ∈ X, i.e., Φπ(IF ) = π(·, F ) for all F ∈ F .

82

14 Kernels 83

Let f, g ∈ M(F) and a, b ∈ R+; then, since Φπ(x,·) is linear,

Φπ(af + bg)(x) = Φπ(x,·)(af + bg) = aΦπ(x,·)(f) + bΦπ(x,·)(g)

= aΦπ(f)(x) + bΦπ(g)(x) = (aΦπ(f) + bΦπ(g))(x)

for all x ∈ X and thus Φπ(af + bg) = aΦπ(f) + bΦπ(g). This shows Φπ is linear,and so it is also monotone, since by Proposition 9.1 M(F) is complemented.

Now let {fn}n≥1 be an increasing sequence from M(F) with limn fn = f . Then,since Φπ(x,·) is continuous,

Φπ(f)(x) = Φπ(x,·)(f) = limn→∞

Φπ(x,·)(fn) = limn→∞

Φπ(fn)(x) =(

limn→∞

Φπ(fn))

(x)

for all x ∈ X and thus Φπ(f) = limn Φπ(fn). This shows Φπ is continuous.

Putting this together gives us that Φπ : M(F) → M(X) is a continuous linearmapping with Φπ(IF ) = π(·, F ) for all F ∈ F . Finally, if Φ′

π : M(F) → M(X)is any continuous linear mapping with Φ′

π(IF ) = π(·, F ) for all F ∈ F thenN = {f ∈ M(F) : Φπ(f) = Φ′

π(f)} is a complete subspace of M(F) with IF ∈ Nfor all F ∈ F and therefore by Proposition 9.2 N = M(F), which means thatΦπ = Φ′

π.

By Proposition 14.1 the mapping π 7→ Φπ defines a bijection between the set ofpre-kernels and the set of continuous linear mappings from M(F) to M(X).

If π : X×F → R+∞ is a pre-kernel then we just write π(f) instead of Φπ(f). This

means that π will also be considered as the unique continuous linear mappingπ : M(F) → M(X) with π(IF ) = π(·, F ) for all F ∈ F .

Proposition 14.2 Let π : M(F) → M(X) be a pre-kernel. Then

π(f)(x) = π(x, ·)(f)

for all x ∈ X, f ∈ M(F).

Proof This follows directly from the proof of Theorem 14.1, since this is how theunique mapping Φπ was defined. It also follows from Proposition 9.2 (withoutappealing to the proof of Theorem 14.1) since it is easily checked that

N = {f ∈ M(F) : π(f)(x) = π(x, ·)(f) for all x ∈ X}

is a complete subspace of M(F) with IF ∈ N for all F ∈ F .

Theorem 14.2 Let π be a pre-kernel and let {fn}n≥1 be a decreasing sequencefrom M(F) such that π(fm)(x) <∞ for all x ∈ X for some m ≥ 1. Then

π(

limn→∞

fn

)

= limn→∞

π(fn) .

14 Kernels 84

Proof This follows immediately from Theorem 10.2.

Let (X, E) be a measurable space (so we now have two measurable spaces (X, E)and (Y,F)).

Lemma 14.1 Let Φ : M(F) → M(X) be a continuous linear mapping. ThenΦ(f) ∈ M(E) for all f ∈ M(F) if and only if Φ(IF ) ∈ M(E) for all F ∈ F .

Proof By Lemma 13.1N = {f ∈ M(F) : Φ(f) ∈ M(E)} is a complete subspace ofM(F). Therefore Proposition 9.2 implies that N = M(F) if and only if IF ∈ Nfor all F ∈ F . In other words, Φ(f) ∈ M(E) for all f ∈ M(F) if and only ifΦ(IF ) ∈ M(E) for all F ∈ F .

The statement that Φ : M(F) → M(E) is a linear mapping will be used to expressthe fact that Φ : M(F) → M(X) is a linear mapping such that Φ(f) ∈ M(E) forall f ∈ M(F).

A mapping π : X × F → R+∞ is called an (X, E)|(Y,F)-kernel if it is a pre-

kernel and π(·, F ) ∈ M(E) for each F ∈ F . If the measurable spaces (X, E) and(Y,F) can be inferred from the context then π will simply be called a kernel. Forexample, this is the case if it is known that π is a mapping from X × F to R+

∞

and E is the only σ-algebra of subsets of X which has been introduced.

Theorem 14.3 (1) If Φ : M(F) → M(E) is a continuous linear mapping andπ : X ×F → R+

∞ is defined by π(x, F ) = Φ(IF )(x) for all x ∈ X, F ∈ F , then πis a kernel.

(2) If π : X × F → R+∞ is a kernel and Φπ : M(F) → M(X) is the unique

continuous linear mapping (given in Theorem 14.1) with Φπ(IF ) = π(·, F ) for allF ∈ F then Φ(f) ∈ M(E) for all f ∈ M(F). This means that there is a mappingΦπ : M(F) → M(E).

Proof (1) By Proposition 14.1 π is a pre-kernel and π(·, F ) = Φ(IF ) ∈ M(E) foreach F ∈ F . Thus π is a kernel.

(2) By definition Φπ(IF ) = π(·, F ) ∈ M(E) holds for all F ∈ F and therefore byLemma 14.1 Φπ(f) ∈ M(E) for all f ∈ M(F).

By Theorem 14.3 the mapping π 7→ Φπ defines a bijection between the set of(X, E)|(Y,F)-kernels and the set of continuous linear mappings from M(F) toM(E).

As with pre-kernels, we write π(f) instead of Φπ(f) when π : X × F → R+∞ is a

kernel. Thus π will also be considered as the unique continuous linear mappingπ : M(F) → M(E) with π(IF ) = π(·, F ) for all F ∈ F .

14 Kernels 85

A kernel π : M(F) → M(E) is said to be finite if π(1) ∈ MF(E), i.e., if the measureπ(x, ·) is finite for each x ∈ X. Moreover, π is a probability kernel if π(1) = 1,i.e., if π(x, ·) is a probability measure for each x ∈ X.

Let us point out that in most applications (including Specifications and theirGibbs States [16]) the kernels involved are defined over a single space, i.e., with(X, E) = (Y,F). However, the results typical to this special situation will bedealt with in [16] when they are needed.

In Chapter 13 we already saw a very special kind of kernel: If f : (X, E) → (Y,F)is a measurable mapping then the induced mapping f ∗ : M(F) → M(E) (withf ∗g = g ◦ f) is continuous and linear and it is thus a kernel. As a mappingf ∗ : X × F → R+

∞ we have f ∗(x, F ) = If−1(F )(x). (This should not be regardedas a typical example of a kernel.)

Now let π : X×F → R+∞ be a kernel and µ be a measure on E . Then the mappings

π : M(F) → M(E) and µ : M(E) → R+∞ can be composed, which results in the

mapping µπ : M(F) → R+∞ with (µπ)(f) = µ(π(f)) for all f ∈ M(F). Now this is

clearly a continuous linear mapping and hence µπ is a measure on F . Consideredas a mapping µπ : F → R+

∞ it is given by

(µπ)(F ) = µ(π(IF )) = µ(π(·, F ))

for each F ∈ F . For those who insist on using integral signs this means that

(µπ)(F ) =

∫

π(·, F ) dµ

for each F ∈ F . Next let (Z,G) be a further measurable space and considerkernels π : X × F → R+

∞ and % : Y × G → R+∞. Then composing the mappings

% : M(G) → M(F) and π : M(F) → M(E) gives the mapping π% : M(G) → M(E)with (π%)(f) = π(%(f)) for all f ∈ M(G). Now π% is clearly a continuous linearmapping and so it is a kernel, and considered as a mapping π% : X × G → R+

∞ itis given by

(π%)(x,G) = (π%)(IG)(x) = π(%(IG))(x) = π(%(·, G))(x) = π(x, ·)(%(·, G))

for all x ∈ X, G ∈ G, with the final equality following from Proposition 14.2. Interms of integral signs this means that

(π%)(x,G) =

∫

%(y,G)π(x, dy)

for all x ∈ X, G ∈ G.

Suppose now in addition that we also have a measure µ on E . Then

µ(π%) = (µπ)%

clearly holds, since the composition of mappings is associative.

14 Kernels 86

Lemma 14.2 Let π : X ×F → R+∞ and % : Y × G → R+

∞ be kernels. Then

(π%)(x, ·) = π(x, ·)%

for all x ∈ X. (This states that the measures (π%)(x, ·) and π(x, ·)% on G areequal for each x ∈ X.)

Proof Let x ∈ X, G ∈ G; then as above (and thus making use of Proposition 14.2)(π%)(x,G) = π(x, ·)(%(·, G)) = (π(x, ·)%)(G) and hence (π%)(x, ·) = π(x, ·)% forall x ∈ X.

We end the chapter by presenting a couple of rather technical results which willbe important later. These deal with topics such as conditions which uniquelydetermine a kernel and criteria for showing that a pre-kernel is actually a kernel.

Proposition 14.3 Let π : M(F) → M(X) be a pre-kernel, let f ∈ M(F) andx ∈ X. Then:

(1) If π(f)(x) = 0 then π(fg)(x) = 0 for all g ∈ M(F).

(2) If π(f)(x) < ∞ then π(fIF )(x) = π(f)(x) holds for any F ∈ F such that{y ∈ X : 0 < f(y) < ∞} ⊂ F . Moreover, then also π(hfIF )(x) = π(hf)(x) forall h ∈ M(F).

Proof Since x ∈ X is fixed both parts are just statements about measures. Infact, (1) is just Proposition 10.5 (2) and (2) is Proposition 10.6 (5).

As with a kernel a pre-kernel π : M(F) → M(X) is finite if π(1) ∈ MF(X), i.e.,if the measure π(x, ·) is finite for each x ∈ X.

Proposition 14.4 Let S ⊂ F be closed under finite intersections with Y ∈ Sand σ(S) = F , and let π1, π2 be finite pre-kernels such that π1(IA) = π2(IA) forall A ∈ S. Then π1 = π2.

Proof This follows immediately from Proposition 3.3.

Lemma 14.3 Let π be finite pre-kernel such that π(1) ∈ M(E). Then the setD = {A ∈ F : π(IA) ∈ M(E)} is both a d-system and a monotone class.

Proof Let A1, A2 ∈ D with A1 ⊂ A2; then π(IA2) = π(IA1

) + π(IA2\A1) and so

by Proposition 9.1 π(IA2\A1) ∈ M(E) (since here π(IA2

)(x) <∞ for each x ∈ X).Therefore D is a d-system, since by assumption X ∈ D. The proof that D is amonotone class is the same as the corresponding part of Proposition 9.2 (makinguse of Theorems 14.1 and 14.2).

14 Kernels 87

Proposition 14.5 Let S ⊂ F be closed under finite intersections with Y ∈ S andσ(S) = F , and let π : M(F) → M(X) be a finite pre-kernel with π(IA) ∈ M(E)for all A ∈ S. Then π is a kernel, i.e., π(f) ∈ M(E) for all f ∈ M(F).

Proof This is the same as the proof of Proposition 3.3 (using Lemma 14.3 insteadof Lemma 3.4).

The requirement in Proposition 14.5 that Y ∈ S just means π(1) ∈ M(E). Inmany applications this holds trivially because π is a probability pre-kernel, i.e.,π(1) = 1.

Here is version of Proposition 14.5 which also works for pre-kernels which are, ina certain sense, only σ-finite. This will be needed in Chapter 15.

Lemma 14.4 Let S ⊂ F be closed under finite intersections with σ(S) = F .Let π : M(F) → M(X) be a pre-kernel and v ∈ M+

F(F) with π(v) ∈ MF(E), and

suppose that π(vIA) ∈ M(E) for all A ∈ S. Then π is a kernel, i.e., π(f) ∈ M(E)for all f ∈ M(F).

Proof Define π′ : M(F) → M(X) by π′(f) = π(vf) for all f ∈ M(F). Then π′

is a finite pre-kernel and by Proposition 14.5 (applied to S ′ = S ∪ {Y }) π′ is akernel, i.e., π(vf) ∈ M(E) for all f ∈ M(F). But v−1f ∈ M(F) for all f ∈ M(F)and vv−1 = 1 and this implies that π(f) ∈ M(E) for all f ∈ M(F).

Finally, the following result shows how it is sometimes possible to show that apre-kernel is a kernel by restricting things to suitable subsets of X.

Lemma 14.5 Let π : M(F) → M(X) be a pre-kernel and let {En}n≥1 be asequence from E with X =

⋃

n≥1En; for each n ≥ 1 there is then the restrictionπn : M(F) → M(En) defined by πn(f)(x) = π(f)(x) for each x ∈ En, and clearlyπn is a pre-kernel. Suppose each πn is a kernel, i.e., that πn(f) ∈ M(En) for allf ∈ M(F), n ≥ 1, where En is the trace σ-algebra of E on En. Then π is a kernel.

Proof Let f ∈ M(F), a ∈ R+∞ and put Dn = {x ∈ En : πn(f)(x) < a}. Then

Dn ∈ En and so Dn ∈ E (with Dn here considered as a subset of X). Therefore{x ∈ X : π(f)(x) < a} =

⋃

n≥1Dn ∈ E and so by Lemma 2.7 π(f) ∈ M(E). Thisshows that π is a kernel.

15 Product measures

In this chapter we first discuss the product of two σ-finite measures. This measureis unique and has several useful properties (Fubini’s theorem). We then show thata product of two arbitrary measures exists (although in general it is neither uniquenor has any reasonable properties). The product of finitely many σ-measures isconsidered as well as a countable product of probability measures. Finally we lookat something which we call an implicit product: This can be seen as trying toconstruct a product measure without explicitly knowing the underlying productstructure.

To start with let (X, E) and (Y,F) be measurable spaces, let µ be a σ-finitemeasure on E and ν a σ-finite measure on F . We show that there is a uniquemeasure µ× ν on E × F such that

(µ× ν)(E × F ) = µ(E)ν(F )

for all E ∈ E , F ∈ F . This measure is called the product of µ and ν. It is σ-finiteand finite if both µ and ν are finite.

Recall from Chapter 2 that the product σ-algebra E × F is defined to be σ(R),where R is the set of measurable rectangles, which in the present case (with justtwo factors) are the sets of the form E × F with E ∈ E and F ∈ F .

For the moment make no assumptions about µ and ν. Let f ∈ M(E × F); byProposition 2.7 fx ∈ M(F) for each x ∈ X and f y ∈ M(E) for each y ∈ Y ,recalling that the chapters fx andf y are defined by fx(y

′) = f(x, y′) for all y′ ∈ Yand f y(x′) = f(x′, y) for all x′ ∈ X. There is thus a mapping ν�(f) : X → R+

∞

defined byν�(f)(x) = ν(fx)

for each x ∈ X, and a mapping µ�(f) : Y → R+∞ defined for each y ∈ Y by

µ�(f)(y) = µ(f y) .

This results in mappings ν� : M(E × F) → M(X) and µ� : M(E × F) → M(Y ).

Lemma 15.1 The mappings ν� and µ� are linear and continuous, and thus theyare both pre-kernels. Moreover, these pre-kernels are finite if µ and ν are finite,since ν�(1)(x) = ν(1) for all x ∈ X and µ�(1)(y) = µ(1) for all y ∈ Y .

Proof Let f, g ∈ M(E × F) and a, b ∈ R+; then for each x ∈ X

ν�(af + bg)(x) = ν((af + bg)x)

= ν(afx + bgx) = aν(fx) + bν(gx) = aν�(f)(x) + bν�(g)(x) ,

88


i.e., ν�(af + bg) = aν�(f) + bν�(g). Thus ν� is linear. Now if g ≤ f then gx ≤ fx

and so ν�(g)(x) = ν(gx) ≤ ν(fx) = ν�(f)(x) for each x ∈ X, i.e., ν�(g) ≤ ν�(f),and hence ν� is monotone. Next, let {fn}n≥1 be an increasing sequence fromM(E × F) and put f = limn fn. Then {(fn)x}n≥1 is an increasing sequence fromM(F) with limn(fn)x = fx and thus with

ν�(f)(x) = ν(fx) = limn→∞

ν((fn)x) = limn→∞

ν�(fn)(x)

for each x ∈ X. Therefore ν�(f) = limn ν�(f), and so ν� is continuous. Finallyν�(1)(x) = ν(1x) = ν(1) for each x ∈ X. The proof for µ� is of course exactly thesame.

Now suppose µ and ν are σ-finite measures.

Lemma 15.2 The mappings ν� and µ� are both kernels: ν�(f) ∈ M(E) andµ�(f) ∈ M(F) for each f ∈ M(E × F).

Proof By Lemma 10.3 there exists v ∈ M+F(F) such that ν(v) < ∞. Define a

mapping v′ : X×Y → R+∞ by v′(x, y) = v(y) for all x ∈ X, y ∈ Y ; thus v′ = v◦p1

and hence v′ ∈ M+F(E × F).

Let R = E × F ∈ R; then ν�(v′IR)(x) = ν(IE(x)vIF ) = IE(x)ν(vIF ) for each

x ∈ X, i.e., ν�(v′IR) = ν(vIF )IE, and thus ν�(v

′IR) ∈ M(E) for each R ∈ R.Hence by Lemma 14.4 ν� is a kernel, since R is closed under finite intersections,X × Y ∈ R and E × F = σ(R). The proof for µ� is exactly the same.

By Lemma 15.2 ν� and µ� are kernels and so we now have the measures µν� andνµ� on E × F . Note that

µ(ν�(1)) = µ(ν(1)) = µ(X)ν(Y ) = ν(µ(1)) = ν(µ�(1))

and so these measures are finite if µ and ν are finite.

Lemma 15.3 µν� = νµ�, and in particular

(µν�)(E × F ) = µ(E)ν(F ) = (νµ�)(E × F )

for all E ∈ E , F ∈ F . Moreover, the measure µν� (= νµ�) is σ-finite.

Proof By Lemma 10.3 there exist u ∈ M+F(E) and v ∈ M+

F(F) with µ(u) <∞ and

ν(v) <∞. Define w : X × Y → R+∞ by w(x, y) = u(x)v(y) for all x ∈ X, y ∈ Y ;


thus w ∈ M+F(E × F). Put ω� = (µν�)·w and ω� = (νµ�)·w. If R = E × F ∈ R

then

ω�(E × F ) = (µν�)(wIE×F ) = µ(ν�(wIE×F ))

= µ(ν(vIF )uIE) = µ(uIE)ν(vIF )

= ν(µ(uIE)vIF ) = ν(µ�(wIE×F )) = (νµ�)(wIE×F ) = ω�(E × F ) .

This shows (with E = X and F = Y ) that the measures ω� and ω� are finiteand ω�(R) = ω�(R) for all R ∈ R. But X × Y ∈ R, R is closed under finiteintersections and σ(R) = E × F , and thus by Proposition 3.3 ω� = ω�. It nowfollows from Lemma 12.2 that

µν� = ((µν�)·w)·w−1 = ω�·w−1 = ω�·w−1 = ((νµ�)·w)·w−1 = νµ� ,

and in particular

(µν�)(E × F ) = µ(ν�(IE×F )) = µ(ν(IF )IE)

= µ(E)ν(F ) = ν(µ(IE)IF ) = ν(µ�(IE×F )) = (νµ�)(E × F )

for all E ∈ E , F ∈ F . Finally, by Lemma 10.3 the measure µν� is σ-finite, since(µν�)(w) = µ(u)ν(v) <∞.

The measure µν� (= νµ�) will be denoted by µ× ν. As already mentioned, it iscalled the product of µ and ν. If µ and ν are finite then µ× ν is also finite, since(µ× ν)(X × Y ) = µ(X)ν(Y ).

Theorem 15.1 µ× ν is the unique measure on E × F satisfying

(µ× ν)(E × F ) = µ(E)ν(F )

for all E ∈ E , F ∈ F . Moreover, (µ× ν)(f) = µ(ν�(f)) = ν(µ�(f)) holds for allf ∈ M(E × F).

Proof We only have to show the uniqueness, since the rest is Lemma 15.3. Thuslet ω1, ω2 be measures on E × F with ω1(E × F ) = µ(E)ν(F ) = ω2(E × F ) forall E ∈ E , F ∈ F . Let R′ = {E × F ∈ R : µ(E) < ∞ and ν(F ) < ∞}. Nowsince µ and ν are σ-finite there exists an increasing sequence {En}n≥1 from E withµ(En) <∞ for each n ≥ 1 and X =

⋃

n≥1En and an increasing sequence {Fn}n≥1

from F with ν(Fn) < ∞ for each n ≥ 1 and Y =⋃

n≥1 Fn. Then {En × Fn}n≥1

is an increasing sequence from R′ with X × Y =⋃

n≥1En × Fn. Moreover, R′ isclosed under finite intersections and σ(R′) = E × F (since R ⊂ σ(R′)). But thenumbers ω1(R) and ω2(R) are finite and equal for all R ∈ R′ and therefore byLemma 3.5 ω1 = ω2.


The final statement in Theorem 15.1 (that (µ× ν)(f) = µ(ν�(f)) = ν(µ�(f)) forall f ∈ M(E × F)) is more-or-less what is known as Fubini’s theorem.

We next show that there always exists a measure µ × ν on E × F such that(µ × ν)(E × F ) = µ(E)ν(F ) for all E ∈ E , F ∈ F , even when µ and ν are notσ-finite. However, the analogue of Fubini’s theorem does not hold in general andif this is the case then the measure is not very useful. Moreover, it need not beunique.

Thus in what follows let µ and ν be arbitrary measures on E and F respectively.The mappings ν� and µ� are defined as before and Lemma 15.1 shows that ν�and µ� are pre-kernels.

Again denote by A (resp. by A′) the set of all finite unions (resp. all finite disjointunions) of elements of R. Then by Lemma 2.8 A is an algebra and A′ = A.Moreover, σ(A) = σ(R). Let

N� = {f ∈ M(E × F) : ν�(f) ∈ M(E)} ,

N� = {f ∈ M(E × F) : µ�(f) ∈ M(F)} .

Lemma 15.4 N� and N� are complete subspaces of M(E × F) and they bothcontain ME(A).

Proof By Lemmas 13.1 (2) and 15.1 it follows that N� and N� are completesubspaces of M(E × F). Also if R = E × F ∈ R then, as in the proof ofLemma 15.2, ν�(IR) = ν(F )IE ∈ M(E) and so IR ∈ N� for all R ∈ R. But ifA ∈ A = A′ then there exist R1, . . . , Rn ∈ R such that IA = IR1

+ · · ·+ IRnand

hence IA ∈ N for all A ∈ A. Therefore by Lemma 9.2 (1) ME(A) ⊂ N�. In thesame way ME(A) ⊂ N�.

By Lemma 15.4 we now have continuous linear mappings µν� : N� → R+∞ and

νµ� : N� → R+∞. Define mappings ω1, ω2 : A → R+

∞ by ω1(A) = (µν�)(IA) andω2(A) = (νµ�)(IA) for all A ∈ A. Then Proposition 10.1 (1) shows that ω1 andω2 are measures on A and therefore by Theorem 4.1 they extend to measures onσ(A) = E × F which will also be denoted by ω1 and ω2 respectively (althoughthese extensions are, in general, not unique). Moreover,

ω1(E × F ) = (µν�)(IE×F ) = µ(E)ν(F ) = (νµ�)(IE×F ) = ω2(E × F )

for all E ∈ E , F ∈ F . This means that ω1 and ω2 are both candidates for a‘product measure’, but in general, however, they will not be equal (even if ω1 andω2 are uniquely determined by their restrictions to A).

There is no problem in extending Theorem 15.1 from two to the product of finitelymany factors. Let n ≥ 2 and for each k = 1, . . . , n let (Xk, Ek) be a measurablespace. Put X = X1 × · · · ×Xn and E = E1 × · · · × En.


Theorem 15.2 For each k = 1, . . . , n let µk be a σ-finite measure on Ek. Thenthere exists a unique measure µ on E such that

µ(E1 × · · · × En) = µ1(E1) × · · · × µn(En)

for all Ek ∈ Ek, k = 1, . . . , n. The measure µ is σ-finite and it is finite if themeasures µ1, . . . , µn are all finite.

Proof If X is considered as the product of X1×· · ·×Xn−1 and Xn then it is easyto see that E = (E1 × · · · × En−1) × En. The existence thus follows by applyingTheorem 15.1 n − 1 times. The uniqueness then follows exactly as in the proofof Theorem 15.1.

We next show that a countable product of probability measures exists. Let S bea countably infinite set and for each s ∈ S let (Xs, Es) be a measurable space.Put X =

∏

s∈S Xs and E =∏

s∈S Es. Let R be the set of measurable rectangles,i.e., sets of the form

∏

s∈S Es with Es ∈ Es for each s and Es 6= Xs for onlyfinitely many s ∈ S; thus by definition E = σ(R).

Theorem 15.3 For each s ∈ S let µs be a probability measure on Es. Then thereexists a unique probability measure µ on E such that

µ(

∏

s∈S

Es

)

=∏

s∈S

µs(Es)

for each measurable rectangle∏

s∈S Es ∈ R (and note that there is no problemwith the product

∏

s∈S µs(Es) since µs(Es) = 1 for all but finitely many s).

Proof The uniqueness follows immediately from Proposition 3.2, since R is closedunder finite intersections and E = σ(R). The proof of the existence which followsis somewhat sketchy; the reader is left to fill in the details. Recall from Lemma 2.8that if A (resp. A′) is the set of all finite unions (resp. all finite disjoint unions)of elements of R then A is an algebra and A′ = A; moreover, σ(A) = σ(R).

By enumerating the elements of S we can assume without loss of generality thatS = N+. For each n ≥ 1 we thus have a a measurable space (Xn, En) and aprobability measure µn on En, and X =

∏

n≥1Xn and E =∏

n≥1 En.

For each n ≥ 1 let Yn =∏

k≥nXk and Fn =∏

k≥n Ek. Then (Y1,F1) = (X, E)and if Yn is identified with Xn × Yn+1 in the usual way then Fn = En × Fn+1.

For each n ≥ 1 define µ�n : M(Fn) → M(Yn+1) by µ�

n(f)(y) = µn(fy). Then by

Lemmas 15.1 and 15.2 µ�n is a continuous linear mapping from M(Fn) to M(Fn+1)

and so µ�n is a probability kernel.


Let n ≥ 1; denote by Rn the set of measurable rectangles occurring in thedefinition of Fn and let An be the set of all finite unions of elements from Rn.Thus by Lemma 2.8 An is an algebra and σ(An) = Fn. In particular R = R1

and A = A1.

Now µ�n(f) ∈ ME(An+1) for each f ∈ ME(An) and so we can consider µ�

n as a linearmapping from ME(An) to ME(An+1). Moreover, for each f ∈ ME(An) there existsp ≥ 1 such that (µ�

n+p−1◦· · ·◦µ�n)(f) is a constant mapping in ME(An+p) and, since

µ�m(1) = 1, we still obtain a constant mapping in ME(An+q) with the same value

if p is replaced by q ≥ p. This allows us to define a mapping νn : ME(An) → R+∞,

where νn(f) is the value of the constant mapping (µ�n+p−1 ◦ · · · ◦ µ�

n)(f) with plarge enough.

The mapping νn : ME(An) → R+∞ is linear and thus by Proposition 10.1 (1)

the mapping A 7→ νn(IA) from An to R+∞ (which we also denote by νn) is a

finite finitely additive measure on An. In particular ν1 is a finite finitely additivemeasure on A1 = A and a direct calculation shows that

ν1

(

∏

n≥1

En

)

=∏

n≥1

µn(En)

for each measurable rectangle∏

n≥1En. Therefore it is enough to show that ν1

is a measure on A1, since the extension of µ1 to σ(A1) is then a measure onσ(A1) = F1 = E with the required property.

Fix n ≥ 1 and note that νn = νn+1 ◦ µ�n. Consider Yn as Xn × Yn+1; then the

section fx is in ME(An+1) for each f ∈ ME(An), x ∈ Xn, and so there is amapping ν/

n+1 : ME(An) → M(Xn) given by ν/n+1(f)(x) = νn+1(fx). Moreover,

ν/n+1(f) ∈ M(En) and µn(ν/

n+1(f)) = νn+1(µ�n(f)) = νn(f) for all f ∈ ME(An);

(this holds when f = IR with R ∈ Rn and hence by linearity it holds for allf ∈ ME(An)). Let ε > 0 and let {fm}m≥1 be a decreasing sequence from ME(An)with νn(fm) ≥ ε for all m ≥ 1. Then {ν/

n+1(fm)}m≥1 is a decreasing sequencefrom M(En) with µn(ν/

n+1(fm)) = νn(fm) ≥ ε; also µn(ν/n+1(f1)) = νn(f1) < ∞.

Thus by Theorem 10.2

µn

(

limm→∞

ν/n+1(fm)

)

= limm→∞

µn(ν/n+1(fm)) ≥ ε

and so there exists x ∈ Xn with (limm ν/n+1(fm))(x) ≥ ε/2. This means there

exists x ∈ Xn such that νn+1((fm)x)) ≥ ε/2 for all m ≥ 1; moreover {(fm)x}m≥1

is a decreasing sequence from ME(An+1).

We now iterate this process starting with a decreasing sequence {fm}m≥1 fromME(A1) with ν1(fm) ≥ ε for all m ≥ 1. For each n ≥ 1 there then exists xn ∈ Xn

such that νn+1(fm(x1, . . . , xn)) ≥ 2−nε for all m ≥ 1, where g(x1) = gx1and

g(x1, . . . , xn) = (g(x1, . . . , xn−1))xn.


Finally, consider a decreasing sequence {Am}m≥1 from A1 with ν1(A) ≥ ε > 0for each m ≥ 1 and apply the above with fm = IAm

. There therefore exists anelement x = {xn}n≥1 ∈ X = Y1 such that νn+1(IAm

(x1, . . . , xn)) ≥ 2−nε for alln ≥ 1 and all m ≥ 1.

Fix m ≥ 1; now the mapping IAm(x1, . . . , xn) ∈ ME(An+1) only takes on the

values 0 and 1, and hence (since νn+1(IAm(x1, . . . , xn)) > 0) there exists for each

n ≥ 1 at least one point yn+1 ∈ Yn+1 with IAm(x1, . . . , xn) (yn+1) = 1. But this

means that the element (x1, . . . , xn, yn+1) of Y1 lies in Am for each n ≥ 1, andsince Am ∈ A, this in turn implies that x ∈ Am.

We have thus shown that x ∈⋂

m≥1Am and so⋂

m≥1Am 6= ∅. Hence if {Am}m≥1

is a decreasing sequence from A1 with⋂

m≥1Am = ∅ then limm ν1(Am) = 0, i.e.,ν1 is ∅-continuous and therefore by Proposition 3.2 ν1 is a measure on A1. Thiscompletes the proof of Theorem 15.3.

We now turn to the final topic of this chapter and look at what will be called animplicit product. If (Y,F) is a measurable space then P(F) will denote the setof probability measures on F .

Let (X, E) be the product of measurable spaces (X1, E1) and (X2, E2) and foreach k = 1, 2 let Fk = p−1

k (Ek) with pk : X → Xk the projection onto the k thcomponent. Thus F1 consists of all sets of the form E1 ×X2 with E1 ∈ E1 andF2 of all sets of the form X1 × E2 with E2 ∈ E2, and in particular E = F1 ∨ F2,where F1∨F2 = σ(F1∪F2) is the smallest σ-algebra containing both F1 and F2.

Now consider probability measures µ1 ∈ P(E1) and µ2 ∈ P(E2) and let µ be theproduct of µ1 and µ2. Also let ν1 be the measure on F1 with ν1(E1×X2) = µ1(E1)for all E1 ∈ E1 and ν2 the measure on F2 with ν2(X1 × E2) = µ2(E2) for allE2 ∈ E2, thus in fact µk = (pk)∗νk is the image of νk under pk for k = 1, 2. Thenµ(F1 ∩ F2) = ν1(F1)ν2(F2) for all F1 ∈ F1, F2 ∈ F2.

We are interested in the following converse: Let (X, E) be a measurable space andF1, F2 be sub-σ-algebras of E with E = F1∨F2. Then when is it the case that forany probability measures ν1 ∈ P(F1) and ν2 ∈ P(F2) there exists a probabilitymeasure µ on E such that µ(F1 ∩ F2) = ν1(F1)ν2(F2) for all F1 ∈ F1, F2 ∈ F2?This can be seen as being able to construct a product measure without explicitlyknowing the underlying product structure on the set X.

Note that in the situation above the following property holds: If F1 ∈ F1, F2 ∈ F2

with F1 6= ∅, F2 6= ∅ then F1∩F2 6= ∅. This is because F1 has the form E1×X2

and F2 the form X1 × E2, and then F1 ∩ F2 = E1 × E2 6= ∅ (since E1 6= ∅ andE2 6= ∅).

In order to avoid subscripts as much as possible let us state the problem again,just renaming the σ-algebras and measures. Thus let X be a set and E and F


be two σ-algebras of subsets of X. Now if µ ∈ P(E) and ν ∈ P(F) then byProposition 3.3 there is at most one probability measure ω on E ∨ F such thatω(E ∩ F ) = µ(E)ν(F ) for all E ∈ E , F ∈ F and if this measure exists then itwill be called the implicit product of µ and ν. The question now becomes: Whendoes the implicit product of µ and ν exist for all µ ∈ P(E), ν ∈ P(F)?

It turns out that this is the case if and only if E and F have the property enjoyedby the σ-algebras E1 and E2 in the original situation. More precisely, let us saythat E and F are weakly independent if E ∩ F 6= ∅ for all E ∈ E , F ∈ F withE 6= ∅, F 6= ∅. Theorem 15.4 below states in part that the implicit productof µ and ν exists for all µ ∈ P(E), ν ∈ P(F) if and only if E and F are weaklyindependent.

In many applications what is really needed is a not just a measure but a kernel.Let π : M(E ∨ F) → M(E) be a probability kernel, so π(x, ·) is a probabilitymeasure on E ∨ F for each x ∈ X and π(·, B) ∈ M(E) for each B ∈ E ∨F . Thenπ is called an implicit product kernel for ν ∈ P(F) if

π(x, E ∩ F ) = IE(x)ν(F )

for all E ∈ E , F ∈ F , x ∈ X. Again, if π exists then by Proposition 14.5 it isunique and will thus be referred to as the implicit product kernel for µ.

For each x ∈ X let εx be the element of P(F) with εx(F ) = IF (x) for all F ∈ F .

Theorem 15.4 The following statements are equivalent:

(1) The implicit product of µ and ν exists for all µ ∈ P(E), ν ∈ P(F).

(2) The implicit product kernel exists for all ν ∈ P(F).

(3) The implicit product of εx and εy exists for all x, y ∈ X (with εx consideredas an element of P(E) and εy as an element of P(F)).

(4) The σ-algebras E and F are weakly independent.

Proof (1) ⇒ (3): This clear.

(3) ⇒ (2): For each x, y ∈ X let εx,y be the implicit product of εx and εy, i.e.,εx,y is the unique element of P(E ∨F) such that εx,y(E ∩F ) = IE(x)IF (y) for allE ∈ E , F ∈ F . Fix x ∈ X and define a pre-kernel πx : M(E ∨ F) → M(X) by

πx(y, B) = εx,y(B) .

Now πx(·, B) ∈ M(F) for each B ∈ R, where R = {E ∩ F : E ∈ E , F ∈ F} andtherefore by Proposition 14.5 πx : M(E ∨F) → M(F) is a probability kernel. Letν ∈ P(F) and define π : X × E ∨ F → R+

∞ by

π(x,B) = ν(πx(·, B)) ;


it will be shown that π is the implicit product kernel for ν. Note first thatπ(x, ·) = νπx and so in particular π(x, ·) ∈ P(E ∨ F) for each x ∈ X (whichmeans that π is a pre-kernel). Moreover,

π(x, E ∩ F ) = ν(εx,·(E ∩ F )) = ν(IE(x)IF ) = IE(x)ν(F )

for all E ∈ E , F ∈ F , and in particular π(·, R) ∈ M(E) for each R ∈ R. Thus byProposition 14.5 π(·, B) ∈ M(E) for each B ∈ E ∨ F .

(2) ⇒ (1): Let µ ∈ P(E) and ν ∈ P(F), and by assumption there exists animplicit product kernel π for ν. Let ω = µπ, i.e., ω ∈ P(E ∨ F) is the measuredefined by ω(B) = µ(π(·, B)) for all B ∈ E ∨ F . Then

ω(E ∩ F ) = µ(π(·, E ∩ F )) = µ(IEν(F )) = µ(E)ν(F )

for all E ∈ E , F ∈ F , and so ω is the implicit product of µ and ν.

(3) ⇒ (4): Let E ∈ E , F ∈ F with E 6= ∅ and F 6= ∅. Choose x ∈ E, y ∈ F andlet εx,y be the implicit product of εx and εy. Then εx,y(E ∩ F ) = IE(x)IF (y) = 1and so in particular E ∩ F 6= ∅.

(4) ⇒ (3): This is the only part which is not trivial and its proof will occupy therest of the chapter.

Let I denote the set of all subsets of X having the form E ∩ F with E ∈ E andF ∈ F . Then E ∪ F ⊂ I ⊂ E ∨ F and hence E ∨ F = σ(I). Moreover, denoteby C (resp. by C ′) the set of all finite unions (resp. all finite disjoint unions) ofelements of I. The following result is the analogue of Lemma 2.8 (and the proofis essentially just copied from the proof of Lemma 2.8):

Lemma 15.5 C is an algebra and C ′ = C. Moreover, σ(C) = E ∨ F .

Proof It is immediate that σ(C) = E ∨F since I ⊂ C and C ⊂ σ(I). Now C ′ ⊂ Cand C is closed under finite unions, and hence it will follow that C is an algebrawith C ′ = C once we show that if C ∈ C then C and X \ C are both elements ofC ′. Thus consider C ∈ C, so C has the form

⋃nk=1Ek ∩ Fk with E1, . . . , En ∈ E

and F1, . . . , Fn ∈ F . Let S = {E1, . . . , En} and T = {F1, . . . , Fn} and let U bethe subset of P(X) consisting of all elements of the form S∩T with S ∈ p(S) andT ∈ p(T ) (where p(S) are p(T ) are as in Lemma 2.2) Then U ⊂ I (since p(S) ⊂ Eand p(T ) ⊂ F) and by Lemma 2.2 (1) the elements of U form a partition of X.Furthermore, by Lemma 2.2 (2) it follows that if U ∈ U then either U ⊂ C orU ∩ C = ∅, and this implies that C is the (disjoint) union of the elements of Uit contains, and the same holds true of X \ C. In particular, C and X \ C areboth elements of A′.

Suppose now that E and F are weakly independent and let x, y ∈ X. We wantto construct a measure ω on E ∨ F such that ω(E ∩ F ) = IE(x)IF (y) for allE ∈ E , F ∈ F , and by Lemma 15.5 and Theorem 11.2 it is enough to constructa measure ω on C with this property.


Lemma 15.6 Let {En}n≥1 be a sequence from E and {Fn}n≥1 a sequence fromF and let E ∈ E , F ∈ F with E 6= ∅, F 6= ∅ and E ∩ F ⊂

⋃

n≥1En ∩ Fn. ThenE ⊂

⋃

n≥1En and F ⊂⋃

n≥1 Fn.

Proof Since E \⋃

n≥1En ∈ E , F 6= ∅ and

(

E \⋃

n≥1

En

)

∩ F = (E ∩ F ) \⋃

n≥1

En ⊂ (E ∩ F ) \⋃

n≥1

En ∩ Fn = ∅

it follows that E \⋃

n≥1En = ∅, and thus E ⊂⋃

n≥1En. The same argumentshows that F ⊂

⋃

n≥1 Fn.

Lemma 15.7 Let {En}n≥1 be a sequence from E and {Fn}n≥1 a sequence fromF such that the sequence {En ∩ Fn}n≥1 is disjoint; also let E ∈ E , F ∈ F withx ∈ E, y ∈ F and E ∩ F ⊂

⋃

n≥1En ∩ Fn. Then there is exactly one index p ≥ 1such that x ∈ Ep and y ∈ Fp.

Proof Suppose x ∈ Ep, y ∈ Fp and also x ∈ Eq, y ∈ Fq. Then Ep ∩ Eq 6= ∅ andFp ∩ Fq 6= ∅ and hence (Ep ∩ Eq) ∩ (Fp ∩ Fq) = (Ep ∩ Fp) ∩ (Eq ∩ Fq) 6= ∅. Butthis only possible if q = p, since the sequence {En ∩ Fn}n≥1 is disjoint. Hencethere is at most one index p ≥ 1 such that x ∈ Ep and y ∈ Fp.

Let M = {n ≥ 1 : x ∈ En}; then M 6= ∅, since by Lemma 15.6 E ⊂⋃

n≥1En.Put E ′ =

⋂

n∈M En \⋃

m/∈M Em; then E ′ ∈ E , x ∈ E ′ (and so E ∩ E ′ 6= ∅) and

(E ∩ E ′) ∩ F ⊂⋃

n≥1

(En ∩ E ′) ∩ Fn

⋃

n∈M

(En ∩ E ′) ∩ Fn .

Thus by Lemma 15.6 F ⊂⋃

n∈M Fn and so there exists p ∈M with y ∈ Fp (andx ∈ Ep since p ∈M).

Lemma 15.8 Let E1, . . . , Em, E′1, . . . , E

′n ∈ E , F1, . . . , Fm, F

′1, . . . , F

′n ∈ F

with (Ej ∩ Fj) ∩ (Ek ∩ Fk) = ∅ and (E ′j ∩ F

′j) ∩ (E ′

k ∩ F′k) = ∅ whenever j 6= k

and such that⋃m

j=1Ej ∩ Fj =⋃n

k=1E′k ∩ F

′k. Then

m∑

j=1

IEj(x)IFj

(y) =n

∑

k=1

IE′

k(x)IF ′

k(y) .

Moreover, the sums occurring here can only take on the values 0 and 1.


Proof Suppose∑m

j=1 IEj(x)IFj

(y) > 0; then there exists an index j with x ∈ Ej

and y ∈ Fj. Thus, since Ej ∩Fj ⊂⋃n

k=1E′k ∩F

′k, it follows from Lemma 15.7 that

there is exactly one index k such that x ∈ E ′k and y ∈ F ′

k, which implies that∑n

k=1 IE′

k(x)IF ′

k(y) = 1. In particular,

∑nk=1 IE′

k(x)IF ′

k(y) > 0 and so the same

argument now shows that∑m

j=1 IEj(x)IFj

(y) = 1. Therefore if one these sums isnot zero then they are both equal to one; hence they are always equal (and canonly take the values 0 and 1).

By Lemmas 15.5 and 15.8 there is a unique mapping ω : C → R+∞ such that

ω(

n⋃

k=1

Ek ∩ Fk

)

=n

∑

k=1

IEk(x)IFk

(y)

whenever E1, . . . , En ∈ E , F1, . . . , Fn ∈ F with E1 ∩ F1, . . . En ∩ Fn disjoint,and ω can only take on the values 0 and 1.

Lemma 15.9 ω is a measure on C.

Proof It is clear that ω(∅) = 0 and that ω is additive. Let {Cn}n≥1 be a disjointsequence from C with C =

⋃

n≥1Cn ∈ C. If ω(C) = 0 then ω(Cn) = 0 for alln ≥ 1 (since ω(Cn) + ω(C \ Cn) = ω(C) = 0) and then ω(C) =

∑

n≥1 ω(Cn)holds trivially. We can thus assume that ω(C) = 1, and so there exist E ∈ Eand F ∈ F with E ∩ F ⊂ C and x ∈ E, y ∈ F . Now for each n ≥ 1 there existEn1, . . . , Enpn

∈ E , Fn1, . . . , Fnpn∈ F with En1 ∩ Fn1, . . . , Enpn

∩ Fnpndisjoint

such that Cn =⋃pn

k=1Enk ∩ Fnk. Hence the elements Enk ∩ Fnk, k = 1, . . . , pn,n ≥ 1 are all disjoint and

E ∩ F ⊂ C =⋃

n≥1

Cn =⋃

n≥1

pn⋃

k=1

Enk ∩ Fnk .

Therefore by Lemma 15.8 there exist m ≥ 1 and 1 ≤ k ≤ pm such that x ∈ Emk

and y ∈ Fmk, which implies that ω(Cm) = 1. But

n∑

k=1

ω(Ck) = ω(

n⋃

k=1

Ck

)

≤ ω(C) = 1

for each n ≥ 1 and thus∑

n≥1 ω(Cn) = 1 = ω(C). This shows that ω is a measureon C.

This completes the proof of (4) ⇒ (3), and hence the proof of Theorem 15.4.

16 Countably generated measurable spaces

A measurable space (X, E) is said to be countably generated if the σ-algebra is,where a σ-algebra E is countably generated if E = σ(I) for some countable subsetI of E . In the present chapter we look at some of the properties enjoyed bysuch spaces. This chapter can be seen as a preparation for Chapter 18, wherewe study substandard Borel spaces (our substitute for standard Borel spaces).However, we also apply the results in Chapter 17 to give a simple proof of theDunford-Pettis theorem.

In our treatment of both countably generated measurable spaces and substandardBorel spaces the following ‘nice’ measurable space (M,B) plays a crucial role: LetM = {0, 1}N (the space of all sequences {zn}n≥0 of 0’s and 1’s), considered as acompact metric space with respect to the metric d : M ×M → R+ given by

d({zn}n≥0, {z′n}n≥0) =

∑

n≥0

2−n|zn − z′n|

(or any equivalent metric the reader might prefer), and let B be the σ-algebra ofBorel subsets of M .

For each m ≥ 0 let qm : M → {0, 1}m+1 be given by qm({zn}n≥0) = (z0, . . . , zm)and let Cm = q−1

m (P({0, 1}m)). Then Cm is a finite algebra and each of the setsin Cm is both open and closed; also Cm ⊂ Cm+1. Let C =

⋃

m≥0 Cm; then C is acountable algebra (the algebra of cylinder sets) and each of the sets in C is bothopen and closed. Also for each m ≥ 0 let pm : M → {0, 1} be the projectionmapping defined by letting pm({zn}n≥0) = zm for each {zn}n≥0 ∈M and let

Λm = p−1n ({1}) = {{zn}n≥0 ∈M : zm = 1} .

Lemma 16.1 B = σ(C) = σ({Λm : m ≥ 0} and hence in particular (M,B) iscountably generated.

Proof Let O be the set of open subsets of M . Then the countable set C is a basefor the topology on M , and so each O ∈ O can be written as a countable unionof elements from C. Hence O ⊂ σ(C) and thus B = σ(O) ⊂ σ(σ(C)) = σ(C),i.e., B = σ(C). Moreover, each element of C can written as a finite intersection ofelements from the set {Λm : m ≥ 0} ∪ {X \ Λm : m ≥ 0} and it therefore followsthat C ⊂ σ({Λm : m ≥ 0}). This implies that B = σ({Λm : m ≥ 0}).

The next result appears as Theorem 2.1 in Mackey [13].

Proposition 16.1 A measurable space (X, E) is countably generated if and onlyif there exists a mapping f : X →M with f−1(B) = E .

99


Proof Suppose first (X, E) is countably generated; then there exists a sequence{En}n≥0 from E such that E = σ({En : n ≥ 0}). Define a mapping f : X → Mby f(x) = {IEn

(x)}n≥0. Then f−1(Λn) = En for each n ≥ 0 and therefore byProposition 2.4 and Lemma 16.1

f−1(B) = f−1(σ({Λn : n ≥ 0})) = σ({En : n ≥ 0}) = E .

Suppose conversely there exists f : X →M with f−1(B) = E and for each n ≥ 0put En = f−1(Λn). Then again by Proposition 2.4 and Lemma 16.1

σ({En : n ≥ 0}) = σ({f−1(Λn) : n ≥ 0}) = f−1(σ({Λn : n ≥ 0})) = f−1(B) = E

and thus E is countably generated.

Proposition 16.2 If (X, E) is a countably generated measurable space then thereexists a countable algebra A with E = σ(A).

Proof By Proposition 16.2 there exists a mapping f : X →M with f−1(B) = Eand therefore Proposition 2.4 implies that A = f−1(C) is a countable algebrawith σ(A) = σ(f−1(C)) = f−1(σ(C)) = E .

A topological space is separable if it possesses a countable dense set. A metricspace is separable if and only its topology has a countable base.

Proposition 16.3 (1) Let X be a topological space having a countable base forits topology and let BX be the σ-algebra of Borel subsets of X. Then (X,BX) iscountably generated. In particular, this is the case when X is a separable metricspace.

(2) Let (X, E) be a countably generated measurable space and Y be a non-emptysubset of X. Then (Y, E|Y ) is countably generated (with E|Y the trace σ-algebra).

(3) Let (X, E) and (Y,F) be measurable spaces with (Y,F) countably generated.If there exists a mapping f : X → Y with f−1(F) = E then (X, E) is countablygenerated.

(4) Let S be a non-empty countable set and for each s ∈ S let (Xs, Es) be acountably generated measurable space. Then the product measurable space (X, E)is countably generated.

(5) Let S be a non-empty countable set and for each s ∈ S let (Xs, Es) be acountably generated measurable space; assume the sets Xs, s ∈ S, are disjoint.Then the disjoint union measurable space (X, E) is countably generated.

(6) Let (X, E) be a countably generated measurable space. Then the measurablespace (X/, E/) is countably generated. (Recall X/ denotes the set of all measureson E taking only values in the set N and that E/ = σ(E♦), where E♦ is the set ofall subsets of X/ having the form {p ∈ X/ : p(E) = k} with E ∈ E and k ∈ N.)


Proof (1) If OX is the set of open subsets of X and U is a countable base forthe topology then each U ∈ OX can be written as a countable union of elementsfrom U and thus OX ⊂ σ(U). Hence BX = σ(OX) ⊂ σ(U) and so BX = σ(U).Therefore (X,BX) is countably generated.

(2) If I is a countable subset of E with E = σ(I) then I|Y is a countable subsetof E|Y and by Proposition 2.3 E|Y = σ(I)|Y = σ(I|Y ).

(3) If I is a countable subset of F with F = σ(I) then f−1(I) is a countablesubset of E and by Proposition 2.4 E = f−1(F) = f−1(σ(I)) = σ(f−1(I)).

Note: The proofs of (4), (5) and (6) require Propositions 16.4, 16.5 and 16.7respectively. These results are dealt with below.

(4) By Proposition 16.1 there exists for each s ∈ S a mapping fs : Xs →M withf−1

s (B) = Es. Define f : X → MS by letting f({xs}s∈S) = {fs(xs)}s∈S for each{xs}s∈S ∈ X. Then by Proposition 2.6 (2) f−1(BS) = E . But by Propositions16.1 and 16.4 (MS,BS) is countably generated and therefore by (3) (X, E) iscountably generated.

(5) By Proposition 16.1 there exists for each s ∈ S a mapping fs : Xs → Mwith f−1

s (B) = Es. Define f : X → S ×M by letting f(x) = (s, fs(x)) for eachx ∈ Xs, s ∈ S. Then by Proposition 2.9 (2) f−1(P(S) × B) = E . But by (1)and Proposition 16.5 (S ×M,P(S) × B) is countably generated and thus by (3)(X, E) is countably generated.

(6) By Proposition 16.1 there exists a mapping f : X → M with f−1(B) = E .Let f/ : X/ →M/ be the mapping defined in Chapter 13 with f/(p) = f∗p for allp ∈ X/. Then by Proposition 13.2 (2) (f/)

−1(B/) = E/. But by Proposition 13.3(M/,B/) is the disjoint union of the measurable spaces (Mn

/ ,Bn/ ), n ≥ 1, and

by (1) and Proposition 16.7 (Mn/ ,B

n/ ) is countably generated for each n ∈ N.

Thus by (5) (M/,B/) is countably generated and so by (3) (X/, E/) is countablygenerated.

We now present the results about the various spaces constructed out of M whichwere used in the proof of Proposition 16.3 and which will be required in the proofof Propositions 18.1 and 18.2.

Proposition 16.4 Let S be a non-empty countable set. Then MS (with theproduct topology) is homeomorphic to M . If h : MS → M is a homeomorphismthen also h−1(B) = BS (with BS the product σ-algebra on MS).

Proof The set S × N is countably infinite, so let ϕ : N → S × N be a bijectivemapping. If {ws}s∈S ∈ MS and s ∈ S then the element ws of M = {0, 1}N

will be denoted by {ws,n}n≥0. Now define a mapping g : MS → M by lettingg({ws}s∈S) = {zn}n≥0, where zn = ws,k and (s, k) = ϕ(n). Then it is easy to see


that g is bijective, and it is continuous, since pn◦g = pk ◦p′s for each n ≥ 0, where

again (s, k) = ϕ(n) and p′s : MS →M is the projection onto the s th component.

Thus g is a homeomorphism, since MS is compact and compact subsets of Mare closed. Now by Proposition 2.10 BS is the Borel σ-algebra and therefore ifh : MS →M is a homeomorphism then by Lemma 2.5 h−1(B) = BS.

Proposition 16.5 Let S be a non-empty countable set considered as a topologicalspace with the discrete topology (in which every subset of S is open); then thetopological space S ×M is separable and its topology can be given by a completemetric. Moreover, P(S) × B is the Borel σ-algebra of S ×M .

Proof Since M is separable there exists a countable dense subset D of M . ThenS × D is countable and it is clearly a dense subset of S ×M ; hence S ×M isseparable. Now a metric % can be defined on S ×M by letting

%((s1, z1), (s2, z2)) = max{δ(s1, s2), d(z1, z2)} ,

where δ(s, s) = 0 and δ(s, t) = 1 if s 6= t, and it is easily checked that this metricis complete and that it generates the topology on S×M . It remains to show thatP(S) ×B is the Borel σ-algebra of S ×M , which we denote by E . Let D denotethe set of all sets having the form {s} × C with s ∈ S and C ∈ C. Then D is acountable base for the topology on S ×M and so σ(D) = E . But each elementof D is a measurable rectangle and hence σ(D) ⊂ P(S) × B, and this showsE ⊂ P(S) × B. Conversely, for each s ∈ S the set Bs = {B ∈ B : {s} × B ∈ E}is a monotone class containing C (since {s} × C is open for each C ∈ C) andhence by Proposition 2.2 Bs = B, i.e., {s} × B ∈ E for all B ∈ B, s ∈ S.Thus P(S) × B ⊂ E , since if E ∈ P(S) × B then E =

⋃

s∈S({s} × Es) and byProposition 2.7 the section Es is in B for each s ∈ S.

In the proof of Proposition 16.7 below (and several times later) we need thefollowing remarkable property of the space (M,B):

Proposition 16.6 Let µ be a finite finitely additive measure on C. Then µ is ameasure and so by Theorem 4.1 it extends to a unique measure on B.

Proof If {Cn}n≥1 is a decreasing sequence from C with⋂

n≥1 Cn = ∅ then, sincethe elements of C are compact, there exists m ≥ 1 so that Cn = ∅ for all n ≥ m.Thus µ(Cn) = 0 for all n ≥ m and so in particular limn µ(Cn) = 0. Therefore µis ∅-continuous, and hence by Proposition 3.3 it is a measure.

For each n ∈ N let Mn/ denote the set of all measures p on B taking only values

in the set Nn = {0, 1, . . . , n} and with p(M) = n; put Bn/ = σ(Bn

♦), where Bn♦ is


the set of all subsets of Mn/ having the form {p ∈ Mn

/ : p(B) = k} with B ∈ Band k ∈ Nn. We also consider Mn

/ as a topological space: Let Un/ be the set of

all non-empty subsets of Mn/ having the form

{p ∈Mn/ : p(C) = vC for all C ∈ N}

with N a finite subset of C and {vC}C∈N a sequence from Nn. Clearly for eachp ∈Mn

/ there exists U ∈ Un/ with p ∈ U and if U1, U2 ∈ Un

/ and p ∈ U1 ∩U2 thenthere exists U ∈ Un

/ with p ∈ U ⊂ U1 ∩ U2. Thus Un/ is the base for a topology

On/ on Mn

/ . This means that U ∈ On/ if and only if for each p ∈ U there exists a

finite subset N of C such that

{q ∈Mn/ : q(C) = p(C) for all C ∈ N} ⊂ U .

Proposition 16.7 The topological space Mn/ is compact and metrisable and Bn

/

is the Borel σ-algebra of Mn/ .

Proof We start by showing that the topology On/ on Mn

/ is given by a metric.Let {Ck}k≥1 be an enumeration of the elements in the countable set C and definea mapping % : Mn

/ ×Mn/ → R+ by

%(p, q) =∑

k≥1

2−k|p(Ck) − q(Ck)| .

If %(p, q) = 0 then p(C) = q(C) for all C ∈ C and hence by Proposition 3.4 p = q.Thus % is a metric since by definition it is symmetric and it is clear that thetriangle inequality holds. Moreover, if p ∈Mn

/ then for each ε > 0 there exists afinite subset N of C such that

{q ∈Mn/ : q(C) = p(C) for all C ∈ N} ⊂ {q ∈Mn

/ : %(q, p) < ε}

and for each finite subset N of C there exists ε > 0 such that

{q ∈Mn/ : %(q, p) < ε} ⊂ {q ∈Mn

/ : q(C) = p(C) for all C ∈ N} .

This means that On/ is the topology given by the metric %. Note that if {pk}k≥1

is a sequence from Mn/ and p ∈ Mn

/ then limk pk = p (i.e., limk %(pk, p) = 0) ifand only if limk pk(C) = p(C) for each C ∈ C.

In order to show that Mn/ is compact it is enough to show that the metric space

Mn/ is sequentially compact. Let {pk}k≥1 be a sequence of elements ofMn

/ . By theusual diagonal argument (Theorem 23.1) there exists a subsequence {kj}j≥1 suchthat limj pkj

(C) exists for each C ∈ C. Define p : C → R+ by p(C) = limj pkj(C).

Then p is clearly finitely additive and p(M) = m and so by Proposition 16.6 p isa measure on C which has a unique extension to a measure (also denoted by p)


on B. But D = {B ∈ Bn/ : p(B) ∈ Nn} is a monotone class containing the algebra

C and thus p ∈Mn/ . Therefore p ∈Mn

/ and limj %(pkj, p) = 0 and this shows that

the metric space Mn/ is sequentially compact.

It remains to show that Bn/ is the Borel σ-algebra of Mn

/ . First, the set Un/ is

countable and so each element of On/ can be written as a countable union of

elements from Un/ . Thus On

/ ⊂ σ(Un/ ), which implies that σ(On

/ ) = σ(Un/ ), since

Un/ ⊂ On

/ . Second, each element of Un/ is a finite intersection of elements from Bn

♦

and hence Un/ ⊂ Bn

/ . This shows that σ(On/ ) = σ(Un

/ ) ⊂ Bn/ . Finally, let k ∈ Nn

and let D be the set of those B ∈ B for which {p ∈ Mn/ : p(B) = k} ∈ σ(On

/ ).Then C ⊂ D and D is a monotone class, and so by Proposition 2.2 D = B, andthis means that {p ∈ Mn

/ : p(B) = k} ∈ σ(On/ ) for all B ∈ B, k ∈ Nn, i.e.,

Bn♦ ⊂ σ(On

/ ). Therefore Bn/ = σ(Bn

♦) ⊂ σ(On/ ), and this shows Bn

/ = σ(On/ ).

A measurable space (X, E) is separable if {x} ∈ X for all x ∈ X. We next lookat separable countably generated measurable spaces, and first need the followingalmost trivial but useful fact.

Lemma 16.2 If f : X → Y is any mapping then f(f−1(F )) = F ∩ f(X) holdsfor all F ⊂ Y .

Proof If y ∈ f(f−1(F )) then there exists x ∈ f−1(F ) with y = f(x) and theny ∈ F . Hence y ∈ F ∩ f(X), i.e., f(f−1(F )) ⊂ F ∩ f(X). On the other hand, ify ∈ F ∩ f(X) then there exists x ∈ X with y = f(x), thus x ∈ f−1(F ) and sof ∈ f(f−1(F )), i.e., F ∩ f(X) ⊂ f(f−1(F )).

Proposition 16.8 Let (X, E) be a separable and countably generated measurablespace and let f : X →M be a mapping with f−1(B) = E ; put A = f(X). Thenf is injective, and if f is considered as a bijective mapping from X onto A thenf−1(B|A) = E .

Proof Let x1, x2 ∈ X with f(x1) = f(x2) = z. Since f−1(B) = E and {x} ∈ Efor each x ∈ X there exist B1, B2 ∈ B with f−1(B1) = {x1} and f−1(B2) = {x2}.Therefore by Lemma 16.2

B1 ∩ f(X) = f(f−1(B1)) = f({x1}) = {z} = f({x2}) = f(f−1(B2)) = B2 ∩ f(X)

and thus (since f−1(B) = f−1(B ∩ f(X)) holds for all B ⊂M)

{x1} = f−1(B1 ∩ f(X)) = f−1({z}) = f−1(B2 ∩ f(X)) = {x2} ;

i.e., x1 = x2. This shows f is injective. Now let B ∈ B|A; then B = B′ ∩ f(X)for some B′ ∈ B, which implies that f−1(B) = f−1(B′) ∈ E . But if E ∈ E then


E = f−1(B) for some B ∈ B, thus B ∩ f(X) ∈ B|A and f−1(B ∩ f(X)) = E.Hence f−1(B|A) = E (and note that this part is true without assuming (X, E) isseparable).

Let us say that measurable spaces (X, E) and (Y,F) are isomorphic if thereexists a bijective mapping f : X → Y such that f−1(F) = E . In this case themapping E 7→ f(E) maps E bijectively onto F . Proposition 16.8 thus says thatany separable countably generated measurable space is isomorphic to (B,B|B) forsome B ⊂M . Conversely, by Proposition 16.3 (2) the measurable space (B,B|B)is separable and countably generated for each B ⊂M .

We now look at what are called atoms in a measurable space. Let (X, E) be ameasurable space and for each x ∈ X let ax be the intersection of all the elementsin E containing x. Thus x ∈ ax and for all x, y ∈ X either ax = ay or ax∩ay = ∅.A subset A of X is called an atom of E if A = ax for some x ∈ X and the setof all atoms of E will be denoted by A(E). Thus A(E) defines a partition of X:For each x ∈ X there is a unique atom A ∈ A(E) with x ∈ A. (Of course, if(X, E) is separable then ax = {x} for each x ∈ X.) In general atoms need notbe measurable, i.e., it will not always be the case that A(E) ⊂ E . However, thisproblem does not arise if (X, E) countably generated:

Lemma 16.3 Let (X, E) be a countably generated measurable space. If (Y,F) isa separable measurable space and f : X → Y a mapping with f−1(F) = E then

A(E) = {f−1({z}) : z ∈ f(X)} .

In particular, A(E) = {f−1({z}) : z ∈ f(X)} for each mapping f : X →M withf−1(B) = E which, together with Proposition 16.1, shows that A(E) ⊂ E .

Proof Let z ∈ f(X), put A = f−1({z}) and consider x ∈ A. If E ∈ E with x ∈ Ethen there exists F ∈ F with f−1(F ) = E and so f(x) ∈ F , from which it followsthat A = f−1({z}) ⊂ f−1(F ) = E. Since A ∈ F , this shows that A = ax for allx ∈ A. Therefore A(F) = {f−1({z}) : z ∈ f(X)}, since clearly x ∈ f−1({f(x)})for each x ∈ X.

Lemma 16.4 Let (X, E) and (Y,F) be countably generated measurable spacesand let f : X → Y be a surjective mapping with f−1(F) = E . Then

A(E) = {f−1(A) : A ∈ A(F)} .

Proof By Proposition 16.1 there exists a mapping g : Y →M with g−1(B) = F .Then g ◦ f : X →M with (g ◦ f)−1(B) = E and thus by Lemma 16.3

A(E) = {(g ◦ f)−1({z}) : z ∈ (g ◦ f)(X)}

= {f−1(g−1({z})) : z ∈ g(Y )} = {f−1(A) : A ∈ A(F)} .


Proposition 16.9 Let (X, E), (Y,F) be countably generated measurable spaces,and so by Proposition 16.3 (4) (X × Y, E ×F) is also countably generated. Then

A(E × F) = {A×B : A ∈ A(E), B ∈ A(F)} .

Proof Let R (resp. R′) denote the set of measurable rectangles in X×Y (resp. inM ×M). By Proposition 16.1 there exist mappings f : X →M and g : Y →Mwith f−1(B) = E and g−1(B) = F . Define a mapping h : X × Y → M ×M byh(x, y) = (f(x), g(y)). If B × C ∈ R′ then h−1(B × C) = f−1(B) × g−1(C) ∈ R.On the other hand, if E × F ∈ R then there exist B, C ∈ B with E = f−1(B)and F = f−1(C) and then E × F = h−1(B × C) ∈ h−1(R′). Hence R = h−1(R′)and therefore by by Proposition 2.3

E × F = σ(R) = σ(h−1(R′)) = h−1(σ(R′)) = h−1(B × B) .

Moreover, (M ×M,B × B) is separable and therefore by Lemma 16.3

A(E × F) = {h−1({(z1, z2)}) : (z1, z2) ∈ h(X × Y )}

= {f−1({(z1)}) × g−1({z2)}) : z1 ∈ f(X), z2 ∈ g(Y )}

= {A×B : A ∈ A(E), B ∈ A(F)} .

Here is another result of this type, whose proof is almost identical with the proofof Proposition 16.9.

Proposition 16.10 Let X be a non-empty set and let E ⊂ P(X), F ⊂ P(X) becountably generated σ-algebras. Then E ∨ F is also countably generated and

A(E ∨ F) = {A ∩ B : A ∈ A(E), B ∈ A(F)} \ {∅} .

Thus each atom of E ∨ F is the intersection of an atom of E with an atom ofF and, conversely, the intersection of an atom of E with an atom of F is eitherempty or an atom of E ∨ F .

Proof Let D = {E ∩F : E ∈ E , F ∈ F}; thus E ∪F ⊂ D ⊂ E ∨F which impliesthat E ∨ F = σ(D). Now by Proposition 16.1 there exist mappings f : X →M ,g : X →M with f−1(B) = E and g−1(B) = F . Define a mapping h : X →M×Mby h(x) = (f(x), g(x)) for all x ∈ X. Then h−1(B×C) = f−1(B)∩g−1(C) ∈ D forall B, C ∈ B. On the other hand, if E ∈ E and F ∈ F then there exist B, C ∈ Bwith E = f−1(B) and F = f−1(C) and then E ∩ F = h−1(B × C). ThusD = h−1(R) where R = {B × C : B, C ∈ B}, and therefore by Proposition 2.4

E ∨ F = σ(D) = σ(h−1(R)) = h−1(σ(R)) = h−1(B × B) .


In particular, by Proposition 16.3 (3) (X, E ∨ F) is countably generated, sinceby Proposition 16.3 (4) (M × M,B × B) is countably generated. Moreover,(M ×M,B × B) is separable and therefore by Lemma 16.3

A(E ∨ F) = {h−1({(z1, z2)}) : (z1, z2) ∈ h(X)}

= {f−1({(z1)}) ∩ g−1({z2)}) : (z1, z2) ∈ h(X)}

= {A ∩ B : A ∈ A(E), B ∈ A(F)} \ {∅} .

17 The Dunford-Pettis theorem

Let (X, E) be a measurable space and again denote the set of probability measureson (X, E) by P(E). A subset Q of P(E) is equicontinuous if for each decreasingsequence {En}n≥1 from E with

⋂

n≥1En = ∅ and each ε > 0 there exists p ≥ 1so that µ(Ep) < ε for all µ ∈ Q.

The following is the elementary (but more useful) half of the the Dunford-Pettistheorem. (The proof of the converse can be found in Dunford and Schwartz [5],Chapter IV.9.) This result plays a fundamental role in Specifications and theirGibbs states [16].

Theorem 17.1 Let Q ⊂ P(E) be equicontinuous; then for each sequence {µn}n≥1

from Q there exists a subsequence {nj}j≥1 and a measure µ ∈ P(E) such thatµ(E) = limj µnj

(E) for all E ∈ E .

The rest of the chapter is taken up with a proof of this result. Before startinghowever, note the following simple fact about the convergence in Theorem 17.1:

Lemma 17.1 Let {µn}n≥1 be a sequence from P(E) and let µ ∈ P(E). Thenµ(E) = limn µn(E) holds for all E ∈ E if and only if µ(f) = limn µn(f) for allf ∈ MB(E).

Proof Suppose µ(E) = limn µn(E) for all E ∈ E ; then also µ(g) = limn µn(g) forall g ∈ ME(E). Let f ∈ MB(E); then for each ε > 0 there exists by Lemma 9.6 (2)g ∈ ME(E) with g ≤ f ≤ g + ε. Thus

lim supn→∞

µn(f) ≤ lim supn→∞

µn(g + ε) = limn→∞

µn(g) + ε = µ(g) + ε ≤ µ(f) + ε

and in the same way

µ(f) ≤ µ(g) + ε = limn→∞

µn(g) + ε ≤ lim infn→∞

µn(f) + ε .

From this it follows that µ(f) = limn µn(f). The converse is clear.

We say that (X, E) has the weak sequential compactness property if wheneverQ ⊂ P(E) is equicontinuous then for each sequence {µn}n≥1 from Q there existsa subsequence {nj}j≥1 and µ ∈ P(E) such that µ(E) = limj µnj

(E) for all E ∈ E .Theorem 17.1 therefore states that every measurable space has this property. Wefirst show that (M,B) has the property (with (M,B) as in Chapter 16), then usethis to show that Theorem 17.1 holds when (X, E) is countably generated, andfinally deal with the general case.

108


Lemma 17.2 The space (M,B) has the weak sequential compactness property.

Proof We use the notation from Chapter 16, so C is the algebra of cylinder sets.Let Q ⊂ P(B) be equicontinuous and let {µn}n≥1 be a sequence from Q. Since Cis countable and the values µn(C) all lie in the compact interval [0, 1] the usualdiagonal argument (Theorem 23.1) implies there exists a subsequence {nj}j≥1

and ν : C → R+ so that ν(C) = limj µnj(C) for all C ∈ C. But ν is clearly

additive and ν(M) = 1 and hence by Proposition 16.6 there exists µ ∈ P(B) withµ(C) = ν(C) for all C ∈ C; thus µ(C) = limj µnj

(C) for all C ∈ C. Let

K ={

B ∈ B : µ(B) = limj→∞

µnj(B)

}

;

then C ⊂ K and σ(C) = B, and so by Proposition 2.1 it is enough to show thatK is a monotone class. Let {Bn}n≥1 be an increasing sequence from K and putB =

⋃

n≥1Bn. For each p ≥ 1 let Ap = B \ Bp; then {Ap}p≥1 is a decreasingsequence from B with

⋂

p≥1Ap = ∅. Let ε > 0; there thus exists p ≥ 1 so thatµ(Ap) < ε/3 and so that ω(Ap) < ε/3 for all ω ∈ Q. Moreover, since Bp ∈ K,there exists m ≥ 1 so that |µ(Bp) − µnj

(Bp)| < ε/3 for all j ≥ m. Hence

|µ(B) − µnj(B)| ≤ |µ(Bp) − µnj

(Bp)| + µ(Ap) + µnj(Ap) < ε

for all j ≥ m, and so µ(B) = limj µnj(B), i.e., B ∈ K. The case of a decreasing

sequence from K is almost exactly the same.

Now let (X, E) and (Y,F) be measurable spaces.

Lemma 17.3 If there exists a mapping f : X → Y with f−1(F) = E and (Y,F)has the weak sequential compactness property then so does (X, E).

Proof For each µ ∈ P(E) denote the image measure f∗µ ∈ P(F) by µ′. LetQ ⊂ P(E) be equicontinuous; then the subset Q′ = {µ′ : µ ∈ Q} of P(F) is alsoequicontinuous: If {Fn}n≥1 is a decreasing sequence from F with

⋂

n≥1 Fn = ∅

and En = f−1(Fn) for each n ≥ 1 then {En}n≥1 is a decreasing sequence fromE with

⋂

n≥1En = ∅; thus, given ε > 0, there exists p ≥ 1 so that µ(Ep) < εfor all µ ∈ Q and hence µ′(Fp) = µ(Ep) < ε for all µ′ ∈ Q′. Let {ν}n≥1 be asequence from Q; thus {ν ′}n≥1 is a sequence from Q′, and so if (Y,F) has theweak sequential compactness property then there exists a subsequence {nj}j≥1

and ν ∈ P(F) such that ν(F ) = limj µ′nj

(F ) for all F ∈ F . In particular, ifF ∈ F with F ∩ f(X) = ∅ then

ν(F ) = limj→∞

µ′nj

(F ) = limj→∞

µnj(f−1(F )) = lim

j→∞µnj

(∅) = 0


and so by Theorem 13.1 there exists a µ ∈ P(E) with ν = f∗µ. Let E ∈ E ; thenE = f−1(F ) for some F ∈ F and therefore

µ(E) = µ(f−1(F )) = ν(F ) = limj→∞

µ′nj

(F ) = limj→∞

µnj(f−1(F )) = lim

j→∞µnj

(E) .

This shows that (X, E) has the weak sequential compactness property.

Proposition 17.1 A countably generated measurable space (X, E) has the weaksequential compactness property.

Proof This follows from Proposition 16.1 and Lemmas 17.2 and 17.3.

We turn to the general case, so now let (X, E) be an arbitrary measurable space.The following standard device (to be found, for example, in Dunford and Schwartz[5], Chapter IV.9.) will be made use of:

Lemma 17.4 Let {µn}n≥1 be any sequence from P(E); then there exists ν ∈ P(E)such that µn � ν for all n ≥ 1.

Proof Just take, for example, ν =∑

n≥1 2−nµn.

Let Q be an equicontinuous subset of P(E) and let {µn}n≥1 be a sequence fromQ. By Lemma 17.4 there exists ν ∈ P(E) such that µn � ν for each n ≥ 1; henceby Theorem 12.1 there exists hn ∈ M(E) such that µn = ν·hn. Now consider anycountably generated σ-algebra F ⊂ E . For each µ ∈ P(E) let µ′ ∈ P(F) be therestriction of µ to F . Then Q′ = {µ′ : µ ∈ Q} is an equicontinuous subset ofP(F) and {µ′

n}n≥1 is a sequence from Q′. Thus by Proposition 17.1 there exists asubsequence {nj}j≥1 and ω ∈ P(F) such that ω(F ) = limj µ

′nj

(F ) for all F ∈ F .But if F ∈ F with ν ′(F ) = 0 then ν(F ) = 0, thus ω(F ) = limj µ

′nj

(F ) = 0and hence ω � ν ′. Therefore by Theorem 12.1 there exists h ∈ M(F) such thatω = ν ′·h. Now define µ ∈ P(E) by µ = ν·h, and so µ(E) = ν(hIE) for all E ∈ E .In particular µ(F ) = limj µnj

(F ) for all F ∈ F .

Lemma 17.5 Suppose hn ∈ M(F) for each n ≥ 1. Then µ(E) = limj µnj(E) for

all E ∈ E .

Proof Let E ∈ E ; then IE ≤ 1 and therefore by Theorem 12.4 there existsg ∈ MB(F) with g ≤ 1 such that ν(fg) = ν(fIE) for all f ∈ M(F). Thenµ(E) = ν(hIE) = ν(hg) = µ(g) and µn(E) = ν(hnIE) = ν(hng) = µn(g) for eachn ≥ 1 and therefore by Lemma 17.1

µ(E) = µ(g) = limj→∞

µnj(g) = lim

j→∞µnj

(E) .

The next result completes the proof of Theorem 17.1.


Lemma 17.6 There exists a countably generated σ-algebra F ⊂ E such thathn ∈ M(F) for each n ≥ 1.

Proof Let J be the countable set consisting of all elements of E of the form{x ∈ X : hn(x) > r} with n ≥ 1 and r ∈ Q+, and put F = σ(J ). Then F ⊂ Eand F is countably generated. But if a ∈ R+ then there is a decreasing sequence{rm}m≥1 from Q+ with limm rm = a and thus

{x ∈ X : hn(x) > a} =⋂

m≥1

{x ∈ X : hn(x) > rm} ∈ F .

This implies that hn ∈ M(F) for each n ≥ 1.

18 Substandard Borel spaces

There are several important results, including some which play an importantrole in [16] (such as the Kolmogorov extension theorem), which do not hold ingeneral for arbitrary measurable spaces and not even for arbitrary countablygenerated measurable spaces. In such situations it is usual to resort to standardBorel spaces, since in most applications the measurable spaces will be standardBorel and the results which are needed can be proved within this framework. (Ameasurable space (X, E) is standard Borel if there exists a metric on X whichmakes it a complete separable metric space in such a way that E is then the Borelσ-algebra.)

The theory of standard Borel spaces has therefore established itself as a kind ofstandard ‘advanced’ measure theory. Unfortunately, a first acquaintance with thistheory can be a bit off-putting, and many tend to regard it as simply providing acollection of very useful tools whose validity has to be taken for granted. We thusprefer to work with a cheap alternative which we call substandard Borel spaces.The term ‘substandard’ should here be considered in the sense of indicating a(pattern of linguistic) usage which does not conform to that of the prestige groupin a (speech) community.

Substandard Borel spaces have the advantage of being easier to understand whileat the same time being good enough to be a replacement for standard Borelspaces in results such as the Kolmogorov extension theorem.

Their definition involves the space (M,B) introduced in Chapter 16. We say thata measurable space (X, E) a substandard Borel space if there exists a mappingf : X → M with f−1(B) = E such that f(X) ∈ B. (This additional condition isnowhere near as harmless as it might first appear.) Thus by Proposition 16.1 asubstandard Borel space is countably generated.

The notion of a substandard Borel space already occurs implicitly in Chapter Vof Parthasarthy [15] in the proof of the Kolmogorov extension theorem for theinverse limit of standard Borel spaces.

The relationship between standard Borel and substandard Borel spaces will belooked at in Chapter 22. In fact, Proposition 18.1 below implies that a standardBorel space is substandard Borel, and for a separable measurable space (i.e., ameasurable space (X, E) with {x} ∈ E for all x ∈ X) the converse is true, but aproof requires the typical machinery associated with standard Borel spaces.

Lemma 18.1 Let (X, E) be substandard Borel and let f : X →M be a mappingwith f−1(B) = E and f(X) ∈ B. Then f(E) ∈ B for all E ∈ E .

Proof Let E ∈ E ; then since f−1(B) = E there exists B ∈ B with f−1(B) = E,and thus by Lemma 16.2 f(E) = f(f−1(B)) = B ∩ f(X) ∈ B.

112


Lemma 18.2 Let (X, E) be a measurable space and let (Y, E) be substandardBorel. If there exists a mapping g : X → Y with g−1(F) = E and g(X) ∈ F then(X, E) is a substandard Borel space.

Proof Since (Y,F) is substandard Borel there exists a mapping f : Y →M withf−1(B) = F and f(Y ) ∈ B; put h = f ◦ g. Then h−1(B) = g−1(F) = E and byLemma 18.1 h(X) = f(g(X)) ∈ B. Therefore (X, E) is substandard Borel.

Proposition 18.1 Let X be a complete separable metric space and BX be theσ-algebra of Borel subsets of X. Then (X,BX) is a substandard Borel space.

Proof To start with consider the closed interval I = [0, 1] and let b : M → I bethe mapping with b({zn}n≥0) =

∑

n≥0 2−n−1zn; then b is continuous and henceby Lemma 2.5 b−1(BI) ⊂ B, with BI the σ-algebra of Borel subsets of I. Nowfor each C ∈ C there is a dyadic interval J such that b−1(J) = C and henceb−1(BI) ⊃ C, which implies that b−1(BI) ⊃ σ(C) = B, i.e., b−1(BI) = B. Put

N ={

{zn}n≥0 ∈ M : z0 = 0 and zn = 1 for all n ≥ m for some m ≥ 1}

;

then N is countable and b maps Mo = M \N bijectively onto I. Let v : I →Mbe the unique mapping with v(b(z)) = z for all z ∈Mo; then v(I) = Mo and so inparticular v(I) ∈ B. Let B ∈ B; then B ∩Mo ∈ B and thus there exists A ∈ BI

with b−1(A) = B ∩Mo, which implies v−1(B) = v−1(B ∩Mo) = A ∈ BI . Thisshows v−1(B) ⊂ BI . But if A ∈ BI then b−1(A) ∈ B and v−1(b−1(A)) = A, andhence v−1(B) = BI . We therefore have a mapping v : I → M with v−1(B) = BI

and v(I) ∈ B.

Now define v� : IN → MN by v�({xn}n≥0) = {v(xn)}n≥0. Then parts (2) and (3)of Proposition 2.6 imply that v−1

� (BN) = BN

I and v�(IN) = f(I)N ∈ BN, where

BN

I and BN are the product σ-algebras on IN and MN. But by Proposition 16.4the topological space MN (with the product topology) is homeomorphic to Mand if u : MN → M is a homeomorphism then u−1(B) = BN, and thus alsou(BN) = B. Put g = u◦v�; then g : IN →M is a mapping with g−1(B) = BN

I andg(IN) ∈ B. This shows that (IN,BN) is a substandard Borel space. Moreover,by Proposition 2.10 the product σ-algebra BN

I is the Borel σ-algebra of IN withthe product topology (since N is countable and I has a countable base for itstopology).

Finally, let (X, d) be a complete separable metric space. Then there is a standardconstruction (given below) producing a continuous injective mapping h : X → IN

such that h is a homeomorphism from X to h(X) (with the relative topology)and such that h(X) is the intersection of a sequence of open subsets of IN, andso in particular with h(X) ∈ BN

I . By Lemma 2.5 h−1(BN

I ) ⊂ BX . On the other


hand, let U ⊂ X be open; since h : X → h(X) is a homeomorphism there existsan open subset V of IN with h−1(h(X) ∩ V ) = U . But then h−1(V ) = U , andthis shows that OX ⊂ h−1(BN

I ), where OX is the set of open subsets of X. Henceby Proposition 2.4 BX = σ(OX) ⊂ σ(h−1(BN

I )) = h−1(σ(BN

I )) = h−1(BN

I ).

We thus have a mapping h : X → IN with h−1(BN

I ) = BX and h(X) ∈ BN

I

and (IN,BN) is substandard Borel. Therefore by Lemma 18.2 (X,BX) is alsosubstandard Borel.

Here is how the mapping h can be constructed: Choose a dense sequence ofelements {xn}n≥0 from X and for each n ≥ 0 let hn : X → I be the continuousmapping given by hn(x) = min{d(x, xn), 1} for each x ∈ X. If x, y ∈ X withx 6= y then there exists n ≥ 0 such that hn(x) 6= hn(y) (since if n ≥ 0 is suchthat d(x, xn) < ε, where ε = 1

2min{d(x, y), 1}, then hn(x) < ε < hn(y)). Define

h : X → IN by letting h(x) = {hn(x)}n≥0 for each x ∈ X. Then h is continuous(since pn ◦ h = hn is continuous for each n ≥ 0, with pn : IN → I the projectiononto the n th component) and injective. Moreover, for each x ∈ X and each0 < ε < 1 there exists n ≥ 0 so that |hn(y) − hn(x)| > ε/2 for all y ∈ X withd(y, x) > ε. (Just take n ≥ 0 so that d(x, xn) < ε/4.) This implies that thebijective mapping h : X → h(X) is a homeomorphism from X to h(X) with therelative topology. (Note that the completeness of X was not needed here.)

The topological space IN is metrisable (since N is countable and I is metrisable).Let δ be any metric generating the topology on IN, and for each n ≥ 0 let

Un = {y ∈ IN : δ(y, y′) < 2−n for some y′ ∈ h(X)} .

Then {Un}n≥0 is a decreasing sequence of open subsets of IN with h(X) ⊂ Un foreach n ≥ 0. In fact h(X) =

⋂

n≥0 Un: Let y ∈⋂

n≥0 Un; then for each n ≥ 0 thereexists yn ∈ h(X) with δ(y, yn) < 2−n and so {yn}n≥0 is a Cauchy sequence inh(X). Thus {xn}n≥0 is a Cauchy sequence in X, where xn is the unique elementwith h(xn) = yn. Since X is complete the sequence {xn}n≥0 has a limit x ∈ Xand then h(x) = y, i.e., y ∈ h(X). This shows that h(X) is the intersection of asequence of open subsets of IN.

Proposition 18.2 (1) If (X, E) is substandard Borel and Y ∈ E is non-emptythen (Y, E|Y ) is a substandard Borel space.

(2) Let S be a countable set and for each s ∈ S let (Xs, Es) be a substandardBorel space. Then the product measurable space (X, E) is also substandard Borel.

(3) Let S be a non-empty countable set and for each s ∈ S let (Xs, Es) be asubstandard Borel space; assume that the sets Xs, s ∈ S, are disjoint. Then thedisjoint union measurable space (X, E) is substandard Borel.

(4) Let (X, E) be a substandard Borel space. Then the measurable space (X/, E/)is substandard Borel. (Recall that X/ denotes the set of all measures on (X, E)


taking only values in the set N and that E/ = σ(E♦), where E♦ is the set of allsubsets of X/ having the form {p ∈ X/ : p(E) = k} with E ∈ E and k ∈ N.)

Proof (1) Since (X, E) is substandard Borel there exists a mapping f : X →Mwith f−1(B) = E and f(X) ∈ B; let f|Y be the restriction of f to Y . If B ∈ Bthen f−1

|Y (B) = f−1(B) ∩ Y ∈ E|Y and so f−1|Y (B) ⊂ E|Y . On the other hand, if

A ∈ E|Y then A = E ∩ Y with E ∈ E and there exists B ∈ B with f−1(B) = E.Thus f−1

|Y (B) = f−1(B) ∩ Y = E ∩ Y = A, which implies f−1|Y (B) = E|Y . This

means that f|Y : Y → M is a mapping with f−1|Y (B) = E|Y . But f|Y (Y ) = f(Y )

and so by Lemma 18.1 f|Y (Y ) ∈ B. Hence (Y, E|Y ) is substandard Borel.

(2) This is trivially true if S = ∅, so we can assume S is non-empty. For eachs ∈ S there exists a mapping fs : Xs → M with f−1

s (B) = Es and fs(Xs) ∈ B.Define f : X → MS by letting f({xs}s∈S) = {fs(xs)}s∈S for each {xs}s∈S ∈ X.Then by Proposition 2.6 (2) and (3) f−1(BS) = E and f(X) ∈ BS . But byLemma 18.2 and Proposition 16.4 (MS ,BS) is substandard Borel and thus byLemma 18.2 (X, E) is substandard Borel.

(3) For each s ∈ S there exists a mapping fs : Xs → M with f−1s (B) = Fs

and fs(X) ∈ B. Define f : X → S ×M by letting f(x) = (s, fs(x)) for eachx ∈ Xs, s ∈ S. Then by Proposition 2.9 (2) and (3) f−1(P(S) × B) = E andf(X) ∈ P(S) × B. But by Propositions 18.1 and 16.5 (S × M,P(S) × B) issubstandard Borel and thus by Lemma 18.2 (X,F) is substandard Borel.

(4) There exists a mapping f : X → M with f−1(B) = E and f(X) ∈ B. Letf/ : X/ → M/ be the mapping defined in Chapter 13 with f/(p) = f∗p for allp ∈ X/. Then by Proposition 13.2 (2) and (3) (f/)

−1(B/) = E/ and f/(X/) ∈ B/.But by Proposition 13.3 (M/,B/) is the disjoint union of the measurable spaces(Mn

/ ,Bn/ ), n ≥ 1, and by Propositions 18.1 and 16.7 (Mn

/ ,Bn/ ) is substandard

Borel for each n ∈ N. Thus by (3) (M/,B/) is substandard Borel and so byLemma 18.2 (X/,F/) is substandard Borel.

Proposition 16.8 says that any separable substandard Borel space is isomorphicto (B,B|B) for some B ∈ B. Conversely, by Proposition 18.2 (1) the measurablespace (B,B|B) is a separable substandard Borel space for each B ∈ B.

19 The Kolmogorov extension property

In the following let (X, E) be a measurable space and {En}n≥0 be an increasingsequence of countably generated sub-σ-algebras of E with E = σ(

⋃

n≥0 En).

A sequence of measures {µn}n≥0 with µn ∈ P(En) for each n ≥ 0 is said to beconsistent if µn(E) = µn+1(E) for all E ∈ En, n ≥ 0. The sequence {En}n≥0 hasthe Kolmogorov extension property if for each consistent sequence {µn}n≥0 thereexists µ ∈ P(E) with µ(E) = µn(E) for all E ∈ En, n ≥ 0. (This measure µis then unique, since it is uniquely determined by the sequence {µn}n≥0 on thealgebra A =

⋃

n≥0 En and σ(A) = E .)

Finally, the σ-algebra E is called the inverse limit of the sequence {En}n≥0 if⋂

n≥0An 6= ∅ holds whenever {An}n≥0 is a decreasing sequence of atoms withAn ∈ A(En) for each n ≥ 0.

Theorem 19.1 Let (X, En) be substandard Borel for each n ≥ 0 and let E be theinverse limit of the sequence {En}n≥0. Then (X, E) is also substandard Borel andthe sequence {En}n≥0 has the Kolmogorov extension property.

Theorem 19.1 (for standard Borel spaces) is the form of the Kolmogorov extensiontheorem occurring in Chapter V of Parthasarathy [15]. In fact the proof in [15]is implicitly a proof for substandard Borel spaces.

Theorem 19.1 can be combined with Theorem 17.1 to give the useful result whichfollows. This also deals with the set-up described above, thus E is the inverselimit of the sequence {En}n≥0 with (X, En) substandard Borel for each n ≥ 0.The algebra

⋃

n≥0 En will be denoted by A.

A subset Q of P(E) is said to be locally equicontinuous if for each n ≥ 0 therestrictions of the measures in Q to En are equicontinuous, i.e., if for each n ≥ 0,each decreasing sequence {Ek}k≥1 from En with

⋂

k≥1Ek = ∅ and each ε > 0there exists p ≥ 1 so that µ(Ep) < ε for all µ ∈ Q.

Theorem 19.2 Let Q be a locally equicontinuous subset of P(E) and {µn}n≥1

be a sequence of elements from Q. Then there exists a subsequence {nj}j≥1 andµ ∈ P(E) such that µ(E) = limj µnj

(E) for all E ∈ A.

Proof Applying Theorem 17.1 to the restrictions of the measures in Q to Ek

for each k ≥ 0 and employing the usual diagonal argument (Theorem 23.1) thereexists a subsequence {nj}j≥1 and for each k ≥ 0 a probability measure νk ∈ P(Ek)such that νk(E) = limj µnj

(E) for allE ∈ Ek, k ≥ 0. But the sequence of measures{νk}k≥0 is then clearly consistent and so by Theorem 19.1 there exists µ ∈ P(E)

116


such that µ(E) = νk(E) for all E ∈ Ek, k ≥ 0. Thus µ(E) = limj µnj(E) for all

E ∈ A.

Before coming to the proof of Theorem 19.1 let us first consider an example.Let S be a countably infinite set and for each s ∈ S let (Xs, Es) be a separablesubstandard Borel space; put X =

∏

s∈S Xs and E =∏

s∈S Es. It thus followsfrom Proposition 18.2 (2) that (X, E) is substandard Borel.

For each non-empty finite subset Λ ⊂ S put XΛ� =

∏

s∈ΛXs and EΛ� =

∏

s∈Λ Es

and let pΛ : X → XΛ be the projection mapping with pΛ({xs}s∈S) = {xs}s∈Λ;again by Proposition 18.2 (2) (XΛ

� , EΛ� ) is a substandard Borel space, which is

separable (since each point is a measurable rectangle). Put EΛ = p−1Λ (EΛ

� ); thenEΛ is a sub-σ-algebra of E and by Lemma 18.2 (X, EΛ) is substandard Borel,since pΛ is surjective. Now choose an increasing sequence {Λn}n≥0 of non-emptyfinite subsets of S with S =

⋃

n≥0 Λn. Then {EΛn}n≥0 is an increasing sequence

of sub-σ-algebras of E with E = σ(⋃

n≥0 EΛn) (since every measurable rectangle

lies in⋃

n≥0 EΛn), and (X, EΛn) is a substandard Borel space for each n ≥ 0.

Moreover, E is the inverse limit of the sequence {EΛn}n≥0: Let {An}n≥0 be adecreasing sequence of atoms with An ∈ A(EΛn) for each n ≥ 0. By Lemma 16.4there exists for each n ≥ 0 a (unique) point xn ∈ XΛn

� with An = p−1Λn

({xn}), and

since An+1 ⊂ An it follows that xn = τn(xn+1), with τn the projection of XΛn+1

�

onto XΛn� . There therefore exists a unique point x ∈ X with xn = pΛn

(x) foreach n ≥ 0, which means that x ∈

⋂

n≥1An, and so in particular⋂

n≥1An 6= ∅.Thus by Theorem 19.1 (X, E) is substandard Borel (which we knew already)and the sequence {EΛn}n≥0 has the Kolmogorov extension property. (In fact, theassumption about the separability of the spaces (Xs, Es) was not really necessary;it was only made to simplify the proof that E is the inverse limit of the sequence{EΛn}n≥0.)

Now for each s ∈ S let µs be a probability measure on Es and for each n ≥ 0let µΛn be the corresponding product measure on EΛn

� . By Theorem 13.1 thereexists a unique νn ∈ P(EΛn) with µΛn = (pΛn

)∗νn (since the mapping pΛnis

surjective and EΛn = p−1Λn

(EΛn� )). Moreover, by checking on rectangles it is easy

to see that the sequence of measures {µn}n≥0 is consistent and hence by theKolmogorov extension property there exists a unique measure µ on E such thatµ(E) = νn(E) for all E ∈ EΛn, n ≥ 0. Again by checking on rectangles it followsthat µ is the product of the measures µs, s ∈ S, whose existence was establishedin Theorem 15.3.

This gives another proof of the existence of the product measure µ, but onlyunder additional assumptions on the measurable spaces (Xs, Es). However, theframework considered here can be used to show the existence of measures whichare far from being product measures, and which the method used in the proof ofTheorem 15.3 cannot deal with.


Proof of Theorem 19.1: The proof is based on that of Theorem 4.1 in Chapter Vof Parthasarathy [15]. In what follows let (X, E) be a measurable space and{En}n≥0 be an increasing sequence of countably generated sub-σ-algebras of E .Suppose that E is the inverse limit of the sequence {En}n≥0 and that (X, En) issubstandard Borel for each n ≥ 0.

In the next two lemmas let E ′ be a countably generated sub-σ-algebra of E .

Lemma 19.1 Let f, g : X →M be mappings with f−1(B) = E ′ and g−1(B) ⊂ E ′;put A = f(X). Then there exists a unique mapping h : A → M such thath ◦ f = g, and then h−1(B) ⊂ B|A.

Proof Let z ∈ A; then by Lemma 16.3 f−1({z}) ∈ A(E ′) and so g(x1) = g(x2)for all x1, x2 ∈ f−1({z}), since g−1({w}) ∈ E ′ for each w ∈M . There thus existsa unique mapping h : A → M such that h ◦ f = g. Now let B ∈ B; then, sincef−1(B) = E ′, there exists B′ ∈ B with f−1(B′) = g−1(B). It follows that

h−1(B) = f(f−1(h−1(B))) = f(g−1(B)) = f(f−1(B′)) = f(X) ∩ B′ = A ∩B′

(since f : X → A is surjective) and therefore h−1(B) ∈ B|A. This shows thath−1(B) ⊂ B|A.

Lemma 19.2 Let f, g : X →M be mappings with f−1(B) = E ′ and g−1(B) ⊂ E ′;let θ : X → M ×M be the mapping with θ = (g, f). Then θ−1(B × B) = E ′ andθ(X) ∈ B×B|A, where A = f(X). In particular, if f(X) ∈ B then θ(X) ∈ B×B.

Proof If B1, B2 ∈ B then θ−1(B1 × B2) = g−1(B1) ∩ f−1(B2) ∈ E ′ and henceθ−1(B × B) ⊂ E ′ (since B × B is the smallest σ-algebra containing all productsof the form B1 × B2 with B1, B2 ∈ B). On the other hand, if E ∈ E ′ thenE = f−1(B) for some B ∈ B and then θ−1(X × B) = E. Thus θ−1(B × B) = E ′.

Now put A = f(X); by Lemma 19.1 there exists a unique mapping h : A → Msuch that h ◦ f = g, and then h−1(B) ⊂ B|A. Let r : M × A → M ×M be themapping with r(z1, z2) = (h(z2), z1); then r−1(B × B) ⊂ B × B|A and

θ(X) = {(g(x), f(x)) : x ∈ X} = {(h(f(x)), f(x)) : x ∈ X}

= {(h(z), z) : z ∈ A} = {(z1, z2) ∈M × A : h(z2) = z1} = r−1(D) ,

where D = {(z, z) : z ∈ M} is the diagonal in M ×M . But D ∈ B × B, sinceD is closed and B × B is the σ-algebra of Borel subsets of M ×M , and henceθ(X) ∈ B × B|A.

Let ∆ : M → M be given by ∆({zn}n≥0) = {z′n}n≥0, where z′n = z2n; thus ∆is continuous and surjective. The key to the whole proof is the following ‘trick’,which appears in Theorem 2.2 of Mackey [13] and is also used in Lemma 4.1 inChapter V of Parthasarathy [15]:


Lemma 19.3 Let E ′1 and E ′

2 be sub-σ-algebras of E with E ′1 ⊂ E ′

2 and with (X, E ′2)

substandard Borel. Let f1 : X → M with f−11 (B) = E ′

1. Then there exists amapping f2 : X →M with f−1

2 (B) = E ′2 and f2(X) ∈ B such that f1 = ∆ ◦ f2.

Proof Since (X, E ′2) is substandard Borel there exists a mapping f : X → M

with f−1(B) = E ′2 and f(X) ∈ B. Define θ : X → M ×M by θ = (f1, f). Then

by Lemma 19.2 (since f1(B) = E ′1 ⊂ E ′

2) θ−1(B × B) = E ′

2 and θ(X) ∈ B × B.Now let h : M ×M →M be the homeomorphism given by

h({zn}n≥0, {z′n}n≥0) = {wn}n≥0 ,

where w2n = zn and w2n+1 = z′n for each n ≥ 0, and let f2 = h◦θ. Then it followsthat f−1

2 (B) = θ−1(B × B) = E ′2 and f2(X) = h(θ(X)) ∈ B. Finally, f1 = ∆ ◦ f2

holds directly from the definition of h and ∆.

Since each (X, En) is a substandard Borel space Lemma 19.3 implies there existsa sequence of mappings {fn}n≥0 from X to M with f−1

n (B) = En and fn(X) ∈ Band such that fn = ∆ ◦ fn+1 for each n ≥ 0.

Consider the product MN as a compact metric space in the usual way; then theproduct σ-algebra BN is also the Borel σ-algebra. Let

M∆ = {{zn}n≥0 ∈MN+

: zn = ∆(zn+1) for all n ≥ 0} ;

M∆ is a closed (and thus compact) subset of MN. Let B∆ be the trace σ-algebra(and so B∆ is the Borel σ-algebra of M∆). For each m ≥ 0 let πm : M∆ → Mbe the mapping given by πm({zn}n≥0) = zm for each {zn}n≥0 ∈ M∆. Then πn

is continuous and surjective (since ∆ is surjective). Moreover, πn = ∆ ◦ πn+1

and this implies that π−1n (B) = π−1

n+1(∆−1(B)) ⊂ π−1

n+1(B) for each n ≥ 0. Thus{π−1

n (B)}n≥0 is an increasing sequence of σ-algebras and B∆ = σ(⋃

n≥0 π−1n (B))

(since πn is the restriction of the projection mapping pn : MN → M). Denotethe algebra

⋃

n≥0 π−1n (B) by A∆.

Now {fn(x)}n≥0 ∈ M∆ for each x ∈ X, since fn = ∆ ◦ fn+1 for each n ≥ 0, andtherefore a mapping f : X → M∆ can be defined by letting f(x) = {fn(x)}n≥0

for each n ≥ 0. This means f : X → M∆ is the unique mapping such thatπn ◦ f = fn for each n ≥ 0. In particular f−1(π−1

n (B)) = f−1n (B) = Fn and hence

f−1(B∆) = f−1(σ(A∆)) = σ(

⋃

n≥1

f−1(π−1n (B))

)

= σ(

⋃

n≥0

En

)

= FE ,

i.e., f−1(B∆) = E . Note that π−1n (fn(X)) ∈ En for each n ≥ 0 (since fn(X) ∈ B)

and that the sequence {π−1n (fn(X))}n≥0 is decreasing (since πn = ∆ ◦ πn+1 and

fn = ∆ ◦ fn+1 for each n ≥ 0).


Lemma 19.4 f(X) =⋂

n≥0

π−1n (fn(X)) and so in particular f(X) ∈ B∆.

Proof Clearly f(X) ⊂⋂

n≥1 π−1n (fn(X)), since πn ◦ f = fn for each n ≥ 0. Thus

consider z =⋂

n≥0 π−1n (fn(X)), so z = {zn}n≥0 ∈ MN with zn ∈ fn(X) and

zn = ∆(zn+1) for each n ≥ 0. Put An = f−1n ({zn}); by Lemma 16.3 An ∈ A(Fn)

and, since zn = ∆(zn+1),

An+1 = f−1n+1({zn+1}) ⊂ f−1

n+1(∆−1({zn+1})) = f−1

n ({zn}) = An

for each n ≥ 0. Therefore by assumption A∞ =⋂

n≥0An 6= ∅, and f(x) = z foreach x ∈ A∞, i.e., z ∈ f(X).

By Proposition 18.1 (M∆,B∆) is substandard Borel and it thus follows fromLemma 19.4 and Lemma 22.2 that (X,F) is substandard Borel.

It remains to show that {En}n≥0 has the Kolmogorov extension property and forthis the following fact will be needed:

Proposition 19.1 Let {νn}n≥0 be a sequence from P(B) with νn = ∆∗νn+1 foreach n ≥ 0. Then there exists a unique measure ν ∈ P(B∆) such that νn = (πn)∗νfor all n ≥ 0.

Proof Again let C ⊂ B be the algebra of cylinder sets. Now ∆−1(C) ⊂ C, since∆ is continuous, and thus π−1

n (C) = π−1n+1(∆

−1(C)) ⊂ π−1n+1(C) for each n ≥ 0,

and therefore {π−1n (C)}n≥0 is an increasing sequence of countable algebras. Put

C∆ =⋃

n≥0 π−1n (C); then C∆ is a countable algebra with C∆ ⊂ A∆ and

σ(C∆) = σ(

⋃

n≥0

π−1n (C)

)

= σ(

⋃

n≥1

π−1n (σ(C))

)

= σ(A∆) = B∆ .

Moreover, each element of C∆ is compact (since the mappings πn are continuousand M∆ is compact), and hence C∆ has the finite intersection property.

Let n ≥ 0; then (since πn is surjective) Theorem 13.1 implies there is a uniqueν ′n ∈ P(π−1

n (B)) with νn = (πn)∗ν′n, and the sequence {ν ′n}n≥1 is consistent in

that ν ′n+1(D) = ν ′n(D) for all D ∈ π−1n (B), n ≥ 0. (Let D ∈ π−1

n (B) withπ−1

n (B) = D = π−1n+1(B

′); then B′ = ∆−1(B), since πn = ∆ ◦ πn+1 and πn+1

is surjective, and so ν ′n+1(D) = νn+1(B′) = νn+1(∆

−1(B)) = νn(B) = ν ′n(D).)There is therefore a unique mapping ν ′ : A∆ → R+ such that ν ′(D) = ν ′n(D)for all D ∈ π−1

n (B), n ≥ 0, and it is clear that ν ′ is finitely additive. Nowthe restriction of ν ′ to C∆ is also finitely additive and hence (as in the proof ofProposition 16.6) there exists a unique ν ∈ P(B∆) with ν(D) = ν ′(D) for allD ∈ A∆. But then the restriction of ν to π−1

n (B) is a probability measure which


is an extension of the restriction of ν ′n to π−1n (C), and π−1

n (C) is an algebra withσ(π−1

n (C)) = π−1n (B). This means that ν is an extension of ν ′n and from this it

immediately follows that νn = (πn)∗ν for each n ≥ 0. The uniqueness of ν isclear, since the requirement that νn = (πn)∗ν for all n ≥ 0 determines ν on thealgebra A∆.

Now let {µn}n≥0 be a consistent sequence of measures (with µn ∈ P(En) for eachn ≥ 0) and for each n ≥ 0 let νn = (fn)∗µn ∈ P(B). Then

νn(B) = µn(f−1n (B)) = µn(f−1

n+1(∆−1(B))) = νn+1(∆

−1(B))

for each B ∈ B, and hence νn = ∆∗νn+1 for each n ≥ 0. By Proposition 19.1there thus exists a unique measure ν ∈ P(B∆) such that νn = (πn)∗ν for alln ≥ 0. But ν(π−1

n (fn(X)) = νn(fn(X)) = µn(X) = 1 for each n ≥ 0 and henceby Lemma 19.4 ν(f(X)) = 1. Therefore by Theorem 13.1 there exists a measureµ ∈ P(E) such that ν = f∗µ. Let n ≥ 0 and E ∈ En; there thus exists B ∈ Bwith E = f−1

n (B) and then

µ(E) = µ(f−1n (B)) = µ(f−1(π−1

n (B)) = ν(π−1n (B))

= νn(B) = µn(f−1n (B)) = µn(E) ;

i.e., µ(E) = µn(E) for all E ∈ En, n ≥ 0.

This shows that the sequence {En}n≥0 has the Kolmogorov extension property,which completes the proof of Theorem 19.1.

20 Convergence of conditional expectations

Let (X, E) be a measurable space and µ ∈ P(E) be a probability measure (bothconsidered to be fixed in what follows). Recall from Theorem 12.3 that if E ′ is asub-σ-algebra of E and if f ∈ M(E) with µ(f) < ∞ then there exists g ∈ M(E ′)such that µ(hg) = µ(hf) for all h ∈ M(E ′), and that if g′ ∈ M(E ′) is a furthermapping with this property then g′ = g µ-a.e. Since µ is a probability measurewe call g a version of the conditional expectation of f with respect to E ′. If f isbounded above by b ∈ R+ then by Theorem 12.4 g can also be chosen so thatg ≤ b.

In the next chapter and several times in Specifications and their Gibbs States [16]we need the two results below about the convergence of conditional expectations.We note that these results are stated and proved only for bounded mappings(since that is all we will require) but in fact they hold for all mappings f ∈ M(E)with µ(f) <∞.

Proposition 20.1 Let {En}n≥1 be a decreasing sequence of sub-σ-algebras of Eand put E∞ =

⋂

n≥1 En. Let f ∈ MB(E), for each n ≥ 1 let fn ∈ MB(En) be aversion of the conditional expectation of f with respect to En and let f∞ ∈ MB(E∞)be a version of the conditional expectation of f with respect to E∞. Then

µ({


fn(x) = f∞(x)})

= 1 ,

i.e., limn fn = f∞ µ-a.e. Moreover, limn µ(|fn − f∞|) = 0.

Proposition 20.2 Let {En}n≥1 be an increasing sequence of sub-σ-algebras of Eand put E ′ = σ(

⋃

n≥1 En). Let f ∈ MB(E), for each n ≥ 1 let fn ∈ MB(En) be aversion of the conditional expectation of f with respect to En and let f ′ ∈ MB(E ′)be a version of the conditional expectation of f with respect to E ′. Then

µ({


fn(x) = f ′(x)})

= 1 ,

i.e., limn fn = f ′ µ-a.e. Moreover, limn µ(|fn − f ′|) = 0.

Recall if {fn}n≥1 is any sequence from M(E) and f ∈ M(E) then by Lemma 9.7the sets {x ∈ X : limn fn(x) exists} and {x ∈ X : limn fn(x) = f(x)} are bothin E . Moreover, if f, g ∈ MB(E) then |f − g| = MB(E), since by Lemma 9.5 (1)MB(E) is a normal subspace of M(X) and |f − g| = (f ∨ g − g) + (f ∨ g − f).

The proofs of Propositions 20.1 and 20.2 make use of the corresponding simpleversions of the martingale convergence theorem. This theorem is one of themost important results in probability theory and we presume that the reader

122


has seen a proof of it. (A very good presentation can be found in Chapter 5of Breiman [2].) In order to prove Propositions 20.1 and 20.2 we only need aversion for uniformly bounded martingales. Thus we give a simple proof here ofthe martingale convergence theorem for this special case. The proof is adaptedfrom the proof of Baez-Duarte and Isaac [9] which can be found in Garsia [7].

We start by looking at the version of martingales needed in Proposition 20.1.

Let {En}n≥1 be a decreasing sequence of sub-σ-algebras of E . A sequence {fn}n≥1

from M(E) is said to be adapted to {En}n≥1 if fn ∈ M(En) for all n ≥ 1. Anadapted sequence {fn}n≥1 is said to be a martingale if µ(fn) < ∞ for eachn ≥ 1 and µ(IEfn) = µ(IEfm) for all E ∈ En whenever n ≥ m. If {fn}n≥1 is amartingale then it immediately follows Proposition 9.2 that also µ(hfn) = µ(hfm)for all h ∈ M(En) whenever n ≥ m.

Usually martingales are defined with an increasing sequence of σ-algebras, andwhat we are dealing with here are often called backward martingales. The ‘usual’martingales enter the picture when we look at Proposition 20.2.

Proposition 20.3 (1) Let {fn}n≥1 be a martingale; then fn is a version of theconditional expectation of f1 with respect to En for each n ≥ 1.

(2) Let f ∈ M(E) with µ(f) < ∞ and for each n ≥ 1 let fn be a version of theconditional expectation of f with respect to En. Then {fn}n≥1 is a martingale.

Proof (1) This is clear, since by definition fn ∈ M(En) and µ(hfn) = µ(hf1) forall h ∈ M(En), n ≥ 1.

(2) For each n ≥ 1 let fn be a version of the conditional expectation of f withrespect to En. Then fn ∈ M(En) and so the sequence {fn}n≥1 is adapted. More-over µ(fn) = µ(f) < ∞ for each n ≥ 1. Let n ≥ m and E ∈ En; then alsoE ∈ Em and therefore µ(IEfn) = µ(IEf1) = µ(IEfm). This shows that {fn}n≥1 isa martingale.

Put E∞ =⋂

n≥1 En; thus E∞ is a σ-algebra which is called the tail σ-algebra. Hereis the martingale convergence theorem for a backward martingale:

Theorem 20.1 Let {fn}n≥1 be a martingale and f∞ ∈ M(E∞) be a version ofthe conditional expectation of f1 with respect to E∞. Then

µ({


fn(x) = f∞(x)})

= 1 .

We are only going to prove this result for uniformly bounded martingales, i.e.,under the additional assumption that there exists b ∈ R+ such that fn ≤ b for


each n ≥ 1 (and so in particular fn ∈ MB(En) for each n ≥ 1). If f, g ∈ MB(E ′)then |f − g| = MB(E ′) and thus also (f − g)2 = MB(E ′).

We start the proof with some results about finite sequences of mappings. LetF1, . . . , Fp be sub-σ-algebras of E with Fk+1 ⊂ Fk for k = 1, . . . , p − 1. Afinite sequence {fk}

pk=1 from MB(E) is adapted to {Fk}

pk=1 if fk ∈ MB(Fk) for

k = 1, . . . , p. An adapted sequence {fk}pk=1 is said to be a martingale (resp. a

submartingale) if µ(IFfk) = µ(IFfj) (resp. µ(IFfk) ≤ µ(IFfj)) for all F ∈ Fk

whenever k ≥ j. (Note that all the mappings occurring here are assumed tobe bounded.) If {fk}

pk=1 is a martingale (resp. submartingale) then we clearly

also have µ(hfk) = µ(hfj) (resp. µ(hfk) ≤ µ(hfj)) for all h ∈ MB(Fk) wheneverk ≥ j.

Lemma 20.1 Let {gk}pk=1 be a submartingale and put g∗ = g1 ∨ · · · ∨ gp. Then

µ({x ∈ X : g∗(x) > a}) ≤ a−1µ(g1)

for all a > 0.

Proof Put F = {x ∈ X : g∗(x) > a} and for k = 1, . . . , p let

Fk = {x ∈ X : gk(x) > a and gj(x) ≤ a for j = k + 1, . . . , p} .

Then Fk ∈ Fk for each k and F is the disjoint union of the sets F1, . . . , Fp. Thus

µ(IFg1) =

p∑

k=1

µ(IFkg1) ≥

p∑

k=1

µ(IFkgk) ≥

p∑

k=1

µ(aIFk) = aµ(F )

and therefore µ(F ) ≤ a−1µ(IFg1) ≤ a−1µ(g1).

Lemma 20.2 If {fk}pk=1 is a martingale then {f 2

k}pk=1 is a submartingale.

Proof Let j ≤ k and F ∈ Fk. Then µ(IFf2k ) = µ(IFfkfk) = µ(IFfkfj) and so by

the Cauchy-Schwarz inequality (Proposition 10.7) applied to the measure µ·IF

µ(IFf2k )2 = µ(IFfkfj)

2 ≤ µ(IFfk)µ(IFfj) .

Thus µ(IFf2k ) ≤ µ(IFf

2j ) (since this holds trivially when µ(IFf

2k ) = 0).

Lemma 20.3 Let {fk}pk=1 be a martingale and for each k put gk = (fk − fp)

2.Then {gk}

pk=1 is a submartingale.


Proof Since Fp ⊂ Fk it follows that gk ∈ MB(Fk). Let j ≤ k and F ∈ Fk. Thenby Lemma 20.2

µ(IFgk) + 2µ(IFfkfp) = µ(IF (fk − fp)2 + 2IFfkfp) = µ(IFf

2k + IFf

2p )

= µ(IFf2k ) + µ(IFf

2p ) ≤ µ(IFf

2j ) + µ(IFf

2p )

= µ(IFf2j + IFf

2p ) = µ(IF (fj − fp)

2 + 2IFfjfp)

= µ(IFgj) + 2µ(IFfjfp) .

But µ(IFfkfp) = µ(IFfpfk) = µ(IFfpfj) = µ(IFfjfp), since fp ∈ MB(Fk), andtherefore µ(IFgk) ≤ µ(IFgj).

Lemma 20.4 Let {fk}pk=1 be a martingale; then for each a > 0

µ({x ∈ X : f �(x) > a}) ≤ a−2µ((f1 − fp)2)

where f � = |f1 − fp| ∨ · · · ∨ |fp−1 − fp|.

Proof This follows immediately from Lemmas 20.1 and 20.3.

Now let {fn}n≥1 be a martingale with fn ∈ MB(En) for each n ≥ 1 (though atthe moment we do not assume that they are uniformly bounded). Let n ≥ m;then fn ∈ MB(Em) and so µ(f 2

n) = µ(fnfn) = µ(fnfm). Thus

µ((fn − fm)2) + 2µ(f 2n) = µ((fn − fm)2 + 2f 2

n) = µ((fn − fm)2 + 2fnfm)

= µ(f 2n + f 2

m) = µ(f 2n) + µ(f 2

m)

and hence µ((fn−fm)2)+µ(f 2n) = µ(f 2

m). In particular µ(f 2m)−µ(f 2

n) ≥ 0, whichimplies that the sequence {µ(f 2

n)}n≥1 is decreasing.

Let a = limn µ(f 2n), and choose a subsequence {nj}j≥1 such that µ(f 2

nj) < a+2−2j

for all j ≥ 1. Then µ((fnj+1− fnj

)2) = µ(f 2nj

)− µ(f 2nj+1

) < 2−2j for all j ≥ 1 andhence by the Cauchy-Schwarz inequality (Proposition 10.7)

µ(|fnj+1− fnj

|)2 = µ(1 · |fnj+1− fnj

|)2 ≤ µ(1)µ((fnj+1− fnj

)2) = µ((fnj+1− fnj

)2)

which means that µ(|fnj+1− fnj

|) < 2−j for all j ≥ 1. Thus

µ(

∑

j≥1

|fnj+1− fnj

|)

=∑

j≥1

µ(|fnj+1− fnj

|) <∞ ;

hence if E = {x ∈ X :∑

j≥1 |fnj+1− fnj

|(x) < ∞} then by Proposition 10.6 (4)µ(E) = 1. But clearly E is a subset of the set

G′ ={

x ∈ X : limj→∞

fnj(x) exists

}


and therefore µ(G′) = 1. Now for each j ≥ 1 let

f �j = |fnj

− fnj+1| ∨ · · · ∨ |fnj+1−1 − fnj+1

| ;

then by Lemma 20.4 µ({x ∈ X : f �j (x) > a}) < a−2µ((fnj

− fnj+1)2) < a−22−2j

for each a > 0 and hence

µ(

⋂

j≥N

{x ∈ X : f �j (x) ≤ a}

)

= 1 − µ(

⋃

j≥N

{x ∈ X : f �j (x) > a}

)

≥ 1 −∑

j≥N

µ({x ∈ X : f �j (x) > a}) ≥ 1 − a−2

∑

j≥N

2−2j ≥ 1 − a−22−N

for all N ≥ 1, a > 0, which implies that

µ(

⋃

N≥1

⋂

j≥N

{x ∈ X : f �j (x) ≤ a}

)

= 1

for each a > 0, which in turn implies that

µ(

⋂

m≥1

⋃

N≥1

⋂

j≥N

{x ∈ X : f �j (x) ≤ 1/m}

)

= 1 .

Thus if G = {x ∈ X : limn fn(x) exists} then

G ⊃ G′ ∩⋂

m≥1

⋃

N≥1

⋂

j≥N

{x ∈ X : f �j (x) < 1/m} ,

and we have therefore shown that µ(G) = 1. Define a mapping f ′ : X → R+∞ by

f ′(x) =

{

limn fn(x) if this limit exists,0 otherwise.

Then f ′ ∈ M(E), since f ′ = IG lim supn fn and lim supn ∈ M(E), G ∈ E . But f ′

does not change if the sequence {fn}n≥1 is replaced by {fn}n≥m and so the sameargument implies that f ′ ∈ M(Em) for all m ≥ 1, i.e., f ′ ∈ M(E∞).

We now assume that fn ≤ b for all n ≥ 1 for some b ∈ R+. Then by Theorem 10.4and Proposition 10.6 (3) µ(IEf

′) = limn µ(IEIGfn) = limn µ(IEfn) for all E ∈ E ,since f ′ = limn IGfn and µ(X \G) = 0. In particular, if E ∈ E∞ then

µ(IEf′) = lim

n→∞µ(IEfn) = lim

n→∞µ(IEf1) = µ(IEf1),

since E ∈ En for each n ≥ 1 and this implies f ′ is a version of the conditionalexpectation of f1 with respect to E∞.

Finally, let f∞ be any version of the conditional expectation of f1 with respect toE∞, let F = {x ∈ X : f∞(x) = f ′(x)} and G∞ = {x ∈ X : limn fn(x) = f∞(x)}.Then G∞ ⊃ G∩ F and µ(G) = µ(F ) = 1; hence µ(G∞) = 1. This completes theproof of Theorem 20.1 for a uniformly bounded martingale.


Proposition 20.4 Let {fn}n≥1 be a uniformly bounded martingale and f∞ be abounded version of the conditional expectation of f1 with respect to E∞. Then

limn→∞

µ(|fn − f∞|) = 0 .

Proof Let G = {x ∈ X : limn fn(x) = f∞(x)}; then by Theorem 20.1 µ(G) = 1,and limn IGfn = IGf∞. Moreover IGfn ≤ b, where b ∈ R+ is such that fn ≤ b forall n and µ(b) = b <∞. Thus by Theorem 10.4 and Proposition 10.6 (3)

limn→∞

µ(|fn − f∞|) = limn→∞

µ(IG|fn − f∞|) = limn→∞

µ(|IGfn − IGf∞|) = 0 .

Proof of Proposition 20.1: Let b ∈ R+ be such that f ≤ b; then by Theorem 12.4(together with the uniqueness in Theorem 12.3) fn ≤ b µ-a.e., and hence fn ∧ bis also a version of the conditional expectation of f with respect to En. Thus byProposition 20.3 (2), Theorem 20.1 and Proposition 20.4 limn fn ∧ b = f∞ µ-a.e.and limn µ(|fn ∧ b− f∞|) = 0, and so the same holds with fn replacing fn ∧ b.

We now turn to the proof of Proposition 20.2, so now let {En}n≥1 be an increasingsequence of sub-σ-algebras of E . A sequence {fn}n≥1 from M(E) is again said tobe adapted to {En}n≥1 if fn ∈ M(En) for all n ≥ 1. An adapted sequence {fn}n≥1

is a martingale if µ(fn) < ∞ for each n ≥ 1 and µ(IEfm) = µ(IEfn) for allE ∈ Em whenever m ≤ n.

Proposition 20.5 Let f ∈ M(E) with µ(f) <∞ and for each n ≥ 1 let fn be aversion of the conditional expectation of f with respect to En. Then {fn}n≥1 is amartingale.

Proof This is the same as the proof of Proposition 20.3 (2). (In general, however,there is no result corresponding to Proposition 20.3 (1).)

Here is the martingale convergence theorem for such a (forward) martingale:

Theorem 20.2 Let {fn}n≥1 be a martingale with supn µ(fn) <∞. Then

µ({


fn(x) exists})

= 1 .

Proof For uniformly bounded martingales the proof is essentially the same asthe proof of the analogous statement in Theorem 20.1. The only real difference isthat the sequence {µ(f 2

n)}n≥1, which was decreasing for a backward martingale,is now increasing. We thus need the assumption that the martingale is uniformlybounded at this point to ensure that the limit a = limn µ(f 2

n) is finite. (In the


previous case the uniform bound was only required later.) The details of theproof are left to the reader.

Proof of Proposition 20.2: Let b ∈ R+ be such that f ≤ b; then as in the proofof Proposition 20.1 we can assume that fn ≤ b for each n ≥ 1. Put

G ={


fn(x) exists}

;

then by Lemma 9.7 G ∈ E ′, since fn ∈ M(E ′) for each n ≥ 1, and by Theorem 20.2and Proposition 20.5 µ(G) = 1. Then limn(IGfn)(x) exists for each x ∈ X andIGfn ∈ MB(E ′) for each n ≥ 1; hence if g = limn IGfn then g ∈ MB(E ′) (withg ≤ b) and limn fn = g µ-a.e. Now if E ∈ Em for some m ≥ 1 then E ∈ En

for all n ≥ m and so by Proposition 10.6 (3) µ(IEf) = µ(IEfn) = µ(IEIGfn)for all n ≥ m. Therefore by Theorem 10.4 µ(IEg) = limn µ(IEIGfn) = µ(IEf),since limn IEIGfn = IEg and IEIGfn ≤ b for each n ≥ 1, and this shows thatµ(IEg) = µ(IEf) for all E ∈ A =

⋃

m≥1 Em. But D = {E ∈ E : µ(IEg) = µ(IEf)}is clearly a monotone class and A is an algebra and so µ(IEg) = µ(IEf) for allE ∈ E ′ = σ(A). Thus g is a version of the conditional expectation of f withrespect to E ′, which implies that limn fn = f ′ µ-a.e., since by Theorem 12.3f ′ = g µ-a.e. Finally, the proof that limn µ(|fn − f ′|) = 0 is exactly the same asthe proof of Proposition 20.4.

21 Existence of conditional distributions

Let us say that conditional distributions exist for measurable spaces (X, E) and(Y,F) if for each µ ∈ P(E ×F) there exists a probability kernel π : X×F → R+

∞

such thatµ(E × F ) = µ1(IEπ(IF ))

for all E ∈ E , F ∈ F , where µ1 = (p1)∗µ is the image measure of µ under theprojection p1 : X×Y → X onto the first component. (Beware that this definitionis not symmetric in (X, E) and (Y,F).)

Conditional distributions do not exist in general. However, they do exist if (X, E)is countably generated and (Y,F) is substandard Borel. Proofs of this fact withsubstandard Borel replaced by standard Borel can be found in Chapter 1 ofDoob [4], Chapter V of Parthasarathy [15], and also in Appendix 4 of Dynkinand Yushkevich [6].

Theorem 21.1 Conditional distributions exist for (X, E) and (Y,F) if (X, E) iscountably generated and (Y,F) is substandard Borel.

Proof We reduce things to the case in which (X, E) = (Y,F) = (M,B).

Lemma 21.1 Conditional distributions exist for (M,B) and (M,B).

Proof For m ≥ 1 again let Cm = q−1m (P({0, 1}m)), where qm : M → {0, 1}m is

given by qm({zn}n≥1) = (z1, . . . , zm). Thus {Cm}m≥1 is an increasing sequence offinite algebras with C =

⋃

m≥0 Cm. For each z ∈M and each n ≥ 1 let an(z) be theatom of Cn containing z. Let N ⊂ B be the trivial σ-algebra with N = {∅,M}.

Let µ ∈ P(B × B) and µ1 = (p1)∗µ with p1 : M ×M → M projecting onto thefirst component. For each n ≥ 1 define γn : M × B → R+ by

γn(z, B) =µ(an(z) × B)

µ1(an(z))

with 0/0 taken to be 0. Then γn(z, ·) is either 0 or an element of P(B) for eachz ∈ M , γn(·, B) ∈ M(Cn) for each B ∈ B and µ(C × B) = µ1(ICγn(IB)) forall C ∈ Cn, B ∈ B. Consider the mapping γ ′n : (M × M) × B → R+ withγ′n((z1, z2), B) = γ(z1, B); then γ′n(·, B) ∈ M(Cn ×N ) and

µ(IC×Nγ′n(IB)) = µ((C ×N) ∩ (M × B)) = µ(IC×NIM×B)

129


for all C ∈ Cn, N ∈ N , B ∈ B. Thus γ′n(IB) is a version of the conditionalexpectation of IM×B with respect to Cn × N for each n ≥ 1 and it thereforefollows from Proposition 20.2 that

µ1

({

z ∈M : limn→∞

γn(z, B) exists})

= µ({

(z1, z2) ∈M ×M : limn→∞

γ′n((z1, z2), B) exists})

= 1

for each B ∈ B. Put

MC ={

z ∈M : limn→∞

γn(z, C) exists for all C ∈ C}

;

since C is countable it follows that MC ∈ B and µ1(MC) = 1. Choose z0 ∈ MC

and define a mapping γ : M × C → R+ by letting

γ(z, C) =

{

limn γn(z, C) if z ∈MC ,limn γn(z0, C) if z ∈M \MC .

Then γ(·, C) ∈ M(B) for each C ∈ C and by Theorem 10.4

µ(C1 × C2) = limn→∞

µ1(IC1γn(IC2

)) = µ1(IC1γ(IC2

))

for all C1, C2 ∈ C. Hence by Proposition 2.1 µ(B1 × C2) = µ1(IB1γ(IC2

)) for allB1 ∈ B, C2 ∈ C. Now it is clear that the mapping γ(z, ·) : C → R+

∞ is additivewith γ(z,M) = 1 for each z ∈ M and so by Proposition 16.6 it has a uniqueextension to an element of P(B) which will also be denoted by γ(z, ·). Thusγ : M ×B → R+ is a pre-kernel which by Proposition 14.5 is a probability kernelsatisfying µ(B1 × B2) = µ1(IB1

γ(IB2)) for all B1, B2 ∈ B.

Lemma 21.2 If (Y,F) is substandard Borel then conditional distributions existfor (M,B) and (Y,F).

Proof Since (Y,F) is substandard Borel there exists a mapping f : Y →M withf−1(B) = F such that f(Y ) ∈ B. Let µ ∈ P(B × F) and let µ1 = (p1)∗µ withp1 : M × Y → M the projection onto the first component. Put ν = g∗µ, whereg = idM × f : M × Y → M ×M , so ν ∈ P(B × B). Then by Lemma 21.1 thereexists a probability kernel γ : M × B → R+ such that

ν(B1 × B2) = ν1(IB1γ(IB2

))

for all B1, B2 ∈ B, where ν1 = (p1)∗ν with p1 : M ×M →M projecting onto thefirst component, and note that ν1 = µ1, since p1 ◦ g = p1 ◦ (idM × f) = p1. Nowconsider M0 = {z ∈M : γ(z, f(Y )) = 1}; then M0 ∈ B and ν1(M0) = 1, since

1 = µ(M × Y ) = µ(g−1(M × f(Y ))) = ν(M × f(Y )) = ν1(γ(f(Y ))) .


Choose some point z0 ∈M0 and define γo : M × B → R+∞ by

γo(z, B) =

{

γ(z, B) if z ∈M0 ,γ(z0, B) if z ∈M \M0 ;

then γo is a probability kernel with γo(z, f(Y )) = 1 for all z ∈ M and

ν(B1 ×B2) = ν1(IB1γo(IB2

))

for allB1, B2 ∈ B. Now by Theorem 13.1 there exists for each z ∈M a probabilitymeasure τ(z, ·) ∈ P(F) so that τ(z, f−1(B)) = γo(z, B) for all B ∈ B, and thenτ : M × F → R+ is clearly a probability kernel. Let B ∈ B and F ∈ F ; thenF = f−1(B′) for some B′ ∈ B and so

µ(B × F ) = µ(B × f−1(B′)) = µ(g−1(B × F )) = ν(B × B ′)

= ν1(IBγo(IB′)) = µ1(IBτ(If−1(B′))) = µ1(IBτ(IF ))

and this shows that conditional distributions exist for (M,B) and (Y,F).

Proof of Theorem 21.1: Since the measurable space (X, E) is countably generatedthere exists by Proposition 16.1 a mapping f : X → M with f−1(B) = E . Letµ ∈ P(X × Y, E × F) and µ1 = (p1)∗µ with p1 : X × Y → X the projection ontothe first component. Put ν = g∗µ, where g = f × idY : X × Y → M × Y , soν ∈ P(M × Y,B × F). Then by Lemma 21.2 there exists a probability kernelτ : M ×F → R+ such that ν(B×E) = ν1(IBτ(IF )) for all B ∈ B, F ∈ F , whereν1 = (p1)∗ν with p1 : M × Y →M the projection onto the first component, andν1 = f∗µ1, since p1 ◦ g = p1 ◦ (f × idY ) = f ◦ p1. Now define π : X ×F → R+ byletting π(x, F ) = τ(f(x), F ) for all x ∈ X, F ∈ F , thus π is clearly a probabilitykernel. Let E ∈ E , F ∈ F ; then E = f−1(B) for some B ∈ B and so

µ(B × F ) = µ(f−1(B) × F ) = µ(g−1(B × F )) = ν(B × F )

= ν1(IBτ(IF )) = (f∗µ1)(IBτ(IF ))

= µ1(If−1(B)τ(f(·), F )) = µ1(IEπ(IF ))

and therefore conditional distributions exist for (X, E) and (Y,F). This completesthe proof of Theorem 21.1.

22 Standard Borel spaces

In this chapter we discuss the relationship between standard and substandardBorel spaces. We start by giving (in Proposition 22.1) some further properties ofthe measurable space (M,B). They form the basis for most of the results whichfollow. However, we offer no proof of these facts, which would involve the typicalmachinery associated with standard Borel spaces.

Proposition 22.1 (1) If f : M → M is injective and B-measurable (meaningthat f−1(B) ⊂ B) then f(B) ∈ B for each B ∈ B (and so in particular f(M) ∈ B).

(2) If A ∈ B is uncountable then there exists an injective B-measurable mappingf : M →M with f(M) = A.

Proof Part (1) is a special case of a theorem of Kuratowski. Part (2) is containedin what goes under the name of the isomorphism theorem. For a treatment ofthese results see, for example, Chapter I of Parthasarathy [15].

A measurable space (X, E) is standard Borel if there exists a metric on X whichmakes it a complete separable metric space in such a way that E is then the Borelσ-algebra (the smallest σ-algebra containing the open subsets of X). The name‘standard Borel’ was given to such spaces by Mackey in [13]. In particular, byProposition 16.3 (1) a standard Borel space is countably generated. Moreover, astandard Borel space (X, E) is separable, i.e., {x} ∈ E for each x ∈ X.

Proposition 22.2 A measurable space (X, E) is standard Borel if and only if itis separable and substandard Borel.

Proof By Proposition 18.1 a standard Borel space is substandard Borel andclearly it is separable. Suppose then that (X, E) is a separable substandardBorel space. Then there exists a mapping f : X → M with f−1(B) = E suchthat f(X) ∈ B and by Proposition 16.8 f is injective and if f is considered as abijective mapping from X onto A = f(X) then f−1(B|A) = E .

If A is countable then so is X and then E = P(X), since (X, E) is separable.In this case P(X) is the Borel σ-algebra of X considered as a topological spacewith the discrete topology. But the discrete topology is generated by the discretemetric δ (with δ(x, x) = 0 and δ(x, y) = 1 if x 6= y) and the metric space (X, δ)is separable and complete. Hence (X, E) is standard Borel.

Suppose then that A = f(X) is uncountable, i.e., A is an uncountable elementof B. By Proposition 22.1 (2) there exists an injective B-measurable mappingh : M → M with h(M) = A and by Proposition 22.1 (1) h(B) ∈ B for each

132


B ∈ B. Define g : X →M by letting g(x) = h−1(f(x)) for each x ∈ X; thus g issurjective and hence bijective.

Let B ∈ B; then g−1(B) = f−1(h(B)) ∈ E , since h(B) ∈ B, and so g−1(B) ⊂ E .On the other hand, for each E ∈ E there exists B ∈ B with f−1(B) = E andthen h−1(B) ∈ B with g−1(h−1(B)) = E. This shows that g−1(B) = E .

We now have a bijective mapping g : X →M with g−1(B) = E and the mappingg can be used to pull the metric on M back to a metric on X; then g becomesa homeomorphism between the metric spaces X and M . Thus X is a separablecomplete metric space with respect to this metric and E = g−1(B) is the Borelσ-algebra. This shows that (X, E) is standard Borel.

Proposition 22.1 can be exploited to give a lot more information about standardBorel and substandard Borel spaces. A couple of these result are given below.

Note that if (X, E) is a countably generated measurable space and f : X →M isa mapping with f−1(B) = E then by Lemma 16.3 f(X) has the same cardinalityas A(E). In particular, the cardinality of f(X) does not depend on f .

Lemma 22.1 Let (X, E) be countably generated and let f1, f2 : X → M bemappings with f−1

1 (B) = f−12 (B) = E ; put A1 = f1(X) and A2 = f2(X). Then

there exists a bijective mapping v : A1 → A2 with v−1(B|A2) = B|A1

.

Proof Let z ∈ A1; then by Lemma 16.3 f−11 ({z}) ∈ A(E) and so f2(x1) = f2(x2)

for all x1, x2 ∈ f−11 ({z}), since f−1

2 ({w}) ∈ E for each w ∈M . There thus exists aunique mapping v : A1 →M such that v◦f1 = f2, and since v(A1) = f2(X) = A2

we can consider v as a surjective mapping from A1 onto A2. Reversing the roles off1 and f2 there is also a unique surjective mapping u : A2 → A1 with u ◦ f2 = f1

which implies that v is injective (since u ◦ v ◦ f1 = f1). This gives us a bijectivemapping v : A1 → A2 and u : A2 → A1 must be the inverse mapping to v.

Now let B2 ∈ B|A2and so B2 = B ∩A2 for some B ∈ B. Then, since f−1

1 (B) = E ,there exists B1 ∈ B with f−1

1 (B1) = f−12 (B) = f−1

2 (B2). It follows that

v−1(B2) = f1(f−11 (v−1(B2))) = f1(f

−12 (B2))

= f1(f−11 (B1)) = f1(X) ∩ B1 = A1 ∩ B1

(since f1 : X → A1 is surjective) and therefore v−1(B2) ∈ B|A1. This shows that

v−1(B|A2) ⊂ B|A1

, and reversing the roles of f1 and f2 gives u−1(B|A1) ⊂ B|A2

,which together implies that v−1(B|A2

) = B|A1.

Proposition 22.3 Let (X, E) be a substandard Borel space. Then f(X) ∈ B forevery mapping f : X →M with f−1(B) = E .


Proof Let f1, f2 : X → M with f−11 (B) = f−1

2 (B) = E , put A1 = f1(X) andA2 = f2(X) and suppose A1 ∈ B. Then by Lemma 22.1 there exists a bijectivemapping v : A1 → A2 with v−1(B|A2

) = B|A1. In particular, if A1 is countable

then so is A2 and in this case A2 ∈ B, since B contains every countable subset ofM . We can thus assume that A1 is uncountable and then by Proposition 22.1 (2)there exists an injective B-measurable mapping h : M → M with h(M) = A1.Define a mapping g : M → M by letting g(z) = h(v(z)) for each z ∈ M ; thusg is injective and g(M) = A2. Now consider B ∈ B; then A2 ∩ B ∈ B|A2

, hencev−1(A2 ∩ B) ∈ B|A2

and so v−1(A2 ∩ B) = A1 ∩ B′ for some B′ ∈ B. But thisimplies that g−1(B) = v−1(A1 ∩ B′) ∈ B, since A1 ∩ B′ ∈ B, and therefore gis B-measurable. Thus by Proposition 22.1 (1) A2 ∈ B. Finally, since (X, E) issubstandard Borel there exists a mapping f : X → M with f−1(B) = E andf(X) ∈ B and so the above shows that f(X) ∈ B for every mapping f : X →Mwith f−1(B) = E .

Proposition 22.4 Let (X, E) be a substandard Borel space.

(1) If A(E) is uncountable then there exists a surjective mapping f : X → Mwith f−1(B) = E .

(2) If A(E) is countable then there exists a countable set N and a surjectivemapping f : X → N with f−1(P(N)) = E .

Proof Since (X, E) is substandard Borel there exists a mapping g : X →M withg−1(B) = E such that g(X) ∈ B.

(1) Let A = g(X); then A is uncountable since, as was already noted, it hasthe same cardinality as A(E). By Proposition 22.1 (2) there thus exists aninjective B-measurable mapping h : M → M with h(M) = A and then byProposition 22.1 (1) h(B) ∈ B for each B ∈ B. Define f : X → M by lettingf(x) = h−1(g(x)) for each x ∈ X; thus f is surjective.

Let B ∈ B; then f−1(B) = g−1(h(B)) ∈ E , since h(B) ∈ B, and so f−1(B) ⊂ E .On the other hand, for each E ∈ E there exists B ∈ B with g−1(B) = E andthen h−1(B) ∈ B with f−1(h−1(B)) = E. This shows that f−1(B) = E .

(2) Here N = g(X) is countable. Thus f = g : X → N is surjective and it nowfollows immediately from Lemma 16.3 that f−1(P(N)) = E .

Proposition 22.5 If (X, E) is a standard Borel space with X uncountable thenthere exists a bijective mapping f : X → M with f−1(B) = E . The measurablespaces (X, E) and (M,B) are therefore isomorphic.


Proof Since (X, E) is separable A(E) has the same cardinality as X and is thusuncountable. By Propositions 22.2 and 22.4 (1) there therefore exists a surjectivemapping f : X →M with f−1(B) = E . But by Proposition 16.8 g is then injectiveand hence bijective.

We have treated the Kolmogorov extension property as well as the existenceof conditional distributions just using substandard Borel spaces. The commoncharacteristic of these two examples is that they involve constructing measures,and such problems seem to be relatively simple. However, there are plenty ofmore difficult problems which cannot be dealt with directly using substandardBorel spaces. A typical example concerns the existence of measurable selectors:

Let (X, E) and (Y,F) be measurable spaces and h : X → Y be a surjectivemapping with h−1(F) ⊂ E . By the axiom of choice there then exists a selector forh, i.e., a mapping ϕ : Y → X such that h ◦ ϕ = idY . If in addition ϕ−1(E) ⊂ Fthen ϕ is called a measurable selector. Unfortunately, measurable selectors donot always exist, even when (X, E) and (Y,F) are standard Borel spaces: LetI = [0, 1] and let p1 : I × I → I be the projection onto the first component.Then there exists a Borel subset A of I × I with p1(A) = I for which there doesnot exist a Borel measurable mapping g : I → I × I with g(I) ⊂ A such thatp1(g(x)) = x for all x ∈ I. (See, for example, Blackwell [1].) However, so-calleduniversally measurable selectors exist:

Proposition 22.6 Let (X, E) and (Y,F) be standard Borel and let h : X → Ybe a surjective mapping with h−1(F) ⊂ E . Then there exists a selector ϕ : Y → Xfor h with ϕ−1(E) ⊂ F∗. Here F∗ is the intersection of all σ-algebras Fµ with µ afinite measure on F and with Fµ the completion of the σ-algebra F with respectto µ. (F∗ is called the σ-algebra of universally measurable sets.)

Proof Proofs of equivalent results can be found in Cohn [3], Theorem 8.5.3, andDynkin and Yushkevich [6], Appendix 3.

We will show how to reduce the proof of Proposition 22.6 to a similar statementinvolving (M,B). However, this time it doesn’t help and we end up with a taskwhich is no easier than the original one.

Let (X, E), (Y,F) and h : X → Y be as in the statement of the theorem, and asusual consider mappings f : X → M and g : Y →M such that f−1(B) = E andg−1(B) = F with A = f(X) and B = g(Y ) both in B.

Let θ : X →M×M be the mapping with θ = (g◦h, f). By Lemma 19.2 (appliedto f and g ◦ h) it follows that θ−1(B × B) = E and θ(X) ∈ B × B.

Let D = θ(X) and p1 : M ×M →M be the projection onto the first component.Then p1 ◦ θ = g ◦ h and p1(D) = B, since h is surjective.


By Proposition 16.8 θ is injective; let q : D → X be the inverse mapping. It thenfollows (again by Proposition 16.8) q−1(E) = (B × B)|D ⊂ B × B.

Suppose now there exists a mapping ψ : B →M ×M with ψ(B) ⊂ D such thatp1 ◦ ψ = idB. Define a mapping ϕ : Y → X by ϕ = q ◦ ψ ◦ g. Then

g ◦ h ◦ ϕ = g ◦ h ◦ q ◦ ψ ◦ g = p1 ◦ θ ◦ q ◦ ψ ◦ g = p1 ◦ ψ ◦ g = idB ◦ g = g

and hence h ◦ ϕ = idY , since by Proposition 16.8 g is injective. Suppose inaddition ψ−1(B×B) ⊂ B∗. Then ϕ−1(E) ⊂ F∗, since it is not difficult to see thatg(B∗) ⊂ F∗. The proof of Proposition 22.6 is therefore reduced to showing thatthe following holds:

Proposition 22.7 Let D be a non-empty subset of M ×M such that D ∈ B×Band B = p1(D) ∈ B. Then there exists a mapping ψ : B → M × M withψ(B) ⊂ D and ψ−1(B × B) ⊂ B∗ such that p1 ◦ ψ = idB.

Proof As already indicated, in this case the reduction to the statement involving(M,B) does not make the problem any easier. For the proof of Proposition 22.7the reader will have to look at the references mentioned in Proposition 22.6.

23 The usual diagonal argument

In a couple of places we have used a result which is typically invoked as an appealto ‘the usual diagonal argument’, and so we here give a precise statement of whatthis means.

For each non-empty set X denote by Σ(X) the set of all sequences {xn}n≥0 ofelements from X, thus Σ(X) is the set of all mappings from N to X. (We indexthe sequences with N, but it would make no difference if we used N+ instead ofN.) Denote by Σ.(N) the set of all strictly increasing mappings in Σ(N); theelements of Σ.(N) are called subsequences. Note that if τ ∈ Σ.(N) then τ(n) ≥ nfor all n ≥ 0 and that τ1 ◦ τ2 ∈ Σ.(N) for all τ1, τ2 ∈ Σ.(N).

If s = {xn}n≥0 ∈ Σ(X) is a sequence and τ = {nj}j≥0 ∈ Σ.(N) is a subsequencethen there is the sequence s ◦ τ = {xnj

}j≥0 ∈ Σ(X) (which is often also called asubsequence).

For s, s′ ∈ Σ(X) we write s ∼ s′ if there exists m ≥ 0 such that s(k) = s′(k) forall k ≥ m. Thus ∼ is an equivalence relation on Σ(X).

Let ∆ be a subset of Σ(X); we say that ∆ is ∼-invariant if s′ ∈ ∆ for alls′ ∈ Σ(X) such that s′ ∼ s for some s ∈ ∆. Moreover, we say that ∆ is closedunder subsequences if s ◦ τ ∈ ∆ for all s ∈ ∆, τ ∈ Σ.(N).

Now let ∆0 and ∆1 be subsets of Σ(X) with ∆0 ⊂ ∆1; we call (∆0,∆1) anadmissible pair in Σ(X) if ∆0 is ∼-invariant and closed under subsequences, ∆1

is closed under subsequences and if for each s ∈ ∆1 there exists τ ∈ Σ.(N) suchthat s ◦ τ ∈ ∆0.

The prototypical example here is with X = R, with ∆1 the set of boundedreal sequences and with ∆0 the set of convergent real sequences: This resultsin an admissible pair in Σ(R), since the Heine-Borel theorem implies that everybounded sequence possesses a convergent subsequence.

Theorem 23.1 For each n ≥ 0 let Xn be a non-empty set, let (∆n0 ,∆

n1 ) be an

admissible pair in Σ(Xn) and let sn ∈ ∆n1 . Then there exists a subsequence

τ ∈ Σ.(N) such that sn ◦ τ ∈ ∆n0 for all n ≥ 0.

Proof The following lemma (involving only the set Σ.(N)) is in some sense thereal diagonal argument:

Lemma 23.1 Let {τn}n≥0 be a sequence of elements from Σ.(N) and for eachn ≥ 0 put γn = τ0 ◦ · · · ◦ τn (and so γn ∈ Σ.(N)). Define τ : N → N by lettingτ(n) = γn(n) for all n ≥ 0. Then τ ∈ Σ.(N). Moreover, for each n ≥ 0 thereexists ηn ∈ Σ.(N) such that τ ∼ γn ◦ ηn.

137

23 The usual diagonal argument 138

Proof The mapping τ : N → N is strictly increasing since

τ(n + 1) = (τ0 ◦ · · · ◦ τn+1)(n+ 1) = (τ0 ◦ · · · ◦ τn)(τn+1(n + 1))

> (τ0 ◦ · · · ◦ τn)(τn+1(n)) ≥ (τ0 ◦ · · · ◦ τn)(n) = τ(n)

for all n ≥ 0, and hence τ ∈ Σ.(N). Fix n ≥ 0; then for all m > n

τ(m) = (τ0 ◦ · · · ◦ τm)(m)

= (τ0 ◦ · · · ◦ τn)((τn+1 ◦ · · · ◦ τm)(m)) = γn((τn+1 ◦ · · · ◦ τm)(m)) ;

thus if we define ηn : N → N by

ηn(m) =

{

m if m ≤ n ,(τn+1 ◦ · · · ◦ τm)(m) if m > n ,

then τ(m) = (γn ◦ ηn)(m) for all m ≥ n, and so τ ∼ γn ◦ ηn. But ηn ∈ Σ.(N): Ifm < n then ηn(m+ 1) = m+ 1 > m = ηn(m), if m > n then

ηn(m + 1) = (τn+1 ◦ · · · ◦ τm+1)(m+ 1) = (τn+1 ◦ · · · ◦ τm)(τm+1(m+ 1))

> (τn+1 ◦ · · · ◦ τm)(τm+1(m)) ≥ (τn+1 ◦ · · · ◦ τm)(m) = ηm(m)

and finally ηn(n+ 1) = τn+1(n+ 1) ≥ n+ 1 > n = ηn(n).

Now to the proof of Theorem 23.1. By induction we define a sequence {τn}n≥0 ofelements from Σ.(N) such that sn ◦τ0 ◦· · ·◦τn ∈ ∆n

0 for each n ≥ 0: To start withthere exists τ0 ∈ Σ.(N) so that s0 ◦ τ0 ∈ ∆0

0, since s0 ∈ ∆01. Thus let n ≥ 0 and

suppose that there exist τ0, . . . , τn ∈ Σ.(N) such that sk◦τ0◦· · ·◦τk ∈ ∆k0 for each

k = 0, . . . , n. Then sn+1◦τ0◦· · ·◦τn ∈ ∆n1 , since ∆n

1 is closed under subsequencesand hence there exists τn+1 ∈ Σ.(N) such that sn+1 ◦ τ0 ◦ · · · ◦ τn+1 ∈ ∆n

0 .

For n ≥ 0 put γn = τ0 ◦ · · · ◦ τn. Then by Lemma 23.1 there exists τ ∈ Σ.(N)and for each n ≥ 0 a subsequence ηn ∈ Σ.(N) such that τ ∼ γn ◦ ηn. Now since∆n

0 is closed under subsequences and sn ◦ γn = sn ◦ τ0 ◦ · · · ◦ τn ∈ ∆n0 it follows

that sn ◦ γ ◦ ηn ∈ ∆n0 . But sn ◦ τ ∼ sn ◦ γ ◦ ηn, since τ ∼ γn ◦ ηn and therefore

sn ◦ τ ∈ ∆n0 , since ∆n

0 is ∼-invariant.

References

[1] Blackwell, D. (1968): A Borel set not containing a graph. Ann. Math. Stats.,39, 1345-1347.

[2] Breiman, L. (1968): Probability. Addison-Wesley, Reading

[3] Cohn, D.L. (1980): Measure Theory. Birkhauser, Boston

[4] Doob, J.L. (1953): Stochastic Processes. Wiley, New York

[5] Dunford, N., Schwartz, J.T. (1958): Linear Operators, Part I. Interscience,New York

[6] Dynkin, E.B., Yushkevich, A.A. (1979): Controlled Markov Processes.Springer-Verlag, Berlin

[7] Garsia, A.M. (1970): Topics in Almost Everywhere Convergence. Markham,Chicago

[8] Halmos, P.R. (1974): Measure Theory. Springer

[9] Isaac, R. (1965): A proof of the martingale convergence theorem. Proc. Am.Math. Soc., 16, 842-844.

[10] Kingman, J., Taylor, S.J. (1966): Introduction to Measure and Probability.Cambridge University Press

[11] Kolgomorov, A.N. (1933): Grundbegriffe der Wahrscheinlichkeitsrechnung.Springer-Verlag, Berlin

[12] Kuratowski, K. (1966): Topology, Volume 1. Academic Press, New York

[13] Mackey, G.W. (1957): Borel structure in groups and their duals. Trans. Am.Math. Soc., 85, 134-165.

[14] Meyer, P.A. (1966): Probability and Potentials. Blaisdell, Toronto

[15] Parthasarathy, K.R. (1967): Probability Measures on Metric Spaces. Aca-demic Press, New York

[16] Preston, C. (1980): Specifications and their Gibbs States. Current versionto be found at: http://www.mathematik.uni-bielefeld.de/~preston/

[17] Taylor, S.J. (1973): Introduction to Measure and Integration. CambridgeUniversity Press (This is the first half of [10].)

139

Index

σ-algebra, 7Borel, 10countably generated, 99trace, 9

σ-algebrasweakly independent, 95

σ-finite measure, 22?-closed

subset, 36subspace, 41

∅-continuous mapping, 18

absolutely continuous, 70weakly, 70

adapted sequence, 123additive mapping, 18algebra, 7almost everywhere, 50atom, 105

backward martingale, 123base for topology, 16Borel σ-algebra, 10Borel space

standard, 132substandard, 112

boundlower, 5, 38upper, 5, 35

bounded mapping, 44

Caratheodory extension, 25Cauchy-Schwarz inequality, 57closed under finite intersections, 9co-complete

poset, 38subset, 39subspace, 41

complemented subspace, 41complete

poset, 35

subset, 36subspace, 41

conditional expectation, 75consistent measures, 116continuous linear mapping, 44, 76continuous mapping, 18, 37continuous operation, 5, 35convergence in measure, 50convergent sequence, 4, 40countable set, 4countably additive mapping, 18countably generated

σ-algebra, 99measurable space, 99

countably sub-additive mapping, 18cylinder sets, 99

d-system, 7Daniell integral, 61decomposition

Lebesgue, 74decreasing sequence, 35density

Radon-Nikodym, 73disjoint union, 15dominated convergence theorem, 55

elementary mapping, 43equicontinuous measures, 108expectation

conditional, 75extension

Caratheodory, 25

Fatou’s lemma, 55finite kernel, 85finite measure, 19finite operation, 6finite pre-kernel, 86finitely additive measure, 18

140

Index 141

generated σ-algebra, 7generated outer measure, 25greatest lower bound, 5, 38

image measure, 78implicit product

kernel, 95measure, 95

increasing sequence, 35integral, 55

Daniell, 61inverse limit, 116isomorphic measurable spaces, 105

kernel, 84finite, 85implicit product, 95probability, 85

Kolmogorov extension property, 116

least upper bound, 5, 35Lebesgue decomposition, 74Lebesgue measure, 31limit of a sequence, 4linear mapping, 44, 76

continuous, 44, 76monotone, 44, 76pre-linear, 44

locally equicontinuous measures, 116locally finite measure, 30lower bound, 5, 38

greatest, 5, 38

mapping∅-continuous, 18additive, 18bounded, 44continuous, 18, 37countably additive, 18countably sub-additive, 18elementary, 43linear, 44, 76measurable, 10monotone, 18, 37

pre-continuous, 37projection, 13simple, 43sub-additive, 18

mappingsuniformly integrable, 59

martingale, 123, 124, 127measurable mapping, 10measurable rectangle, 12measurable selector, 135measurable space, 10

countably generated, 99separable, 78, 104

measurable spacesisomorphic, 105

measure, 18σ-finite, 22finite, 19finitely additive, 18image, 78implicit product, 95Lebesgue, 31locally finite, 30outer, 24pre-image, 78probability, 19product, 88

measuresconsistent, 116equicontinuous, 108locally equicontinuous, 116mutually singular, 73

monotone class, 7monotone linear mapping, 44, 76monotone mapping, 18, 37monotone operation, 5, 35mutually singular measures, 73

normal subspace, 42

open rectangle, 16operation

continuous, 5, 35

Index 142

finite, 6monotone, 5, 35

orderpartial, 35

outer measure, 24

partial order, 35partially ordered set, 35poset, 35

co-complete, 38complete, 35

pre-continuous linear mapping, 44pre-continuous mapping, 37pre-image measure, 78pre-kernel, 82

finite, 86probability kernel, 85probability measure, 19product σ-algebra, 13product measure, 88product of measurable spaces, 13projection mapping, 13property

Kolmogorov extension, 116weak sequential compactness, 108

Radon-Nikodym density, 73Radon-Nikodym theorem, 71rectangle

measurable, 12open, 16

section, 14selector, 135

measurable, 135separable measurable space, 78, 104separable topological space, 100sequence

adapted, 123convergent, 4, 40decreasing, 4, 35increasing, 4, 35

simple mapping, 43space

measurable, 10standard Borel space, 132sub-σ-algebra, 74sub-additive mapping, 18submartingale, 124subsequence, 137subset

?-closed, 36co-complete, 39complete, 36thick, 78

subspace, 41?-closed, 41co-complete, 41complemented, 41complete, 41normal, 42weakly complemented, 61weakly normal, 61

substandard Borel space, 112

tail σ-algebra, 123thick subset, 78topological space

separable, 100trace σ-algebra, 9

uniformly integrable mappings, 59upper bound, 5, 35

least, 5, 35

weak sequential compactness, 108weakly absolutely continuous, 70weakly complemented subspace, 61weakly independent σ-algebras, 95weakly normal subspace, 61

some notes on measure theory - uni-bielefeld.depreston/rest/measures/files/measures.pdf · could...

Documents