andersen r.- set theory and the construction of numbers

Set Theory and the Construction of Numbers

Robert Andersen

Department of Mathematics

University of Wisconsin - Eau Claire

1

I INTRODUCTION

Modern mathematics is couched in the language and steeped in the theory

of sets. Sets are the heart and soul of mathematics. Sets should be well

understood by any serious student of the subject. What we attempt to do

here is to present the Zermelo Fraenkel Axioms for Set Theory and develop a

model for those axioms. We call this model the number model, as our usual

concept of numbers is a consequence of this model.

There is an initial temptation when developing a theory to define all of the

terms. This temptation is quickly extinguished when we realize its futility.

To define a new term we must define it in terms of previous terms; thus we

are faced with two possibilities, either having circular definitions or having

an infinite regression of definitions. Either case is unsatisfactory. Therefore

we begin with undefined terms, or as I prefer “dictionary” defined terms.

That is, definitions taken directly from a standard dictionary.

A set is a group of persons or things classed or belonging together. We

may paraphrase this as: A set is a collection of objects. However, as we shall

see later, not every collection can be regarded as a set. Collections, either

sets or non-sets, are often referred to as spaces to avoid repetitive rhetoric.

The objects, persons, or things that make up the set we shall call ele-

ments. A single one of these objects is of course an element.

The aggregate of elements is the set. If a particular element is a member

of the aggregate we say it is an element of the set. Let a represent a set and

x one of its element. We express this fact with the notation, x ∈ a, and we

read this as x is an element of a.

In developing an axiomatic discussion of sets it may be tempting to posit

2

the existence of a set. This is troublesome for the following reason. In any

well defined axiom system the axioms are chosen to be independent. An

axiom is independent of the other axioms if a model can be constructed,

using all other axioms, and replacing the axiom in question with another

that is its denial or implies its denial. An axiom system is independent if all

of its axioms are independent. If our axiom system includes an axiom that

asserts the existence of a set, then to demonstrate its independence, we must

construct a model that assumes the validity of all other axioms, which are

statements about sets, and an axiom that denies the existence of a set. It is

not necessary to posit the existence of a set since examples of concrete sets

are ubiquitous. An examination of my right front pocket reveals a collection

of keys, indeed a set exists.

3

II THE AXIOMS

The classical collection of Axioms for set theory are the Zermelo Fraenkel

Axioms (ZF Axioms).

The Axiom of Extension

ZF1 Axiom of extension. Two sets are equal if and only if they have

identically the same elements. ∀a∀b(a = b) ⇔ ∀x(x ∈ a ⇔ x ∈ b).

This axiom not only defines the term equal and its associated symbol, =,

but also tells us that it is the elements that uniquely define a set. The set

may however have different descriptions. For example consider those planets

that orbit the sun closer than the earth, and those planets of our solar system

with no natural satellite. A simple check of an almanac reveals that exactly

the same collection of planets satisfies both descriptions.

Let a be any set. Let P (x) be a proposition about an arbitrary element

x from a; that is, for every x ∈ a, P (x) is a statement that is either true or

false. An example of a non-proposition in the variable x is x ∈ b. In this

example b is also an indeterminate. Without specifying b we have no way of

knowing whether any particular element of our initial set a is in b or not.

The Axiom Schema of Specification

ZF2 Axiom schema of specification. For every set a and proposition P (x)

there is a set b that consists of those elements of a where P (x) is true.

∀a∃b(x ∈ b ⇔ x ∈ a ∧ P (x)).

ZF2 is not regarded as a single axiom but rather as a collection of ax-

ioms, hence the term axiom schema is used. Each possible proposition P (x)

4

produces a separate axiom. Several of the Zermelo Fraenkel Axioms are in

fact axiom schema and will be identified as such.

By convention, braces and are used when representing a set, with the

elements listed between the braces. The pair of braces should be considered

as a single symbol, a single brace without its complement is meaningless in

set theory. The braces can be thought of as the purse that holds the coins.

The purse is by no means a member of the collection of coins, and the braces

are not elements of a set. The braces simple mean that the elements listed

between them are to be considered a set. x, y, z is the set that consists of

the letters x, y, z (the commas, or course, are punctuation).

When we use an axiom of specification we express the set b (of ZF2) by

x ∈ a | P (x), which is read: The set of all x in a such that P (x). The set a

is called the Universal Set and P (x) is the proposition that must be satisfied

for an element to be included in the set b.

Let a be an arbitrary set, for example let a be the collection of nations

in the United Nations. Now let P (x) be the statement: x is not x, e.g.,

Canada is not Canada. At first glance this statement may seem absurd, but

in reality it is not absurd but simply false for every element of our set. Using

our notation we write the set specified as:

x ∈ a | x 6= x.

By virtue of the axiom schema of specification this set exists, but no

element of a satisfies the proposition x 6= x. Hence we must conclude that

the set is vacuous and we call it the Empty or Null Set. The empty set is not

just an interesting or pathological footnote to the axiom of specification, but

is absolutely crucial to the study of sets. In fact we may at this time choose

5

to disregard any other set (and I choose to do so) and focus our attention

solely on the empty set and other sets that may be derived from it by the

axioms. We reserve the symbols ∅ and for the empty set. We may state

this result as a theorem.

Theorem 2.1 There exists a set that has no elements.

Here is the formal proof.

Proof Let ∅ = x ∈ a | x 6= x. Since for any set a no element satisfies the

proposition x 6= x, we conclude ∅ has no elements.

An interesting result known as Russell’s Paradox can now be presented.

Since sets themselves can be regarded as objects, it seems quite reasonable

to consider a collection of sets as a set itself. Let s be the collection of all

sets and consider the following set.

r = x ∈ s | x 6∈ x.

Now the set r is either an element of itself or it is not. Assume that it is,

i.e. we assume r ∈ r. Then by the defining proposition of the set r we see

that it must be the case that r 6∈ r, contrary to the assumption. Thus we

may assume that the other case is true, that r is not an element of itself, i.e

r 6∈ r, but we see that if that is true we must necessarily have r ∈ r. Again

this presents a contradiction. Clearly the collection, s, is much too large or

encompassing to be considered a set. This unfortunate result is handled by

an appropriate axiom yet to be stated. (See ZF9)

At this stage the only sets that we have are real, tangible objects and the

empty set. We wish to present axioms that allow us to expand our collection

of sets to more abstract objects. The next few axioms allow us to construct

new sets from preexisting sets.

6

The Axiom of Pairing

ZF3 Axiom of pairing. If a and b are sets then there exists a set c such

that a ∈ c and b ∈ c. ∀a∀b∃c(a ∈ c ∧ b ∈ c).

There is an important fact to note here. The axiom specifies two elements

that are in c but says nothing about what else may be in c, in this sense c is

not a well defined set. But by the use of an axiom of specification we may

construct a well defined set. Let c be a set guaranteed to exist by the axiom

of pairing and consider the following set:

d = x ∈ c | x = a, or x = b.

Clearly d = a, b.

Another point to note here is that a and b may be the same set, that is

a = b. In this case we have the following:

d = a, b = a, a = a

The last equation is true by virtue of the axiom of extension. Since

both sets on each side of the equal sign have exactly the same elements the

statement is true.

Using only the empty set, we can now construct a new set. Let a = b = ∅,and we have d = ∅.

We may continue ad infinitum constructing new sets from previously con-

structed sets, for example let a = ∅ and b = ∅, and we now have

d′ = ∅, ∅.

7

The Axiom of Unions

ZF4 Axiom of unions. For every set c there is a set a where, if b ∈ c

and x ∈ b, then x ∈ a. ∀c∃a∀b(x ∈ b ∧ b ∈ c ⇒ x ∈ a).

Here we regard the set c as a collection of sets. We form a new set by

including every element of each of the member sets of c. If c is not a collection

of sets then the statement is vacuously true and a is possibly empty. Again a

is not a well defined set, the axiom does not exclude other possible elements

that do not satisfy our stated condition. An application of the axiom of

specification (ZF2) will form a set that contains only those elements that are

contained in the sets of c.

Let a′ represent that set whose elements consist of only those elements of

a that are elements of members of c. We write

a′ =⋃

b∈c

b or simply a′ =⋃c

which is read a prime is the union of c.

If c consists of only two elements, b1, b2, then we write b1 ∪ b2.

A closely related concept to the union of sets is the intersection of sets.

Let c be a collection of sets, we define the intersection of this collection of

sets by ⋂

b∈c

b = x ∈⋃c

| x ∈ b for every b in c.

The reader may question why we choose the Union of c for our Universal

set when any element of c would suffice. By choosing the Union we eliminate

the problem of devising some mechanism to choose some element of c to be

our Universal set.

8

If we are constructing the intersection of only two sets, b1, b2, then we

write b1 ∩ b2.

Definition We say the sets a and b are disjoint, if a ∩ b = ∅.

Definition For two sets a and b we say a is a subset of b, written a ⊂ b,

if and only if every element of a is an element of b; i.e. a ⊂ b ⇔ (x ∈ a ⇒x ∈ b).

Theorem 2.2 a = b if and only if a ⊂ b and b ⊂ a.

This theorem follows immediately from the definition of subset and the

axiom of extension (ZF1).

Note that it is vacuously true that ∅ ⊂ a for every set a.

If a ⊂ b and a 6= b, we say that a is a proper subset of b, and we write

a ( b. Definition For sets a and b, the complement of b in a, written

a− b is the set

a− b = x ∈ a | x 6∈ b.

Theorem 2.3 De Morgan laws If c is a set and b is a collection of subsets

of c, then c−⋃

a∈b

a =⋂

a∈b

(c− a) and c−⋂

a∈b

a =⋃

a∈b

(c− a).

Proof

[x ∈ c−

⋃

a∈b

a ⇒ x ∈ c and x 6∈⋃

a∈b

a

⇒ x ∈ c and ∀a ∈ b x 6∈ a

⇒ ∀a ∈ b x ∈ (c− a)

⇒ x ∈⋂

a∈b

(c− a)]

⇒ c−⋃

a∈b

a ⊂⋂

a∈b

(c− a)

9

On the other hand

[x ∈

⋂

a∈b

(c− a) ⇒ ∀ a ∈ b x ∈ (c− a)

⇒ x ∈ c and ∀ a ∈ b x 6∈ a

⇒ x ∈ c and x 6∈⋃

a∈b

a

⇒ x ∈ c−⋃

a∈b

a]

⇒⋂

a∈b

(c− a) ⊂ c−⋃

a∈b

a

By Theorem 2.2 we have the result

c−⋃

a∈b

a =⋂

a∈b

(c− a).

The proof that

c−⋂

a∈b

a =⋃

a∈b

(c− a)

is similar and left to the reader.

The Axiom of Power Sets

ZF5 Axiom of power sets. For any set a there is a set b such that if

x ⊂ a, then x ∈ b. ∀a∃b(x ⊂ a ⇒ x ∈ b).

Again we must apply an axiom of specification (ZF2)to construct a well

defined set whose elements are only the subsets of a. This set is called the

power set of a and is written P(a).

By virtue of the axiom of pairing (ZF3) for any set a we may construct

the singleton set a. Now by virtue of the axiom of unions(ZF4) we may

10

construct the set a ∪ a. This set is known as the successor of a and is

written a + 1. We formally define the successor by:

Definition For any set a the successor of a is the set a + 1 given by:

a + 1 = a ∪ a.

Examples:

∅+ 1 = ∅ ∪ ∅ = ∅

∅+ 1 + 1 = ∅ ∪ ∅ = ∅, ∅

For convenience we name these sets, ∅ = 0, ∅ = 1, ∅, ∅ = 2.

Thus we see:

= 0

0 = 1

0, 1 = 2

...

0, 1, . . . , n− 1 = n

...

Remark: The notation a + 1 + 1 + · · · + 1 is unambiguous as its only in-

terpretation is (. . . ((a + 1) + 1) + · · · + 1). To associate differently, e.g.

a + (1 + 1 + · · · + 1) has no meaning. The notation a + n refers to the nth

successor of a. That is a + n = a + 1 + 1 + · · ·+ 1︸︷︷︸n

.

11

The Axiom of Infinity

The following axiom is a powerful statement that allows for the construction

of arbitrarily large sets and allows us to regard unbounded classes of numbers

as sets.

ZF6 Axiom of infinity. There exists a set, a, that contains ∅, and the

successor of each of its elements. ∃a(∅ ∈ a ∧ (x ∈ a ⇒ x + 1 ∈ a).

A set that satisfies the axiom of infinity is called a successor set.

Theorem 2.4 If a and b are successor sets, then a ∩ b is a successor set.

Proof (a, b successor sets) ⇒ (∅ ∈ a and ∅ ∈ b) ⇒ (∅ ∈ a ∩ b). (α ∈a ∩ b) ⇒ (α ∈ a and α ∈ b) ⇒ α + 1 ∈ a and α + 1 ∈ b) ⇒ α + 1 ∈ a ∩ b).

We generalize this theorem to include arbitrary intersections and hence

the above theorem becomes a corollary to the following theorem.

Theorem 2.5 If A is an arbitrary collection of successor sets, then⋂A

is a

successor set.

Proof Let A be a collection of successor sets. ∅ ∈ a ∀a ∈ A ⇒ ∅ ∈⋂a∈A

a.

α ∈⋂a∈A

a ⇒ α ∈ a ∀a ∈ A ⇒ α + 1 ∈ a ∀a ∈ A ⇒ α + 1 ∈⋂

a∈b

a.

Let Ω be a successor set, and let A = a ∈ P(Ω)|a is a successor set ,that is A is the collection of all successor subsets of Ω. We let ω =

⋂A

.

We notice ω is unique regardless of the initial choice of Ω since if we

let Ω1 and Ω2 be two successor sets we have Ω1 ∩ Ω2 is a successor set and

Ω1 ∩ Ω2 ⊂ Ω1 and Ω1 ∩ Ω2 ⊂ Ω2.

12

We also see ω is the minimal successor set. Thus we see ω = 0, 1, 2, . . . .

We can now construct the successors of ω, ω + 1 = 0, 1, 2, . . . , ω and

ω + 1 + 1 = 0, 1, 2, . . . , ω, ω + 1

We name ω + 1 + 1 = ω + 2 and ω + 1 + · · ·+ 1︸︷︷︸n

= ω + n.

It is important to observe here that the successor of a set is not a successor

set.

The Ordering of Sets, Cartesian Products and Functions

Definition We say that a is less than b, written a < b, iff a ∈ b.

We should note here that sets and elements are somewhat synonymous,

they differ only in their relationship to each other.

Now we wish to deal with the situation where we have a set with two

or more elements and we wish to designate one element as the first element,

and another as the second and so forth. Let us begin with the easiest case,

a set with two elements.

Let x and y be elements, by virtue of the axiom of pairing (ZF3) we can

construct the two sets x, y, and x. Again using the axiom of pairing we

construct the set x, x, y. For simplicity of notation we use (x, y) to

represent the set x, x, y.

Definition The ordered pair, (x, y), is the set x, x, y.

We regard the ordered triple (x, y, z) as the ordered pair

((x, y), z) = x, x, y, x, x, y, z.

13

Thus we may inductively define the ordered n-tuple by

(x1, x2, · · · , xn) = ((x1, x2, · · · , xn−1), xn).

We now want to consider certain subsets of the collection of all ordered

pairs formed from two sets.

Definition The Cartesian product of two sets a and b is

a× b = z ∈ P(P(a ∪ b)) | z = x, x, y where x ∈ a, y ∈ b.

For simplicity of notation we write

a× b = (x, y)|x ∈ a, y ∈ b.

Definition A function from a set a to a set b is a subset, f , of a× b that

satisfies the following two condition:

1. ∀x ∈ a ∃(x, y) ∈ f , and

2. if (x, y) ∈ f and (x, z) ∈ f then y = z.

To work with functions efficiently, it helps to name the related sets. The

set a is called the domain of f , and b is called the codomain of f . The

range of f is the set y ∈ b|(x, y) ∈ f. If c ⊂ a, then the image of c under

f is the set y ∈ b | x ∈ c and (x, y) ∈ f. If d ⊂ b, then the preimage

of d under f is the set x ∈ a|(x, y) ∈ f and y ∈ d. We write f(c) for the

image of c under f , and f−1(d) for the preimage of d under f . Also, if the

range of a function f is equal to its codomain, we say the function is onto

its codomain.

14

We express a function, f , with domain a and codomain b by f : a → b.

Also for a function f , if (x, y) ∈ f , then we express y as f(x). We may write

y = f(x), or (x, f(x)) to represent the element (x, y) of f .

We wish to generalize the cartesian product to arbitrary collections of

sets. To do so we introduce the concept of indexing one set by another.

Definition A function I from a set Λ onto a set a is said to index the set

a by Λ. The set Λ is called the index and a is the indexed set. If I(λ) = a,

then we write aλ for I(λ).

Definition Let a be a non-empty set (we remind the reader here, that the

elements of sets are considered to be sets) indexed by a set Λ. The cartesian

product of a is defined to be the collection, Π, of all functions with domain

Λ and codomain⋃

b∈a

b, satisfying the condition f(λ) ∈ bλ. We write

c =∏

λ∈Λ

bλ.

For clarity we present some examples here.

Let a = 1, 2, 2, 3. Then we have the four functions

f1 = (1, 2, 1), (2, 3, 2)

f2 = (1, 2, 1), (2, 3, 3)

f3 = (1, 2, 2), (2, 3, 2)

f4 = (1, 2, 2), (2, 3, 3)

This notation is cumbersome but we can represent these functions by the

following ordered pairs without loss of any information.

f1 = (1, 2), f2 = (1, 3), f3 = (2, 2), f4 = (2, 3).

15

Hence the cartesian product is

∏= (1, 2), (1, 3), (2, 2), (2, 3).

In this example where there are only two elements in a we may write

1, 2 × 2, 3 = (1, 2), (1, 3), (2, 2), (2, 3).

We leave as an exercise for the reader to demonstrate that the cartesian

product of the following collection

a = 1, 2, 2, 3, 3, 4

can be represented by

∏

λ∈3

αλ = (1, 2, 3), (1, 2, 4), (1, 3, 3), (1, 3, 4), (2, 2, 3), (2, 2, 4), (2, 3, 3), (2, 3, 4).

Let a = 0, 1 be indexed by ω. The cartesian product of a with respect

to this indexing can be represented by the set of infinite strings of zeros and

ones. That is

∏i∈ω

0, 1i = (b0, b1, · · · )|bi = 0 or 1 ∀i ∈ ω.

Any function, f , that satisfies the conditions in the previous definition

is called a choice function. The rationale for this name is clear, as the

function chooses an element from each set.

Let a be a set and c =∏

λ∈Λ

bλ. The projection map pbλ: c → bλ is the

function defined by pbλ(x) = x(λ).

For our original example we may compute

p1,2(f1) = f1(1, 2) = 1

16

and

p1,2(f3) = f3(1, 2) = 2.

We leave as an exercise to compute p1,2(f2) and p1,2(f4).

The Axiom of Choice

The next axiom is known as the Axiom of Choice. We give three formulations

of the statement of this axiom.

ZF7 Axiom of Choice.

I. For every nonempty set whose elements are nonempty sets there

exists a choice function.

II. If ai is a family of nonempty sets, indexed by a nonempty set I,

then there exists a family xi with i ∈ I such that xi ∈ ai for each

i ∈ I.

III. The cartesian product of a nonempty collection of nonempty sets is

nonempty. (∀a 6= ∅ ∨ x ∈ a ⇒ x 6= ∅) ⇒∏x∈a

6= ∅.

Theorem 2.6 The three previous statements are equivalent.

Proof I. ⇒ II. Let A be a collection of disjoint nonempty sets. We have

A⊂ P(⋃i∈I

ai). By I. there exists a choice function f on P(⋃i∈I

ai). Let b be

the image of A. Pick an element a ∈A, f(a) ∈ a∩ b since f(a) ∈ a. Let y ∈ b

where y 6= f(a) thus we have y = f(a′) where a′ 6= a, and thus y ∈ a′. Since

a and a′ are disjoint y 6∈ a. Thus the only element of b ∩ a is f(a).

II. ⇒ I. Define the choice function to be (ai, xi) | i ∈ I.

17

I ⇐⇒ III. Since the cartesian product is the collection of choice func-

tions, if a choice function exists then the cartesian product is non-empty.

Conversely if the cartesian product is non-empty its elements are choice func-

tions, thus a choice function exists.

The Axiom Schema of Replacement

ZF8 Axiom schema of replacement. If P (x, y) is a proposition such that

for each x in a set a, P (x, y) and P (x, z) implies that y = z, then there exists

a set b such that y ∈ b if and only if there exists an x in a such that P (x, y).

∀x ∈ a(P (x, y) ∧ P (x, z) ⇒ y = z) ⇒ (∃b ∧ (y ∈ b ⇔ ∃x ∈ a ∧ P (x, y))).

ZF8 allows the construction of new sets in the following way. If a set a

exists and a rule that assigns to elements of a other pre-existing elements or

sets that may or may not be elements of other sets, then there exists a set b

that contains only those elements. This axiom schema will be heavily relied

upon in the next chapter.

The Axiom Schema of Restriction

ZF9 Axiom schema of restriction. Let S(x) be any proposition involv-

ing x that does not involve y or z. If there exists an x such that S(x) is true,

then there exists a y such that S(y) is true and, for all z, if z ∈ y then S(z)

is false.

If we take S(x) to be the statement x ∈ a, then we have the following

statement, which we call the axiom of regularity.

Axiom of regularity. Every nonempty set a contains an element b such

that a ∩ b = ∅. ∀a 6= ∅∃b ∈ a ∧ a ∩ b = ∅.

18

Two important lemmas follow from the axiom of regularity.

Lemma 2.7 For each set a, a 6∈ a.

Proof The proof is indirect. We assume that there exists a set a such that

a ∈ a. We thus have a ∈ a ∩ a. However by the axiom of regularity acontains an element whose intersection with a is empty. Since a is the only

element we have a ∩ a = ∅, which contradicts our original assumption.

Corollary There does not exist a set of all sets.

Proof If there existed a set of all sets it would have to be an element of

itself which would contradict the previous lemma. .

Thus we see the axiom of regularity is the response to Russell’s paradox.

Lemma 2.8 No two sets can be elements of each other.

Proof Again the proof is indirect. Assume that a and b are sets such that

a ∈ b and b ∈ a. We thus have a ∈ a, b∩b and b ∈ a, b∩a. By the Axiom

of regularity we must have an element x ∈ a, b such that x ∩ a, b = ∅.Since our only two choices are a or b we must have either a, b ∩ a = ∅ or

a, b ∩ b = ∅ which contradicts our assumption.

These two lemmas can be replaced by a more general theorem (Theorem

3.3) that will be stated and proved in the next chapter. Meanwhile we can

use Lemma 2.8 to prove the following “cancellation” law.

Theorem 2.9 If x + 1 = y + 1, then x = y.

Proof x + 1 = y + 1 ⇒ x ∪ x = y ∪ y ⇒ either x = y or x ∈y and y ∈ x. The latter case is a contradiction to lemma 2.8.

Since the collection of all sets is an object that can be contemplated the

study of collections can be extended to include collections that are not sets.

19

Collections that may or may not be sets are called classes. It is not within

the scope of this book to study classes but we give the following definition.

Definition A collection that is not a set is called a proper class.

Exercises Prove the following “distributive” laws.

1. a ∪ ( ⋂

λ∈Λ

bλ

)=

⋂

λ∈Λ

(a ∪ bλ

)

2. a ∩ ( ⋃

λ∈Λ

bλ

)=

⋃

λ∈Λ

(a ∩ bλ

)

Also show

3. (a, b) = (c, d) ⇒ a = c and b = d.

20

III ORDINAL NUMBERS

Order Relations

Definition For any set a, each subset of a× a is called a relation on a.

If R is a relation on a set a and if (x, y) ∈ R then we write xRy.

Definition A relation that satisfies the following conditions:

R xRx ∀x ∈ a

A xRy & yRx ⇒ x = y.

T xRy & yRz ⇒ xRz

is called an order relation, or simply an order.

Condition R is called reflexive, A is called antisymmetric, and T is called

transitive.

Let a be a set. In our context the elements of a are themselves sets.

We see that R = (x, y) | x ⊂ y is an order relation on a. We leave the

verification as an exercise for the reader.

With an order relation we use the symbol x ¹ y instead of xRy. If x ¹ y

we say that x precedes y or x is less than or equal to y.

Definition If x ¹ y and y 6¹ x, then we say x is strictly less than (or simply

less than) y, and we write x ≺ y.

A set together with an order is said to be an ordered set. We often refer

to an ordered set as a partially ordered set to differentiate it from linearly

ordered and well ordered sets which we now define.

21

Definition An ordered set a that for every x, y ∈ a either x ¹ y or y ¹ x

is said to be linearly ordered or totally ordered.

Theorem 3.1 Trichotomy If a is a linearly ordered set and x, y ∈ a Exactly

one of the following statements is true:

i) x ≺ y. ii) y ≺ x. iii) x = y.

Proof If a is linearly ordered then we have

x ¹ y or y ¹ x ⇒

i) (x ¹ y and y ¹ x) ⇒ x = y or

ii) (x ¹ y and y 6¹ x) ⇒ x ≺ y or

iii) (x 6¹ y and y ¹ x) ⇒ y ≺ x.

We observe that neither x ¹ y and x 6¹ y, nor y ¹ x and y 6¹ x can be

true at the same time, thus the statements are pairwise inconsistent.

If ∃x ∈ a such that x ¹ y ∀y ∈ a, then we say x is the least element in

a.

Definition An ordered set a is said to be well ordered if and only if

whenever b is any nonempty subset of a, then b has a least element.

Theorem 3.2 Every well ordered set is linearly ordered.

Proof Let a be a well ordered set. Let x, y ∈ a. since a is well ordered

the subset x, y has a least element thus either x ¹ y or y ¹ x. Hence a is

linearly ordered.

Ordinal Numbers

Definition Let a be a set and let x ∈ a, the section of x with respect to

the set a is

S(x) = y ∈ a | y ≺ x.

22

The weak section of x is

S(x) = y ∈ a | y ¹ x.

Definition An ordinal number is a well ordered set a where for all x ∈ a,

S(x) = x.

In the last section we constructed the sets

= 0

0 = 1

0, 1 = 2

...

0, 1, . . . , n− 1 = n

...

By virtue of the axiom of infinity we may construct the sets

ω = 0, 1, · · · ω + 1 = 0, 1, · · · , ω

ω + 2 = ω + 1 + 1 = 0, 1, · · · , ω, ω + 1...

We can easily verify that these sets satisfy the definition of ordinal num-

bers where the order relation is x ⊂ y.

We now want to construct ‘higher order’ ordinal numbers. The axiom of

infinity will not work for us, since it only guarantees the existence of ω. We

now must appeal to the axiom schema of replacement (ZF8) to continue our

constructions.

23

Let P (x, y) be the proposition: For x ∈ ω, y is the xth successor of ω, i.e.

y = ω + x. Since ordinal successors are unique if z = ω + x we must have

y = z. Thus by the axiom schema of replacement (ZF8) there exists a set b,

such that ω + n ∈ b for every n ∈ ω, and conversely b contains only those

elements. We see that b = ω, ω+1, ω+2, · · · . We now construct the union

of ω and b.

ω ∪ b = 0, 1, 2, · · · , ω, ω + 1, ω + 2 · · ·

We name this set ω2.

We may repeat this process and construct the set ω3 where P (x, y) is

the proposition: y is the xth successor of ω2. We continue constructing sets

ω4, ω5, · · · . For clarity of discussion we shall refer to ωn as the nth multiple

of ω. We let ω0 = 0 and ω1 = ω.

We now let P (x, y) be the proposition: y is the xth multiple of ω. Since

successors are unique and multiples are unique collections of successors we

have unique multiples. Thus we apply the axiom of replacement (ZF8) and

construct the set b = 0, ω, ω2, · · ·

We now apply the axiom of unions (ZF4) to construct the union of all

sets of b. We designate this set by ω2.

⋃

b

= ω2

We may visualize ω2 by the following array.

ω2 =

0, 1, 2, · · ·ω, ω + 1, ω + 2, · · ·ω2, ω2 + 1 ω2 + 2, · · ·

...

24

We note here, that an element of ω2 is of the form ωn + m where n,m ∈ ω.

We may now construct successors of ω2, ω2 + n, and by the axiom of

replacement form the set b = ω2 +n|n ∈ ω. The set⋃

b

we call ω2 +ω. We

continue as above to form the set b = ω2+ωn|n ∈ ω, and the set⋃

b

we call

ω22. We continue and construct the multiples of ω2, ω2n. Again by virtue of

the axiom of replacement we construct a set b = 0, ω2, ω22, ω23, · · · . And

by virtue of the axiom of unions we construct the set ω3 =⋃

b

.

We may continue in this fashion constructing the sets ω4, ω5, · · · . We

refer to the set ωn as the nth power of ω. We again apply the axiom of

replacement and construct the set

b = 0, ω, ω2, ω3, · · ·

and the union of the sets of b form the set

ωω =⋃

b

.

We observe here that an element of ωω can be expressed in the form

ωnAn + ωn−1An−1 + · · ·+ A0 ≡n∑

k=0

ωkAk. Where n ∈ ω and An ∈ ω.

This process of course does not stop here but continues. We may form

sets ωωω, · · · . The set ωω·

··we call ε0. We of course have no reason to believe

that we have exhausted all ordinal numbers, and may continue in this fashion

ad infinitum.

Transfinite Induction

Theorem 3.3 The principle of transfinite induction. If a is an ordinal

number and b ⊂ a such that, for x ∈ a, S(x) ⊂ b ⇒ x ∈ b, then b = a.

25

Proof Suppose to the contrary that a is an ordinal number and b ⊂ a such

that for x ∈ a, S(x) ⊂ b ⇒ x ∈ b but, there exists c ∈ a such that c 6∈ b.

Then there exists a nonempty set y = x ∈ a | x 6∈ b. Since y is a nonempty

subset of a well ordered set it must have a least element, a0, and S(a0) ⊂ b,

thus by our hypothesis a0 ∈ b which contradicts our assumption. Thus we

must conclude b = a.

Properties of Ordinal Numbers

Lemma 3.4 The elements of ordinal numbers are ordinal numbers.

Proof If a is an ordinal number and b ∈ a, then b = S(b) ⊂ a. We notice

that any subset of b is a subset of a, thus b is well ordered, furthermore for

any c ∈ b we have c ∈ a and thus c = S(c). Therefore b is an ordinal number.

Theorem 3.5 Let a, b be ordinal numbers, then either a ( b, or b ( a, or

a = b.

Proof Either a = b, or a 6= b. If a 6= b, then either ∃x ∈ a where x 6∈ b or

∃x ∈ b where x 6∈ a. If ∃x ∈ a where x 6∈ b, let t = x ∈ b|x ∈ a ⊂ b∩a ⊂ a.

We show t = b by transfinite induction, and hence b ⊂ a. Let x ∈ b such

that S(x) ⊂ t, thus S(x) ( a. Since S(x) is a proper subset of a we have

y ∈ a|y 6∈ S(x) 6= ∅. Let r be the least element of y ∈ a|y 6∈ S(x),then r = S(x) = x. Since r ∈ a we have x ∈ a and thus x ∈ t. Hence by

Transfinite induction t = b. By the symmetric argument if ∃x ∈ b where

x 6∈ a we have a ⊂ b.

Definition An upper bound for an ordered set C is an element β such

that x ¹ β ∀x ∈ C.

26

Definition A supremum or least upper bound for an ordered set C is

an element α such that α is an upper bound, and if γ is an upper bound,

then α ¹ γ. We indicate the supremum of an ordered set C by sup C.

When the elements of an ordered sets are regarded as numbers we will use

the symbols ≤ and < for ¹ and ≺. Thus for ordinal numbers the symbols

≤, ¹ and ⊂ are equivalent, as are <, ≺ and (.

Theorem 3.6 If C is a set of ordinal numbers, then C has a supremum.

Proof Let α =⋃C

. We claim that α is an ordinal number. Let A ⊂ α and

A 6= ∅. Pick a ∈ A, if a ≤ b ∀b ∈ A, then a is the least element. If a is not

the least element, then ∃ b ∈ A such that b < a ⇒ b ∈ a. Thus a ∩ A 6= ∅.The element a is an ordinal number and is well ordered. Let a0 be the least

element of a ∩ A. Let c be an arbitrary element in A, then either a ≤ c or

c < a. If a ≤ c, then a0 ≤ c. If c < a, then c ∈ a∩A, and thus a0 ≤ c. Thus

a0 is the least element of A. If ξ ∈ α, then ξ ∈ c for some c ∈ C ⇒ ξ = S(ξ).

Thus α is an ordinal number. Now α is an upper bound for C, since if c ∈ C,

then c ⊂ α implies c ∈ α. Now suppose ζ is an upper bound for C. Then for

all c ∈ C we have c ⊂ ζ, and thus α ⊂ ζ ⇒ α < ζ. Thus α is the supremum.

Corollary The collection of all ordinal numbers is a proper class.

Proof If the collection were a set, then a supremum would exist. Let α

be the supremum, but α ⊂ α + 1 and α + 1 is an ordinal number. Thus

α + 1 ∈ α ∈ α + 1. Which contradicts Lemma 2.7.

As promised at the end of chapter II, we now state and prove a gener-

alization of the final two lemmas of that chapter. As you will note we need

the concept of ordinal numbers to state the theorem in its generality.

27

Definition An ordinal number greater than 0 that is not the successor of

any other ordinal number is said to be a limit ordinal.

Theorem 3.7 For any collection of sets C, that can be indexed by an

ordinal α+1, that is not a limit ordinal, we can never have, for C = xλ|λ ∈α + 1, x0 ∈ x1 ∈ · · · ∈ xα ∈ x0.

Proof If we had x0 ∈ x1 ∈ · · · ∈ xα ∈ x0, then we would always have for

every ν 6= α, xν ∈ C ∩ xν+1 and xα ∈ C ∩ x0. Which contradicts the axiom

of regularity (ZF9).

Corollary For any two sets a and b, a ∩ (b× a) = ∅.

Proof Every element of b× a is of the form b′, b′, a which cannot

be in a, by virtue of theorem 3.7.

The Transfinite Recursion Theorem.

Let W be a well-ordered set and α ∈ W . An α-sequence in a set X is

a function φ : S(α) → X. Recall that S(α) is the initial section of α.

A sequence function of type W in X is a function

f : φ : S(α) → X|α ∈ W → X.

That is, f maps α-sequences into X.

Let Υ : W → X where W is a well ordered set and X is a set. We

observe that Υ|S(α) : S(α) → X is an α-sequence for all α ∈ W . Υ|S(α) is the

restriction of Υ to S(α)).

Theorem 3.8 Transfinite Recursion Theorem If W is a well ordered set

and if f is a sequence function of type W in a set X, then there exists a

unique function Υ : W → X such that Υ(α) = f(Υ|S(α)) for each α ∈ W .

28

Proof To prove uniqueness, let Υ and Ψ be two such functions such that

Υ(β) = Ψ(β) ∀β ∈ S(α). That is Υ|S(α) = Ψ|S(α). Then we have

Υ(α) = f(Υ|S(α)) = f(Ψ|S(α)) = Ψ(α).

Thus by Transfinite induction we have Υ(α) = Ψ(α) ∀α ∈ W .

To prove existence we explicitly construct Υ as a subset of W ×X.

We say a subset A of W×X is f-closed if for α ∈ W and t an α-sequence

in A, i.e. (c, t(c))|c ∈ S(α) ⊂ A, then (α, f(t)) ∈ A. W × X is f -closed,

thus such subsets do exist.

Let Υ =⋂

A is f−closed

A. Υ is f -closed since any α-sequence is in every A. Thus

(α, f(t)) ∈ A for every A, and (α, f(t)) ∈ Υ.

We now show that Υ is a function. That is ∀γ ∈ W ∃ ! ξ ∈ X such that

(γ, ξ) ∈ Υ.

We proceed by transfinite induction on W .

Let S = γ ∈ W |(γ, ξ), (γ, ζ) ∈ Υ & (γ, ξ) = (γ, ζ) ⇒ ξ = ζ. Also let

S(α) ⊂ S for some α. Thus if γ < α, then ∃ ! ξ ∈ X such that (γ, ξ) ∈ Υ.

The function t : S(α) → ξ, thus defined, is an α-sequence and t ⊂ Υ.

Now assume α 6∈ S. Then (α, y) ∈ Υ where y 6= f(t). Now consider the

set Υ−(α, y). Let β ∈ W and r be a β-sequence in Υ−(α, y). If β = α,

then r = t by the uniqueness of Υ. Also (β, f(r)) = (α, f(t)) ∈ Υ−(α, y),since (α, f(t)) 6= (α, y)) and (α, f(t)) ∈ Υ. If β 6= α, we have (β, f(r)) ∈Υ − (α, y) since Υ is f -closed and (β, f(r)) 6= (α, y). Thus we have, if

β ∈ W and if r is a β-sequence in Υ−(α, y), then (β, f(r)) ∈ Υ−(α, y).That is to say Υ − (α, y) is f -closed. This contradicts the fact that Υ is

29

the smallest f -closed set. We must conclude that α ∈ S. The hypothesis

for transfinite induction has been verified. Thus the existence of Υ has been

demonstrated.

30

IV CARDINAL NUMBERS

Cardinality

Definition Let a and b be sets. A subset, m, of a × b that satisfies the

following conditions is called a bijection:

I. ∀x ∈ a ∃y ∈ b such that (x, y) ∈ m.

II. If (x, y) and (x, z) are in m, then y = z.

III. ∀y ∈ b ∃x ∈ a such that (x, y) ∈ m.

IV. If (x, y) and (z, y) are in m, then x = z.

Conditions I. and II are the conditions necessary for the subset, m to be a

function as defined in Chapter II. We often refer to functions as maps. Thus

we say m is a map or function, and we write m : a → b.

Condition, III, is called the onto condition. Condition, IV, is called the

one to one condition. Hence we often say a bijection is a map that is one

to one and onto.

Definition Let a, b, c, be sets such that there are functions, f : a → b and

g : b → c. The composition of g with f , written g f is that function that

satisfies

(g f)(x) = z ⇐⇒ g(y) = z where f(x) = y.

Definition Let a and b be sets. If there exists a subset of a × b that is a

bijection then we say a and b are cardinally equivalent, or a and b have

the same cardinality.

31

Definition A relation on a set a that satisfies the following conditions:

R: xRx ∀x ∈ a

S: xRy ⇒ yRx.

T: xRy & yRz ⇒ xRz

is called an equivalence relation.

Condition R is called reflexive, S is called symmetric, and T is transitive.

We note here that the equivalence relation differs from the order relation

by replacing antisymmetry with symmetry.

Definition A Partition of a set a, which we shall indicate by p(a), is a

subset of the power set of a, P(a), such that⋃

p(a)

= a and for b, c ∈ p(a), b 6=

c ⇒ b ∩ c = ∅.

We leave it as an exercise to the reader to verify that an equivalence

relation on a set induces a partition of the set.

We say an equivalence relation partitions a set. That is the set is parti-

tioned into disjoint subsets where each element of each subset is equivalent

to each other but not to any element of any other subset.

Theorem 4.1 Cardinal equivalence is an equivalence relation.

Proof Let c be a set.

For reflexivity we note that ∀a ∈ c the identity map I : a → a defined by

I(x) = x is a bijection.

For symmetry we note that bijections are one-to-one and onto, thus the

reversed relation is also a bijection.

32

For Transitivity we note that the composition of bijections is also a bi-

jection.

Definition A Cardinal Number is the least ordinal of that cardinality.

We say a set is finite if it is cardinally equivalent to a proper subset of

ω, otherwise we say it is infinite. For finite sets there is a unique ordinal

number to which that set is cardinally equivalent, thus for finite sets, ordinal

and cardinal numbers are identically the same. This is not true for infinite

sets.

For ω + 1 = 0, 1, · · · , ω we can form the bijection

b : ω + 1 ↔ ω

defined by

b(x) = x + 1 for x 6= ω and b(ω) = 0.

For ω + n = 0, 1, · · · , ω, ω + 1, · · · , ω + (n− 1) we can form the bijection

b : ω + n ↔ ω

defined by

b(x) = x + n for x < ω and b(ω + k) = k for ω ≤ x ≤ ω + (n− 1).

Definition Any set that is cardinally equivalent to a subset of ω is said to

be countable, otherwise we say it is uncountable.

Every finite set is cardinally equivalent to a subset of ω, thus all finite

sets are countable. If a set is cardinally equivalent to ω then we say the set

is countably infinite.

33

Notation: The cardinality of a set, a, is indicated by C(a). The cardinality

associated with a countably infinite set is denoted by the cardinal number ℵ0

(aleph naught, ℵ is the first letter of the Hebrew Alphabet). Thus C(ω) = ℵ0.

Cantor’s Theorem

Theorem 4.2 Cantor’s Theorem For any set a, C(P(a)) 6= C(a).

Proof We prove Cantor’s theorem by contradiction.

Assume there exists a bijection b : a ↔ P(a), and let y ∈ P(a) be defined

as

y = x ∈ a | x 6∈ b(x).

The set y exists by the axiom of specification, ZF2. let c = b−1(y). We thus

must have the absurd implications

c ∈ y ⇒ c 6∈ b(c) = y ⇒ c ∈ b(c) = y.

We conclude no such bijection can possibly exist.

Since there is a natural bijection from a set a to the subset of P(a) that

consists of all the singletons it is natural to believe that we in fact have

the inequality C(a) < C(P(a)). Since we have defined cardinal numbers in

terms of ordinal numbers we wish to delay making this statement until we

have demonstrated that every set is cardinally equivalent to some ordinal

number.

Corollary There exists an uncountable set.

Proof P(ω) is not cardinally equivalent to any subset of ω and thus must

be uncountable.

34

The Schroder-Bernstein Theorem

If two sets, A and B, are cardinally equivalent, i.e. a bijection exists

between them then we write

A ↔ B.

Now suppose that for two sets, A and B, that set B has a subset C, where

A is cardinally equivalent to C, ie. A ↔ C ⊂ B. We then write

A → B.

Equivalently we may write

B ← A.

Theorem 4.3 Schroder-Bernstein For any two sets X and Y if X → Y

and X ← Y then X ↔ Y .

Proof By the assumption we have a bijection from X into Y . Call this

bijection f , i.e. f : X → Y , or we may write f : X ↔ f(X) ⊂ Y . We also

have by assumption a bijection, g, from Y into X. I.e. g : Y ↔ g(Y ) ⊂ X.

Our goal is to construct a bijection from X to Y . We will proceed as follows.

We will partition both X and Y into three disjoint subsets and produce

bijections between the subsets of X and Y .

First consider the elements of X that are not in the image of g, i.e., the

set X − g(Y ). We enlarge this set by including all of its descendants that

are in X under the maps (g f)n and call this set XX . The set XX can be

specified by

XX = z ∈ X | z = (g f)n(x) for some x ∈ X − g(Y ) and n ∈ ω.

35

When n = 0 z would be in X − g(Y ). We now consider the elements of Y

that are descendants of X − g(Y ) under the maps f (g f)n. We call this

set YX . YX can be specified by

YX = w ∈ Y | w = f (g f)n(x) for some x ∈ X − g(Y ) and n ∈ ω.

We note that YX = f(XX).

We similarly construct the subsets YY and XY which can be specified as

YY = t ∈ Y | t = (f g)n(y) for some y ∈ Y − f(X) and n ∈ ω.

and

XY = u ∈ X | u = g (f g)n(y) for some y ∈ Y − f(X) and n ∈ ω.

Again we note XY = g(YY ).

We now define the subset X∞ as those elements of X that are neither in

XX nor XY . I.e.

X∞ = s ∈ X | s 6∈ XX ∪XY .Similarly we define Y∞ by

Y∞ = r ∈ Y | r 6∈ YY ∪ YX.

We point out here the rationale for the symbols for these sets. XX is the

collection of elements in X whose most distant or primitive ancestor (under

the maps g f) is in X. XY is the collection of elements in X whose most

distant ancestor is in Y . X∞ is the collection of elements of X that have

no most distant ancestor, i.e. their lineage can be traced back infinitely far.

Similarly for YY , YX , and Y∞. We also note here that the sets XX , XY and

X∞ are mutually disjoint, as are the sets YY , YX and Y∞ and

X = XX ∪XY ∪X∞ and Y = YY ∪ YX ∪ Y∞.

36

To complete the proof we demonstrate that f restricted to XX , is a bi-

jection onto YX , g restricted to YY is a bijection onto XY , and f restricted to

X∞ is a bijection onto Y∞. We note here that g restricted to Y∞ would also

be a bijection onto X∞. For brevity we will abbreviate a function restricted

to a subset of its natural domain by f |A.

We must first show that f(XX) is in YX . Let z ∈ XX then z = (g f)n(x)

for some x ∈ X − g(Y ) and n ∈ ω. Thus f(z) = f (g f)n(x) for some

x ∈ X − g(Y ) and n ∈ ω. Hence f(z) ∈ YX . Clearly f |XXis one to one since

f is one to one. If y ∈ YX then y = f (g f)n(x) for some x ∈ X − g(Y )

and n ∈ ω Thus y is the image under f of an element of the form (g f)n(x)

for some x ∈ X − g(Y ) which is in XX , hence f |XXis onto, and thus f |XX

is

a bijection.

Similarly g|YYis a bijection from YY to XY , and thus g−1|Xy is a bijection

from XY to YY .

To demonstrate that f(X∞) is in Y∞ we note that for any x ∈ X∞ if f(x)

were not in Y∞ it would be in either YY or YX . Thus f(x) would be of the

form (f g)n(y) for some y ∈ Y − f(X) and n ∈ ω or f (g f)n(z) for some

z ∈ X − g(Y ) and n ∈ ω. Thus x = f−1 (f g)n(y) = g (f g)n−1(y)

for some y ∈ Y − f(x) and n ∈ ω or x = f−1 f (g f)n(z) for some

z ∈ X − g(Y ) and n ∈ ω. Thus x would be in either XY or XX which

contradicts our assumption, hence f(X∞) is in Y∞. Again f |X∞ is one to one

since f is one to one. Now let y ∈ Y∞ then there exists an x ∈ X such that

f(x) = y, if not then y ∈ Y − f(X) ⊂ YY . If that x were not in X∞ then it

would either be in XX or XY which would mean that y would be in either

YX or YY and thus not in Y∞ which contradicts our assumption. Hence we

conclude that f |X∞ is onto and is thus a bijection.

37

We can now formally define our bijection, b : X ↔ Y , as follows:

b(x) =

f(x) if x ∈ XX

f(x) if x ∈ X∞

g−1(x) if x ∈ XY .

Exercise In chapter 8 we develop the real numbers, however assuming an a

priori knowledge of the real numbers consider the closed interval [0, 1] ≡ X

and the half-open interval [0, 1) ≡ Y . Define injective maps f : X →Y by f(x) = 1

2x, and g : Y → X by g(x) = x. Determine the sets

XX , YX , YY , XY , X∞, Y∞, and the bijection b : X ↔ Y that the Schhroder-

Bernstein theorem guarantees to exist.

Some Countable Sets

Theorem 4.4 The finite cartesian product of countable sets is countable.

Proof If the cartesian product of a collection of sets is countable the

product can be reindexed to be a single countable set, thus all that needs

to be shown is that the cartesian product of two countable sets is countable.

this argument is made by the classical diagonalization process.

Let A and B be two countable sets then we can represent the cartesian

product by

A×B = (ai, bj)|ai ∈ A, bj ∈ B, i, j ∈ ω

We define a bijection b : ω → A×B by the following recursion:

b : 1 → (a1, b1)

b : n → (ai−1, bj+1), where b : n− 1 → (ai, bj), where n > 1, i 6= 1

38

b : n → (aj+1, b1), where b : n− 1 → (a1, bj), where n > 1.

To verify that the map defined by this recursion is a bijection consider the

following diagram. The arrows indicate the “order” in which the elements

are to be “counted”.

(a1, b1) (a1, b2) (a1, b3) (a1, b4) · · ·y (a2, b1) (a2, b2) (a2, b3) (a2, b4) · · ·

(a3, b1) (a3, b2) (a3, b3) (a3, b4) · · ·

(a4, b1) (a4, b2) (a4, b3) (a4, b4) · · ·

......

......

If b(n) = b(m), then the set of predecessors of b(n) is equal to the set of

predecessors of b(m). Thus the section of n is equal to the section of m, and

thus n = m. So b is 1 to 1.

If (a, b) ∈ A × B, then (a, b) = (ak, bl) for some k, and l in ω. Then by

“counting” backwards through the recursion we may determine an n where

b(n) = (a, b)

Corollary∏i∈n

ω for n ∈ ω is countable.

Theorem 4.5 The countable union of countable sets is countable.

Proof Without loss of generality we may assume the sets are disjoint. This

is in fact the most extreme case. We may now let aij represent the ith element

of the jth set. The argument is now identical to the proof of theorem 4.4.

39

Corollary ωω is countable.

Proof ωω =⋃

b

Where b = 0, ω, ω2, · · · .

From the above arguments it is easily seen that all the ordinal numbers

that have explicitly been constructed by the method outlined in chapter III

must be countable. Not all ordinals are countable as we shall see in the next

chapter.

Some Uncountable Sets

Theorem 4.6 The countably infinite product of countable sets each of

which has cardinality of at least 2 is not countable.

Proof Assume the conclusion is false. That is assume there exists a bijec-

tion b from ω to∏i∈ω

Ai, where each Ai is countable.

Let pn be the projection map onto the nth co-ordinate space. For all

n ∈ ω pick an ∈ An − (pn b)(n). Then the sequence (a0, a1, · · · ) ∈∏i∈ω

Ai.

We have for all n ∈ ω pn((a0, a1 · · · )) = an, but (pn b)(n) 6= an. Thus

b(n) 6= (a0, a1, · · · ) ∀n ∈ ω. Which is a contradiction. Thus no bijection

exists.

Exercises

The following set of exercises leads to an alternate proof of the Schroder-

Bernstein Theorem.

Let X be a partially ordered set and A a subset of X. An element x ∈ X

is said to be an upper bound for A if x ≥ a ∀a ∈ A. Equivalently, an

40

element y ∈ X is said to be a lower bound for A if y ≤ a ∀a ∈ A. An upper

bound u of A is the Supremum or least upper bound of A if u ≤ x for

all upper bounds x of A. A lower bound l of A is the infimum or greatest

lower bound of A if l ≥ y for all lower bounds y of A.

We say that a partially ordered set X is complete if there exists a supre-

mum and infimum for every subset of X.

Let X and Y be partially ordered sets. A function f : X → Y is order

preserving if x ≤ y ⇒ f(x) ≤ f(y).

1. If L is a complete partially ordered set and f : L → L is an order

preserving function, show that there exists a ∈ L such that f(a) = a.

2. Let A be any set. Show that P(A) is a complete partially ordered set,

where X ≤ Y if X ⊆ Y (we say that P(A) is ordered by inclusion).

For simplicity of notation define X −A = A′, that is A′ is the comple-

ment of A in X.

3. Let A and B be sets and f and g be functions such that f : A → B and

g : B → A. Let h : P(A) → P(A) be defined by h(S) = [g(f(S)′)]′.

Show that T ⊂ S ⇒ h(T ) ⊂ h(S).

4. Observe that h is an order preserving map, thus there exists S ⊂ L

such that h(S) = S. that is g(f(s)′)′ = X, and g(f(S)′) = S ′. Now

assume f and g are one to one and demonstrate the bijection from A

to B.

41

V ZORN’S LEMMA AND WELLORDERING

We have shown that there are sets with cardinality greater than ℵ0 but we

have yet to demonstrate that there exists ordinal numbers with cardinality

greater than ℵ0. We will accomplish this task by showing that every set can

be well ordered, and that every well ordered set is order isomorphic to an

ordinal number. By order isomorphic we shall mean the following.

Definition Two partially ordered sets a and b are said to be order iso-

morphic if there exists a bijection between them that preserves order. That

is if β : a−→b is the order preserving bijection then x ≤ y if and only if

β(x) ≤ β(y). We write a ' b.

Zorn’s Lemma

Theorem 5.1 Zorn’s Lemma: If X is a partially ordered set such that every

chain in x has an upper bound in X, then X contains a maximal element.

By chain we mean a totally or linearly ordered subset. In the hypothesis

of Zorn’s Lemma the upper bound need not be in the chain. In the conclusion

the maximal element is simply an element with no superior, that is if x ≤y ⇒ x = y then x is maximal. There may in fact be elements in X that are

not comparable to x, yet x may still be maximal.

Proof Let X be a partially ordered set. For each x ∈ X, S(x) = y ∈ X |y ≤ x is the weak section of x. S is a function from X to P(X) since the

section of any element is unique. The range R of S is a collection of subsets

that are partially ordered by inclusion, i.e. A ≤ B if A ⊆ B. S is one to one,

42

since S(x) ⊆ S(y) if and only if x ≤ y. Thus if we find a maximal element

S(z) in R then z is maximal in X. Also C is a chain in X if and only if S(C)

is a chain in R.

Let X be the set of all chains in X. Let Γ ∈ X . Since Γ is a chain it

has an upper bound x by hypothesis thus Γ ⊆ S(x) for some x ∈ X. X is

a non-empty collection of sets partially ordered by inclusion. Now if C is a

chain in X then G =⋃Γ∈C

Γ ∈ X . G is an upper bound of C in X as each

element of C is dominated by G.

Now let X be an arbitrary non-empty collection of subsets of a non-empty

set X subject to

1. if A ∈ X and B ⊆ A then B ∈ X, and

2. if C is a chain in X then⋃Γ∈C

Γ ∈ X.

Notice that these are the exact conditions that our set X had in the previous

discussion. Also notice that the first condition implies that ∅ ∈ X. Our task

is to show that X has a maximal element.

Now let φ be a choice function from the non-empty subsets of X to X, i.e

φ : (P(X)−∅)−→X. We note that φ is a function such that φ(A) ∈ A for all

non-empty subsets A of X. For each set A ∈ X let A = x ∈ X | A ∪ x ∈X. Define a function γ :X−→ X by the following:

γ(A) =

A ∪ φ(A− A) if A− A 6= ∅A if A− A = ∅.

We observe that A − A = ∅ if and only if A is maximal. Our task is now

to show there exists a set A in X such that γ(A) = A. Since φ(A − A) is a

43

single element we notice that γ(A) contains at most one more element than

A.

We now define a tower as a subcollection T of X that satisfies the fol-

lowing conditions:

1. ∅ ∈ T ,

2. if A ∈ T , then γ(A) ∈ T , and

3. if C is a chain in T , then⋃A∈C

A ∈ T .

We notice here that X satisfies the conditions for a tower, and thus towers

exist. We can also easily verify that the intersection of a collection of towers is

a tower. Let Tλ be a collection of towers, since ∅ ∈ Tλ for all λ, ∅ ∈⋂

λ

Tλ.

If A ∈⋂

λ

Tλ, then A ∈ Tλ for all λ, thus γ(A) ∈ Tλ for all λ and thus

γ(A) ∈⋂

λ

Tλ for all λ. And finally if C is a chain in⋂

λ

Tλ, then C is a chain

in Tλ for all λ, thus⋃

A∈C

A ∈ Tλ for all λ, and thus⋃A∈C

A ∈⋂

λ

Tλ. It follows

that the intersection of all towers T0 is the smallest tower. We now wish to

show that T0 is a chain.

We say that a set B in T0 is comparable if it is comparable with every

set in T0, that is, for all A ∈ T0 either A ⊆ B or B ⊆ A. To show that T0 is

a chain we show that every set in T0 is comparable.

We now let B be an arbitrary comparable set in T0. Comparable sets do

exist since ∅ is clearly comparable. Suppose A ∈ T0 and A is a proper subset

of B. Since B is comparable we have γ(A) ⊆ B or B is a proper subset of

γ(A). If B ⊂ γ(A) we have A as a proper subset of B and B a proper subset

44

of γ(A), but γ(A)−A is a singleton thus there cannot be a set between them.

We conclude γ(A) ⊆ B.

Now consider the collection U of all sets A in T0 where either A ⊆ B or

γ(B) ⊆ A. The collection U is no larger than the collection of sets in T0

that are comparable with γ(B) since, if A ∈ U and since B ⊆ γ(B) we have

either A ⊆ γ(B) or γ(B) ⊆ A.

We now claim that U is a tower. We verify the three conditions:

1) ∅ ∈ U .

2) To show A ∈ U ⇒ γ(A) ∈ U Consider three cases

i. A ⊂ B.

ii. A = B.

iii. γ(B) ⊆ A.

For i. γ(A) ⊆ B by the preceding argument thus γ(A) ∈ U .

For ii. γ(A) = γ(B) thus γ(B) ⊆ γ(A), therefore γ(A) ∈ U .

For iii. γ(B) ⊆ A ⇒ γ(B) ⊆ γ(A), therefore γ(A) ∈ U .

3) Let C be a chain in U , if γ(B) ⊆ D for some D ∈ C, then γ(B) ⊆⋃

D∈C

D

hence⋃

D∈C

D ∈ U . If however D ⊆ B for all D ∈ C, then⋃

D∈C

D ⊆ B. Thus

we may conclude⋃

D∈C

D ∈ U .

We thus conclude that U is a tower and is a subset of T0 which is the

smallest tower hence we have U = T0.

45

Now let B be a comparable set, we form U as above and since U = T0 for

any A ∈ T0 we have either A ⊆ B ⊆ γ(B) or γ(B) ⊆ A. We thus conclude

that if B is a comparable set, then γ(B) is comparable also.

We have that ∅ is comparable and γ maps comparable sets to comparable

sets. Now since the union of a chain of comparable sets is comparable we

may conclude that the comparable sets constitutes a tower, and hence they

exhaust all of T0.

T0 is a chain, thus if A =⋃

B∈T0

B ∈ T0 we have γ(A) ⊆ A, since the union

A includes all the sets in T0. We always have A ⊆ γ(A), thus we conclude

that A = γ(A). This is the condition that we noted earlier needed to be

shown to complete the proof.

Definition A well ordered set A is a continuation of a well ordered set

B if

i) B ⊂ A

ii) B = S(a) for some a ∈ A

iii) For a, b ∈ B, a ≤B b iff a ≤A b.

The reason for the third condition is that a set may have more than one

ordering, and for continuation we want B to have the same ordering as A

when restricted to B.

The Well Ordering and Counting Theorems

Theorem 5.2 The Well Ordering Theorem Every set can be well ordered.

46

An important point to be noted here is that the set may be presented

with an ordering that is not a well ordering. The Well Ordering Theorem

says we may disregard any previously assigned ordering that the set may

have and endow it with a new ordering that is a well ordering.

Before we prove this theorem we make this note. We shall regard an

ordered set X as the pair (X,<) where < is an order relation.

Proof Let X be a set. Let W be the collection of all well ordered subsets

of X under every possible ordering, i.e.

W = (A,<) ∈ P(X)× P(A× A) | < is a well ordering of A.

We partially order W by continuation, i.e. (A,<) <W (B,<) if B is a

continuation of A. W is not empty since (X, <) where <= (x, x) | x ∈ Xis an element of W .

Now let C be a chain in W ,

C = (Aλ, <λ) | (Aλi, <λi

) <W (Aλk, <λk

) for λi < λk.

Since the Aλ are nested sets,⋃

Aλ∈C

Aλ is an upper bound for C, and is in

W since any subset of⋃

Aλ∈C

Aλ must be a subset of Aλifor some λi, and thus

have a least element, and therefore be well ordered.

Hence the condition for the hypothesis of Zorn’s lemma has been satisfied

and we may conclude that there exists a maximal element M in W . We claim

M = X. If not then there exists x ∈ X such that x 6∈ M . Thus we may

construct (M, <) = (M∪x, <) where y < x for all y ∈ M . M is clearly well

ordered and the continuation of M , thus M > M . Which is a contradiction

since M is maximal. We conclude M = X and thus X is well ordered.

47

Theorem 5.3 Counting Theorem Every well ordered set is order isomor-

phic to a unique ordinal number.

Proof Uniqueness is virtually trivial, since order isomophic is clearly tran-

sitive, if a well ordered set were order isomorphic to two different ordinal

numbers, those ordinal numbers would be order isomorphic, a contradiction.

Now let X be a well ordered set, let S = x ∈ X | S(x) ' α for some

ordinal number α and let a ∈ X be an element such that for each predecessor

b of a S(b) is order isomorphic to some ordinal number ( clearly the ordinal

number is unique and such an element does exist, since the initial segment of

the least element of X is the empty set which is order isomorphic to 0, thus

the least element of the set X − l | l is the least element of X satisfies the

condition for a). Now let P (x, α) be the proposition “α is an ordinal number

and S(x) ' α”. Since α is unique we have P (x, α) and P (x, β) ⇒ α = β. We

may now apply the Axiom Schema of Replacement and verify the existence

of the set T = α | x ∈ S(a) & P (x, α). T is the set of ordinal numbers

order isomorphic to the initial segments determined by the predecessors of a.

T is clearly an ordinal number and is order isomorphic to S(a). We have thus

satisfied the hypothesis for transfinite induction and we may conclude that

for all x ∈ X S(x) ' α for some ordinal α. We may annex another element z

to X and make it maximal that is z > x for all x ∈ X. Then in the new set

X ∪ z, S(z) = X, and by the previous argument X = S(z) ' α for some

ordinal number α.

48

The Equivalences to the Axiom of Choice

Since every well ordered set is order isomorphic to a unique ordinal num-

ber, we present the following definition.

Definition The unique ordinal number to which a well ordered set is order

isomorphic is its order type, which we shall abbreviate by OT .

We can in fact regard ordinal numbers to be the order types of well

ordered sets. However when developing the properties of well ordered sets it

is cognitively much easier to work with the narrowly defined ordinal numbers.

Lemma If A ⊂ B are well ordered sets, then OT (A) ≤ OT (B).

Proof Let a and b be ordinal numbers and α : a → A and β : b → B

be order preserving bijections. Either a ⊆ b or b ⊆ a. If a ⊆ b, then

OT (A) = a ≤ b = OT (B). If b ⊆ a, then let φ = β−1 α, an order preserving

bijection from a to b. Let c ⊆ a such that φ(x) = x ∀x ∈ c. Now let y ∈ a

such that S(y) ⊂ c. Assume y 6∈ c, then φ(y) = z > y, and ∃t ∈ a such

that φ(t) = y and t > y. But φ(t) = y < z = φ(y), which contradicts order

preserving. Thus we must have y ∈ c, and by transfinite induction we have

c = a. Thus φ : a → b is the identity map, and since b ⊂ a we have b = a.

Corollary If a and b be ordinal numbers that are order isomorphic and

a ⊆ b, then a = b.

If we take the Well ordering theorem as an axiom we may prove the Axiom

of Choice as a theorem.

Theorem 5.4 For any non-empty collection of non-empty sets there exists

a choice function.

49

Proof Let C be a non-empty collection of non-empty sets. Well order each

member of C and let the choice function choose the least element of each set.

We may summarize our results by the following:

Theorem 5.5 The following are equivalent:

1. The Axiom of Choice.

2. Zorn’s Lemma.

3. The Well Ordering Theorem.

50

VI ARITHMETIC

In this chapter we will develop the concept of arithmetic for ordinal and

cardinal numbers. It is conceptually easier to define an arithmetic for car-

dinal numbers, so we will do that first and extend those concepts to ordinal

numbers.

Cardinal Arithmetic

Definition A Binary Operation on a set a is a function from a× a to a.

Definition An Arithmetic on a set a is a collection of binary operations

on the set a. The collection is usually finite.

We extend these two definitions to include arbitrary collections that are

not necessarily sets.

Definition A Binary Operation on any collection is a rule that assigns

to every ordered pair of elements of the collection a unique element of the

collection.

The definition for an arithmetic on an arbitrary collection is identical

except the term set is replaced with collection.

We will represent every binary operation by a symbol, and indicate the

element to which any ordered pair of elements is associated by the two el-

ements juxtaposed with the operation symbol inserted between them. For

Example, if ∗ represents the binary operation and a and b are elements of

the collection then a ∗ b will represent the element to which the ordered pair

(a, b) is associated.

51

We now define an arithmetic for the collection of Cardinal Numbers. The

first operation we will define we will call addition, which will be represented

by the symbol +, and the associated element to the ordered pair will be

known as the sum. The motivation for the definition is naive and intuitive.

We would like to say that the sum of the cardinality of two sets is the car-

dinality of the union of the two sets, but this is a little too naive. The two

sets may have non-empty intersection, and we wish to avoid this situation.

Let a and b be two arbitrary sets and 0, 1 be a two point space. The

set a×0 is cardinally equivalent to a by the obvious bijection, as is b×1with b. a× 0 and b× 1 are disjoint sets, as the second member of each

element of one set is different from the second member of each element of

the other set.

Definition The Disjoint Union of two sets a and b, which will be denoted

a ] b, is a× 0 ∪ b× 1.

We may now properly define cardinal addition.

Definition Let A and B be cardinal numbers. There are two sets a and b

such that C(a) = A and C(b) = B, and we define A + B to be C(a ] b).

The second binary operation we define on cardinal numbers we will call

multiplication. We will nominally use the symbol · for multiplication, and

the associated element to the ordered pair will be known as the product.

When indicating the product it is unambiguous to omit the symbol for mul-

tiplication and simply indicate the product by the juxtaposition of the two

elements of the ordered pair, e.g., AB will represent A ·B.

The motivation for the definition for cardinal multiplication is nearly as

intuitive as that for addition. We imagine that we have a collection of sets

52

of equal cardinality and we wish to determine the cardinality of the total

collection. Every element of the total collection can be represented by an

ordered pair, the first member is the symbol for that element within its set,

and the second member is the symbol for the set to which it belongs.

Definition Let A and B be cardinal numbers. There are sets a and b such

that C(a) = A and C(b) = B, and we define AB = C(A×B).

The third binary operation that we shall define on cardinal numbers we

shall call exponentiation. We will nominally use the symbol ∧ for exponen-

tiation and the associated element to the ordered pair will be known as the

power. Again we choose to use a different but still unambiguous notation

for common use, we will use the second member as a superscript to indicate

the power. I.e., ab = a ∧ b.

The motivation for the definition of cardinal exponentiation is that we

imagine that we have an arbitrary collection of arbitrary collections of sets of

equal cardinality, and we wish to determine its cardinality. Recall that the

cartesian product of a collection of sets is the collection of all maps from the

collection of sets to the union of all elements of the sets, where the image of a

set is restricted to the elements of itself. Thus our model for exponentiation

is a collection of duplications of a given set, and we wish to compute the

cardinality of the cartesian product of this collection.

Definition Let A and B be cardinal numbers. There are sets a and b such

that C(a) = A and C(b) = B, and we define AB = C(f |f : b → a).

Ordinal Arithmetic

We will now define an arithmetic for ordinal numbers. We wish to extend

53

the concept of addition relating to the union of two sets, but we also wish to

preserve the order properties of ordinal numbers.

Recall that in Chapter V we defined order type to be the unique ordinal

number to which a well ordered set is order isomorphic, and we use the

abbreviation OT to refer to the order type.

The first operation we will define will again be called addition and its

associated symbol will also be +.

Let a and b be ordinal numbers, and 0, 1 be a two point space. The

sets a× 0 and b× 1 are similar to a and b by the obvious bijection.

Definition If a and b are ordinal numbers, then a + b = OT (a ] b), where

(x, 0) < (y, 0) iff x < y, (x, 1) < (y, 1) iff x < y, and (x, 0) < (y, 1) ∀ x ∈ a

and ∀ y ∈ b.

The second operation defined will again be called multiplication and is

associated symbol will also be ·.

We want to extend the concept of cardinal multiplication so we will con-

cern ourselves with the cartesian product of ordinal numbers. We wish to

extend the order of the ordinal numbers in a fashion that will well order

the cartesian product. Let a and b be ordinal numbers. We define an order

relation on a× b by

(x, y) < (z, w) iff

y < w or

if y = w then x < z.

This order is called reverse lexicographic order.

Definition If a and b are ordinal numbers, then a · b = OT (a × b), where

(a× b) is well ordered by reverse lexicographic order.

54

Verifying that reverse lexicographic order is a well ordering is straight-

forward. Pick any non-empty subset of the cartesian product a × b. The

collection of second members of the elements of this subset has a least ele-

ment. Pick all those elements that has that least second member, from those

pick the element that has the least first member. This will be the least el-

ement of the chosen subset. Hence reverse lexicographic ordering is a well

ordering of the cartesian product of two ordinal numbers.

Lemma For ordinal numbers a, b, c, if a = b, then a + c = b + c and

ac = bc.

Before we prove the lemma we give the following definition and observa-

tion.

Definition The function from a set a to itself defined by I : x → x ∀ x ∈ a

is called the Identity map.

We observe that if a = b, then a]c = b]c. We can now prove the lemma.

Proof The identity maps I1 : a] c → b] c and I2 : a× c → b× c are order

preserving bijections. Let β1 : OT (a ] c) → a ] c, γ1 : OT (b ] c) → b ] c,

β2 : OT (a × c) → a × c, γ2 : OT (b × c) → b × c be order preserving

bijections. We now have that γ−11 I1 β1 : OT (a ] c) → OT (b ] c) and

γ−12 I2 β2 : OT (a× c) → OT (b× c) are order preserving bijections.

We now define the third operation which we call exponentiation and we

also choose the associated symbol ∧, and again we abbreviate with super-

scripting.

Exponentiation is defined recursively by

i. a ∧ 0 ≡ a0 = 1,

55

ii. a ∧ (b + 1) ≡ ab+1 = ab · a,

iii. a ∧ b ≡ ab = supac|c < b if b is a limit ordinal.

Exercises Let a, b, c be ordinal numbers, show

1. ab · ac = ab+c.

2. (ab)c = ab·c.

Hint: Let d be any ordinal number containing c, e.g. c + 1. For 1 Let

A = y ∈ d|ab · ay = ab+y, and use transfinite induction to show A = d. For

2 let A = y ∈ d|(ab)y = ab·y.

Since Cardinal numbers and Ordinal numbers are the same in ω the reader

should verify that ordinal and cardinal arithmetic agree.

We also leave it to the reader to verify these arithmetic facts.

1. ℵ0 + ℵ0 = ℵ0.

2. ℵ0 · ℵ0 = ℵ0.

3. ℵ0 ∧ ℵ0 > ℵ0.

4. n + ω = ω ∀n ∈ ω.

5. ω + n = ω + n ∀n ∈ ω, where the left side of the equation refers to

ordinal addition while the right side refers to the ordinal number ω+n.

6. n · ω = ω ∀n ∈ ω.

7. ω · n = ωn ∀n ∈ ω.

56

We invite the reader to either confirm or deny any arithmetic fact that he

may hypothesize; i.e. add or multiply a few numbers and see what happens.

Definition Let a and b be elements of ω. If b = c + 1, then we define

b− 1 ≡ c.

Lemma If a, b ∈ ω and C(a) = C(b), then a = b.

Proof Let c = y ∈ ω|C(y) < C(y +1), and let x ∈ ω such that S(x) ⊂ c.

If x = 0, we then have C(0) < C(1), thus 0 ∈ c. If x 6= 0, we then have

x− 1 ∈ c, and thus C(x− 1) < C(x).

We can clearly establish that there is no bijection between 0 and 1, or 1

and 2. Now for x > 1 assume there exists a bijection β : x → x + 1. We may

then create a bijection γ : x− 1 → x by

γ(y) =

β(y) if y 6= β−1(x)

β(x− 1) if y = β−1(x).

Thus C(x−1) = C(x), which is a contradiction. Thus C(x) < C(x+1), and

thus x ∈ c, and thus by transfinite induction C(x) < C(x + 1) ∀ x ∈ ω. By

transitivity we have a < b ⇒ C(a) < C(b).

Thus C(a) = C(b) ⇒ a = b.

An alternate definition for addition and multiplication on ω is

a + 0 = a

a + b = (a + 1) + (b− 1)

and

a · 0 = 0

a · b = a + a · (b− 1).

57

Corollary This definition for addition and multiplication agrees with or-

dinal arithmetic on ω.

Proof Using the previous lemma we need only demonstrate equivalent car-

dinality. Define a bijection β : a ] b → (a + 1) ] (b− 1) by

β(x) =

x if x ∈ a× 0x if x ∈ (b− 1)× 1(a, 0) if x = ((b− 1), 1).

We also note that C(a + 0) = C(a ∪ ∅) = C(a).

To show a·b = a+a·(b−1) we define the bijection β(a×b) → a]a×(b−1)

by

β(x, y) =

(x, 0) if y = b− 1

((x, y), 1) if y ∈ b− 1.

Also a · 0 = C(a× ∅) = C(∅) = 0.

Exercises

For a, b, c ∈ ω show the following:

1. (a + b) + c = a + (b + c).

2. a + b = b + a.

3. (a · b) · c = a · (b · c).

4. a · b = b · a.

5. a · 1 = a.

58

6. (a + b) · c = a · c + b · c.

Solution to 1 define the bijection

β : (a×0 ] b×1)×0 ] c×1 → a×0 ] (b×0 ] c×1)×1

by

β((x, n, 0)) =

(x, 0) if x ∈ a

(x, 0, 1) if x ∈ b

and β((x, 1)) = (x, 1, 1) if x ∈ c.

59

VII INTEGER AND RATIONAL

NUMBERS

Natural Numbers

Definition The natural or counting numbers are the elements of ω.

We use the blackboard bold face letter N to represent the set of natural

numbers. As the definition indicates they are synonymous with the ordinal

number ω.

We note here that when the natural or counting numbers are developed

via the Peano postulates, the set of natural numbers begins with 1 and do not

include the number 0. The Whole numbers, indicated by W, are considered

to be the set of numbers that are the union of the natural numbers and

0. However in the set theoretic development of numbers it is much more

convenient to consider the natural numbers as the ordinal number ω, and not

specify any particular set as being whole numbers. (The Peano postulates

can be found in most elementary number theory texts, and also in Naive Set

Theory by Paul R. Halmos.)

Integers

The next set of numbers we develop we shall call Integer Numbers

and we will indicate this set by the bold face letter Z (From the German

word for counting, zahlen). For the purpose of brevity the integer numbers

are referred to as integers. The rationale for the development of this set is

that we may wish to answer questions such as: What number added to 2 is

60

0? This can be expressed symbolically by x + 2 = 0. We realize of course

that x ∈ N| x + 2 = 0 = ∅, thus the question has a vacant answer in the

natural numbers. We extend the natural numbers to a larger set in which

this question and other questions like it have non-vacuous answers.

We define an equivalence relation on the cartesian product of the natural

numbers with themselves, i.e., N× N, by

(a, b) ≡ (c, d) ⇔ a + d = b + c.

The rationale for this definition is that each ordered pair represents a

difference. Using our previous (to the study of set theory) concept of sub-

traction we see

a + d = b + c ⇔ a− b = c− d.

We leave it to the reader to verify that this relation is an equivalence

relation. Also the reader should note and verify that this relation is not an

equivalence relation for α× α where α > ω.

Definition The integers are the collection of equivalence classes of N×Nwith respect to the equivalence relation

(a, b) ≡ (c, d) if a + d = b + c.

We will indicate the equivalence class of (a, b) by [a, b], that is

[a, b] = (x, y)|(x, y) ≡ (a, b).

We now wish to define an order and an arithmetic for the integers. First

we need a pair of lemmas.

Lemma 7.1 a + n = b + n ⇒ a = b ∀ n ∈ N.

61

Proof Let A = n ∈ N|a+n = b+n ⇒ a = b ∀ a, b ∈ N and let k ∈ N. If

k ≥ 2, then S(k) = 0, 1, 2, · · · k− 1 ⊂ A. Thus a + 1 = b + 1 ⇒ a = b, and

a+k− 1 = b+k− 1 ⇒ a = b, for all a, b ∈ N. We now assume a+k = b+k,

which implies a+k−1+1 = b+k−1+1, which implies a+k−1 = b+k−1,

which implies a = b. For k = 1 we have a + 1 = b + 1 ⇒ a = b by Theorem

2.8. For k = 0, we have a + 0 = a and b + 0 = b, thus a + 0 = b + 0 ⇒ a = b.

All three cases imply k ∈ A. Thus by transfinite induction we have the

desired result.

Lemma 7.2 a + n < b + n ⇒ a < b ∀ n ∈ N.

Proof: The proof is identical to the proof of the previous lemma by re-

placing “=” with “<”, except possibly the case k = 1. For the case k = 1,

a + 1 < b + 1 ⇒ a ∪ a ⊂ b ∪ b. If a 6⊂ b, then there exists x ∈ a such

that x 6∈ b. Since x ∈ b ∪ b we must have x = b. But either a ∈ b or

a = b, which contradicts the axiom of regularity. We thus conclude a ⊆ b. If

a = b we also have the obvious contradiction to the axiom of regularity, thus

a ⊂ b ⇒ a < b.

There is a natural order that may be defined on the integers.

Definition [a, b] < [c, d] iff a + d < b + c.

Since [a, b] and [c, d] represent equivalence classes, and the numbers a, b, c, d

are specific values we must verify that the definition is valid regardless of what

pair is chosen to represent each equivalence class. That is, we must show that

the ordering is well defined.

Theorem 7.3 The ordering of the integers is well defined.

62

Proof Let [a, b] < [c, d], [a, b] = [x, y], and [c, d] = [z, w]. We have

a + d < b + c, a + y = b + x, and c + w = d + z

⇒ a + d + x + w < b + c + x + w = a + d + y + z

⇒ x + w < y + z

⇒ [x, y] < [z, w].

We define addition and multiplication of integers as follows

[a, b] + [c, d] = [a + c, b + d]

[a, b] · [c, d] = [ac + bd, ad + bc].

We now demonstrate that these operations are well defined.

Theorem 7.4 Addition and multiplication of integers are well defined.

Proof Let (a, b) ≡ (x, y), and (c, d) ≡ (z, w). We thus have

a + c + y + w = b + d + x + z

⇒ (a + c, b + d) ≡ (x + z, y + w)

⇒ [a, b] + [c, d] = [x, y] + [z, w].

Hence addition is well defined.

63

For multiplication we have

a + y = b + x and c + w = d + z

⇒ ac + cy = bc + cx, bd + dx = ad + dy,

xw + cx = dx + xz and yz + dy = cy + yw

⇒ ac + bd + xw + yz + cy + dx + cx + dy =

ad + bc + xz + yw + cy + dx + cx + dy

⇒ ac + bd + xw + yz = ad + bc + xz + yw

⇒ (ac + bd, ad + bc) ≡ (xz + yw, xw + yz)

⇒ [a, b] · [c, d] = [x, y] · [z, w].

Hence multiplication is well defined.

We must now develop the usual properties of the integers.

Theorem 7.5 If a, b, c, d ∈ Z with c 6= 0 and d > 0, then

i. a = b ⇔ a + c = b + c.

ii. a = b ⇔ ac = bc.

iii. a 0 ⇒ d1 > d2.

64

i.

a = b ⇔ [a1, a2] = [b1, b2]

⇔ a1 + b2 = a2 + b1

⇔ a1 + b2 + c1 + c2 = a2 + b1 + c1 + c2

⇔ [a1 + c1, a2 + c2] = [b1 + c1, b2 + c2]

⇔ [a1, a2] + [c1, c2] = [b1, b2] + [c1, c2]

⇔ a + c = b + c

ii.

a = b ⇔ [a1, a2] = [b1, b2]

⇔ a1 + b2 = a2 + b1

⇔ a1c1 + b2c1 = a2c1 + b1c1 & a1c2 + b2c2 = a2c2 + b1c2

⇔ a1c1 + b2c1 + a2c2 + b1c2 = a1c2 + b2c2 + a2c1 + b1c1

⇔ (a1c1 + a2c2, a1c2 + a2c1) = (b1c1 + b2c2, b1c2 + b2c1)

⇔ ac = bc

iii.

a < b ⇔ [a1, a2] < [b1, b2]

⇔ a1 + b2 < a2 + b1

⇔ a1 + b2 + c1 + c2 < a2 + b1 + c1 + c2

⇔ [a1 + c1, a2 + c2] < [b1 + c1, b2 + c2]

⇔ [a1, a2] + [c1, c2] < [b1, b2] + [c1, c2]

⇔ a + c < b + c

65

iv.

a < b ⇔ [a1, a2] < [b1, b2]

⇔ a1 + b2 < a2 + b1

⇔ a1d1 + b2d1 < a2d1 + b1d1 & a1d2 + b2d2 < a2d2 + b1d2

also a1d2 + b2d2 < a1d1 + b2d1 & a2d2 + b1d2 < a2d1 + b1d1

since d2 < d1

⇔ a1d1 + b2d1 + a2d2 + b1d2 < a1d2 + b2d2 + a2d1 + b1d1

⇔ (a1d1 + a2d2, a1d2 + a2d1) < (b1d1 + b2d2, b1d2 + b2d1)

⇔ ad < bd

Definition An injection of a set a into a set b is a bijection from a to a

subset of b. We will use the symbol → to indicate that a map is an injection.

There is a natural injection, J , from N to Z defined by

J : x → [x, 0].

When there exists an injection from one set to another that preserves

order and arithmetic properties, we say the first set is embedded into the

second. It is easy to verify that the natural injection is an embedding.

Lemma 7.6 If [a, b] is an integer, then there exists a natural number c,

such that [a, b] = [c, 0], or [a, b] = [0, c].

Proof By trichotomy, either a > b, a = b, or a < b. If a > b let c be such

that b + c = a, thus [a, b] = [c, 0]. If a = b let c = 0 (recall for us that 0

66

is a natural number), thus [a, b] = [0, 0] = [c, 0]. If a < b let c be such that

a + c = b, thus [a, b] = [0, c].

When a + c = b, and a, b, c ∈ N, we express c as b− a.

It is convenient to represent an equivalence class by one of its elements.

When a choice function is defined to choose from each of the equivalence

classes a representative element, that element is known as the canonical

representative.

For the integers we define our choice function to be

φ([a, b]) =

(a− b, 0) if a ≥ b

(0, b− a) if b > a

Thus every integer can be represented by [a, 0] or [0, a]. When the num-

bers are understood to be integers we will use a to represent [a, 0] and −a to

represent [0, a]. As an exercise the reader may wish to show that a > 0, and

−a < 0, that is [a, 0] > [0, 0], and [0, a] < [0, 0].

Definition The set of integers strictly greater than 0 is called the Positive

Integers and are denoted by Z+. Those integers strictly less than 0 are called

the Negative Integers and are denoted by Z−.

An Integral Domain

We leave it the reader to verify the following properties for Z.

1. a + b ∈ Z ∀a, b ∈ Z.

2. ab ∈ Z ∀a, b ∈ Z.

67

3. (a + b) + c = a + (b + c)∀a, b, c ∈ Z.

4. (ab)c = a(bc)∀a, b, c ∈ Z.

5. a + b = b + a∀a, b ∈ Z.

6. ab = ba∀a, b ∈ Z.

7. a(b + c) = ab + ac∀a, b, c ∈ Z.

8. ∃e ∈ Z such that a + e = e + a = a∀a ∈ Z

9. ∃u ∈ Z such that au = ua = a∀a ∈ Z

10. ∀a ∈ Z ∃(−a) ∈ Z such that a + (−a) = (−a) + a = 0

11∗. If ab = 0, then either a = 0, or b = 0.

Properties 1 and 2 are called the closure properties, for addition and mul-

tiplication respectively, properties 3 and 4 are the associative properties, 5

and 6 are the commutative properties. Property 7 is the distributive prop-

erty, we say multiplication distributes over addition. In properties 8 and 9

e and u are called the identities (again additive identity and multiplicative

identity respectively). The −a in property 10 is called the additive inverse,

or opposite. We say that a number a is a zero divisor if ab = 0, but neither

a nor b equal 0 (of course b is also a zero divisor). Property 11∗ is called the

“no zero divisors” property.

Definition Any set with two binary operations satisfying these 11 proper-

ties is called an Integral Domain.

68

Rational Numbers

Another question we may wish to answer is: Two times what number is

1? This can be represented symbolically by 2x = 1. Again this question has

a vacant answer in the set of integers.

We extend the integers to the set of rational numbers by defining the

appropriate equivalence relation on the cartesian product of the integers with

themselves. We use the bold face letter Q to represent rational numbers.

The letter Q comes from the term quotient, i.e., the rational numbers are a

collection of quotients.

Definition The rational numbers are the collection of equivalence classes

of Z× (Z− 0) with respect to the equivalence relation

(x, y) ≡ (z, w) ⇔ xw = yz.

From the above comment we can see the rationale for this definition.

Using our previous notion of quotients we see xw = yz ⇔ x

y=

z

wprovided

y, w 6= 0. The reader should again verify that the relation is an equivalence

relation.

If we let a, b ∈ Z such that a ≥ 0 and b > 0, then the reader should

verify that [−a,−b] ≡ [a, b] and [a,−b] ≡ [−a, b]. Thus we may (and shall)

assume for any rational number [a, b] that b > 0. Let d be the least element

of y|(x, y) ∈ [a, b] and b > 0. We then let the unique element (c, d) ∈ [a, b]

be the canonical representative of the rational number [a, b].

We again have the natural order defined on the rational numbers given

by

[x, y] < [z, w] iff xw < yz.

69

Theorem 7.7 The ordering of rational numbers is well defined.

Proof Let [x, y] = [a, b], [z, w] = [c, d], and [x, y] < [z, w] with b, d, y, w >

0. We thus have

ay = bx, dz = cw and xw < yz

⇒ bdxyw2 < bdy2zw ⇒ ady2w2 < bcy2w2 ⇒ ad < bc ⇒ [a, b] < [c, d].

Hence the ordering is well defined.

We define addition and multiplication as follows,

[x, y] + [z, w] = [xw + zy, yw]

[x, y] · [z, w] = [xz, yw].

Theorem 7.8 Addition and multiplication of rational numbers are well

defined.

Proof Let [x, y] = [a, b] and [z, w] = [c, d].

For addition we have

ay = bx and dz = cy

⇒ aydw = bxdw and bydz = bycw

⇒ bdxw + bdyz = adyw + bcyw

⇒ [xw + yz, yw] = [ad + bc, bd].

Hence addition is well defined.

For multiplication we have

ay = bx and dz = cy

⇒ acyw = bdxz

⇒ [ac, bd] = [xz, yw].

70

Hence multiplication is well defined.

Just as there is a natural embedding of the natural numbers into the

integers, there is the natural embedding of the integers into the rational

numbers given by the injection,

J : x → [x, 1].

It is easy to verify that this injection is an embedding.

A Field

We leave it the reader to verify the following properties of Q.

1. a + b ∈ Q ∀a, b ∈ Q.

2. ab ∈ Q ∀a, b ∈ Q.

3. (a + b) + c = a + (b + c) ∀a, b, c ∈ Q.

4. (ab)c = a(bc) ∀a, b, c ∈ Q.

5. a + b = b + a ∀a, b ∈ Q.

6. ab = ba ∀a, b ∈ Q.

7. a(b + c) = ab + ac ∀a, b, c ∈ Q.

8. ∃e ∈ Q such that a + e = e + a = a ∀a ∈ Z

9. ∃u ∈ Q such that au = ua = a ∀a ∈ Q

10. ∀a ∈ Q ∃(−a) ∈ Q such that a + (−a) = (−a) + a = 0

11. ∀a ∈ Q, a 6= 0, ∃a−1 such that aa−1 = a−1a = 1

71

The first 10 properties are identical to the properties of an Integral Domain.

Property 11∗ is replaced by property 11 where a−1 is called the multiplicative

inverse, or reciprocal.

Any set with two binary operations satisfying these 11 properties is called

a Field.

Exercise Show that every field is an integral domain. That is to say, every

field has no zero divisors.

Differences and Quotients

Definition The difference between integers or rational numbers a and b

is a + (−b), which is written a− b.

Definition The quotient of two rational numbers a and b is a · b−1, which

is writtena

b.

We can see that difference and quotient can be regarded as a binary oper-

ations, we also notice that neither operation is commutative nor associative.

Exercise For any two rational numbers p < q, show that p 0, then pr ≤ qr.

2. If p, q, r are rational numbers where p ≤ q and r < 0, then pr ≥ qr.

3. If p and q are positive rational numbers, thenp

q≥ 1 ⇒ q

p≤ 1.

72

Mathematical Induction

The next theorem is a special case of transfinite induction, that is widely

used in many situations. Before we state and prove the theorem we need two

small lemmas that we present as exercises.

Exercise 1. Define the map φ : N → Z+ by φ(a) = [a + 1, 0]. Show that

φ(a) + 1 = φ(a + 1).

Exercise 2. Show that Z+ is order isomorphic to ω.

Theorem 7.9 The Principle of Mathematical Induction If T ⊆ Z+ such

that the following conditions are true:

i. 1 ∈ T

ii. if k ∈ T , then k + 1 ∈ T ,

then T = Z+.

Proof Consider the order preserving bijection φ : ω → Z+ defined by

φ(a) = [a + 1, 0]. Let A = φ−1(T ). Let x ∈ ω such that S(x) ⊂ A.

If x = 0, then φ(x) = 1 ∈ T ⇒ x ∈ A. If x 6= 0, then

S(x) ⊂ A ⇒ x− 1 ∈ A

⇒ φ(x− 1) ∈ T

⇒ φ(x− 1) + 1 ∈ T

⇒ φ(x) ∈ T

⇒ x ∈ A.

Thus by Transfinite Induction A = ω ⇒ T = φ(A) = φ(ω) = Z+.

73

The Cardinality of Integers and Rational Numbers

Theorem 7.10 Both the integers and the rational numbers are countable.

Proof By Theorem 4.4 we know that N× N is countable. there exists the

natural embedding of Z into N × N by identifying [a, b] with its canonical

representative (a− b, 0) or (0, b− a). Thus there exists a bijection from Z to

a subset of a countable set, hence Z is countable. Again by Theorem 4.4 we

know that Z× Z is countable, and there exists the natural embedding of Q

into Z×Z by identifying [a, b] with its canonical representative, (c, d) where

d is positive and minimal. Thus there exists a bijection from Q to a subset

of a countable set, hence Q is countable.

Let Z∗ represent the image of the embedding of Z into Q. Also let Z+∗

and Z−∗ represent respectively the images of the embeddings of the positive

and negative integers into the rationals.

The Archimedian Property

Theorem 7.11 Archimedian Property ∀r ∈ Q ∃n ∈ Z+∗ such that r < n.

Proof Let r = [a, b], if r ≤ 1, then r < 2 and we are done. If r > 1, then

without loss of generality, both a, and b are positive integers, and

b(a + 1) ≥ a + 1 > a

⇒ [a, b] < [a + 1, 1] ∈ Z+∗ .

Lemma For positive rational numbers r and s, if r > s, then r−1 < s−1.

74

Proof First we note that if r = [a, b], then r−1 = [b, a], this is immediate

by computing [a, b] · [b, a] = [ab, ab] = [1, 1]. Now let r = [a, b] and s = [c, d].

[a, b] > [c, d] ⇔ ad > bc

⇔ [d, c] > [b, a]

⇔ s−1 > r−1.

For any integer, a, the product of a with itself b times, where b is a

positive integer is denoted ab.

Exercise Prove that for any positive integer, n, there exists an integer of

the form 2m for some positive integer m, such that 2m > n. (Hint: use

induction).

Solution For n = 1, 1 < 2 = 21. Now assume n < 2m for some m, then

n + 1 < 2m + 1 < 2m + 2m = 2m+1.

Exercise Prove that for any positive rational number, q, there exists a

rational number of the form 2n, where 2n > q, and where n is the embedded

image of a positive integer.

Solution Let q = (a, b) where a and b are positive integers. We then have

(a, b) ≤ (a, 1) < (2n, 1) for some n.

The Division Algorithm

Theorem 7.12 The division algorithm If a and d are integers with d > 0,

then there exist unique integers q and r such that a = dq + r and 0 ≤ r < d.

Proof This result is a consequence of the well ordering property.

Let S = x ∈ Z|x = a− dn ∀n ∈ Z, and let S ′ = x ∈ S|x ≥ 0, S ′ is thus

75

the embedded image of some subset of N, and thus if it is non-empty it must

have a least element.

If a ≥ 0, then let n = 0, and thus x = a − 0 = a ≥ 0. Thus a ∈ S ′. If

a < 0, then let n = a, thus x = a−ad = a(1−d) ≥ 0. Thus a−ad ∈ S ′. Thus

S ′ 6= ∅. Since S ′ 6= ∅, and is embedded image of a subset of N, S ′ has a least

element. Let the least element be r. Thus we have r = a − dq ≤ s ∀s ∈ S ′

and a = dq + r where r ≥ 0.

We now must show r < d. We have a − d(q + 1) = a − dq − d = r − d,

thus r − d ∈ S. Since r is the least element in S ′ and r − d < r we have

r − d < 0 ⇒ r < d. We thus have a = dq + r with 0 ≤ r < d.

Now we have to show that q and r are unique. Suppose a = dq1 + r1 and

a = dq2 + r2 where 0 ≤ r1 < d and 0 ≤ r2 < d. Without loss of generality

we may assume r1 ≤ r2. We thus have 0 ≤ r2 − r1 < r2 < d. We note that

0 ≤ r2 − r1 = a− dq2 − a + dq1 = d(q1 − q2). Thus r2 − r1 is a multiple of d

and non-negative. We thus have 0 ≤ r2 − r1 < d and r2 − r1 = d(q1 − q2)

⇒ 0 ≤ d(q1 − q2) < d ⇒ 0 ≤ q1 − q2 < 1 ⇒ q1 − q2 = 0. Thus q1 = q2,

and thus r1 − r2 = 0 ⇒ r1 = r2.

76

Exercises

1. Verify that the relation (a, b) ≡ (c, d) ⇔ a+d = b+c is an equivalence

relation on N× N, but not on α× α where α > ω.

2. Verify that the relation (x, y) ≡ (z, w) ⇔ xw = yz is an equivalence

relation on Z× (Z− 0)For any Integral Domain D show:

3. a · e = e ∀a ∈ D and where e is the additive identity.

4. u · (−u) = −u, where u is the multiplicative identity.

5. (−u) · (−u) = u, where u is the multiplicative identity.

6. If a, z ∈ D and a + z = a, then show z = e.

Solution to 3: a = a · u = a · (u + e) = a + a · e ⇒ a · e = e.

7. If v, a ∈ F , where F is a field and a · v = a, then show v = u.

8. Show that every Field is an Integral Domain.

Mathematical Induction is often used to prove certain identities. Exer-

cise 9 and its solution exemplifies how this is done. Exercise 10 is left

as practice.

9. Show thatn∑

k=1

k =n(n + 1)

2∀n ∈ Z+.

Solution to 9: Let A =

n

∣∣∣∣∣n∑

k=1

k =n(n + 1)

2

i.1∑

k=1

k = 1 =1(2)

2thus 1 ∈ A.

77

ii. Assume m ∈ A, then

m+1∑

k=1

k = m + 1 +m∑

k=1

k = m + 1 +m(m + 1)

2=

(m + 1)(m + 2)

2.

Thus m + 1 ∈ A. Therefore by Mathematical Induction A = N,

andn∑

k=1

k =n(n + 1)

2∀n ∈ Z+.

10. Show thatn∑

k=1

2k − 1 = n2 ∀n ∈ Z+.

78

VIII REAL NUMBERS

It is a well known fact, and a standard exercise, that√

2 is not rational.

This means the equation x2−2 = 0 has no solutions in the rational numbers,

hence the need to extend the rational numbers to a larger set that would

include the solutions to such equations.

The next set we develop is the set the real numbers which we will indicate

by R.

Dedekind Cuts and Real Numbers

Definition A Cut or Dedekind cut of the rational numbers is a subset

A of the rational numbers such that

1. A 6= ∅, and A 6= Q.

2. If a ∈ A, and b < a, then b ∈ A.

3. if a ∈ A ∃ a′ ∈ A, such that a < a′.

From this definition we see that a cut forms a partition of the rational num-

bers into two non-empty subsets, the cut and its complement, where every

element of the cut is less than every element of its complement. The cut is

called the lower set and its complement is called the upper set.

Definition The set of Real Numbers, R, is the collection of all cuts of

the rational numbers.

We define a natural ordering of the real numbers by the following.

79

Definition For two real numbers, A and A′, we say A < A′ if A ⊂ A′. We

emphasize the inclusion is proper, i.e., A 6= A′.

Theorem 8.1 For any rational number r the set z = p|p < r is a cut,

and hence a real number.

Proof We demonstrate that z satisfies the three conditions of the definition

of a cut.

1) r − 1 < r ⇒ r − 1 ∈ z Thus z 6= ∅. Also r 6< r ⇒ r 6∈ z ⇒ z 6= Q.

2) If q ∈ z and p < q, then p < r, thus p ∈ z.

3) If q ∈ z, then q < r ⇒ q <q + r

2< r, thus

q + r

2∈ z.

There is a natural injection of the rational numbers into the real numbers,

the injection is given by

J : q → A where A = p|p < q.

We will use the notation q to represent the real number p|p < q, in partic-

ular the rational number 0 embeds as J : 0 → p|p < 0 = 0. Throughout

the remainder of this chapter real numbers will be marked by the symbol ˆ.

In subsequent chapters the ˆ will be omitted, and a number will be known

to be real by the context.

Exercise Show that if r ∈ Q and r < x, then r ∈ x.

Solution r = p|p < r ⊂ x, thus ∃ t ∈ x such that t 6∈ r, thus t ≥ r. If

t = r we are done, if t > r, then r ∈ x.

80

Trichotomy

Lemma 8.2 The trichotomy property for real numbers For any real num-

ber x exactly one of the following is true:

x > 0, x = 0, or x < 0.

Proof Let x be a real number, then exactly one of the following is true:

0 ⊂ x, x = 0, or x ⊂ 0.

Definition If x > 0 then we say x is a positive real number. If x < 0 then

we say x is a negative real number.

Addition

We define addition of real numbers by the following:

Definition x + y = r|r < p + q, p ∈ x, and q ∈ y.

Theorem 8.3 The sum x + y is a real number.

Proof We show that x + y satisfies the definition of a cut.

1) Since x and y are not empty there exists p ∈ x and q ∈ y, and

p + q − 1 t ∀t ∈ x and d > w ∀w ∈ y. We have c + d > t + w ∀t + w ∈ x + y. Thus

c + d 6∈ x + y, and so x + y 6= Q.

2) Let a ∈ x + y and b < a. We have b < a p such that r ∈ x, thus p + q < r + q. Thus a < p + q ∈ x + y.

81

Corollary x + y = p + q|p ∈ x, and q ∈ y.

Proof From part 2) we have

p + q|p ∈ x, and q ∈ y ⊆ r|r < p + q, p ∈ x, and q ∈ y.

Now if r < p + q where p ∈ x and q ∈ y, then r − p < q, thus r − p ∈ y and

so r = p + (r − p) where p ∈ x and (r − p) ∈ y, thus

r|r < p + q, p ∈ x, and q ∈ y ⊆ p + q|p ∈ x, and q ∈ y.

Thus x + y = p + q|p ∈ x, and q ∈ y.

We leave to the reader as an exercise to show addition of real numbers is

commutative and associative; i.e., x+ y = y+ x, and x+(y+ z) = (x+ y)+ z

for all real numbers x, y, z.

Exercises For p, q ∈ Q, and A ∈ N show

i) p + q = p + q.

ii)∑i∈A

pi =∑i∈A

pi.

Theorem 8.4 For any real number x and the real number 0, x + 0 = x.

Proof: If r ∈ x + 0, then r = p + q where p ∈ x and q < 0, thus r = p + q r, thus r − s < 0, thus r − s ∈ 0, and (r − s) + s = r, thus

x ⊆ x + 0. Hence x + 0 = x.

We say 0 is the additive identity.

Definition If for two real numbers x, and y we have x + y = 0, then we

say y is the additive inverse or opposite of x. We use the notation −x to

indicate the additive opposite of x.

82

Theorem 8.5 Every real number has an additive inverse.

Proof Let x be a real number and let y = q ∈ Q|q + p < 0 ∀ p ∈ x. We

first show that y is a real number, i.e., y is a cut.

1) Since x 6= Q ∃ d ∈ Q such that d 6∈ x. Thus d > p ∀ p ∈ x. Thus

−d < −p ∀ p ∈ x. Thus p + (−d) < 0 ∀ p ∈ x. Thus −d ∈ y. Thus y 6= ∅.Also we notice for any p ∈ x, p + (−p) = 0, thus −p 6∈ y, thus y 6= Q.

2) Let q ∈ y and a < q. We thus have a + p < q + p < 0 ∀ p ∈ x. Thus

a ∈ y.

3) Let q ∈ y, thus q + p < 0 ∀p ∈ x. Now assume p + q + r ≥ 0 for some

p ∈ x and ∀r ∈ Q+. Thus p + r 6∈ x ∀r ∈ Q+. But ∀p ∈ x ∃s > p such that

s ∈ x. Let r = s− p, and we thus have p + r = s ∈ x, which contradicts our

assumption. Thus ∀p ∈ x ∃r ∈ Q+ such that q + p + r < 0, thus q + r ∈ y,

and q < q + r.

Thus y is a cut and we write y = y.

We now show that x + y = 0. If r ∈ x + y, then r = p + q < 0, thus

x + y ⊆ 0.

If s ∈ 0, and p ∈ x is arbitrary, then s − p ∈ Q and p + s − p = s < 0,

thus s − p ∈ y. Thus s = p + s − p ∈ x + y, and thus 0 ⊆ x + y, and thus

x + y = 0.

Lemma 8.6 −(−x) = x.

Proof We have

x + (−x) = 0 = −(−x) + (−x)

⇒ x + (−x) + x = −(−x) + (−x) + x

⇒ x = −(−x)

83

Lemma 8.7 x > 0 if and only if −x < 0.

Proof Assume x > 0 then 0 ∈ x, so ∃p ∈ x such that p > 0. Now assume

−x ≥ 0, then ∃q ∈ −x such that q > −p. Thus q + p > 0, and thus

x +−x > 0, which is a contradiction.

Now assume −x < 0, then ∃q < 0 such that q > p ∀p ∈ −x. So

−q < −p ∀p ∈ −x. Thus p +−q < 0 ⇒ −q ∈ x. Since we have 0 < −q, we

have 0 ∈ x, and thus x > 0.

Corollary −0 = 0.

Lemma 8.8 −(x + y) = −x + (−y).

Proof We have −x + (−y) = r + s|r ∈ −x and s ∈ −y. Thus for

r ∈ −x and s ∈ −y, r + p < 0 ∀p ∈ x and s + q < 0 ∀q ∈ y. Thus

r+s+p+q < 0 ∀(p+q) ∈ x+ y. Thus −x+(−y) = r+s|(r+s)+(p+q) <

0, r ∈ −x, s ∈ −y ∀(p + q) ∈ (x + y).

Now we also have −(x + y) = t|t + (p + q) < 0 ∀((p + q) ∈ x + y. Since

(r + s) ∈ Q ∀r ∈ −x and s ∈ −y we have (r + s) ∈ −(x + y). Thus we have

−x + (−y) ⊆ −(x + y).

Now let t ∈ −(x + y), then t + (p + q) < 0 ∀p ∈ x and q ∈ y. Also

t + (p + q) <t + (p + q)

2< 0. So

t + (p + q)

2− p + p < 0 ∀p ∈ x and q ∈ y.

Thust + (p + q)

2− p ∈ −x. Also

t + (p + q)

2− q + q < 0 ∀p ∈ x and q ∈ y.

Thust + (p + q)

2−q ∈ −y. And we have t =

t + (p + q)

2−p+

t + (p + q)

2−q.

Thus t ∈ −x + (−y). Thus −(x + y) ⊂ −x + (−y).

Thus we have −(x + y) = −x + (−y).

84

Multiplication

We now define multiplication of real numbers. First we define the product

of two positive real numbers by:

Definition For x, y > 0, x · y = r|r < pq, p ∈ x, q ∈ y, and p > 0, q > 0.

For ease of notation we will often use juxtaposition to indicate the oper-

ation of multiplication. That is xy ≡ x · y.

Theorem 8.9 If x and y are positive real numbers, then xy is a positive

real number.

Proof We show xy is a cut.

1) x, y > 0 ⇒ ∃p > 0, q > 0, p ∈ x and q ∈ y, thus 0 < pq, thus 0 ∈ xy,

thus xy 6= ∅. Now ∃a 6∈ x and ∃b 6∈ y, thus a > p ∀p ∈ x and b > q ∀q ∈ y,

thus ab > pq ∀p ∈ x and q ∈ y. Hence ab 6∈ xy. Hence xy 6= Q.

2) Let r ∈ xy, r < pq for some p ∈ x and q ∈ y. If s < r, then s < pq

and thus s ∈ xy.

3) Let r ∈ xy. We have r < pq for some positive p ∈ x and q ∈ y. Since

∃s ∈ x such that p < s and ∃t ∈ y such that q < t we have r < pq < st, thus

pq ∈ xy.

Since xy satisfies the definition of a cut we conclude xy is a real number.

To show that xy is positive we notice that x > 0 ⇒ 0 ⊂ x ⇒ 0 ∈ x ⇒∃p > 0, p ∈ x. Similarly ∃q > 0, q ∈ y. Now let r ∈ x and r > 0, s ∈ y and

s > 0 be arbitrary. We thus have 0 < rs ⇒ 0 ∈ xy ⇒ 0 ⊂ xy ⇒ xy > 0.

85

We complete the definition of multiplication by:

xy =

0 ifx = 0 or y = 0

−(−x)y if x < 0 and y > 0

−(x(−y)) if x > 0 and y < 0

((−x)(−y)) if x < 0 andy < 0

Corollary If xand y are real numbers, then xy is a real number.

Associativity and commutativity of multiplication follows immediately

from the definition of multiplication and the associativity and commutativity

of rational numbers.

Exercise For p, q ∈ Q, A ∈ N show

i) p · q = p · q.

ii)∏i∈A

pi =∏i∈A

pi.

Theorem 8.10 The real number 1 = p|p < 1 is the multiplicative iden-

tity.

Proof First consider a positive real number x.

x · 1 = r|r < pq, p ∈ x, q ∈ 1, p > 0, q > 0.

If t ∈ x · 1, then ∃p ∈ x, p > 0 and q ∈ 1, q > 0 such that t < pq 0, such that t < q. We

now note that t <t + q

2< q and

t + q

2q< 1. We thus have t < q · t + q

2q, thus

t ∈ x · 1. Hence x · 1 = x.

86

For x = 0 we always have 0 · 1 = 0.

For x < 0 we have x · 1 = −((−x) · 1) = −(−x) = x.

Theorem 8.11 x 6= 0, if and only if ∃y such that xy = 1.

Proof In the converse direction we simple note that 0y = 0 ∀y ∈ R. Thus,

if xy = 1, then neither x nor y can be 0.

In the forward direction, let x > 0, and let

y = s|∃ p, q > 0, q ∈ x, pq < 1 and s < p.

We first show that y is a cut.

1. x > 0 ⇒ 0 ∈ x ⇒ ∃q ∈ x where q > 0. Since 0 · q = 0 < 1 ∀q we

have 0 ∈ y. Thus y 6= ∅. To show that y 6= Q, pick q ∈ x, q > 0, then

q−1 · q = 1, thus q−1 6∈ y.

2. If s ∈ y and t < s, then t < p, thus t ∈ y.

3. If s ∈ y, then s < p, thus s < s+p2

< p, thus s+p2∈ y.

We conclude that y is a cut and thus is a real number.

For any s ∈ y, q ∈ x and p > 0 where pq < 1 we have s < p, thus sq < pq < 1.

Hence yx = r|r < sq < pq < 1, s, p, q > 0 = r|r < 1 = 1.

To complete the proof, assume x < 0, then ∃y such that (−x) · y = 1,

thus

x · (−y) = −xy = (−x) · y = 1.

87

The set of Real Numbers is a Field

To complete the verification that the real numbers form a field we demon-

strate that multiplication distributes over addition.

Theorem 8.12 x · (y + z) = x · y + x · z.

To prove Theorem 8.12 we need the following lemma.

Lemma If p ∈ x and q ∈ y where p, q > 0, then pq ∈ xy.

Proof ∃ r ∈ x such that p < r, thus pq < rq, thus pq ∈ xy

Proof of Theorem 8.12

i) Assume x, y + z > 0, then either y > 0 or z > 0.

x · (y + z) = t|t 0

If y and z are positive, then for s ∈ (y + z) we can write s = q + r

where q ∈ y, r ∈ z and q, r > 0. Thus we have

x·(y+z) = t|t < p·s = p(q+r) = pq+pr, p, q, r > 0 with p ∈ x, q ∈ y, r ∈ z.

Now

xy + xz = t|t 0, p, p′ ∈ x, q ∈ y, r ∈ z.

Thus x · (y + z) ⊆ xy + xz.

Now let p, p′ ∈ x, p, p′ > 0, then pq + p′r = p(q +p′

pr) = p′(

p

p′q + r)

and eitherp′

p≤ 1 or

p

p′≤ 1, thus either

p

p′q ∈ y or

p′

pr ∈ z.

88

Thus xy + xz ⊆ x · (y + z), and thus xy + xz = x · (y + z).

Now without loss of generality, assume y > 0 and z ≤ 0. Then we have

x · (y + z) = t|t < ps, p > 0, s > 0, p ∈ x and s ∈ y + z.

Now, s = q − r, where q ∈ y, −r ∈ z, q, r > 0. Thus

x · (y + z) = t|t < ps = p(q − r) = pq − pr, p, q, r > 0.

Now

xy + xz = xy − x(−z) = t|t < p− q, p ∈ xy, and q ∈ x(−z).

We have p = su, q = rv, where s ∈ x, u ∈ y, r ∈ x, v ∈ −z. Thus

xy + xz = xy − x(−z) = t|t < su− rv.

Thus we again have x · (y + z) ⊆ xy + xz.

And again eitherr

s< 1 or

s

r< 1. Thus either

s

ru ∈ y or

r

sv ∈ −z.

Thus su− rv = s(u− r

sv) = r(

s

ru− v), and one is in x · (y + z).

Thus xy + xz ⊆ x · (y + z), and thus xy + xz = x · (y + z).

ii) Now assume x > 0, y + z < 0. Then

x · (y + z) = −(x · (−(y + z))

= −(x(−y − z))

= −(−xy − xz) = xy + xz.

iii) Assume x < 0, y + z > 0. Then

x · (y + z) = −(−x · (y + z))

= −(−xy − xz) = xy + xz.

89

iv) Assume x < 0, y + z < 0. Then

x · (y + z) = (−x · (−(y + z)))

= (−x · (−y − z))

= (−x(−y)− x(−z))

= −(x(−y) + x(−z)

= −(−(xy + xz)) = xy + xz

v) If y + z = 0, then z = −y and we have x · (y + z) = x · 0 = 0 and

xy + xz = xy + x(−y) = xy − xy = 0.

vi) Finally, if x = 0, then

x · (y + z) = 0 · (y + z) = 0 = 0 · y + 0 · z = xy + xz

The Least Upper Bound Property

The real numbers enjoy a property that is not shared by the rational

numbers, known as the least upper bound property.

Definition If X is a set of real numbers and the real number u satisfies the

condition that x ≤ u ∀x ∈ X, then u is said to be an upper bound for X.

Equivalently if the real number l satisfies the condition that x ≥ l ∀x ∈ X,

then l is said to be a lower bound for X. Any set that has an upper bound

is said to be bounded above, if the set has a lower bound it is said to be

bounded below. If a set has both an upper bound and a lower bound, we say

it is bounded.

Definition An upper bound, u of a set X, that satisfies the condition, if v

is also upper bound, then u ≤ v, is said to be the supremum, or the least

90

upper bound of X. A similar definition can be made for the greatest

lower bound or infimum.

We can now state and prove the following theorem.

Theorem 8.13 The Supremum Property Every non-empty set of real

numbers bounded above has a supremum.

Proof Let A be a non-empty collection of real numbers that is bounded

above by u. We will show that⋃

x∈A

x is a real number and is the supremum.

1. For any x ∈ A ∃p ∈ x ⇒ p ∈⋃

x∈A

x ⇒⋃

x∈A

x 6= ∅.x ≤ u ∀x ∈ A and

∃q 6∈ u ⇒ q 6∈ x ∀x ∈ A ⇒ q 6∈⋃

x∈A

x ⇒⋃

x∈A

x 6= Q.

2. If q p such that q ∈ x ⇒ q ∈⋃

x∈A

x.

Thus⋃

x∈A

x is a cut and hence a real number.

Now we show that⋃

x∈A

x is the supremum. If x ∈ A, then x ⊂⋃

x∈A

x, thus

x ≤⋃

x∈A

x. Hence⋃

x∈A

x is an upper bound. Now let y <⋃

x∈A

x ⇒ ∃p ∈⋃

x∈A

x

such that p 6∈ y. Since p ∈ x for some x we have y < p < x, thus y is not an

upper bound. Thus if z is an upper bound, we must have z ≥⋃

x∈A

x. Thus

⋃

x∈A

x is the supremum.

91

Exercise Show that q ∈ Q|q2 < 2 is bounded above in Q, and does not

have a supremum in Q. We conclude that the rational numbers do not have

the supremum property.

The Cardinality of the Real Numbers

Theorem 7.10 asserts that the integer and rational numbers are countable.

We now investigate the cardinality of the real numbers.

To facilitate our investigation we will develop an alternate representation

of the real numbers. We will show that every real number can be represented

as the sum of integral powers of 2. However, the computation

x =∑i∈ω

2n−i = 2n + 2n−1 + · · ·

⇒ 2x = 2n+1 + 2n + · · ·⇒ x = 2x− x = 2n+1

shows that the representation need not be unique. Thus we must take care

not to allow the duplications.

Definition Any function whose domain is an ordinal number is called a

sequence. If the domain is the ordinal number α we say the sequence is an

α-sequence, if the image of a sequence is in a set a, we say that the sequence is

an a-valued α-sequence. We use the notation (sn) to represent the sequence

s : α → s(α) where n ∈ α.

We begin with some terminology.

1. A sequence kn is decreasing if kn < km whenever n > m.

92

2. Let kn be a decreasing sequence of integers. We say kn is inessential

if there exists N ∈ N such that kn+1 = kn − 1 ∀n ≥ N . We say kn is

essential if it is not inessential.

3. We will say any rational number of the form 2k, k ∈ Z is a binary.

Consider the set of all binary-valued sequences on α ⊆ ω of the form sn = 2kn

where kn is an essential decreasing integer-valued sequence, i.e.,

B = (2kn)|(kn) is an essential decreasing integer-valued sequence.

We will construct a bijection, b, between B and the positive real numbers,

i.e. we construct b : R+ ↔ B. If b(x) = (2kn), we will say (2kn) is the binary

representation of x.

We construct b by demonstrating how to compute the image of x ∈ R+.

x > 0 ⇒ 0 ∈ x ⇒ ∃p > 0, p ∈ x.

Thus

∃k ∈ Z such that 2k < p.

Now

∃q 6∈ x and ∃m such that 2k+m ≥ q.

Thus

∃n such that 2k+n > x.

Consider m|2k+m > x ⊆ ω. Pick the least element,n, thus 2k+n > x and

2k+n−1 ≤ x, thus 2k+n−1 is the largest element of the form 2m in x. Set

k + n− 1 = k0.

Now consider x − 2k0 . There exists a largest element of the form 2m in

x− 2k0 , call that element 2k1 .

93

We show 2k1 < 2k0 .

2k1 ≤ x− 2k0 ⇒ 2k1 + 2k0 ≤ x

If 2k0 ≤ 2k1 , then we have

2k0 + 2k0 ≤ 2k1 + 2k0 < x.

Thus

2k0+1 = 2 · 2k0 < x.

Since 2k0 is the largest binary in x we have a contradiction, thus 2k1 < 2k0 .

We pick 2k2 to be the largest binary in x − 2k0 − 2k1 = x − (2k1 + 2k1).

Again 2k2 < 2k1 . We continue in this fashion to construct the sequence. This

construction process defines our map b. We now need to show that b is well

defined, one to one and onto.

To demonstrate that b is well defined we need only show that the maximal

binary in any real number is unique. If 2m and 2n are maximal binaries in x,

then 2m = 2n ⇒ m = n. Thus the maximal binary is unique and b is well

defined.

To show that b is onto, we need to define some notation and prove a

lemma.

Let (an) be an α-sequence, where α ⊆ ω.

Let∑

k∈n

ak = a0 + a1 + · · · + an for n ∈ α. Then we have( ∑

k∈n

ak

)is an

α-sequence. Which we call the sequence of partial sums of (an).

Now let (an) be an α-sequence of non-negative rational numbers, α ⊂ ω.

Each element of( ∑

k∈n

ak

)is a rational number since each sum is finite. Also

∑

k∈n

ak ≤∑

k∈m

ak if n < m, since each ai ≥ 0.

94

If the associated sequence of real numbers(∑

k∈n

ak

)is bounded above,

then we define ∑n∈α

an = sup∑

k∈n

an

.

Lemma If (an) = (2kn) where kn ∈ Z, n ∈ α ⊂ ω and kn < km if n > m,

then(∑

i∈n

2ki

)is bounded above.

Proof ∑i∈n

2ki ≤∑i∈n

2k0−i ≤∑i∈ω

2k0−i = 2k0+1

Thus ∑i∈n

2ki < 2k0+1 ∀n ∈ α ⇒∑i∈n

2ki ≤ 2k0+1 ∀n ∈ α

Thus(∑

i∈n

2ki

)is bounded above.

Now let (2kn) be a sequence where (kn) is a decreasing essential sequence

of integers. Since kn is essential∑

n∈N−02km+n < 2km ∀m ∈ N. Thus we see

that (2kn) is the binary representation of the real number∑

2kn .

To show that b is one to one, if b(x) = b(y), then 2kn = 2jn only if

2kn = 2jn ∀n.

Theorem 8.14 The real numbers are uncountable.

To prove this theorem we need two lemmas and a corollary.

Lemma 1 The set of all bi-valued ω-sequences is uncountable.

That is S = s : ω → 0, 1 is uncountable.

Proof The demonstration is by contradiction.

95

Assume there exists a bijection b : ω → S. Define a bi-valued ω-sequence

s by

s(n) =

0 if(b(n)

)(n) = 1

1 if(b(n)

)(n) = 0.

Since b is a bijection, ∃ k where b(k) = s. Then we have

s(k) =

0 if(b(k)

)(k) = s(k) = 1

1 if(b(k)

)(k) = s(k) = 0

which is a contradiction.

Lemma 2 The set of all bi-valued ω-sequences where s(k) = 0 for only a

finite number of times is countable.

That is, A = s|∃N ∈ ω such that s(n) = 1 ∀n ≥ N is countable.

Proof There is a reasonably obvious bijective map

j : A ↔⋃n∈ω

s : n → 0, 1|n ∈ ω.

Thus C(A) ≤ C

(⋃n∈ω

s : n → 0, 1|n ∈ ω)

, and the countable union of

finite sets is countable.

Corollary The set of all bi-valued ω-sequences where s(k) = 0 infinitely

often is uncountable.

Proof of Thoerem 8.14

Let A = s : ω → 0, 1, where sn = 0 infinitely often. There exists a

bijection

b : A ↔∑

n∈ω

sn · 2−n|s ∈ A

⊂ R

defined by b(s) =∑n∈ω

sn · 2−n. Since A is uncountable, so is R.

96

Exercise

Show that s : ω → 0, 1, where sn = 0 infinitely often = x|0 ≤ x ≤ 1.

The Cauchy Sequence Construction of the Real Numbers

We conclude the chapter with a set of exercises that leads to an alternate

construction of the real numbers.

Definition The functions Abs : R→ R defined by

Abs(x) =

x if x ≥ 0

−x if x < 0

is the absolute value function. We abbreviate Abs(x) by |x|,i.e. Abs(x) = |x|.

Definition A rational-valued sequence, sn, that satisfies the following:

∀ε > 0, ε ∈ Q, ∃N ∈ N such that ∀n,m > N ⇒ |sn − sm| < ε

is called a Cauchy Sequence.

We define an equivalence relation on the set of all rational-valued Cauchy

Sequences by

sn ≡ tn iff ∀ε > 0 ∃N ∈ N such that ∀n ≥ N ⇒ |sn − tn| < ε.

We define an order on the equivalence classes of Cauchy Sequences by

S ≥ T iff ∃N ∈ N such that ∀n > N ⇒ sn− tn ≥ 0 ∀(sn) ∈ S and ∀(tn) ∈ T.

97

Exercises

1. Show that the relation as defined above is in an equivalence relation.

2. Show that the order defined for the equivalence classes of Cauchy Se-

quences is a linear order.

3. Show that the set of equivalence classes of rational-valued Cauchy se-

quences is order isomorphic to the Real Numbers.

98

IX Complex numbers, Quaternions andOctonions

Since the product of two positive real numbers is positive, and the product

of any two negative real numbers is also positive, the solution to the equation

x2 + 1 = 0 is vacuous in the Real numbers. However by extending the real

numbers to a larger set of numbers we can create solutions to equations such

as the example given above. The construction of this larger set of numbers

from the Real numbers is far easier than the construction of the Reals from

the Rationals.

Complex Numbers

Definition The Complex Numbers, C, is the cartesian product of the

Real Numbers with themselves, R× R, with the following arithmetic.

1. (a, b) + (c, d) = (a + c, b + d).

2. (a, b) · (c, d) = (ac− bd, ad + bc).

The real numbers are embedded into the Complex numbers by the following

injection map

J : x → (x, 0).

Theorem 9.1 The complex numbers are a field.

Proof From the definition of addition and multiplication, the results of

the binary operation produces an ordered pair of real numbers hence the

Complex numbers are closed with respect to the binary operations of addition

and multiplication.

99

(a, b) + (c, d) = (a + c, b + d) = (c + a, d + b) = (c, d) + (a, b), and

(a, b) · (c, d) = (ac − bd, ad + bc) = (ca − db, cb + da) = (c, d) · (a, b). Hence

addition and multiplication is commutative.

((a, b)+(c, d))+(e, f) = (a+ c, b+d)+(e, f) = ((a+ c)+e, (b+d)+f) =

(a + (c + e), b + (d + f)) = (a, b) + (c + e, d + f) = (a, b) + ((c, d) + (e, f),

and ((a, b) · (c, d)) · (e, f) = (ac − bd, ad + bc) · (e, f) = ((ac − bd)e − (ad +

bc)f, (ac−bd)f +(ad+bc)e) = (ace−bde−adf−bcf, acf−bdf +ade+bce) =

(a(ce− df)− b(de + cf), a(cf + de) + b(ce− df)) = (a, b) · (ce− df, cf + de) =

(a, b) · ((c, d) · (e, f)). Hence addition and multiplication is associative.

(a, b) · ((c, d)+(e, f)) = (a, b) · (c+e, d+f) = (ac+ae− bd− bf, ad+af +

bc+ be) = (ac− bd, ad+ bc)+ (ae− bf, af + be) = (a, b) · (c, d)+ (a, b) · (e, f).

Hence multiplication distributes over addition.

(a, b) + (0, 0) = (a + 0, b + 0) = (a, b). Thus (0, 0) is the additive identity.

(a, b)·(1, 0) = (a·1−b·0, a·0+b·1) = (a, b). Thus (1, 0) is the multiplicative

identity.

(a, b)+(−a,−b) = (a−a, b−b) = (0, 0). Thus any complex number (a, b)

has an additive inverse (−a,−b).

(a, b) · ( a

a2 + b2,

−b

a2 + b2) = (

a2

a2 + b2− −b2

a2 + b2,−ab

a2 + b2+

ab

a2 + b2) =

(a2 + b2

a2 + b2,ab− ab

a2 + b2) = (1, 0). Thus for any complex number (a, b) 6= (0, 0) the

complex number (a

a2 + b2,

−b

a2 + b2) is the multiplicative inverse.

We have thus verified that the complex numbers forms a field.

We note that the product (0, 1) · (0, 1) = (−1, 0) Since (−1, 0) is the

embedding of the real number −1, we have (0, 1) as the square root of −1,

and we have a solution to the equation x2 + 1 = 0.

100

The simplest and standard way to represent Complex numbers is to repre-

sent them as the formal expression a+ bi, where i2 = (0, 1)(0, 1) = (−1, 0) =

−1. Now standard arithmetic on binomial expressions yield the appropriate

sums and products.

(a + bi) + (a′ + b′i) = (a + a′) + (b + b′)i

(a + bi)(a′ + b′i) = aa′ + ab′i + a′bi + bb′i2 = (aa′ − bb′) + (ab′ + a′b)i

.

For the complex number (a, b) we say (a,−b) is its complex conjugate.

We denote the complex conjugate of a complex number c by c∗, thus if

c = (a, b), then c∗ = (a, b)∗ = (a,−b).

Definition For x = (a1, a2, · · · , an) ∈ Rn = Πi∈nR, the real value

√n∑

i=1

a2i

is the norm of x. We say that the norm of a complex number is the norm

of the pair of real numbers that represents it.

We see that for any complex number c, we have c · c∗ = (a+ bi)(a− bi) =

a2 + b2 which is the norm squared of c.

Quarternions

Definition The Quaternions is the set C × C with the following arith-

metic:

1. (a, b) + (c, d) = (a + c, b + d)

2. (a, b) · (c, d) = (ac− d∗b, ad + bc∗).

101

We designate the Quaternions with the blackboard bold face capital H, H.

Let h ∈ H, then h = (a + bi, c + di). But again it is common practice to

write h = a + bi + cj + dk.

We leave it to the reader to verify i2 = j2 = k2 = −1.

We also leave it to the reader to verify ij = −ji = k, jk = −kj =

i, ki = −ik = j. Thus we see ij 6= ji, hence quaternions fail to have the

commutative property.

Octonions

The quaternion conjugate of a quaternion h = (a, b) is h∗ = (a∗,−b).

Definition The Octonions is the set H×H with the following arithmetic:

1. (a, b) + (c, d) = (a + c, b + d)

2. (a, b) · (c, d) = (ac− d∗b, ad + bc∗).

We designate the Octonions with a bold face capital K, K.

Again it is common practice to designate Octonions by k = a + be1 +

ce2 + de3 + ee4 + fe5 + ge6 + he7.

We leave it again to the reader to verify e2n = −1 and that for n 6= m we

have anticommutivity, enem = −emen.

We could continue constructing numbers in this fashion but more alge-

braic properties fail. In particular in the next constructions not every non

zero number has a multiplicative inverse.

102

X TRANSFINITE AND INFINITESIMALNUMBERS

In his development of the Calculus, Isaac Newton introduced the concept

of a “fluxion”, a number that was smaller than any positive number and yet

was greater than zero. This concept seemed to be contrary to the Archime-

dian principle, and led one of the critics of early analysis, Bishop Berkeley, to

comment “And what are these fluxions? The velocities of evanescent incre-

ments. And what are these same evanescent increments? They are neither

finite quantities, nor quantities infinitely small, nor yet nothing. May we

not call them ghosts of departed quantities?” At this time in history, when

Newton’s genius is greatly admired we may be inclined to dismiss Bishop

Berkeley’s comments as those of a pompous fool. However this is not true,

his criticisms were well founded as the rigorous structure of Newton’s Cal-

culus had not been well established. Criticism such as Bishop Berkeley, is

good for the development of any scholarly field of study, as it forces those

involved in that study to construct a firm basis for their claims. Of course

the concept of fluxions has been replaced with the limit, and all seems well

with the world, and Calculus.

With Cantor’s development of transfinite ordinal and cardinal numbers

it has become possible to make the concept of fluxions rigorous, and without

contradicting Archimedes. We will not use the term fluxion, but will call such

numbers infinitesimal. In this chapter we will construct transfinite Integers,

Rational and Real Numbers. The construction will be a simple extension of

our development of finite numbers.

We begin our construction of transfinite and infinitesmals by noting that

our construction of the integers, rational and real numbers used the ordinal

103

number ω as a basis on which to build the appropriate collection of sets that

would be our numbers. An attempt to use a larger ordinal than ω will fail, as

ordinal arithmetic is not commutative, and commutativity is crucial in the

construction of the equivalence classes that represent the different numbers.

Consider the equivalence relation defined on ω×ω that yielded the integers.

(a, b) ≡ (c, d) ⇔ a + d = b + c.

This relation is not an equivalence relation on ordinal numbers greater than

ω as (ω, 1) 6≡ (ω, 1) since ω + 1 6= 1 + ω, hence the relation does not satisfy

the reflexive property.

Transfinite Arithmetic

Our choices of binary operations were made from our experiences with tangi-

ble collection of objects (e.g. boxes of apples), and these tangible collections

must be finite by their very nature. The extension of these binary operations

to abstract and infinite sets, although well motivated is quite arbitrary. So

our strategy is to define a new arithmetic, that agrees with ordinal arithmetic

on finite sets, but satisfies the appropriate field axioms on infinite sets. As

noted in Chapter 3, the ordinal number ωω can be expressed as

x | x =n∑

i=0

ωiαi where n, αi ∈ ω.

Thus every element of ωω is of the form of a polynomial of indeterminate ω

with coefficients and exponents in ω. We thus define an arithmetic on ωω to

be polynomial arithmetic.

n∑i=0

ωiαi +m∑

i=0

ωiβi =

maxn,m∑i=0

ωi(αi + βi)

104

andn∑

i=0

ωiαi ·m∑

j=0

ωjβj =n∑

i=0

m∑j=0

ωi+j(αi · βj).

Where the arithmetic on ω is ordinal arithmetic.

We may now easily verify that these two operations are commutative

and associative, and that multiplication distributes over addition. Thus we

may repeat our construction of integers, rational and real numbers with this

arithmetic to form transfinite integers, transfinite rational, and transfinite

real numbers. Both the transfinite rational numbers and transfinite real

numbers contain elements that satisfy the conditions of Newton’s fluxions.

We will call those numbers infinitesimal.

If we let n ∈ ω be an arbitrary element , and we let ω represent the trans-

finite integer (ω, 0), 1 represent the transfinite integer (1, 0) and n represent

the transfinite integer (n, 0) we observe that 1 · n < 1 · ω ⇒ [1, ω] < [1, n],

where [1, ω] and [1, n] are transfinite rational numbers. Since the restriction

of the construction of transfinite rational numbers is the construction of the

rational numbers, we may say that [1, n] is a rational number. Thus for any

positive real number ε, by the Archimedian property there exists an integer

n such that [1, n] < ε, and since [1, ω] < [1, n] we have [1, ω] < ε. And thus

[1, ω] satisfies the condition of Newton’s fluxions.

Transfinite Numbers

We define the ω-Transfinite Natural Numbers, Nω to be ωω, with

polynomial arithmetic. We define the ω-Transfinite Integers, Zω, to be

the collection of equivalence classes

[x, y]|x, y ∈ Nω where (x, y) ≡ (z, w) ⇔ x + w = y + z.

105

We define the ω-Transfinite Rational numbers, Qω, to be the collection

of equivalence classes

[x, y]|x, y ∈ Zω where (x, y) ≡ (z, w) ⇔ xw = yz.

And we define the ω-Transfinite Real Numbers, R, to be the collection

of all Dedekind cuts of transfinite rational numbers.

We now observe that the ordinal number ω(ωω) can be expressed as

x | x =n∑

i=0

(ωω)iαi where n, αi ∈ ωω.

Thus every element of ωω is of the form of a polynomial of indeterminate ωω

with coefficients and exponents in ωω. We again define arithmetic as poly-

nomial arithmetic where the arithmetic on ωω is the arithmetic previously

defined.

We now define the ωω-Transfinite Natural Numbers, ωω-Transfinite In-

tegers, ωω-Transfinite Rational Numbers, and ωω-Transfinite Real Numbers

in the analogous fashion. We may continue indefinitely constructing Trans-

finite numbers in this fashion, We thus define any number constructed in

this fashion to be a Transfinite Number. Thus a Transfinite Integer is an

α-Transfinite Integer for some appropriate ordinal number α. And equiva-

lently for Transfinite Natural Numbers, Transfinite Rational Numbers, and

Transfinite Real numbers.

We will adopt the convention, that when referring to an element of the

transfinite real numbers that is the embedded image of either a transfinite

natural number, transfinite integer, or transfinite rational number, we will

simply refer to that number as being an element of that respective embedded

106

set, that is, a transfinite natural number, transfinite integer or transfinite

rational number.

Let ξ be an arbitrary transfinite real number. Consider the cut α|α2 < ξ.We may consider this cut to represent the transfinite real number

√ξ.

Questions What is√

ω ? What is√

ω + 1 ? What is 3√

ω ?

107

XI SURREAL NUMBERS

The origin of Surreal numbers is credited to John Conway, however the

name was coined by Don Knuth. For consistency we may use the symbol S

for surreal numbers, however the need to do so seldom arises.

The Constructive Definition of Surreal Numbers

John Conway defined Surreal numbers by formulating two simple rules

for the construction of Surreal numbers plus the definitions for addition and

multiplication. With these two rules and two definitions, a collection of num-

bers is constructed that includes all Real numbers, and all Ordinal numbers.

The collection although not a set, but rather a proper class, forms a field.

Rule 1: Every number is represented by a pair of sets of previously con-

structed numbers, a left set and a right set, where no number in the left set

is greater than or equal to any number of the right set.

Rule 2: A number, a, is less than or equal to a number, b, if and only if

no member of a′s left set is greater than or equal to b, and no member of b′s

right set is less than or equal to a.

The first rule tells us how to construct new Surreal numbers from previ-

ously constructed numbers. The second rule defines the order relation of the

collection of surreal numbers that is necessary for the construction.

We will develop the surreal numbers using an alternate definition, that

uses some of the principles already developed in Set Theory.

108

The Function Definition of Surreal Numbers

Definition A Surreal number is a function from an ordinal number to a

two point space. The two point space is designated by +,−. The domain

of a surreal number will be called its length.

The function from the ordinal number 0 = is of course the empty

function,, and is considered to be a surreal number, and we call it 0.

Example 5 = 0, 1, 2, 3, 4. Thus

0 1 2 3 4

↓ ↓ ↓ ↓ ↓+ + - + -

Is a surreal number

Recall that we defined a sequence as a function whose domain is an ordinal

number. If the domain of a sequence is the ordinal number α then we say

that the sequence is an α-sequence.

Thus a surreal number, a, is a binary valued α-sequence for some ordinal

α, and its length is α, which we indicate by l(a) = α. The 0-sequence is of

course the empty set.

Definition If a is an α-sequence, and b is a β-sequence, such that a∩b = b,

then b is an initial segment of a. If b 6= a, then b is a proper initial

segment. We see that if b is a proper initial segment of a, then l(b) < l(a).

Two surreal numbers are equal if they are equal as sets.

We define a linear order on the class of surreal numbers by the following:

Let a and b be surreal numbers, and let c be the maximal initial segment

109

that is in both a and b, where c is a γ-sequence. We say

a > b if

a(γ) = + and b(γ) = − or

a(γ) = + and b(γ) is undefined or

a(γ) is undefined and b(γ) = − .

The Canonical Representation of Surreal Numbers

Let a be a surreal number. If A′ = a′|a′ is an initial segment of a and

a′ < a, and A′′ = a′′|a′′ is an initial segment of a and a < a′′, then we

say A′|A′′ is the canonical representation of a.

Example (+ +−−+) = 138

= 0, 1, 114|2, 11

2

Addition of Surreal Numbers

We define addition in the following way.

If a = A′|A′′ and b = B′|B′′ are in canonical form, then

a + b = a + b′, b + a′|a + b′′, b + a′′

∀ a′ ∈ A,′ b′ ∈ B′, a′′ ∈ A′′, b′′ ∈ B′′.

We now verify that the elements of the left set are truly less than the

elements of the right set. But first we must verify that addition is commu-

tative.

a+ b = a+ b′, b+a′|a+ b′′, b+a′′ = b+a′, a+ b′|b+a′′, a+ b′′ = b+a

To complete the verification we induct on the ordinal sum of the lengths

of a and b. We make the inductive hypothesis that if l(c)+ l(d) < l(a)+ l(b),

l(c′) + l(d′) < l(a) + l(b), c < c′, and d′ < d′, then c + d < c′ + d′.

110

We have a′ < a < a′′, b′ < b < b′′. Thus

a + b′ < a + b′′, a + b′ < a′′ + b = b + a′′

and

b + a′ < b + a′′, b + a′′ < b′′ + a = a + b′′.

Thus a + b does represent a Surreal number. However we notice that a + b is

not in canonical form. Is it well defined?

If A and B are sets of surreal numbers such that a < b ∀a ∈ A and b ∈ B,

then A|B is the “first” surreal number c, such that a < c < b ∀a ∈ A and b ∈B. That is, c is the surreal number with the minimal domain such that

a < c < b ∀a and ∀b. Hence the sum is well defined.

The way to think of why this is true is this way. Start at 0 and try to

get into the gap between the two sets as quickly as possible. If there is an

element of the lower set greater or equal to zero step towards the upper set,

if the opposite is true step toward the lower set. If you arrive between the

two sets then you have defined that “youngest” surreal number. If you are

still “amid” one of the two sets, step towards the other set and continue until

you arrive between them. If at any stage you were to step away from the set

in which you are not amid, then you can never arrive between the sets, thus

there is only one way to arrive between the sets, and when you first get there

you stop.

Example Compute 1+1.

To add 1+1 we need to know 0+1, to add 0+1 we need to know 0+0.

We have 0 = ∅|∅, thus 0 + 0 = 0 + b′, 0 + a′|0 + b′′, 0 + a′′, where

a′, a′′, b′, b′′ ∈ ∅ →←. Since there are no elements in the empty set to add to

111

0 each of those sums does not exist and thus

0 + 0 = ∅|∅ = 0.

We now have 1 = 0|∅, thus

0 + 1 = 0 + 0, 1 + a′|0 + b′′, 1 + a′′ where a′, a′′, b′′ ∈ ∅ →← .

Thus we have 0 + 1 = 0|∅ = 1. Finally

1 + 1 = 1 + 0, 1 + 0|1 + b′′, 1 + a′′ where a′′, b′′ ∈ ∅ →← .

Thus 1 + 1 = 1|∅ = 2.

We can see that 0 = ∅|∅ is the additive identity, since if we let a = A′|A′′,

then

a + 0 = 0 + a′|0 + a′′ = a.

Now we can define the additive inverses of surreal numbers.

Let a be a surreal number. Define −a as:

−a(α) =

+ if a(α) = −− if a(α) = +.

Example Let a + (+−+) = 34

= 0, 12|1,

thus −a = (−+−) = −34

= −1|0,−12.

a + b = −1

4,−3

4|3

4,1

4 = 0.

The canonical representation of the opposite of a given surreal number

a = A′|A′′ has the opposites of A′′ as its lower set and the opposites of A′

112

as its upper set. Thus if a = A′|A′′, then −a = −A′′| − A′ where −A′′ =

−a′′|a′′ ∈ A′′ and −A′ = −a′|a′ ∈ A′.

Now a + (−a) = a + (−a′′), (−a) + a′|a + (−a′),−a + a′′

Since a′ < a < a′′ ⇒ −a′′ < −a < −a′ We have

a + (−a′′) < a′′ + (−a′′) = 0

−a + (a′) < a′ + (−a′) = 0

a + (−a′) > a′ + (−a′) = 0

−a + a′′ > −a′′ + a′′ = 0.

Thus a + (−a) = 0.

We can verify that addition of surreal numbers is associative and thus

satisfy the axioms for an abelian group under addition. Let a = A′|A′′, b =

B′|B′′, and c = C ′|C ′′.

(a + b) + c = a + b′, b + a′|a + b′′, b + a′′+ c

= (a + b) + c′, c + (a + b′), c + (b + a′|(a + b) + c′′, c + (a + b′′), c + (b + a′′)= a + (b + c′), a + (c + b′), (b + c) + a′|a + (b + c′′), a + (c + b′′), (b + c) + a′′= a + b + c′, c + b′|b + c′′, c + b′′= a + (b + c).

The Multiplication of Surreal Numbers

We now define multiplication.

As a motivation for the following definition consider real numbers a, b, c, d

where a < b, and c < d. We then have b− a > 0 and d− c > 0. Thus

(b−a)(d−c) > 0 ⇒ bd+ac−ad−bc > 0 ⇒ bd > ad+bc−ac, and bc < bd+ac−ad.

113

Definition Let a = A′|A′′, b = B′|B′′, then

ab = a′b + ab′ − a′b′, a′′b + ab′′ − a′′b′′|a′b + ab′′ − a′b′′, a′′b + ab′ − a′′b′

where a′ ∈ A′, b′ ∈ B′, a′′ ∈ A′′ and b′′ ∈ B′′.

Exercise Consider ε = (+−−− · · · ) = 0|1, 12, 1

4, · · · and

ω = (+ · · · ) = 0, 1, · · · |∅. Find ε · ω.

The Field Properties of Surreal Numbers

We have confirmed that Surreal numbers with the binary operation of

addition forms an abelian group. We now want to complete our verification

that Surreal numbers with the operations of addition and multiplication form

a field. Of course we must verify that the definition of multiplication does

indeed define a Surreal number, to do that we must first verify the properties

of commutativity and associativity of multiplication, and the distributive

property.

We again make the inductive hypothesis that the inequalities necessary

are valid for the product of surreal numbers whose sum of lengths are less than

the sum of the lengths of a and b, and thus the definition of multiplication

will be well defined.

Let a = A′|A′′, b = B′|B′′ and c = C ′|C ′′. We now have

ab = a′b + ab′ − a′b′, a′′b + ab′′ − a′′b′′|a′b + ab′′ − a′b′′, a′′b + ab′ − a′′b′= b′a + ba′ − b′a′, b′′a + ba′′ − b′′a′′|b′a + ba′′ − b′a′′, b′′a + ba′ − b′′a′ = ba

Where a′ ∈ A′, a′′ ∈ A′′, b′ ∈ B′, and b′′ ∈ B′′. Thus multiplication is

114

commutative.

(ab)c = (ab)′c + (ab)c′ − (ab)′c′, (ab)′′c + (ab)c′′ − (ab)′′c′′|(ab)′c + (ab)c′′ − (ab)′c′′, (ab)′′c + (ab)c′ − (ab)′′c′

= (a′b + ab′ − a′b′)c + abc′ − (a′b + ab′ − a′b′)c′,

(a′′b + ab′′ − a′b′′)c + abc′ − (a′′b + ab′′ − a′′b′′)c′,

(a′b + ab′′ − a′b′′)c + abc′′ − (a′b + ab′′ − a′b′′)c′′,

(a′′b + ab′ − a′′b′)c + abc′′ − (a′′b + ab′ − a′′b′)c′′|(a′b + ab′ − a′b′)c + (ab)c′′ − (a′b + ab′ − a′b′)c′′,

(a′′b + ab′′ − a′′b′′)c + (ab)c′′ − (a′′b + ab′′ − a′′b′′)c′′,

(a′b + ab′′ − a′b′′)c + (ab)c′ − (a′b + ab′′ − a′b′′)c′,

(a′′b + ab′ − a′′b′)c + (ab)c′ − (a′′b + ab′ − a′′b′)c′= a′bc + ab′c− a′b′c + abc′ + abc′ − a′bc′ − ab′c′ + a′b′c′,

a′′bc + ab′′c− a′′b′′c + abc′ − a′′bc− ab′′c + a′′b′′c,

a′bc + ab′′c− a′b′′c + abc′′ − a′bc′′ − ab′′c + a′b′′c′′,

a′′bc + ab′c− a′′b′c + abc′′ − a′′bc′′ − ab′c′′ + a′′b′c′′|a′bc + ab′c− a′b′c + abc′′ − a′bc′′ − ab′c′′ + a′b′c′′,

a′′bc + ab′′c− a′′b′′c + abc′′ − a′′bc′′ − ab′′c′′ + a′′b′′c′′,

a′bc + ab′′c− a′b′′c + abc′ − a′bc′ − ab′′c′ + a′b′′c′,

a′′bc + ab′c− a′′b′c + abc′ − a′′bc′ − ab′c + a′′b′c′

= a′(bc) + a(b′c + bc′ − b′c′)− a′(b′c + bc′ − b′c′),

a′(bc) + a(b′′c + bc′′ − b′′c′′)− a′(b′′c + bc′′ − b′′c′′),

a′′(bc) + a(b′c + bc′ − b′c′)− a′′(b′c + bc− b′c′),

a′′(bc) + a(b′′c + bc′′ − b′′c′′)− a′′(b′′c + bc′′ − b′′c′′)|a′(bc) + a(b′c + bc′′ − b′c′′)− a′(b′c + bc′′ − b′c′′),

a′(bc) + a(b′′c + bc′ − b′′c′)− a′(b′′c + bc′ − b′′c′),

a′′(bc) + a(b′c + bc′ − b′c′)− a′′(b′c + bc′ − b′c′),

a′′(bc) + a(b′′c + bc′′ − b′′c′′)− a′′(b′′c + bc′′ − b′′c′′)

115

= a′(bc) + a(bc)′ − a′(bc)′, a′′(bc) + a(bc)′′ − a′′(bc)′′|a′(bc) + a(bc)′′ − a′(bc)′′, a′′(bc)′ − a′′(bc)′ = a(bc)

Where a′ ∈ A′, a′′ ∈ A′′, b′ ∈ B′, b′′ ∈ B′′, c′ ∈ C ′, and c′′ ∈ C ′′ . Thus

multiplication is associative.

a(b + c) = a′(b + c) + a(b + c)′ − a′(b + c)′, a′′(b + c) + a(b + c)′′ − a′′(b + c)′′|a′(b + c) + a(b + c)′′ − a′(b + c)′′, a′′(b + c) + a(b + c)′ − a′′(b + c)′

= a′(b + c) + a(b + c′)− a′(b + c′), a′(b + c) + a(b′ + c)− a′(b′ + c),

a′′(b + c) + a(b + c′′)− a′′(b + c′′), a′′(b + c) + a(b′′ + c)− a′′(b′′ + c)|a′(b + c) + a(b + c′′)− a′(b + c′′), a′(b + c) + a(b′′ + c)− a′(b′′ + c),

a′′(b + c) + a(b + c′)− a′′(b + c′), a′′(b + c) + a(b′ + c)− a′′(b′ + c)= a′b + a′c + ab + ac′ − a′b′ − a′c′, a′b + a′c + ab′ + ac− a′b′ − a′c,

a′′b + a′′c + ab + ac′′ − a′′b− a′′c′′, a′′b + a′′c + ab′′ + ac− a′′b′′ − a′′c|a′b + a′c + ab + ac′′ − a′b− a′c′′, a′b + a′c + ab′′ + ac− a′b′′ − a′c,

a′′b + a′′c + ab + ac′ − a′′b− a′′c′, a′′b + a′′c + ab′ + ac− a′′b′ − a′′c= ab + a′c + ac′ − a′c′, ac + a′b + ab′ − a′b′,

ab + a′′c + ac′′ − a′′c′′, ac + a′′b + ab′′ − a′′b′′|ab + a′c + ac′′ − a′c′′, ac + a′b + ab′′ − a′b′′,

ab + a′′c + ac′ − a′′c′, ac + a′′b + ab′ − a′′b′= ab + (ac)′, (ab)′ + ac|ab + (ac)′′ + (ab)′′ + ac = ab + ac

Where a′ ∈ A′, a′′ ∈ A′′, b′ ∈ B′, b′′ ∈ B′′, c′ ∈ C ′, and c′′ ∈ C ′′ . Thus

multiplication distributes over addition.

116

We now verify that the Surreal number 1 = 0| is the multiplicative

identity. First we must verify that any surreal number multiplied times 0 is

0. Let a = A′|A′′, and we have 0 = |, then we have

a · 0 = a′b + ab′ − a′b′, a′′b + ab′′ − a′′b′′|a′b + ab′′ − a′b′′, a′′b + ab′ − a′′b′

where a′ ∈ A′, b′ ∈ , a′′ ∈ A′′ and b′′ ∈

Since there are no elements in we conclude that the left and right sets

of the product are empty and thus is the Surreal number 0.

Now consider a · 1.

a · 1 = a′ · 1 + a · 0− 1 · 0|a′′ · 1 + a · 0− a′′ · 0 = a′|a′′ = a.

Hence 1 is the multiplicative identity.

To demonstrate that all non-zero surreal numbers have multiplicative in-

verses is considerably more technical than the calculation arguments used

for the associative and distributive properties. A complete development of

inverses is given in the book “An introduction to the theory of Surreal Num-

bers” by Harry Gonshor. We give here an intuitive argument as to why

inverses should exist.

Given a non-zero surreal number x we naively pick a candidate y0 = B′|B′′

for its inverse. y0 > 0 if x > 0 and y0 < 0 if x < 0. If x · y0 > 1,

then we construct our next candidate by y1 = B′|B′′ ∪ y0 if x > 0 or

y1 = B′ ∪ y0|B′′ if x < 0. If x · y0 < 1, then we construct our next

candidate by y1 = B′ ∪ y0|B′′ if x > 0 or y1 = B′|B′′ ∪ y0 if x < 0.

We proceed in this fashion until the product is in fact 1. We naively

believe that the procedure will eventually end as we will exhaust all possible

117

numbers that could lie between the sets of x · yα except for 1. Then yα will

be the multiplicative inverse of x.

118

APPENDIX 1

The Continuum Hypothesis

A major question arising from the very beginning of Set Theory is one

regarding the relationship between ordinal numbers and cardinal numbers.

Namely, which ordinal is the first uncountable ordinal? We know that ordinal

numbers and cardinal numbers are identical until the ordinal ω+1 which has

the same cardinality as ω, namely ℵ0. Since we know that 2ℵ0 has cardinality

greater than ℵ0 could that be the next cardinal number? I.e. does there not

exist an ordinal number whose cardinality is strictly greater than ℵ0 and

strictly less than 2ℵ0? We call ℵ1 the next cardinal greater than ℵ0, and we

formally state the hypothesis ℵ1 = 2ℵ0 , which is known as the continuum

hypothesis.

Recall that cardinal numbers are ordinal numbers so the existence of the

next larger cardinal number is guaranteed by the well ordering of ordinal

numbers. If we let ℵ be an arbitrary cardinal number we designate the

next larger cardinal number by ℵ+. We can thus generalize the continuum

hypothesis by

ℵ+ = 2ℵ ∀ℵ ≥ ℵ0.

Paul Cohen demonstrated in 1963 that the answer to the above question

cannot be decided with the Zermelo-Fraenkel axioms. That is the contin-

uum hypothesis is independent of ZF Axioms! A sketch of the argument

that establishes this fact is given in Keith Devlin’s book The Joy of Sets,

and a rigorous account is given in Bell’s book Boolean-Valued Models and

Independence Proofs in Set Theory.

119

APPENDIX 2

The number 1

The symbol that we use to represent a set is the two braces and . The

symbols should be considered a single symbol, since in representing sets the

left or right brace alone is meaningless. If we should replace these symbols

with a simple closed curve, that is a circle, then we can essentially illustrate

numbers. Here we illustrate the various numbers 1.

The ordinal number 1 is also the cardinal number 1 and the natural

number 1.

The integer 1 and rational 1 are represented by their canonical represen-

tative from their respective equivalence classes.

The real 1 is an infinite collection of rational numbers and does not lend

itself to a concise picture, and hence we do not represent it here.

120

ORDINAL 1

121

INTEGER 1

122

RATIONAL 1

123

SURREAL 1

124

APPENDIX 3

Quantifiers

∀ for all or for any

∃ there exists

! Unique

Logical Connectives

- not

∨ or

∧ and

⇒ implies a⇒ b; if a, then b

⇔ if and only if a⇔ b; if a, then b and if b, then a

6⇒ does not imply a 6⇒ b ≡ −(a⇒ b)

Logical Contradiction

→← The preceding statement is self contradictory.

e.g. (a⇒ b) ∧ (a 6⇒ b) →←

Truth Tables

a −a

T F

F T

a b a ∨ b

T T T

T F T

F T T

F F F

a b a ∧ b

T T T

T F F

F T F

F F F

a b a⇒ b

T T T

T F F

F T T

F F T

a b a⇔ b

T T T

T F F

F T F

F F T

Bibliography

[1] Halmos, Paul R. Naive Set Theory D. Van Nostrand Company, Inc.,

Princeton, NJ, 1960

[2] Stoll, Robert R. Set Theory and Logic W. H. Freeman and Company,

San Francisco Ca, 1963

[3] Rudin, Walter The Principles of Mathematical Analysis, third edition

McGraw-Hill, inc., New York, NY, 1964

[4] Landau, Edmund Foundations of Analysis Chelsea Publishing Company,

New York, NY, 1951

[5] Gonshor, Harry An Introduction to the Theory of Surreal Numbers Cam-

bridge University Press, Cambridge, U.K., 1986

[6] Knuth, Donald Surreal Numbers Addison-Wesley Publishing Company,

Menlo Park, CA., 1974.

[7] Devlin, Keith The Joy of Sets, Second Edition Springer-Verlag, New York,

NY, 1993

[8] Bell, J.L. Boolean-Valued Models and Independence Proofs in Set Theory

Oxford University Press, London, 1977

[9] Websters New World Dictionary Warner Books Inc., New York, 1990

Index

Term Page

Absolute Value 96addition 51additive identity 67additive inverse 67Aleph 33antisymmetric 20Archimedes 102Archimedian Property 73Arithmetic 50Axiom of Choice 16Axiom of extension 3Axiom of infinity 10Axiom of pairing 6Axiom of power sets 9Axiom of regularity 17Axiom of unions 7Axiom schema of replacement 17Axiom schema of restriction 17Axiom schema of specification 3Bijection 30binary operation 50Binary representation 92Bishop Berkeley 102bound 39C 98canonical representation 109Cantor 102Cantor’s Theorem 33Cardinal Arithmetic 50Cardinal Number 32cardinality 30Cartesian Product 13Cauchy Sequence 96chain 41Choice Function 15codomain 13comparable 43complete 40Complex Numbers 98Compliment 8Composition 30continuation 45continuum hypotheses 118Conway 107countable 32countably infinite 32counting numbers 59Counting Theorem 47Cut 78De Morgan laws 8Dedekind cut 78

Term Page

disjoint 8disjoint union 51division algorithm 74domain 13element 1embedding 65Empty Set 4equivalence relation 31exponentiation 52,54Field 71fluxion 102Function 13,30greatest lower bound 40,90H 101image 13induction 72infimum 40,90initial segment 108injection 65integer addition 62integer multiplication 62Integers 60Integral Domain 67Intersection 7Isaac Newton 102K 75Knuth 101least upper bound 40,89Limit Ordinal 27linear order 21lower bound 40Mathematical Induction 72map 30multiplication 51N 59natural numbers 59Negative Integers 66Octonians 101One to One 30Onto 30order 20order isomorphic 41order preserving 40order type 48Ordinal Arithmetic 52Ordinal Number 22partial order 20Partition 31Positive Integers 66Power 52Power set 9

Term Page

preimage 13product 51Projection Map 15Proper Class 19proper initial segment 108Proposition 3Q 68Quaternions 100R 78range 13Rational numbers 68Real Number 78reflexive 20relation 20Russel’s Paradox 5S 107Schroder-Bernstein Theorem 34section 21sequence 91Set 1simple closed curve 119spaces 1subset 8

Term Page

successor 10successor set 11sum 51supremum 26,40,89Supremum Property 90Surreal numbers 107total order 21tower 43Transfinite Induction 24Transfinite numbers 102Transfinite Recursion Theorem 27transitive 20Trichotomy 21Trichotomy Property 80uncountable 32union 7upper bound 26,39weak section 22well ordered 21Well ordering theorem 45Z 59Zermelo Fraenkel 1Zorn’s Lemma 41

andersen r.- set theory and the construction of numbers

Documents

set theory

universal set

set ais

hencewemustconcludethatthe

set andxoneofitselement

null set

axiom of specication

theoryof sets