Set Theory and the Construction of Numbers
Robert Andersen
Department of Mathematics
University of Wisconsin - Eau Claire
1
I INTRODUCTION
Modern mathematics is couched in the language and steeped in the theory
of sets. Sets are the heart and soul of mathematics. Sets should be well
understood by any serious student of the subject. What we attempt to do
here is to present the Zermelo Fraenkel Axioms for Set Theory and develop a
model for those axioms. We call this model the number model, as our usual
concept of numbers is a consequence of this model.
There is an initial temptation when developing a theory to define all of the
terms. This temptation is quickly extinguished when we realize its futility.
To define a new term we must define it in terms of previous terms; thus we
are faced with two possibilities, either having circular definitions or having
an infinite regression of definitions. Either case is unsatisfactory. Therefore
we begin with undefined terms, or as I prefer “dictionary” defined terms.
That is, definitions taken directly from a standard dictionary.
A set is a group of persons or things classed or belonging together. We
may paraphrase this as: A set is a collection of objects. However, as we shall
see later, not every collection can be regarded as a set. Collections, either
sets or non-sets, are often referred to as spaces to avoid repetitive rhetoric.
The objects, persons, or things that make up the set we shall call ele-
ments. A single one of these objects is of course an element.
The aggregate of elements is the set. If a particular element is a member
of the aggregate we say it is an element of the set. Let a represent a set and
x one of its element. We express this fact with the notation, x ∈ a, and we
read this as x is an element of a.
In developing an axiomatic discussion of sets it may be tempting to posit
2
the existence of a set. This is troublesome for the following reason. In any
well defined axiom system the axioms are chosen to be independent. An
axiom is independent of the other axioms if a model can be constructed,
using all other axioms, and replacing the axiom in question with another
that is its denial or implies its denial. An axiom system is independent if all
of its axioms are independent. If our axiom system includes an axiom that
asserts the existence of a set, then to demonstrate its independence, we must
construct a model that assumes the validity of all other axioms, which are
statements about sets, and an axiom that denies the existence of a set. It is
not necessary to posit the existence of a set since examples of concrete sets
are ubiquitous. An examination of my right front pocket reveals a collection
of keys, indeed a set exists.
3
II THE AXIOMS
The classical collection of Axioms for set theory are the Zermelo Fraenkel
Axioms (ZF Axioms).
The Axiom of Extension
ZF1 Axiom of extension. Two sets are equal if and only if they have
identically the same elements. ∀a∀b(a = b) ⇔ ∀x(x ∈ a ⇔ x ∈ b).
This axiom not only defines the term equal and its associated symbol, =,
but also tells us that it is the elements that uniquely define a set. The set
may however have different descriptions. For example consider those planets
that orbit the sun closer than the earth, and those planets of our solar system
with no natural satellite. A simple check of an almanac reveals that exactly
the same collection of planets satisfies both descriptions.
Let a be any set. Let P (x) be a proposition about an arbitrary element
x from a; that is, for every x ∈ a, P (x) is a statement that is either true or
false. An example of a non-proposition in the variable x is x ∈ b. In this
example b is also an indeterminate. Without specifying b we have no way of
knowing whether any particular element of our initial set a is in b or not.
The Axiom Schema of Specification
ZF2 Axiom schema of specification. For every set a and proposition P (x)
there is a set b that consists of those elements of a where P (x) is true.
∀a∃b(x ∈ b ⇔ x ∈ a ∧ P (x)).
ZF2 is not regarded as a single axiom but rather as a collection of ax-
ioms, hence the term axiom schema is used. Each possible proposition P (x)
4
produces a separate axiom. Several of the Zermelo Fraenkel Axioms are in
fact axiom schema and will be identified as such.
By convention, braces and are used when representing a set, with the
elements listed between the braces. The pair of braces should be considered
as a single symbol, a single brace without its complement is meaningless in
set theory. The braces can be thought of as the purse that holds the coins.
The purse is by no means a member of the collection of coins, and the braces
are not elements of a set. The braces simple mean that the elements listed
between them are to be considered a set. x, y, z is the set that consists of
the letters x, y, z (the commas, or course, are punctuation).
When we use an axiom of specification we express the set b (of ZF2) by
x ∈ a | P (x), which is read: The set of all x in a such that P (x). The set a
is called the Universal Set and P (x) is the proposition that must be satisfied
for an element to be included in the set b.
Let a be an arbitrary set, for example let a be the collection of nations
in the United Nations. Now let P (x) be the statement: x is not x, e.g.,
Canada is not Canada. At first glance this statement may seem absurd, but
in reality it is not absurd but simply false for every element of our set. Using
our notation we write the set specified as:
x ∈ a | x 6= x.
By virtue of the axiom schema of specification this set exists, but no
element of a satisfies the proposition x 6= x. Hence we must conclude that
the set is vacuous and we call it the Empty or Null Set. The empty set is not
just an interesting or pathological footnote to the axiom of specification, but
is absolutely crucial to the study of sets. In fact we may at this time choose
5
to disregard any other set (and I choose to do so) and focus our attention
solely on the empty set and other sets that may be derived from it by the
axioms. We reserve the symbols ∅ and for the empty set. We may state
this result as a theorem.
Theorem 2.1 There exists a set that has no elements.
Here is the formal proof.
Proof Let ∅ = x ∈ a | x 6= x. Since for any set a no element satisfies the
proposition x 6= x, we conclude ∅ has no elements.
An interesting result known as Russell’s Paradox can now be presented.
Since sets themselves can be regarded as objects, it seems quite reasonable
to consider a collection of sets as a set itself. Let s be the collection of all
sets and consider the following set.
r = x ∈ s | x 6∈ x.
Now the set r is either an element of itself or it is not. Assume that it is,
i.e. we assume r ∈ r. Then by the defining proposition of the set r we see
that it must be the case that r 6∈ r, contrary to the assumption. Thus we
may assume that the other case is true, that r is not an element of itself, i.e
r 6∈ r, but we see that if that is true we must necessarily have r ∈ r. Again
this presents a contradiction. Clearly the collection, s, is much too large or
encompassing to be considered a set. This unfortunate result is handled by
an appropriate axiom yet to be stated. (See ZF9)
At this stage the only sets that we have are real, tangible objects and the
empty set. We wish to present axioms that allow us to expand our collection
of sets to more abstract objects. The next few axioms allow us to construct
new sets from preexisting sets.
6
The Axiom of Pairing
ZF3 Axiom of pairing. If a and b are sets then there exists a set c such
that a ∈ c and b ∈ c. ∀a∀b∃c(a ∈ c ∧ b ∈ c).
There is an important fact to note here. The axiom specifies two elements
that are in c but says nothing about what else may be in c, in this sense c is
not a well defined set. But by the use of an axiom of specification we may
construct a well defined set. Let c be a set guaranteed to exist by the axiom
of pairing and consider the following set:
d = x ∈ c | x = a, or x = b.
Clearly d = a, b.
Another point to note here is that a and b may be the same set, that is
a = b. In this case we have the following:
d = a, b = a, a = a
The last equation is true by virtue of the axiom of extension. Since
both sets on each side of the equal sign have exactly the same elements the
statement is true.
Using only the empty set, we can now construct a new set. Let a = b = ∅,and we have d = ∅.
We may continue ad infinitum constructing new sets from previously con-
structed sets, for example let a = ∅ and b = ∅, and we now have
d′ = ∅, ∅.
7
The Axiom of Unions
ZF4 Axiom of unions. For every set c there is a set a where, if b ∈ c
and x ∈ b, then x ∈ a. ∀c∃a∀b(x ∈ b ∧ b ∈ c ⇒ x ∈ a).
Here we regard the set c as a collection of sets. We form a new set by
including every element of each of the member sets of c. If c is not a collection
of sets then the statement is vacuously true and a is possibly empty. Again a
is not a well defined set, the axiom does not exclude other possible elements
that do not satisfy our stated condition. An application of the axiom of
specification (ZF2) will form a set that contains only those elements that are
contained in the sets of c.
Let a′ represent that set whose elements consist of only those elements of
a that are elements of members of c. We write
a′ =⋃
b∈c
b or simply a′ =⋃c
which is read a prime is the union of c.
If c consists of only two elements, b1, b2, then we write b1 ∪ b2.
A closely related concept to the union of sets is the intersection of sets.
Let c be a collection of sets, we define the intersection of this collection of
sets by ⋂
b∈c
b = x ∈⋃c
| x ∈ b for every b in c.
The reader may question why we choose the Union of c for our Universal
set when any element of c would suffice. By choosing the Union we eliminate
the problem of devising some mechanism to choose some element of c to be
our Universal set.
8
If we are constructing the intersection of only two sets, b1, b2, then we
write b1 ∩ b2.
Definition We say the sets a and b are disjoint, if a ∩ b = ∅.
Definition For two sets a and b we say a is a subset of b, written a ⊂ b,
if and only if every element of a is an element of b; i.e. a ⊂ b ⇔ (x ∈ a ⇒x ∈ b).
Theorem 2.2 a = b if and only if a ⊂ b and b ⊂ a.
This theorem follows immediately from the definition of subset and the
axiom of extension (ZF1).
Note that it is vacuously true that ∅ ⊂ a for every set a.
If a ⊂ b and a 6= b, we say that a is a proper subset of b, and we write
a ( b. Definition For sets a and b, the complement of b in a, written
a− b is the set
a− b = x ∈ a | x 6∈ b.
Theorem 2.3 De Morgan laws If c is a set and b is a collection of subsets
of c, then c−⋃
a∈b
a =⋂
a∈b
(c− a) and c−⋂
a∈b
a =⋃
a∈b
(c− a).
Proof
[x ∈ c−
⋃
a∈b
a ⇒ x ∈ c and x 6∈⋃
a∈b
a
⇒ x ∈ c and ∀a ∈ b x 6∈ a
⇒ ∀a ∈ b x ∈ (c− a)
⇒ x ∈⋂
a∈b
(c− a)]
⇒ c−⋃
a∈b
a ⊂⋂
a∈b
(c− a)
9
On the other hand
[x ∈
⋂
a∈b
(c− a) ⇒ ∀ a ∈ b x ∈ (c− a)
⇒ x ∈ c and ∀ a ∈ b x 6∈ a
⇒ x ∈ c and x 6∈⋃
a∈b
a
⇒ x ∈ c−⋃
a∈b
a]
⇒⋂
a∈b
(c− a) ⊂ c−⋃
a∈b
a
By Theorem 2.2 we have the result
c−⋃
a∈b
a =⋂
a∈b
(c− a).
The proof that
c−⋂
a∈b
a =⋃
a∈b
(c− a)
is similar and left to the reader.
The Axiom of Power Sets
ZF5 Axiom of power sets. For any set a there is a set b such that if
x ⊂ a, then x ∈ b. ∀a∃b(x ⊂ a ⇒ x ∈ b).
Again we must apply an axiom of specification (ZF2)to construct a well
defined set whose elements are only the subsets of a. This set is called the
power set of a and is written P(a).
By virtue of the axiom of pairing (ZF3) for any set a we may construct
the singleton set a. Now by virtue of the axiom of unions(ZF4) we may
10
construct the set a ∪ a. This set is known as the successor of a and is
written a + 1. We formally define the successor by:
Definition For any set a the successor of a is the set a + 1 given by:
a + 1 = a ∪ a.
Examples:
∅+ 1 = ∅ ∪ ∅ = ∅
∅+ 1 + 1 = ∅ ∪ ∅ = ∅, ∅
For convenience we name these sets, ∅ = 0, ∅ = 1, ∅, ∅ = 2.
Thus we see:
= 0
0 = 1
0, 1 = 2
...
0, 1, . . . , n− 1 = n
...
Remark: The notation a + 1 + 1 + · · · + 1 is unambiguous as its only in-
terpretation is (. . . ((a + 1) + 1) + · · · + 1). To associate differently, e.g.
a + (1 + 1 + · · · + 1) has no meaning. The notation a + n refers to the nth
successor of a. That is a + n = a + 1 + 1 + · · ·+ 1︸ ︷︷ ︸n
.
11
The Axiom of Infinity
The following axiom is a powerful statement that allows for the construction
of arbitrarily large sets and allows us to regard unbounded classes of numbers
as sets.
ZF6 Axiom of infinity. There exists a set, a, that contains ∅, and the
successor of each of its elements. ∃a(∅ ∈ a ∧ (x ∈ a ⇒ x + 1 ∈ a).
A set that satisfies the axiom of infinity is called a successor set.
Theorem 2.4 If a and b are successor sets, then a ∩ b is a successor set.
Proof (a, b successor sets) ⇒ (∅ ∈ a and ∅ ∈ b) ⇒ (∅ ∈ a ∩ b). (α ∈a ∩ b) ⇒ (α ∈ a and α ∈ b) ⇒ α + 1 ∈ a and α + 1 ∈ b) ⇒ α + 1 ∈ a ∩ b).
We generalize this theorem to include arbitrary intersections and hence
the above theorem becomes a corollary to the following theorem.
Theorem 2.5 If A is an arbitrary collection of successor sets, then⋂A
is a
successor set.
Proof Let A be a collection of successor sets. ∅ ∈ a ∀a ∈ A ⇒ ∅ ∈⋂a∈A
a.
α ∈⋂a∈A
a ⇒ α ∈ a ∀a ∈ A ⇒ α + 1 ∈ a ∀a ∈ A ⇒ α + 1 ∈⋂
a∈b
a.
Let Ω be a successor set, and let A = a ∈ P(Ω)|a is a successor set ,that is A is the collection of all successor subsets of Ω. We let ω =
⋂A
.
We notice ω is unique regardless of the initial choice of Ω since if we
let Ω1 and Ω2 be two successor sets we have Ω1 ∩ Ω2 is a successor set and
Ω1 ∩ Ω2 ⊂ Ω1 and Ω1 ∩ Ω2 ⊂ Ω2.
12
We also see ω is the minimal successor set. Thus we see ω = 0, 1, 2, . . . .
We can now construct the successors of ω, ω + 1 = 0, 1, 2, . . . , ω and
ω + 1 + 1 = 0, 1, 2, . . . , ω, ω + 1
We name ω + 1 + 1 = ω + 2 and ω + 1 + · · ·+ 1︸ ︷︷ ︸n
= ω + n.
It is important to observe here that the successor of a set is not a successor
set.
The Ordering of Sets, Cartesian Products and Functions
Definition We say that a is less than b, written a < b, iff a ∈ b.
We should note here that sets and elements are somewhat synonymous,
they differ only in their relationship to each other.
Now we wish to deal with the situation where we have a set with two
or more elements and we wish to designate one element as the first element,
and another as the second and so forth. Let us begin with the easiest case,
a set with two elements.
Let x and y be elements, by virtue of the axiom of pairing (ZF3) we can
construct the two sets x, y, and x. Again using the axiom of pairing we
construct the set x, x, y. For simplicity of notation we use (x, y) to
represent the set x, x, y.
Definition The ordered pair, (x, y), is the set x, x, y.
We regard the ordered triple (x, y, z) as the ordered pair
((x, y), z) = x, x, y, x, x, y, z.
13
Thus we may inductively define the ordered n-tuple by
(x1, x2, · · · , xn) = ((x1, x2, · · · , xn−1), xn).
We now want to consider certain subsets of the collection of all ordered
pairs formed from two sets.
Definition The Cartesian product of two sets a and b is
a× b = z ∈ P(P(a ∪ b)) | z = x, x, y where x ∈ a, y ∈ b.
For simplicity of notation we write
a× b = (x, y)|x ∈ a, y ∈ b.
Definition A function from a set a to a set b is a subset, f , of a× b that
satisfies the following two condition:
1. ∀x ∈ a ∃(x, y) ∈ f , and
2. if (x, y) ∈ f and (x, z) ∈ f then y = z.
To work with functions efficiently, it helps to name the related sets. The
set a is called the domain of f , and b is called the codomain of f . The
range of f is the set y ∈ b|(x, y) ∈ f. If c ⊂ a, then the image of c under
f is the set y ∈ b | x ∈ c and (x, y) ∈ f. If d ⊂ b, then the preimage
of d under f is the set x ∈ a|(x, y) ∈ f and y ∈ d. We write f(c) for the
image of c under f , and f−1(d) for the preimage of d under f . Also, if the
range of a function f is equal to its codomain, we say the function is onto
its codomain.
14
We express a function, f , with domain a and codomain b by f : a → b.
Also for a function f , if (x, y) ∈ f , then we express y as f(x). We may write
y = f(x), or (x, f(x)) to represent the element (x, y) of f .
We wish to generalize the cartesian product to arbitrary collections of
sets. To do so we introduce the concept of indexing one set by another.
Definition A function I from a set Λ onto a set a is said to index the set
a by Λ. The set Λ is called the index and a is the indexed set. If I(λ) = a,
then we write aλ for I(λ).
Definition Let a be a non-empty set (we remind the reader here, that the
elements of sets are considered to be sets) indexed by a set Λ. The cartesian
product of a is defined to be the collection, Π, of all functions with domain
Λ and codomain⋃
b∈a
b, satisfying the condition f(λ) ∈ bλ. We write
c =∏
λ∈Λ
bλ.
For clarity we present some examples here.
Let a = 1, 2, 2, 3. Then we have the four functions
f1 = (1, 2, 1), (2, 3, 2)
f2 = (1, 2, 1), (2, 3, 3)
f3 = (1, 2, 2), (2, 3, 2)
f4 = (1, 2, 2), (2, 3, 3)
This notation is cumbersome but we can represent these functions by the
following ordered pairs without loss of any information.
f1 = (1, 2), f2 = (1, 3), f3 = (2, 2), f4 = (2, 3).
15
Hence the cartesian product is
∏= (1, 2), (1, 3), (2, 2), (2, 3).
In this example where there are only two elements in a we may write
1, 2 × 2, 3 = (1, 2), (1, 3), (2, 2), (2, 3).
We leave as an exercise for the reader to demonstrate that the cartesian
product of the following collection
a = 1, 2, 2, 3, 3, 4
can be represented by
∏
λ∈3
αλ = (1, 2, 3), (1, 2, 4), (1, 3, 3), (1, 3, 4), (2, 2, 3), (2, 2, 4), (2, 3, 3), (2, 3, 4).
Let a = 0, 1 be indexed by ω. The cartesian product of a with respect
to this indexing can be represented by the set of infinite strings of zeros and
ones. That is
∏i∈ω
0, 1i = (b0, b1, · · · )|bi = 0 or 1 ∀i ∈ ω.
Any function, f , that satisfies the conditions in the previous definition
is called a choice function. The rationale for this name is clear, as the
function chooses an element from each set.
Let a be a set and c =∏
λ∈Λ
bλ. The projection map pbλ: c → bλ is the
function defined by pbλ(x) = x(λ).
For our original example we may compute
p1,2(f1) = f1(1, 2) = 1
16
and
p1,2(f3) = f3(1, 2) = 2.
We leave as an exercise to compute p1,2(f2) and p1,2(f4).
The Axiom of Choice
The next axiom is known as the Axiom of Choice. We give three formulations
of the statement of this axiom.
ZF7 Axiom of Choice.
I. For every nonempty set whose elements are nonempty sets there
exists a choice function.
II. If ai is a family of nonempty sets, indexed by a nonempty set I,
then there exists a family xi with i ∈ I such that xi ∈ ai for each
i ∈ I.
III. The cartesian product of a nonempty collection of nonempty sets is
nonempty. (∀a 6= ∅ ∨ x ∈ a ⇒ x 6= ∅) ⇒∏x∈a
6= ∅.
Theorem 2.6 The three previous statements are equivalent.
Proof I. ⇒ II. Let A be a collection of disjoint nonempty sets. We have
A⊂ P(⋃i∈I
ai). By I. there exists a choice function f on P(⋃i∈I
ai). Let b be
the image of A. Pick an element a ∈A, f(a) ∈ a∩ b since f(a) ∈ a. Let y ∈ b
where y 6= f(a) thus we have y = f(a′) where a′ 6= a, and thus y ∈ a′. Since
a and a′ are disjoint y 6∈ a. Thus the only element of b ∩ a is f(a).
II. ⇒ I. Define the choice function to be (ai, xi) | i ∈ I.
17
I ⇐⇒ III. Since the cartesian product is the collection of choice func-
tions, if a choice function exists then the cartesian product is non-empty.
Conversely if the cartesian product is non-empty its elements are choice func-
tions, thus a choice function exists.
The Axiom Schema of Replacement
ZF8 Axiom schema of replacement. If P (x, y) is a proposition such that
for each x in a set a, P (x, y) and P (x, z) implies that y = z, then there exists
a set b such that y ∈ b if and only if there exists an x in a such that P (x, y).
∀x ∈ a(P (x, y) ∧ P (x, z) ⇒ y = z) ⇒ (∃b ∧ (y ∈ b ⇔ ∃x ∈ a ∧ P (x, y))).
ZF8 allows the construction of new sets in the following way. If a set a
exists and a rule that assigns to elements of a other pre-existing elements or
sets that may or may not be elements of other sets, then there exists a set b
that contains only those elements. This axiom schema will be heavily relied
upon in the next chapter.
The Axiom Schema of Restriction
ZF9 Axiom schema of restriction. Let S(x) be any proposition involv-
ing x that does not involve y or z. If there exists an x such that S(x) is true,
then there exists a y such that S(y) is true and, for all z, if z ∈ y then S(z)
is false.
If we take S(x) to be the statement x ∈ a, then we have the following
statement, which we call the axiom of regularity.
Axiom of regularity. Every nonempty set a contains an element b such
that a ∩ b = ∅. ∀a 6= ∅∃b ∈ a ∧ a ∩ b = ∅.
18
Two important lemmas follow from the axiom of regularity.
Lemma 2.7 For each set a, a 6∈ a.
Proof The proof is indirect. We assume that there exists a set a such that
a ∈ a. We thus have a ∈ a ∩ a. However by the axiom of regularity acontains an element whose intersection with a is empty. Since a is the only
element we have a ∩ a = ∅, which contradicts our original assumption.
Corollary There does not exist a set of all sets.
Proof If there existed a set of all sets it would have to be an element of
itself which would contradict the previous lemma. .
Thus we see the axiom of regularity is the response to Russell’s paradox.
Lemma 2.8 No two sets can be elements of each other.
Proof Again the proof is indirect. Assume that a and b are sets such that
a ∈ b and b ∈ a. We thus have a ∈ a, b∩b and b ∈ a, b∩a. By the Axiom
of regularity we must have an element x ∈ a, b such that x ∩ a, b = ∅.Since our only two choices are a or b we must have either a, b ∩ a = ∅ or
a, b ∩ b = ∅ which contradicts our assumption.
These two lemmas can be replaced by a more general theorem (Theorem
3.3) that will be stated and proved in the next chapter. Meanwhile we can
use Lemma 2.8 to prove the following “cancellation” law.
Theorem 2.9 If x + 1 = y + 1, then x = y.
Proof x + 1 = y + 1 ⇒ x ∪ x = y ∪ y ⇒ either x = y or x ∈y and y ∈ x. The latter case is a contradiction to lemma 2.8.
Since the collection of all sets is an object that can be contemplated the
study of collections can be extended to include collections that are not sets.
19
Collections that may or may not be sets are called classes. It is not within
the scope of this book to study classes but we give the following definition.
Definition A collection that is not a set is called a proper class.
Exercises Prove the following “distributive” laws.
1. a ∪ ( ⋂
λ∈Λ
bλ
)=
⋂
λ∈Λ
(a ∪ bλ
)
2. a ∩ ( ⋃
λ∈Λ
bλ
)=
⋃
λ∈Λ
(a ∩ bλ
)
Also show
3. (a, b) = (c, d) ⇒ a = c and b = d.
20
III ORDINAL NUMBERS
Order Relations
Definition For any set a, each subset of a× a is called a relation on a.
If R is a relation on a set a and if (x, y) ∈ R then we write xRy.
Definition A relation that satisfies the following conditions:
R xRx ∀x ∈ a
A xRy & yRx ⇒ x = y.
T xRy & yRz ⇒ xRz
is called an order relation, or simply an order.
Condition R is called reflexive, A is called antisymmetric, and T is called
transitive.
Let a be a set. In our context the elements of a are themselves sets.
We see that R = (x, y) | x ⊂ y is an order relation on a. We leave the
verification as an exercise for the reader.
With an order relation we use the symbol x ¹ y instead of xRy. If x ¹ y
we say that x precedes y or x is less than or equal to y.
Definition If x ¹ y and y 6¹ x, then we say x is strictly less than (or simply
less than) y, and we write x ≺ y.
A set together with an order is said to be an ordered set. We often refer
to an ordered set as a partially ordered set to differentiate it from linearly
ordered and well ordered sets which we now define.
21
Definition An ordered set a that for every x, y ∈ a either x ¹ y or y ¹ x
is said to be linearly ordered or totally ordered.
Theorem 3.1 Trichotomy If a is a linearly ordered set and x, y ∈ a Exactly
one of the following statements is true:
i) x ≺ y. ii) y ≺ x. iii) x = y.
Proof If a is linearly ordered then we have
x ¹ y or y ¹ x ⇒
i) (x ¹ y and y ¹ x) ⇒ x = y or
ii) (x ¹ y and y 6¹ x) ⇒ x ≺ y or
iii) (x 6¹ y and y ¹ x) ⇒ y ≺ x.
We observe that neither x ¹ y and x 6¹ y, nor y ¹ x and y 6¹ x can be
true at the same time, thus the statements are pairwise inconsistent.
If ∃x ∈ a such that x ¹ y ∀y ∈ a, then we say x is the least element in
a.
Definition An ordered set a is said to be well ordered if and only if
whenever b is any nonempty subset of a, then b has a least element.
Theorem 3.2 Every well ordered set is linearly ordered.
Proof Let a be a well ordered set. Let x, y ∈ a. since a is well ordered
the subset x, y has a least element thus either x ¹ y or y ¹ x. Hence a is
linearly ordered.
Ordinal Numbers
Definition Let a be a set and let x ∈ a, the section of x with respect to
the set a is
S(x) = y ∈ a | y ≺ x.
22
The weak section of x is
S(x) = y ∈ a | y ¹ x.
Definition An ordinal number is a well ordered set a where for all x ∈ a,
S(x) = x.
In the last section we constructed the sets
= 0
0 = 1
0, 1 = 2
...
0, 1, . . . , n− 1 = n
...
By virtue of the axiom of infinity we may construct the sets
ω = 0, 1, · · · ω + 1 = 0, 1, · · · , ω
ω + 2 = ω + 1 + 1 = 0, 1, · · · , ω, ω + 1...
We can easily verify that these sets satisfy the definition of ordinal num-
bers where the order relation is x ⊂ y.
We now want to construct ‘higher order’ ordinal numbers. The axiom of
infinity will not work for us, since it only guarantees the existence of ω. We
now must appeal to the axiom schema of replacement (ZF8) to continue our
constructions.
23
Let P (x, y) be the proposition: For x ∈ ω, y is the xth successor of ω, i.e.
y = ω + x. Since ordinal successors are unique if z = ω + x we must have
y = z. Thus by the axiom schema of replacement (ZF8) there exists a set b,
such that ω + n ∈ b for every n ∈ ω, and conversely b contains only those
elements. We see that b = ω, ω+1, ω+2, · · · . We now construct the union
of ω and b.
ω ∪ b = 0, 1, 2, · · · , ω, ω + 1, ω + 2 · · ·
We name this set ω2.
We may repeat this process and construct the set ω3 where P (x, y) is
the proposition: y is the xth successor of ω2. We continue constructing sets
ω4, ω5, · · · . For clarity of discussion we shall refer to ωn as the nth multiple
of ω. We let ω0 = 0 and ω1 = ω.
We now let P (x, y) be the proposition: y is the xth multiple of ω. Since
successors are unique and multiples are unique collections of successors we
have unique multiples. Thus we apply the axiom of replacement (ZF8) and
construct the set b = 0, ω, ω2, · · ·
We now apply the axiom of unions (ZF4) to construct the union of all
sets of b. We designate this set by ω2.
⋃
b
= ω2
We may visualize ω2 by the following array.
ω2 =
0, 1, 2, · · ·ω, ω + 1, ω + 2, · · ·ω2, ω2 + 1 ω2 + 2, · · ·
...
24
We note here, that an element of ω2 is of the form ωn + m where n,m ∈ ω.
We may now construct successors of ω2, ω2 + n, and by the axiom of
replacement form the set b = ω2 +n|n ∈ ω. The set⋃
b
we call ω2 +ω. We
continue as above to form the set b = ω2+ωn|n ∈ ω, and the set⋃
b
we call
ω22. We continue and construct the multiples of ω2, ω2n. Again by virtue of
the axiom of replacement we construct a set b = 0, ω2, ω22, ω23, · · · . And
by virtue of the axiom of unions we construct the set ω3 =⋃
b
.
We may continue in this fashion constructing the sets ω4, ω5, · · · . We
refer to the set ωn as the nth power of ω. We again apply the axiom of
replacement and construct the set
b = 0, ω, ω2, ω3, · · ·
and the union of the sets of b form the set
ωω =⋃
b
.
We observe here that an element of ωω can be expressed in the form
ωnAn + ωn−1An−1 + · · ·+ A0 ≡n∑
k=0
ωkAk. Where n ∈ ω and An ∈ ω.
This process of course does not stop here but continues. We may form
sets ωωω, · · · . The set ωω·
··we call ε0. We of course have no reason to believe
that we have exhausted all ordinal numbers, and may continue in this fashion
ad infinitum.
Transfinite Induction
Theorem 3.3 The principle of transfinite induction. If a is an ordinal
number and b ⊂ a such that, for x ∈ a, S(x) ⊂ b ⇒ x ∈ b, then b = a.
25
Proof Suppose to the contrary that a is an ordinal number and b ⊂ a such
that for x ∈ a, S(x) ⊂ b ⇒ x ∈ b but, there exists c ∈ a such that c 6∈ b.
Then there exists a nonempty set y = x ∈ a | x 6∈ b. Since y is a nonempty
subset of a well ordered set it must have a least element, a0, and S(a0) ⊂ b,
thus by our hypothesis a0 ∈ b which contradicts our assumption. Thus we
must conclude b = a.
Properties of Ordinal Numbers
Lemma 3.4 The elements of ordinal numbers are ordinal numbers.
Proof If a is an ordinal number and b ∈ a, then b = S(b) ⊂ a. We notice
that any subset of b is a subset of a, thus b is well ordered, furthermore for
any c ∈ b we have c ∈ a and thus c = S(c). Therefore b is an ordinal number.
Theorem 3.5 Let a, b be ordinal numbers, then either a ( b, or b ( a, or
a = b.
Proof Either a = b, or a 6= b. If a 6= b, then either ∃x ∈ a where x 6∈ b or
∃x ∈ b where x 6∈ a. If ∃x ∈ a where x 6∈ b, let t = x ∈ b|x ∈ a ⊂ b∩a ⊂ a.
We show t = b by transfinite induction, and hence b ⊂ a. Let x ∈ b such
that S(x) ⊂ t, thus S(x) ( a. Since S(x) is a proper subset of a we have
y ∈ a|y 6∈ S(x) 6= ∅. Let r be the least element of y ∈ a|y 6∈ S(x),then r = S(x) = x. Since r ∈ a we have x ∈ a and thus x ∈ t. Hence by
Transfinite induction t = b. By the symmetric argument if ∃x ∈ b where
x 6∈ a we have a ⊂ b.
Definition An upper bound for an ordered set C is an element β such
that x ¹ β ∀x ∈ C.
26
Definition A supremum or least upper bound for an ordered set C is
an element α such that α is an upper bound, and if γ is an upper bound,
then α ¹ γ. We indicate the supremum of an ordered set C by sup C.
When the elements of an ordered sets are regarded as numbers we will use
the symbols ≤ and < for ¹ and ≺. Thus for ordinal numbers the symbols
≤, ¹ and ⊂ are equivalent, as are <, ≺ and (.
Theorem 3.6 If C is a set of ordinal numbers, then C has a supremum.
Proof Let α =⋃C
. We claim that α is an ordinal number. Let A ⊂ α and
A 6= ∅. Pick a ∈ A, if a ≤ b ∀b ∈ A, then a is the least element. If a is not
the least element, then ∃ b ∈ A such that b < a ⇒ b ∈ a. Thus a ∩ A 6= ∅.The element a is an ordinal number and is well ordered. Let a0 be the least
element of a ∩ A. Let c be an arbitrary element in A, then either a ≤ c or
c < a. If a ≤ c, then a0 ≤ c. If c < a, then c ∈ a∩A, and thus a0 ≤ c. Thus
a0 is the least element of A. If ξ ∈ α, then ξ ∈ c for some c ∈ C ⇒ ξ = S(ξ).
Thus α is an ordinal number. Now α is an upper bound for C, since if c ∈ C,
then c ⊂ α implies c ∈ α. Now suppose ζ is an upper bound for C. Then for
all c ∈ C we have c ⊂ ζ, and thus α ⊂ ζ ⇒ α < ζ. Thus α is the supremum.
Corollary The collection of all ordinal numbers is a proper class.
Proof If the collection were a set, then a supremum would exist. Let α
be the supremum, but α ⊂ α + 1 and α + 1 is an ordinal number. Thus
α + 1 ∈ α ∈ α + 1. Which contradicts Lemma 2.7.
As promised at the end of chapter II, we now state and prove a gener-
alization of the final two lemmas of that chapter. As you will note we need
the concept of ordinal numbers to state the theorem in its generality.
27
Definition An ordinal number greater than 0 that is not the successor of
any other ordinal number is said to be a limit ordinal.
Theorem 3.7 For any collection of sets C, that can be indexed by an
ordinal α+1, that is not a limit ordinal, we can never have, for C = xλ|λ ∈α + 1, x0 ∈ x1 ∈ · · · ∈ xα ∈ x0.
Proof If we had x0 ∈ x1 ∈ · · · ∈ xα ∈ x0, then we would always have for
every ν 6= α, xν ∈ C ∩ xν+1 and xα ∈ C ∩ x0. Which contradicts the axiom
of regularity (ZF9).
Corollary For any two sets a and b, a ∩ (b× a) = ∅.
Proof Every element of b× a is of the form b′, b′, a which cannot
be in a, by virtue of theorem 3.7.
The Transfinite Recursion Theorem.
Let W be a well-ordered set and α ∈ W . An α-sequence in a set X is
a function φ : S(α) → X. Recall that S(α) is the initial section of α.
A sequence function of type W in X is a function
f : φ : S(α) → X|α ∈ W → X.
That is, f maps α-sequences into X.
Let Υ : W → X where W is a well ordered set and X is a set. We
observe that Υ|S(α) : S(α) → X is an α-sequence for all α ∈ W . Υ|S(α) is the
restriction of Υ to S(α)).
Theorem 3.8 Transfinite Recursion Theorem If W is a well ordered set
and if f is a sequence function of type W in a set X, then there exists a
unique function Υ : W → X such that Υ(α) = f(Υ|S(α)) for each α ∈ W .
28
Proof To prove uniqueness, let Υ and Ψ be two such functions such that
Υ(β) = Ψ(β) ∀β ∈ S(α). That is Υ|S(α) = Ψ|S(α). Then we have
Υ(α) = f(Υ|S(α)) = f(Ψ|S(α)) = Ψ(α).
Thus by Transfinite induction we have Υ(α) = Ψ(α) ∀α ∈ W .
To prove existence we explicitly construct Υ as a subset of W ×X.
We say a subset A of W×X is f-closed if for α ∈ W and t an α-sequence
in A, i.e. (c, t(c))|c ∈ S(α) ⊂ A, then (α, f(t)) ∈ A. W × X is f -closed,
thus such subsets do exist.
Let Υ =⋂
A is f−closed
A. Υ is f -closed since any α-sequence is in every A. Thus
(α, f(t)) ∈ A for every A, and (α, f(t)) ∈ Υ.
We now show that Υ is a function. That is ∀γ ∈ W ∃ ! ξ ∈ X such that
(γ, ξ) ∈ Υ.
We proceed by transfinite induction on W .
Let S = γ ∈ W |(γ, ξ), (γ, ζ) ∈ Υ & (γ, ξ) = (γ, ζ) ⇒ ξ = ζ. Also let
S(α) ⊂ S for some α. Thus if γ < α, then ∃ ! ξ ∈ X such that (γ, ξ) ∈ Υ.
The function t : S(α) → ξ, thus defined, is an α-sequence and t ⊂ Υ.
Now assume α 6∈ S. Then (α, y) ∈ Υ where y 6= f(t). Now consider the
set Υ−(α, y). Let β ∈ W and r be a β-sequence in Υ−(α, y). If β = α,
then r = t by the uniqueness of Υ. Also (β, f(r)) = (α, f(t)) ∈ Υ−(α, y),since (α, f(t)) 6= (α, y)) and (α, f(t)) ∈ Υ. If β 6= α, we have (β, f(r)) ∈Υ − (α, y) since Υ is f -closed and (β, f(r)) 6= (α, y). Thus we have, if
β ∈ W and if r is a β-sequence in Υ−(α, y), then (β, f(r)) ∈ Υ−(α, y).That is to say Υ − (α, y) is f -closed. This contradicts the fact that Υ is
29
the smallest f -closed set. We must conclude that α ∈ S. The hypothesis
for transfinite induction has been verified. Thus the existence of Υ has been
demonstrated.
30
IV CARDINAL NUMBERS
Cardinality
Definition Let a and b be sets. A subset, m, of a × b that satisfies the
following conditions is called a bijection:
I. ∀x ∈ a ∃y ∈ b such that (x, y) ∈ m.
II. If (x, y) and (x, z) are in m, then y = z.
III. ∀y ∈ b ∃x ∈ a such that (x, y) ∈ m.
IV. If (x, y) and (z, y) are in m, then x = z.
Conditions I. and II are the conditions necessary for the subset, m to be a
function as defined in Chapter II. We often refer to functions as maps. Thus
we say m is a map or function, and we write m : a → b.
Condition, III, is called the onto condition. Condition, IV, is called the
one to one condition. Hence we often say a bijection is a map that is one
to one and onto.
Definition Let a, b, c, be sets such that there are functions, f : a → b and
g : b → c. The composition of g with f , written g f is that function that
satisfies
(g f)(x) = z ⇐⇒ g(y) = z where f(x) = y.
Definition Let a and b be sets. If there exists a subset of a × b that is a
bijection then we say a and b are cardinally equivalent, or a and b have
the same cardinality.
31
Definition A relation on a set a that satisfies the following conditions:
R: xRx ∀x ∈ a
S: xRy ⇒ yRx.
T: xRy & yRz ⇒ xRz
is called an equivalence relation.
Condition R is called reflexive, S is called symmetric, and T is transitive.
We note here that the equivalence relation differs from the order relation
by replacing antisymmetry with symmetry.
Definition A Partition of a set a, which we shall indicate by p(a), is a
subset of the power set of a, P(a), such that⋃
p(a)
= a and for b, c ∈ p(a), b 6=
c ⇒ b ∩ c = ∅.
We leave it as an exercise to the reader to verify that an equivalence
relation on a set induces a partition of the set.
We say an equivalence relation partitions a set. That is the set is parti-
tioned into disjoint subsets where each element of each subset is equivalent
to each other but not to any element of any other subset.
Theorem 4.1 Cardinal equivalence is an equivalence relation.
Proof Let c be a set.
For reflexivity we note that ∀a ∈ c the identity map I : a → a defined by
I(x) = x is a bijection.
For symmetry we note that bijections are one-to-one and onto, thus the
reversed relation is also a bijection.
32
For Transitivity we note that the composition of bijections is also a bi-
jection.
Definition A Cardinal Number is the least ordinal of that cardinality.
We say a set is finite if it is cardinally equivalent to a proper subset of
ω, otherwise we say it is infinite. For finite sets there is a unique ordinal
number to which that set is cardinally equivalent, thus for finite sets, ordinal
and cardinal numbers are identically the same. This is not true for infinite
sets.
For ω + 1 = 0, 1, · · · , ω we can form the bijection
b : ω + 1 ↔ ω
defined by
b(x) = x + 1 for x 6= ω and b(ω) = 0.
For ω + n = 0, 1, · · · , ω, ω + 1, · · · , ω + (n− 1) we can form the bijection
b : ω + n ↔ ω
defined by
b(x) = x + n for x < ω and b(ω + k) = k for ω ≤ x ≤ ω + (n− 1).
Definition Any set that is cardinally equivalent to a subset of ω is said to
be countable, otherwise we say it is uncountable.
Every finite set is cardinally equivalent to a subset of ω, thus all finite
sets are countable. If a set is cardinally equivalent to ω then we say the set
is countably infinite.
33
Notation: The cardinality of a set, a, is indicated by C(a). The cardinality
associated with a countably infinite set is denoted by the cardinal number ℵ0
(aleph naught, ℵ is the first letter of the Hebrew Alphabet). Thus C(ω) = ℵ0.
Cantor’s Theorem
Theorem 4.2 Cantor’s Theorem For any set a, C(P(a)) 6= C(a).
Proof We prove Cantor’s theorem by contradiction.
Assume there exists a bijection b : a ↔ P(a), and let y ∈ P(a) be defined
as
y = x ∈ a | x 6∈ b(x).
The set y exists by the axiom of specification, ZF2. let c = b−1(y). We thus
must have the absurd implications
c ∈ y ⇒ c 6∈ b(c) = y ⇒ c ∈ b(c) = y.
We conclude no such bijection can possibly exist.
Since there is a natural bijection from a set a to the subset of P(a) that
consists of all the singletons it is natural to believe that we in fact have
the inequality C(a) < C(P(a)). Since we have defined cardinal numbers in
terms of ordinal numbers we wish to delay making this statement until we
have demonstrated that every set is cardinally equivalent to some ordinal
number.
Corollary There exists an uncountable set.
Proof P(ω) is not cardinally equivalent to any subset of ω and thus must
be uncountable.
34
The Schroder-Bernstein Theorem
If two sets, A and B, are cardinally equivalent, i.e. a bijection exists
between them then we write
A ↔ B.
Now suppose that for two sets, A and B, that set B has a subset C, where
A is cardinally equivalent to C, ie. A ↔ C ⊂ B. We then write
A → B.
Equivalently we may write
B ← A.
Theorem 4.3 Schroder-Bernstein For any two sets X and Y if X → Y
and X ← Y then X ↔ Y .
Proof By the assumption we have a bijection from X into Y . Call this
bijection f , i.e. f : X → Y , or we may write f : X ↔ f(X) ⊂ Y . We also
have by assumption a bijection, g, from Y into X. I.e. g : Y ↔ g(Y ) ⊂ X.
Our goal is to construct a bijection from X to Y . We will proceed as follows.
We will partition both X and Y into three disjoint subsets and produce
bijections between the subsets of X and Y .
First consider the elements of X that are not in the image of g, i.e., the
set X − g(Y ). We enlarge this set by including all of its descendants that
are in X under the maps (g f)n and call this set XX . The set XX can be
specified by
XX = z ∈ X | z = (g f)n(x) for some x ∈ X − g(Y ) and n ∈ ω.
35
When n = 0 z would be in X − g(Y ). We now consider the elements of Y
that are descendants of X − g(Y ) under the maps f (g f)n. We call this
set YX . YX can be specified by
YX = w ∈ Y | w = f (g f)n(x) for some x ∈ X − g(Y ) and n ∈ ω.
We note that YX = f(XX).
We similarly construct the subsets YY and XY which can be specified as
YY = t ∈ Y | t = (f g)n(y) for some y ∈ Y − f(X) and n ∈ ω.
and
XY = u ∈ X | u = g (f g)n(y) for some y ∈ Y − f(X) and n ∈ ω.
Again we note XY = g(YY ).
We now define the subset X∞ as those elements of X that are neither in
XX nor XY . I.e.
X∞ = s ∈ X | s 6∈ XX ∪XY .Similarly we define Y∞ by
Y∞ = r ∈ Y | r 6∈ YY ∪ YX.
We point out here the rationale for the symbols for these sets. XX is the
collection of elements in X whose most distant or primitive ancestor (under
the maps g f) is in X. XY is the collection of elements in X whose most
distant ancestor is in Y . X∞ is the collection of elements of X that have
no most distant ancestor, i.e. their lineage can be traced back infinitely far.
Similarly for YY , YX , and Y∞. We also note here that the sets XX , XY and
X∞ are mutually disjoint, as are the sets YY , YX and Y∞ and
X = XX ∪XY ∪X∞ and Y = YY ∪ YX ∪ Y∞.
36
To complete the proof we demonstrate that f restricted to XX , is a bi-
jection onto YX , g restricted to YY is a bijection onto XY , and f restricted to
X∞ is a bijection onto Y∞. We note here that g restricted to Y∞ would also
be a bijection onto X∞. For brevity we will abbreviate a function restricted
to a subset of its natural domain by f |A.
We must first show that f(XX) is in YX . Let z ∈ XX then z = (g f)n(x)
for some x ∈ X − g(Y ) and n ∈ ω. Thus f(z) = f (g f)n(x) for some
x ∈ X − g(Y ) and n ∈ ω. Hence f(z) ∈ YX . Clearly f |XXis one to one since
f is one to one. If y ∈ YX then y = f (g f)n(x) for some x ∈ X − g(Y )
and n ∈ ω Thus y is the image under f of an element of the form (g f)n(x)
for some x ∈ X − g(Y ) which is in XX , hence f |XXis onto, and thus f |XX
is
a bijection.
Similarly g|YYis a bijection from YY to XY , and thus g−1|Xy is a bijection
from XY to YY .
To demonstrate that f(X∞) is in Y∞ we note that for any x ∈ X∞ if f(x)
were not in Y∞ it would be in either YY or YX . Thus f(x) would be of the
form (f g)n(y) for some y ∈ Y − f(X) and n ∈ ω or f (g f)n(z) for some
z ∈ X − g(Y ) and n ∈ ω. Thus x = f−1 (f g)n(y) = g (f g)n−1(y)
for some y ∈ Y − f(x) and n ∈ ω or x = f−1 f (g f)n(z) for some
z ∈ X − g(Y ) and n ∈ ω. Thus x would be in either XY or XX which
contradicts our assumption, hence f(X∞) is in Y∞. Again f |X∞ is one to one
since f is one to one. Now let y ∈ Y∞ then there exists an x ∈ X such that
f(x) = y, if not then y ∈ Y − f(X) ⊂ YY . If that x were not in X∞ then it
would either be in XX or XY which would mean that y would be in either
YX or YY and thus not in Y∞ which contradicts our assumption. Hence we
conclude that f |X∞ is onto and is thus a bijection.
37
We can now formally define our bijection, b : X ↔ Y , as follows:
b(x) =
f(x) if x ∈ XX
f(x) if x ∈ X∞
g−1(x) if x ∈ XY .
Exercise In chapter 8 we develop the real numbers, however assuming an a
priori knowledge of the real numbers consider the closed interval [0, 1] ≡ X
and the half-open interval [0, 1) ≡ Y . Define injective maps f : X →Y by f(x) = 1
2x, and g : Y → X by g(x) = x. Determine the sets
XX , YX , YY , XY , X∞, Y∞, and the bijection b : X ↔ Y that the Schhroder-
Bernstein theorem guarantees to exist.
Some Countable Sets
Theorem 4.4 The finite cartesian product of countable sets is countable.
Proof If the cartesian product of a collection of sets is countable the
product can be reindexed to be a single countable set, thus all that needs
to be shown is that the cartesian product of two countable sets is countable.
this argument is made by the classical diagonalization process.
Let A and B be two countable sets then we can represent the cartesian
product by
A×B = (ai, bj)|ai ∈ A, bj ∈ B, i, j ∈ ω
We define a bijection b : ω → A×B by the following recursion:
b : 1 → (a1, b1)
b : n → (ai−1, bj+1), where b : n− 1 → (ai, bj), where n > 1, i 6= 1
38
b : n → (aj+1, b1), where b : n− 1 → (a1, bj), where n > 1.
To verify that the map defined by this recursion is a bijection consider the
following diagram. The arrows indicate the “order” in which the elements
are to be “counted”.
(a1, b1) (a1, b2) (a1, b3) (a1, b4) · · ·y (a2, b1) (a2, b2) (a2, b3) (a2, b4) · · ·
(a3, b1) (a3, b2) (a3, b3) (a3, b4) · · ·
(a4, b1) (a4, b2) (a4, b3) (a4, b4) · · ·
......
......
If b(n) = b(m), then the set of predecessors of b(n) is equal to the set of
predecessors of b(m). Thus the section of n is equal to the section of m, and
thus n = m. So b is 1 to 1.
If (a, b) ∈ A × B, then (a, b) = (ak, bl) for some k, and l in ω. Then by
“counting” backwards through the recursion we may determine an n where
b(n) = (a, b)
Corollary∏i∈n
ω for n ∈ ω is countable.
Theorem 4.5 The countable union of countable sets is countable.
Proof Without loss of generality we may assume the sets are disjoint. This
is in fact the most extreme case. We may now let aij represent the ith element
of the jth set. The argument is now identical to the proof of theorem 4.4.
39
Corollary ωω is countable.
Proof ωω =⋃
b
Where b = 0, ω, ω2, · · · .
From the above arguments it is easily seen that all the ordinal numbers
that have explicitly been constructed by the method outlined in chapter III
must be countable. Not all ordinals are countable as we shall see in the next
chapter.
Some Uncountable Sets
Theorem 4.6 The countably infinite product of countable sets each of
which has cardinality of at least 2 is not countable.
Proof Assume the conclusion is false. That is assume there exists a bijec-
tion b from ω to∏i∈ω
Ai, where each Ai is countable.
Let pn be the projection map onto the nth co-ordinate space. For all
n ∈ ω pick an ∈ An − (pn b)(n). Then the sequence (a0, a1, · · · ) ∈∏i∈ω
Ai.
We have for all n ∈ ω pn((a0, a1 · · · )) = an, but (pn b)(n) 6= an. Thus
b(n) 6= (a0, a1, · · · ) ∀n ∈ ω. Which is a contradiction. Thus no bijection
exists.
Exercises
The following set of exercises leads to an alternate proof of the Schroder-
Bernstein Theorem.
Let X be a partially ordered set and A a subset of X. An element x ∈ X
is said to be an upper bound for A if x ≥ a ∀a ∈ A. Equivalently, an
40
element y ∈ X is said to be a lower bound for A if y ≤ a ∀a ∈ A. An upper
bound u of A is the Supremum or least upper bound of A if u ≤ x for
all upper bounds x of A. A lower bound l of A is the infimum or greatest
lower bound of A if l ≥ y for all lower bounds y of A.
We say that a partially ordered set X is complete if there exists a supre-
mum and infimum for every subset of X.
Let X and Y be partially ordered sets. A function f : X → Y is order
preserving if x ≤ y ⇒ f(x) ≤ f(y).
1. If L is a complete partially ordered set and f : L → L is an order
preserving function, show that there exists a ∈ L such that f(a) = a.
2. Let A be any set. Show that P(A) is a complete partially ordered set,
where X ≤ Y if X ⊆ Y (we say that P(A) is ordered by inclusion).
For simplicity of notation define X −A = A′, that is A′ is the comple-
ment of A in X.
3. Let A and B be sets and f and g be functions such that f : A → B and
g : B → A. Let h : P(A) → P(A) be defined by h(S) = [g(f(S)′)]′.
Show that T ⊂ S ⇒ h(T ) ⊂ h(S).
4. Observe that h is an order preserving map, thus there exists S ⊂ L
such that h(S) = S. that is g(f(s)′)′ = X, and g(f(S)′) = S ′. Now
assume f and g are one to one and demonstrate the bijection from A
to B.
41
V ZORN’S LEMMA AND WELLORDERING
We have shown that there are sets with cardinality greater than ℵ0 but we
have yet to demonstrate that there exists ordinal numbers with cardinality
greater than ℵ0. We will accomplish this task by showing that every set can
be well ordered, and that every well ordered set is order isomorphic to an
ordinal number. By order isomorphic we shall mean the following.
Definition Two partially ordered sets a and b are said to be order iso-
morphic if there exists a bijection between them that preserves order. That
is if β : a−→b is the order preserving bijection then x ≤ y if and only if
β(x) ≤ β(y). We write a ' b.
Zorn’s Lemma
Theorem 5.1 Zorn’s Lemma: If X is a partially ordered set such that every
chain in x has an upper bound in X, then X contains a maximal element.
By chain we mean a totally or linearly ordered subset. In the hypothesis
of Zorn’s Lemma the upper bound need not be in the chain. In the conclusion
the maximal element is simply an element with no superior, that is if x ≤y ⇒ x = y then x is maximal. There may in fact be elements in X that are
not comparable to x, yet x may still be maximal.
Proof Let X be a partially ordered set. For each x ∈ X, S(x) = y ∈ X |y ≤ x is the weak section of x. S is a function from X to P(X) since the
section of any element is unique. The range R of S is a collection of subsets
that are partially ordered by inclusion, i.e. A ≤ B if A ⊆ B. S is one to one,
42
since S(x) ⊆ S(y) if and only if x ≤ y. Thus if we find a maximal element
S(z) in R then z is maximal in X. Also C is a chain in X if and only if S(C)
is a chain in R.
Let X be the set of all chains in X. Let Γ ∈ X . Since Γ is a chain it
has an upper bound x by hypothesis thus Γ ⊆ S(x) for some x ∈ X. X is
a non-empty collection of sets partially ordered by inclusion. Now if C is a
chain in X then G =⋃Γ∈C
Γ ∈ X . G is an upper bound of C in X as each
element of C is dominated by G.
Now let X be an arbitrary non-empty collection of subsets of a non-empty
set X subject to
1. if A ∈ X and B ⊆ A then B ∈ X, and
2. if C is a chain in X then⋃Γ∈C
Γ ∈ X.
Notice that these are the exact conditions that our set X had in the previous
discussion. Also notice that the first condition implies that ∅ ∈ X. Our task
is to show that X has a maximal element.
Now let φ be a choice function from the non-empty subsets of X to X, i.e
φ : (P(X)−∅)−→X. We note that φ is a function such that φ(A) ∈ A for all
non-empty subsets A of X. For each set A ∈ X let A = x ∈ X | A ∪ x ∈X. Define a function γ :X−→ X by the following:
γ(A) =
A ∪ φ(A− A) if A− A 6= ∅A if A− A = ∅.
We observe that A − A = ∅ if and only if A is maximal. Our task is now
to show there exists a set A in X such that γ(A) = A. Since φ(A − A) is a
43
single element we notice that γ(A) contains at most one more element than
A.
We now define a tower as a subcollection T of X that satisfies the fol-
lowing conditions:
1. ∅ ∈ T ,
2. if A ∈ T , then γ(A) ∈ T , and
3. if C is a chain in T , then⋃A∈C
A ∈ T .
We notice here that X satisfies the conditions for a tower, and thus towers
exist. We can also easily verify that the intersection of a collection of towers is
a tower. Let Tλ be a collection of towers, since ∅ ∈ Tλ for all λ, ∅ ∈⋂
λ
Tλ.
If A ∈⋂
λ
Tλ, then A ∈ Tλ for all λ, thus γ(A) ∈ Tλ for all λ and thus
γ(A) ∈⋂
λ
Tλ for all λ. And finally if C is a chain in⋂
λ
Tλ, then C is a chain
in Tλ for all λ, thus⋃
A∈C
A ∈ Tλ for all λ, and thus⋃A∈C
A ∈⋂
λ
Tλ. It follows
that the intersection of all towers T0 is the smallest tower. We now wish to
show that T0 is a chain.
We say that a set B in T0 is comparable if it is comparable with every
set in T0, that is, for all A ∈ T0 either A ⊆ B or B ⊆ A. To show that T0 is
a chain we show that every set in T0 is comparable.
We now let B be an arbitrary comparable set in T0. Comparable sets do
exist since ∅ is clearly comparable. Suppose A ∈ T0 and A is a proper subset
of B. Since B is comparable we have γ(A) ⊆ B or B is a proper subset of
γ(A). If B ⊂ γ(A) we have A as a proper subset of B and B a proper subset
44
of γ(A), but γ(A)−A is a singleton thus there cannot be a set between them.
We conclude γ(A) ⊆ B.
Now consider the collection U of all sets A in T0 where either A ⊆ B or
γ(B) ⊆ A. The collection U is no larger than the collection of sets in T0
that are comparable with γ(B) since, if A ∈ U and since B ⊆ γ(B) we have
either A ⊆ γ(B) or γ(B) ⊆ A.
We now claim that U is a tower. We verify the three conditions:
1) ∅ ∈ U .
2) To show A ∈ U ⇒ γ(A) ∈ U Consider three cases
i. A ⊂ B.
ii. A = B.
iii. γ(B) ⊆ A.
For i. γ(A) ⊆ B by the preceding argument thus γ(A) ∈ U .
For ii. γ(A) = γ(B) thus γ(B) ⊆ γ(A), therefore γ(A) ∈ U .
For iii. γ(B) ⊆ A ⇒ γ(B) ⊆ γ(A), therefore γ(A) ∈ U .
3) Let C be a chain in U , if γ(B) ⊆ D for some D ∈ C, then γ(B) ⊆⋃
D∈C
D
hence⋃
D∈C
D ∈ U . If however D ⊆ B for all D ∈ C, then⋃
D∈C
D ⊆ B. Thus
we may conclude⋃
D∈C
D ∈ U .
We thus conclude that U is a tower and is a subset of T0 which is the
smallest tower hence we have U = T0.
45
Now let B be a comparable set, we form U as above and since U = T0 for
any A ∈ T0 we have either A ⊆ B ⊆ γ(B) or γ(B) ⊆ A. We thus conclude
that if B is a comparable set, then γ(B) is comparable also.
We have that ∅ is comparable and γ maps comparable sets to comparable
sets. Now since the union of a chain of comparable sets is comparable we
may conclude that the comparable sets constitutes a tower, and hence they
exhaust all of T0.
T0 is a chain, thus if A =⋃
B∈T0
B ∈ T0 we have γ(A) ⊆ A, since the union
A includes all the sets in T0. We always have A ⊆ γ(A), thus we conclude
that A = γ(A). This is the condition that we noted earlier needed to be
shown to complete the proof.
Definition A well ordered set A is a continuation of a well ordered set
B if
i) B ⊂ A
ii) B = S(a) for some a ∈ A
iii) For a, b ∈ B, a ≤B b iff a ≤A b.
The reason for the third condition is that a set may have more than one
ordering, and for continuation we want B to have the same ordering as A
when restricted to B.
The Well Ordering and Counting Theorems
Theorem 5.2 The Well Ordering Theorem Every set can be well ordered.
46
An important point to be noted here is that the set may be presented
with an ordering that is not a well ordering. The Well Ordering Theorem
says we may disregard any previously assigned ordering that the set may
have and endow it with a new ordering that is a well ordering.
Before we prove this theorem we make this note. We shall regard an
ordered set X as the pair (X,<) where < is an order relation.
Proof Let X be a set. Let W be the collection of all well ordered subsets
of X under every possible ordering, i.e.
W = (A,<) ∈ P(X)× P(A× A) | < is a well ordering of A.
We partially order W by continuation, i.e. (A,<) <W (B,<) if B is a
continuation of A. W is not empty since (X, <) where <= (x, x) | x ∈ Xis an element of W .
Now let C be a chain in W ,
C = (Aλ, <λ) | (Aλi, <λi
) <W (Aλk, <λk
) for λi < λk.
Since the Aλ are nested sets,⋃
Aλ∈C
Aλ is an upper bound for C, and is in
W since any subset of⋃
Aλ∈C
Aλ must be a subset of Aλifor some λi, and thus
have a least element, and therefore be well ordered.
Hence the condition for the hypothesis of Zorn’s lemma has been satisfied
and we may conclude that there exists a maximal element M in W . We claim
M = X. If not then there exists x ∈ X such that x 6∈ M . Thus we may
construct (M, <) = (M∪x, <) where y < x for all y ∈ M . M is clearly well
ordered and the continuation of M , thus M > M . Which is a contradiction
since M is maximal. We conclude M = X and thus X is well ordered.
47
Theorem 5.3 Counting Theorem Every well ordered set is order isomor-
phic to a unique ordinal number.
Proof Uniqueness is virtually trivial, since order isomophic is clearly tran-
sitive, if a well ordered set were order isomorphic to two different ordinal
numbers, those ordinal numbers would be order isomorphic, a contradiction.
Now let X be a well ordered set, let S = x ∈ X | S(x) ' α for some
ordinal number α and let a ∈ X be an element such that for each predecessor
b of a S(b) is order isomorphic to some ordinal number ( clearly the ordinal
number is unique and such an element does exist, since the initial segment of
the least element of X is the empty set which is order isomorphic to 0, thus
the least element of the set X − l | l is the least element of X satisfies the
condition for a). Now let P (x, α) be the proposition “α is an ordinal number
and S(x) ' α”. Since α is unique we have P (x, α) and P (x, β) ⇒ α = β. We
may now apply the Axiom Schema of Replacement and verify the existence
of the set T = α | x ∈ S(a) & P (x, α). T is the set of ordinal numbers
order isomorphic to the initial segments determined by the predecessors of a.
T is clearly an ordinal number and is order isomorphic to S(a). We have thus
satisfied the hypothesis for transfinite induction and we may conclude that
for all x ∈ X S(x) ' α for some ordinal α. We may annex another element z
to X and make it maximal that is z > x for all x ∈ X. Then in the new set
X ∪ z, S(z) = X, and by the previous argument X = S(z) ' α for some
ordinal number α.
48
The Equivalences to the Axiom of Choice
Since every well ordered set is order isomorphic to a unique ordinal num-
ber, we present the following definition.
Definition The unique ordinal number to which a well ordered set is order
isomorphic is its order type, which we shall abbreviate by OT .
We can in fact regard ordinal numbers to be the order types of well
ordered sets. However when developing the properties of well ordered sets it
is cognitively much easier to work with the narrowly defined ordinal numbers.
Lemma If A ⊂ B are well ordered sets, then OT (A) ≤ OT (B).
Proof Let a and b be ordinal numbers and α : a → A and β : b → B
be order preserving bijections. Either a ⊆ b or b ⊆ a. If a ⊆ b, then
OT (A) = a ≤ b = OT (B). If b ⊆ a, then let φ = β−1 α, an order preserving
bijection from a to b. Let c ⊆ a such that φ(x) = x ∀x ∈ c. Now let y ∈ a
such that S(y) ⊂ c. Assume y 6∈ c, then φ(y) = z > y, and ∃t ∈ a such
that φ(t) = y and t > y. But φ(t) = y < z = φ(y), which contradicts order
preserving. Thus we must have y ∈ c, and by transfinite induction we have
c = a. Thus φ : a → b is the identity map, and since b ⊂ a we have b = a.
Corollary If a and b be ordinal numbers that are order isomorphic and
a ⊆ b, then a = b.
If we take the Well ordering theorem as an axiom we may prove the Axiom
of Choice as a theorem.
Theorem 5.4 For any non-empty collection of non-empty sets there exists
a choice function.
49
Proof Let C be a non-empty collection of non-empty sets. Well order each
member of C and let the choice function choose the least element of each set.
We may summarize our results by the following:
Theorem 5.5 The following are equivalent:
1. The Axiom of Choice.
2. Zorn’s Lemma.
3. The Well Ordering Theorem.
50
VI ARITHMETIC
In this chapter we will develop the concept of arithmetic for ordinal and
cardinal numbers. It is conceptually easier to define an arithmetic for car-
dinal numbers, so we will do that first and extend those concepts to ordinal
numbers.
Cardinal Arithmetic
Definition A Binary Operation on a set a is a function from a× a to a.
Definition An Arithmetic on a set a is a collection of binary operations
on the set a. The collection is usually finite.
We extend these two definitions to include arbitrary collections that are
not necessarily sets.
Definition A Binary Operation on any collection is a rule that assigns
to every ordered pair of elements of the collection a unique element of the
collection.
The definition for an arithmetic on an arbitrary collection is identical
except the term set is replaced with collection.
We will represent every binary operation by a symbol, and indicate the
element to which any ordered pair of elements is associated by the two el-
ements juxtaposed with the operation symbol inserted between them. For
Example, if ∗ represents the binary operation and a and b are elements of
the collection then a ∗ b will represent the element to which the ordered pair
(a, b) is associated.
51
We now define an arithmetic for the collection of Cardinal Numbers. The
first operation we will define we will call addition, which will be represented
by the symbol +, and the associated element to the ordered pair will be
known as the sum. The motivation for the definition is naive and intuitive.
We would like to say that the sum of the cardinality of two sets is the car-
dinality of the union of the two sets, but this is a little too naive. The two
sets may have non-empty intersection, and we wish to avoid this situation.
Let a and b be two arbitrary sets and 0, 1 be a two point space. The
set a×0 is cardinally equivalent to a by the obvious bijection, as is b×1with b. a× 0 and b× 1 are disjoint sets, as the second member of each
element of one set is different from the second member of each element of
the other set.
Definition The Disjoint Union of two sets a and b, which will be denoted
a ] b, is a× 0 ∪ b× 1.
We may now properly define cardinal addition.
Definition Let A and B be cardinal numbers. There are two sets a and b
such that C(a) = A and C(b) = B, and we define A + B to be C(a ] b).
The second binary operation we define on cardinal numbers we will call
multiplication. We will nominally use the symbol · for multiplication, and
the associated element to the ordered pair will be known as the product.
When indicating the product it is unambiguous to omit the symbol for mul-
tiplication and simply indicate the product by the juxtaposition of the two
elements of the ordered pair, e.g., AB will represent A ·B.
The motivation for the definition for cardinal multiplication is nearly as
intuitive as that for addition. We imagine that we have a collection of sets
52
of equal cardinality and we wish to determine the cardinality of the total
collection. Every element of the total collection can be represented by an
ordered pair, the first member is the symbol for that element within its set,
and the second member is the symbol for the set to which it belongs.
Definition Let A and B be cardinal numbers. There are sets a and b such
that C(a) = A and C(b) = B, and we define AB = C(A×B).
The third binary operation that we shall define on cardinal numbers we
shall call exponentiation. We will nominally use the symbol ∧ for exponen-
tiation and the associated element to the ordered pair will be known as the
power. Again we choose to use a different but still unambiguous notation
for common use, we will use the second member as a superscript to indicate
the power. I.e., ab = a ∧ b.
The motivation for the definition of cardinal exponentiation is that we
imagine that we have an arbitrary collection of arbitrary collections of sets of
equal cardinality, and we wish to determine its cardinality. Recall that the
cartesian product of a collection of sets is the collection of all maps from the
collection of sets to the union of all elements of the sets, where the image of a
set is restricted to the elements of itself. Thus our model for exponentiation
is a collection of duplications of a given set, and we wish to compute the
cardinality of the cartesian product of this collection.
Definition Let A and B be cardinal numbers. There are sets a and b such
that C(a) = A and C(b) = B, and we define AB = C(f |f : b → a).
Ordinal Arithmetic
We will now define an arithmetic for ordinal numbers. We wish to extend
53
the concept of addition relating to the union of two sets, but we also wish to
preserve the order properties of ordinal numbers.
Recall that in Chapter V we defined order type to be the unique ordinal
number to which a well ordered set is order isomorphic, and we use the
abbreviation OT to refer to the order type.
The first operation we will define will again be called addition and its
associated symbol will also be +.
Let a and b be ordinal numbers, and 0, 1 be a two point space. The
sets a× 0 and b× 1 are similar to a and b by the obvious bijection.
Definition If a and b are ordinal numbers, then a + b = OT (a ] b), where
(x, 0) < (y, 0) iff x < y, (x, 1) < (y, 1) iff x < y, and (x, 0) < (y, 1) ∀ x ∈ a
and ∀ y ∈ b.
The second operation defined will again be called multiplication and is
associated symbol will also be ·.
We want to extend the concept of cardinal multiplication so we will con-
cern ourselves with the cartesian product of ordinal numbers. We wish to
extend the order of the ordinal numbers in a fashion that will well order
the cartesian product. Let a and b be ordinal numbers. We define an order
relation on a× b by
(x, y) < (z, w) iff
y < w or
if y = w then x < z.
This order is called reverse lexicographic order.
Definition If a and b are ordinal numbers, then a · b = OT (a × b), where
(a× b) is well ordered by reverse lexicographic order.
54
Verifying that reverse lexicographic order is a well ordering is straight-
forward. Pick any non-empty subset of the cartesian product a × b. The
collection of second members of the elements of this subset has a least ele-
ment. Pick all those elements that has that least second member, from those
pick the element that has the least first member. This will be the least el-
ement of the chosen subset. Hence reverse lexicographic ordering is a well
ordering of the cartesian product of two ordinal numbers.
Lemma For ordinal numbers a, b, c, if a = b, then a + c = b + c and
ac = bc.
Before we prove the lemma we give the following definition and observa-
tion.
Definition The function from a set a to itself defined by I : x → x ∀ x ∈ a
is called the Identity map.
We observe that if a = b, then a]c = b]c. We can now prove the lemma.
Proof The identity maps I1 : a] c → b] c and I2 : a× c → b× c are order
preserving bijections. Let β1 : OT (a ] c) → a ] c, γ1 : OT (b ] c) → b ] c,
β2 : OT (a × c) → a × c, γ2 : OT (b × c) → b × c be order preserving
bijections. We now have that γ−11 I1 β1 : OT (a ] c) → OT (b ] c) and
γ−12 I2 β2 : OT (a× c) → OT (b× c) are order preserving bijections.
We now define the third operation which we call exponentiation and we
also choose the associated symbol ∧, and again we abbreviate with super-
scripting.
Exponentiation is defined recursively by
i. a ∧ 0 ≡ a0 = 1,
55
ii. a ∧ (b + 1) ≡ ab+1 = ab · a,
iii. a ∧ b ≡ ab = supac|c < b if b is a limit ordinal.
Exercises Let a, b, c be ordinal numbers, show
1. ab · ac = ab+c.
2. (ab)c = ab·c.
Hint: Let d be any ordinal number containing c, e.g. c + 1. For 1 Let
A = y ∈ d|ab · ay = ab+y, and use transfinite induction to show A = d. For
2 let A = y ∈ d|(ab)y = ab·y.
Since Cardinal numbers and Ordinal numbers are the same in ω the reader
should verify that ordinal and cardinal arithmetic agree.
We also leave it to the reader to verify these arithmetic facts.
1. ℵ0 + ℵ0 = ℵ0.
2. ℵ0 · ℵ0 = ℵ0.
3. ℵ0 ∧ ℵ0 > ℵ0.
4. n + ω = ω ∀n ∈ ω.
5. ω + n = ω + n ∀n ∈ ω, where the left side of the equation refers to
ordinal addition while the right side refers to the ordinal number ω+n.
6. n · ω = ω ∀n ∈ ω.
7. ω · n = ωn ∀n ∈ ω.
56
We invite the reader to either confirm or deny any arithmetic fact that he
may hypothesize; i.e. add or multiply a few numbers and see what happens.
Definition Let a and b be elements of ω. If b = c + 1, then we define
b− 1 ≡ c.
Lemma If a, b ∈ ω and C(a) = C(b), then a = b.
Proof Let c = y ∈ ω|C(y) < C(y +1), and let x ∈ ω such that S(x) ⊂ c.
If x = 0, we then have C(0) < C(1), thus 0 ∈ c. If x 6= 0, we then have
x− 1 ∈ c, and thus C(x− 1) < C(x).
We can clearly establish that there is no bijection between 0 and 1, or 1
and 2. Now for x > 1 assume there exists a bijection β : x → x + 1. We may
then create a bijection γ : x− 1 → x by
γ(y) =
β(y) if y 6= β−1(x)
β(x− 1) if y = β−1(x).
Thus C(x−1) = C(x), which is a contradiction. Thus C(x) < C(x+1), and
thus x ∈ c, and thus by transfinite induction C(x) < C(x + 1) ∀ x ∈ ω. By
transitivity we have a < b ⇒ C(a) < C(b).
Thus C(a) = C(b) ⇒ a = b.
An alternate definition for addition and multiplication on ω is
a + 0 = a
a + b = (a + 1) + (b− 1)
and
a · 0 = 0
a · b = a + a · (b− 1).
57
Corollary This definition for addition and multiplication agrees with or-
dinal arithmetic on ω.
Proof Using the previous lemma we need only demonstrate equivalent car-
dinality. Define a bijection β : a ] b → (a + 1) ] (b− 1) by
β(x) =
x if x ∈ a× 0x if x ∈ (b− 1)× 1(a, 0) if x = ((b− 1), 1).
We also note that C(a + 0) = C(a ∪ ∅) = C(a).
To show a·b = a+a·(b−1) we define the bijection β(a×b) → a]a×(b−1)
by
β(x, y) =
(x, 0) if y = b− 1
((x, y), 1) if y ∈ b− 1.
Also a · 0 = C(a× ∅) = C(∅) = 0.
Exercises
For a, b, c ∈ ω show the following:
1. (a + b) + c = a + (b + c).
2. a + b = b + a.
3. (a · b) · c = a · (b · c).
4. a · b = b · a.
5. a · 1 = a.
58
6. (a + b) · c = a · c + b · c.
Solution to 1 define the bijection
β : (a×0 ] b×1)×0 ] c×1 → a×0 ] (b×0 ] c×1)×1
by
β((x, n, 0)) =
(x, 0) if x ∈ a
(x, 0, 1) if x ∈ b
and β((x, 1)) = (x, 1, 1) if x ∈ c.
59
VII INTEGER AND RATIONAL
NUMBERS
Natural Numbers
Definition The natural or counting numbers are the elements of ω.
We use the blackboard bold face letter N to represent the set of natural
numbers. As the definition indicates they are synonymous with the ordinal
number ω.
We note here that when the natural or counting numbers are developed
via the Peano postulates, the set of natural numbers begins with 1 and do not
include the number 0. The Whole numbers, indicated by W, are considered
to be the set of numbers that are the union of the natural numbers and
0. However in the set theoretic development of numbers it is much more
convenient to consider the natural numbers as the ordinal number ω, and not
specify any particular set as being whole numbers. (The Peano postulates
can be found in most elementary number theory texts, and also in Naive Set
Theory by Paul R. Halmos.)
Integers
The next set of numbers we develop we shall call Integer Numbers
and we will indicate this set by the bold face letter Z (From the German
word for counting, zahlen). For the purpose of brevity the integer numbers
are referred to as integers. The rationale for the development of this set is
that we may wish to answer questions such as: What number added to 2 is
60
0? This can be expressed symbolically by x + 2 = 0. We realize of course
that x ∈ N| x + 2 = 0 = ∅, thus the question has a vacant answer in the
natural numbers. We extend the natural numbers to a larger set in which
this question and other questions like it have non-vacuous answers.
We define an equivalence relation on the cartesian product of the natural
numbers with themselves, i.e., N× N, by
(a, b) ≡ (c, d) ⇔ a + d = b + c.
The rationale for this definition is that each ordered pair represents a
difference. Using our previous (to the study of set theory) concept of sub-
traction we see
a + d = b + c ⇔ a− b = c− d.
We leave it to the reader to verify that this relation is an equivalence
relation. Also the reader should note and verify that this relation is not an
equivalence relation for α× α where α > ω.
Definition The integers are the collection of equivalence classes of N×Nwith respect to the equivalence relation
(a, b) ≡ (c, d) if a + d = b + c.
We will indicate the equivalence class of (a, b) by [a, b], that is
[a, b] = (x, y)|(x, y) ≡ (a, b).
We now wish to define an order and an arithmetic for the integers. First
we need a pair of lemmas.
Lemma 7.1 a + n = b + n ⇒ a = b ∀ n ∈ N.
61
Proof Let A = n ∈ N|a+n = b+n ⇒ a = b ∀ a, b ∈ N and let k ∈ N. If
k ≥ 2, then S(k) = 0, 1, 2, · · · k− 1 ⊂ A. Thus a + 1 = b + 1 ⇒ a = b, and
a+k− 1 = b+k− 1 ⇒ a = b, for all a, b ∈ N. We now assume a+k = b+k,
which implies a+k−1+1 = b+k−1+1, which implies a+k−1 = b+k−1,
which implies a = b. For k = 1 we have a + 1 = b + 1 ⇒ a = b by Theorem
2.8. For k = 0, we have a + 0 = a and b + 0 = b, thus a + 0 = b + 0 ⇒ a = b.
All three cases imply k ∈ A. Thus by transfinite induction we have the
desired result.
Lemma 7.2 a + n < b + n ⇒ a < b ∀ n ∈ N.
Proof: The proof is identical to the proof of the previous lemma by re-
placing “=” with “<”, except possibly the case k = 1. For the case k = 1,
a + 1 < b + 1 ⇒ a ∪ a ⊂ b ∪ b. If a 6⊂ b, then there exists x ∈ a such
that x 6∈ b. Since x ∈ b ∪ b we must have x = b. But either a ∈ b or
a = b, which contradicts the axiom of regularity. We thus conclude a ⊆ b. If
a = b we also have the obvious contradiction to the axiom of regularity, thus
a ⊂ b ⇒ a < b.
There is a natural order that may be defined on the integers.
Definition [a, b] < [c, d] iff a + d < b + c.
Since [a, b] and [c, d] represent equivalence classes, and the numbers a, b, c, d
are specific values we must verify that the definition is valid regardless of what
pair is chosen to represent each equivalence class. That is, we must show that
the ordering is well defined.
Theorem 7.3 The ordering of the integers is well defined.
62
Proof Let [a, b] < [c, d], [a, b] = [x, y], and [c, d] = [z, w]. We have
a + d < b + c, a + y = b + x, and c + w = d + z
⇒ a + d + x + w < b + c + x + w = a + d + y + z
⇒ x + w < y + z
⇒ [x, y] < [z, w].
We define addition and multiplication of integers as follows
[a, b] + [c, d] = [a + c, b + d]
[a, b] · [c, d] = [ac + bd, ad + bc].
We now demonstrate that these operations are well defined.
Theorem 7.4 Addition and multiplication of integers are well defined.
Proof Let (a, b) ≡ (x, y), and (c, d) ≡ (z, w). We thus have
a + c + y + w = b + d + x + z
⇒ (a + c, b + d) ≡ (x + z, y + w)
⇒ [a, b] + [c, d] = [x, y] + [z, w].
Hence addition is well defined.
63
For multiplication we have
a + y = b + x and c + w = d + z
⇒ ac + cy = bc + cx, bd + dx = ad + dy,
xw + cx = dx + xz and yz + dy = cy + yw
⇒ ac + bd + xw + yz + cy + dx + cx + dy =
ad + bc + xz + yw + cy + dx + cx + dy
⇒ ac + bd + xw + yz = ad + bc + xz + yw
⇒ (ac + bd, ad + bc) ≡ (xz + yw, xw + yz)
⇒ [a, b] · [c, d] = [x, y] · [z, w].
Hence multiplication is well defined.
We must now develop the usual properties of the integers.
Theorem 7.5 If a, b, c, d ∈ Z with c 6= 0 and d > 0, then
i. a = b ⇔ a + c = b + c.
ii. a = b ⇔ ac = bc.
iii. a < b ⇔ a + c < b + c.
iv. a < b ⇔ ad < bd.
Proof Let a = [a1, a2], b = [b1, b2], c = [c1, c2], d = [d1, d2] > 0 ⇒ d1 > d2.
64
i.
a = b ⇔ [a1, a2] = [b1, b2]
⇔ a1 + b2 = a2 + b1
⇔ a1 + b2 + c1 + c2 = a2 + b1 + c1 + c2
⇔ [a1 + c1, a2 + c2] = [b1 + c1, b2 + c2]
⇔ [a1, a2] + [c1, c2] = [b1, b2] + [c1, c2]
⇔ a + c = b + c
ii.
a = b ⇔ [a1, a2] = [b1, b2]
⇔ a1 + b2 = a2 + b1
⇔ a1c1 + b2c1 = a2c1 + b1c1 & a1c2 + b2c2 = a2c2 + b1c2
⇔ a1c1 + b2c1 + a2c2 + b1c2 = a1c2 + b2c2 + a2c1 + b1c1
⇔ (a1c1 + a2c2, a1c2 + a2c1) = (b1c1 + b2c2, b1c2 + b2c1)
⇔ ac = bc
iii.
a < b ⇔ [a1, a2] < [b1, b2]
⇔ a1 + b2 < a2 + b1
⇔ a1 + b2 + c1 + c2 < a2 + b1 + c1 + c2
⇔ [a1 + c1, a2 + c2] < [b1 + c1, b2 + c2]
⇔ [a1, a2] + [c1, c2] < [b1, b2] + [c1, c2]
⇔ a + c < b + c
65
iv.
a < b ⇔ [a1, a2] < [b1, b2]
⇔ a1 + b2 < a2 + b1
⇔ a1d1 + b2d1 < a2d1 + b1d1 & a1d2 + b2d2 < a2d2 + b1d2
also a1d2 + b2d2 < a1d1 + b2d1 & a2d2 + b1d2 < a2d1 + b1d1
since d2 < d1
⇔ a1d1 + b2d1 + a2d2 + b1d2 < a1d2 + b2d2 + a2d1 + b1d1
⇔ (a1d1 + a2d2, a1d2 + a2d1) < (b1d1 + b2d2, b1d2 + b2d1)
⇔ ad < bd
Definition An injection of a set a into a set b is a bijection from a to a
subset of b. We will use the symbol → to indicate that a map is an injection.
There is a natural injection, J , from N to Z defined by
J : x → [x, 0].
When there exists an injection from one set to another that preserves
order and arithmetic properties, we say the first set is embedded into the
second. It is easy to verify that the natural injection is an embedding.
Lemma 7.6 If [a, b] is an integer, then there exists a natural number c,
such that [a, b] = [c, 0], or [a, b] = [0, c].
Proof By trichotomy, either a > b, a = b, or a < b. If a > b let c be such
that b + c = a, thus [a, b] = [c, 0]. If a = b let c = 0 (recall for us that 0
66
is a natural number), thus [a, b] = [0, 0] = [c, 0]. If a < b let c be such that
a + c = b, thus [a, b] = [0, c].
When a + c = b, and a, b, c ∈ N, we express c as b− a.
It is convenient to represent an equivalence class by one of its elements.
When a choice function is defined to choose from each of the equivalence
classes a representative element, that element is known as the canonical
representative.
For the integers we define our choice function to be
φ([a, b]) =
(a− b, 0) if a ≥ b
(0, b− a) if b > a
Thus every integer can be represented by [a, 0] or [0, a]. When the num-
bers are understood to be integers we will use a to represent [a, 0] and −a to
represent [0, a]. As an exercise the reader may wish to show that a > 0, and
−a < 0, that is [a, 0] > [0, 0], and [0, a] < [0, 0].
Definition The set of integers strictly greater than 0 is called the Positive
Integers and are denoted by Z+. Those integers strictly less than 0 are called
the Negative Integers and are denoted by Z−.
An Integral Domain
We leave it the reader to verify the following properties for Z.
1. a + b ∈ Z ∀a, b ∈ Z.
2. ab ∈ Z ∀a, b ∈ Z.
67
3. (a + b) + c = a + (b + c)∀a, b, c ∈ Z.
4. (ab)c = a(bc)∀a, b, c ∈ Z.
5. a + b = b + a∀a, b ∈ Z.
6. ab = ba∀a, b ∈ Z.
7. a(b + c) = ab + ac∀a, b, c ∈ Z.
8. ∃e ∈ Z such that a + e = e + a = a∀a ∈ Z
9. ∃u ∈ Z such that au = ua = a∀a ∈ Z
10. ∀a ∈ Z ∃(−a) ∈ Z such that a + (−a) = (−a) + a = 0
11∗. If ab = 0, then either a = 0, or b = 0.
Properties 1 and 2 are called the closure properties, for addition and mul-
tiplication respectively, properties 3 and 4 are the associative properties, 5
and 6 are the commutative properties. Property 7 is the distributive prop-
erty, we say multiplication distributes over addition. In properties 8 and 9
e and u are called the identities (again additive identity and multiplicative
identity respectively). The −a in property 10 is called the additive inverse,
or opposite. We say that a number a is a zero divisor if ab = 0, but neither
a nor b equal 0 (of course b is also a zero divisor). Property 11∗ is called the
“no zero divisors” property.
Definition Any set with two binary operations satisfying these 11 proper-
ties is called an Integral Domain.
68
Rational Numbers
Another question we may wish to answer is: Two times what number is
1? This can be represented symbolically by 2x = 1. Again this question has
a vacant answer in the set of integers.
We extend the integers to the set of rational numbers by defining the
appropriate equivalence relation on the cartesian product of the integers with
themselves. We use the bold face letter Q to represent rational numbers.
The letter Q comes from the term quotient, i.e., the rational numbers are a
collection of quotients.
Definition The rational numbers are the collection of equivalence classes
of Z× (Z− 0) with respect to the equivalence relation
(x, y) ≡ (z, w) ⇔ xw = yz.
From the above comment we can see the rationale for this definition.
Using our previous notion of quotients we see xw = yz ⇔ x
y=
z
wprovided
y, w 6= 0. The reader should again verify that the relation is an equivalence
relation.
If we let a, b ∈ Z such that a ≥ 0 and b > 0, then the reader should
verify that [−a,−b] ≡ [a, b] and [a,−b] ≡ [−a, b]. Thus we may (and shall)
assume for any rational number [a, b] that b > 0. Let d be the least element
of y|(x, y) ∈ [a, b] and b > 0. We then let the unique element (c, d) ∈ [a, b]
be the canonical representative of the rational number [a, b].
We again have the natural order defined on the rational numbers given
by
[x, y] < [z, w] iff xw < yz.
69
Theorem 7.7 The ordering of rational numbers is well defined.
Proof Let [x, y] = [a, b], [z, w] = [c, d], and [x, y] < [z, w] with b, d, y, w >
0. We thus have
ay = bx, dz = cw and xw < yz
⇒ bdxyw2 < bdy2zw ⇒ ady2w2 < bcy2w2 ⇒ ad < bc ⇒ [a, b] < [c, d].
Hence the ordering is well defined.
We define addition and multiplication as follows,
[x, y] + [z, w] = [xw + zy, yw]
[x, y] · [z, w] = [xz, yw].
Theorem 7.8 Addition and multiplication of rational numbers are well
defined.
Proof Let [x, y] = [a, b] and [z, w] = [c, d].
For addition we have
ay = bx and dz = cy
⇒ aydw = bxdw and bydz = bycw
⇒ bdxw + bdyz = adyw + bcyw
⇒ [xw + yz, yw] = [ad + bc, bd].
Hence addition is well defined.
For multiplication we have
ay = bx and dz = cy
⇒ acyw = bdxz
⇒ [ac, bd] = [xz, yw].
70
Hence multiplication is well defined.
Just as there is a natural embedding of the natural numbers into the
integers, there is the natural embedding of the integers into the rational
numbers given by the injection,
J : x → [x, 1].
It is easy to verify that this injection is an embedding.
A Field
We leave it the reader to verify the following properties of Q.
1. a + b ∈ Q ∀a, b ∈ Q.
2. ab ∈ Q ∀a, b ∈ Q.
3. (a + b) + c = a + (b + c) ∀a, b, c ∈ Q.
4. (ab)c = a(bc) ∀a, b, c ∈ Q.
5. a + b = b + a ∀a, b ∈ Q.
6. ab = ba ∀a, b ∈ Q.
7. a(b + c) = ab + ac ∀a, b, c ∈ Q.
8. ∃e ∈ Q such that a + e = e + a = a ∀a ∈ Z
9. ∃u ∈ Q such that au = ua = a ∀a ∈ Q
10. ∀a ∈ Q ∃(−a) ∈ Q such that a + (−a) = (−a) + a = 0
11. ∀a ∈ Q, a 6= 0, ∃a−1 such that aa−1 = a−1a = 1
71
The first 10 properties are identical to the properties of an Integral Domain.
Property 11∗ is replaced by property 11 where a−1 is called the multiplicative
inverse, or reciprocal.
Any set with two binary operations satisfying these 11 properties is called
a Field.
Exercise Show that every field is an integral domain. That is to say, every
field has no zero divisors.
Differences and Quotients
Definition The difference between integers or rational numbers a and b
is a + (−b), which is written a− b.
Definition The quotient of two rational numbers a and b is a · b−1, which
is writtena
b.
We can see that difference and quotient can be regarded as a binary oper-
ations, we also notice that neither operation is commutative nor associative.
Exercise For any two rational numbers p < q, show that p <p + q
2< q.
Exercise Show that
1. If p, q, r are rational numbers where p ≤ q and r > 0, then pr ≤ qr.
2. If p, q, r are rational numbers where p ≤ q and r < 0, then pr ≥ qr.
3. If p and q are positive rational numbers, thenp
q≥ 1 ⇒ q
p≤ 1.
72
Mathematical Induction
The next theorem is a special case of transfinite induction, that is widely
used in many situations. Before we state and prove the theorem we need two
small lemmas that we present as exercises.
Exercise 1. Define the map φ : N → Z+ by φ(a) = [a + 1, 0]. Show that
φ(a) + 1 = φ(a + 1).
Exercise 2. Show that Z+ is order isomorphic to ω.
Theorem 7.9 The Principle of Mathematical Induction If T ⊆ Z+ such
that the following conditions are true:
i. 1 ∈ T
ii. if k ∈ T , then k + 1 ∈ T ,
then T = Z+.
Proof Consider the order preserving bijection φ : ω → Z+ defined by
φ(a) = [a + 1, 0]. Let A = φ−1(T ). Let x ∈ ω such that S(x) ⊂ A.
If x = 0, then φ(x) = 1 ∈ T ⇒ x ∈ A. If x 6= 0, then
S(x) ⊂ A ⇒ x− 1 ∈ A
⇒ φ(x− 1) ∈ T
⇒ φ(x− 1) + 1 ∈ T
⇒ φ(x) ∈ T
⇒ x ∈ A.
Thus by Transfinite Induction A = ω ⇒ T = φ(A) = φ(ω) = Z+.
73
The Cardinality of Integers and Rational Numbers
Theorem 7.10 Both the integers and the rational numbers are countable.
Proof By Theorem 4.4 we know that N× N is countable. there exists the
natural embedding of Z into N × N by identifying [a, b] with its canonical
representative (a− b, 0) or (0, b− a). Thus there exists a bijection from Z to
a subset of a countable set, hence Z is countable. Again by Theorem 4.4 we
know that Z× Z is countable, and there exists the natural embedding of Q
into Z×Z by identifying [a, b] with its canonical representative, (c, d) where
d is positive and minimal. Thus there exists a bijection from Q to a subset
of a countable set, hence Q is countable.
Let Z∗ represent the image of the embedding of Z into Q. Also let Z+∗
and Z−∗ represent respectively the images of the embeddings of the positive
and negative integers into the rationals.
The Archimedian Property
Theorem 7.11 Archimedian Property ∀r ∈ Q ∃n ∈ Z+∗ such that r < n.
Proof Let r = [a, b], if r ≤ 1, then r < 2 and we are done. If r > 1, then
without loss of generality, both a, and b are positive integers, and
b(a + 1) ≥ a + 1 > a
⇒ [a, b] < [a + 1, 1] ∈ Z+∗ .
Lemma For positive rational numbers r and s, if r > s, then r−1 < s−1.
74
Proof First we note that if r = [a, b], then r−1 = [b, a], this is immediate
by computing [a, b] · [b, a] = [ab, ab] = [1, 1]. Now let r = [a, b] and s = [c, d].
[a, b] > [c, d] ⇔ ad > bc
⇔ [d, c] > [b, a]
⇔ s−1 > r−1.
For any integer, a, the product of a with itself b times, where b is a
positive integer is denoted ab.
Exercise Prove that for any positive integer, n, there exists an integer of
the form 2m for some positive integer m, such that 2m > n. (Hint: use
induction).
Solution For n = 1, 1 < 2 = 21. Now assume n < 2m for some m, then
n + 1 < 2m + 1 < 2m + 2m = 2m+1.
Exercise Prove that for any positive rational number, q, there exists a
rational number of the form 2n, where 2n > q, and where n is the embedded
image of a positive integer.
Solution Let q = (a, b) where a and b are positive integers. We then have
(a, b) ≤ (a, 1) < (2n, 1) for some n.
The Division Algorithm
Theorem 7.12 The division algorithm If a and d are integers with d > 0,
then there exist unique integers q and r such that a = dq + r and 0 ≤ r < d.
Proof This result is a consequence of the well ordering property.
Let S = x ∈ Z|x = a− dn ∀n ∈ Z, and let S ′ = x ∈ S|x ≥ 0, S ′ is thus
75
the embedded image of some subset of N, and thus if it is non-empty it must
have a least element.
If a ≥ 0, then let n = 0, and thus x = a − 0 = a ≥ 0. Thus a ∈ S ′. If
a < 0, then let n = a, thus x = a−ad = a(1−d) ≥ 0. Thus a−ad ∈ S ′. Thus
S ′ 6= ∅. Since S ′ 6= ∅, and is embedded image of a subset of N, S ′ has a least
element. Let the least element be r. Thus we have r = a − dq ≤ s ∀s ∈ S ′
and a = dq + r where r ≥ 0.
We now must show r < d. We have a − d(q + 1) = a − dq − d = r − d,
thus r − d ∈ S. Since r is the least element in S ′ and r − d < r we have
r − d < 0 ⇒ r < d. We thus have a = dq + r with 0 ≤ r < d.
Now we have to show that q and r are unique. Suppose a = dq1 + r1 and
a = dq2 + r2 where 0 ≤ r1 < d and 0 ≤ r2 < d. Without loss of generality
we may assume r1 ≤ r2. We thus have 0 ≤ r2 − r1 < r2 < d. We note that
0 ≤ r2 − r1 = a− dq2 − a + dq1 = d(q1 − q2). Thus r2 − r1 is a multiple of d
and non-negative. We thus have 0 ≤ r2 − r1 < d and r2 − r1 = d(q1 − q2)
⇒ 0 ≤ d(q1 − q2) < d ⇒ 0 ≤ q1 − q2 < 1 ⇒ q1 − q2 = 0. Thus q1 = q2,
and thus r1 − r2 = 0 ⇒ r1 = r2.
76
Exercises
1. Verify that the relation (a, b) ≡ (c, d) ⇔ a+d = b+c is an equivalence
relation on N× N, but not on α× α where α > ω.
2. Verify that the relation (x, y) ≡ (z, w) ⇔ xw = yz is an equivalence
relation on Z× (Z− 0)For any Integral Domain D show:
3. a · e = e ∀a ∈ D and where e is the additive identity.
4. u · (−u) = −u, where u is the multiplicative identity.
5. (−u) · (−u) = u, where u is the multiplicative identity.
6. If a, z ∈ D and a + z = a, then show z = e.
Solution to 3: a = a · u = a · (u + e) = a + a · e ⇒ a · e = e.
7. If v, a ∈ F , where F is a field and a · v = a, then show v = u.
8. Show that every Field is an Integral Domain.
Mathematical Induction is often used to prove certain identities. Exer-
cise 9 and its solution exemplifies how this is done. Exercise 10 is left
as practice.
9. Show thatn∑
k=1
k =n(n + 1)
2∀n ∈ Z+.
Solution to 9: Let A =
n
∣∣∣∣∣n∑
k=1
k =n(n + 1)
2
i.1∑
k=1
k = 1 =1(2)
2thus 1 ∈ A.
77
ii. Assume m ∈ A, then
m+1∑
k=1
k = m + 1 +m∑
k=1
k = m + 1 +m(m + 1)
2=
(m + 1)(m + 2)
2.
Thus m + 1 ∈ A. Therefore by Mathematical Induction A = N,
andn∑
k=1
k =n(n + 1)
2∀n ∈ Z+.
10. Show thatn∑
k=1
2k − 1 = n2 ∀n ∈ Z+.
78
VIII REAL NUMBERS
It is a well known fact, and a standard exercise, that√
2 is not rational.
This means the equation x2−2 = 0 has no solutions in the rational numbers,
hence the need to extend the rational numbers to a larger set that would
include the solutions to such equations.
The next set we develop is the set the real numbers which we will indicate
by R.
Dedekind Cuts and Real Numbers
Definition A Cut or Dedekind cut of the rational numbers is a subset
A of the rational numbers such that
1. A 6= ∅, and A 6= Q.
2. If a ∈ A, and b < a, then b ∈ A.
3. if a ∈ A ∃ a′ ∈ A, such that a < a′.
From this definition we see that a cut forms a partition of the rational num-
bers into two non-empty subsets, the cut and its complement, where every
element of the cut is less than every element of its complement. The cut is
called the lower set and its complement is called the upper set.
Definition The set of Real Numbers, R, is the collection of all cuts of
the rational numbers.
We define a natural ordering of the real numbers by the following.
79
Definition For two real numbers, A and A′, we say A < A′ if A ⊂ A′. We
emphasize the inclusion is proper, i.e., A 6= A′.
Theorem 8.1 For any rational number r the set z = p|p < r is a cut,
and hence a real number.
Proof We demonstrate that z satisfies the three conditions of the definition
of a cut.
1) r − 1 < r ⇒ r − 1 ∈ z Thus z 6= ∅. Also r 6< r ⇒ r 6∈ z ⇒ z 6= Q.
2) If q ∈ z and p < q, then p < r, thus p ∈ z.
3) If q ∈ z, then q < r ⇒ q <q + r
2< r, thus
q + r
2∈ z.
There is a natural injection of the rational numbers into the real numbers,
the injection is given by
J : q → A where A = p|p < q.
We will use the notation q to represent the real number p|p < q, in partic-
ular the rational number 0 embeds as J : 0 → p|p < 0 = 0. Throughout
the remainder of this chapter real numbers will be marked by the symbol ˆ.
In subsequent chapters the ˆ will be omitted, and a number will be known
to be real by the context.
Exercise Show that if r ∈ Q and r < x, then r ∈ x.
Solution r = p|p < r ⊂ x, thus ∃ t ∈ x such that t 6∈ r, thus t ≥ r. If
t = r we are done, if t > r, then r ∈ x.
80
Trichotomy
Lemma 8.2 The trichotomy property for real numbers For any real num-
ber x exactly one of the following is true:
x > 0, x = 0, or x < 0.
Proof Let x be a real number, then exactly one of the following is true:
0 ⊂ x, x = 0, or x ⊂ 0.
Definition If x > 0 then we say x is a positive real number. If x < 0 then
we say x is a negative real number.
Addition
We define addition of real numbers by the following:
Definition x + y = r|r < p + q, p ∈ x, and q ∈ y.
Theorem 8.3 The sum x + y is a real number.
Proof We show that x + y satisfies the definition of a cut.
1) Since x and y are not empty there exists p ∈ x and q ∈ y, and
p + q − 1 < p + q ⇒ x + y 6= ∅. Also since ∃c 6∈ x and ∃d 6∈ y we have
c > t ∀t ∈ x and d > w ∀w ∈ y. We have c + d > t + w ∀t + w ∈ x + y. Thus
c + d 6∈ x + y, and so x + y 6= Q.
2) Let a ∈ x + y and b < a. We have b < a < p + q where p ∈ x and
q ∈ y. Thus b ∈ x + y.
3) Let a ∈ x + y, thus a < p + q where p ∈ x and q ∈ y. There exists
r > p such that r ∈ x, thus p + q < r + q. Thus a < p + q ∈ x + y.
81
Corollary x + y = p + q|p ∈ x, and q ∈ y.
Proof From part 2) we have
p + q|p ∈ x, and q ∈ y ⊆ r|r < p + q, p ∈ x, and q ∈ y.
Now if r < p + q where p ∈ x and q ∈ y, then r − p < q, thus r − p ∈ y and
so r = p + (r − p) where p ∈ x and (r − p) ∈ y, thus
r|r < p + q, p ∈ x, and q ∈ y ⊆ p + q|p ∈ x, and q ∈ y.
Thus x + y = p + q|p ∈ x, and q ∈ y.
We leave to the reader as an exercise to show addition of real numbers is
commutative and associative; i.e., x+ y = y+ x, and x+(y+ z) = (x+ y)+ z
for all real numbers x, y, z.
Exercises For p, q ∈ Q, and A ∈ N show
i) p + q = p + q.
ii)∑i∈A
pi =∑i∈A
pi.
Theorem 8.4 For any real number x and the real number 0, x + 0 = x.
Proof: If r ∈ x + 0, then r = p + q where p ∈ x and q < 0, thus r = p + q <
p ⇒ r ∈ x, thus x + 0 ⊆ x. If r ∈ x, then there is a rational number s ∈ x,
such that s > r, thus r − s < 0, thus r − s ∈ 0, and (r − s) + s = r, thus
x ⊆ x + 0. Hence x + 0 = x.
We say 0 is the additive identity.
Definition If for two real numbers x, and y we have x + y = 0, then we
say y is the additive inverse or opposite of x. We use the notation −x to
indicate the additive opposite of x.
82
Theorem 8.5 Every real number has an additive inverse.
Proof Let x be a real number and let y = q ∈ Q|q + p < 0 ∀ p ∈ x. We
first show that y is a real number, i.e., y is a cut.
1) Since x 6= Q ∃ d ∈ Q such that d 6∈ x. Thus d > p ∀ p ∈ x. Thus
−d < −p ∀ p ∈ x. Thus p + (−d) < 0 ∀ p ∈ x. Thus −d ∈ y. Thus y 6= ∅.Also we notice for any p ∈ x, p + (−p) = 0, thus −p 6∈ y, thus y 6= Q.
2) Let q ∈ y and a < q. We thus have a + p < q + p < 0 ∀ p ∈ x. Thus
a ∈ y.
3) Let q ∈ y, thus q + p < 0 ∀p ∈ x. Now assume p + q + r ≥ 0 for some
p ∈ x and ∀r ∈ Q+. Thus p + r 6∈ x ∀r ∈ Q+. But ∀p ∈ x ∃s > p such that
s ∈ x. Let r = s− p, and we thus have p + r = s ∈ x, which contradicts our
assumption. Thus ∀p ∈ x ∃r ∈ Q+ such that q + p + r < 0, thus q + r ∈ y,
and q < q + r.
Thus y is a cut and we write y = y.
We now show that x + y = 0. If r ∈ x + y, then r = p + q < 0, thus
x + y ⊆ 0.
If s ∈ 0, and p ∈ x is arbitrary, then s − p ∈ Q and p + s − p = s < 0,
thus s − p ∈ y. Thus s = p + s − p ∈ x + y, and thus 0 ⊆ x + y, and thus
x + y = 0.
Lemma 8.6 −(−x) = x.
Proof We have
x + (−x) = 0 = −(−x) + (−x)
⇒ x + (−x) + x = −(−x) + (−x) + x
⇒ x = −(−x)
83
Lemma 8.7 x > 0 if and only if −x < 0.
Proof Assume x > 0 then 0 ∈ x, so ∃p ∈ x such that p > 0. Now assume
−x ≥ 0, then ∃q ∈ −x such that q > −p. Thus q + p > 0, and thus
x +−x > 0, which is a contradiction.
Now assume −x < 0, then ∃q < 0 such that q > p ∀p ∈ −x. So
−q < −p ∀p ∈ −x. Thus p +−q < 0 ⇒ −q ∈ x. Since we have 0 < −q, we
have 0 ∈ x, and thus x > 0.
Corollary −0 = 0.
Lemma 8.8 −(x + y) = −x + (−y).
Proof We have −x + (−y) = r + s|r ∈ −x and s ∈ −y. Thus for
r ∈ −x and s ∈ −y, r + p < 0 ∀p ∈ x and s + q < 0 ∀q ∈ y. Thus
r+s+p+q < 0 ∀(p+q) ∈ x+ y. Thus −x+(−y) = r+s|(r+s)+(p+q) <
0, r ∈ −x, s ∈ −y ∀(p + q) ∈ (x + y).
Now we also have −(x + y) = t|t + (p + q) < 0 ∀((p + q) ∈ x + y. Since
(r + s) ∈ Q ∀r ∈ −x and s ∈ −y we have (r + s) ∈ −(x + y). Thus we have
−x + (−y) ⊆ −(x + y).
Now let t ∈ −(x + y), then t + (p + q) < 0 ∀p ∈ x and q ∈ y. Also
t + (p + q) <t + (p + q)
2< 0. So
t + (p + q)
2− p + p < 0 ∀p ∈ x and q ∈ y.
Thust + (p + q)
2− p ∈ −x. Also
t + (p + q)
2− q + q < 0 ∀p ∈ x and q ∈ y.
Thust + (p + q)
2−q ∈ −y. And we have t =
t + (p + q)
2−p+
t + (p + q)
2−q.
Thus t ∈ −x + (−y). Thus −(x + y) ⊂ −x + (−y).
Thus we have −(x + y) = −x + (−y).
84
Multiplication
We now define multiplication of real numbers. First we define the product
of two positive real numbers by:
Definition For x, y > 0, x · y = r|r < pq, p ∈ x, q ∈ y, and p > 0, q > 0.
For ease of notation we will often use juxtaposition to indicate the oper-
ation of multiplication. That is xy ≡ x · y.
Theorem 8.9 If x and y are positive real numbers, then xy is a positive
real number.
Proof We show xy is a cut.
1) x, y > 0 ⇒ ∃p > 0, q > 0, p ∈ x and q ∈ y, thus 0 < pq, thus 0 ∈ xy,
thus xy 6= ∅. Now ∃a 6∈ x and ∃b 6∈ y, thus a > p ∀p ∈ x and b > q ∀q ∈ y,
thus ab > pq ∀p ∈ x and q ∈ y. Hence ab 6∈ xy. Hence xy 6= Q.
2) Let r ∈ xy, r < pq for some p ∈ x and q ∈ y. If s < r, then s < pq
and thus s ∈ xy.
3) Let r ∈ xy. We have r < pq for some positive p ∈ x and q ∈ y. Since
∃s ∈ x such that p < s and ∃t ∈ y such that q < t we have r < pq < st, thus
pq ∈ xy.
Since xy satisfies the definition of a cut we conclude xy is a real number.
To show that xy is positive we notice that x > 0 ⇒ 0 ⊂ x ⇒ 0 ∈ x ⇒∃p > 0, p ∈ x. Similarly ∃q > 0, q ∈ y. Now let r ∈ x and r > 0, s ∈ y and
s > 0 be arbitrary. We thus have 0 < rs ⇒ 0 ∈ xy ⇒ 0 ⊂ xy ⇒ xy > 0.
85
We complete the definition of multiplication by:
xy =
0 ifx = 0 or y = 0
−(−x)y if x < 0 and y > 0
−(x(−y)) if x > 0 and y < 0
((−x)(−y)) if x < 0 andy < 0
Corollary If xand y are real numbers, then xy is a real number.
Associativity and commutativity of multiplication follows immediately
from the definition of multiplication and the associativity and commutativity
of rational numbers.
Exercise For p, q ∈ Q, A ∈ N show
i) p · q = p · q.
ii)∏i∈A
pi =∏i∈A
pi.
Theorem 8.10 The real number 1 = p|p < 1 is the multiplicative iden-
tity.
Proof First consider a positive real number x.
x · 1 = r|r < pq, p ∈ x, q ∈ 1, p > 0, q > 0.
If t ∈ x · 1, then ∃p ∈ x, p > 0 and q ∈ 1, q > 0 such that t < pq < p ∈ x,
thus t ∈ x. Now if t ∈ x, then there exists q ∈ x, q > 0, such that t < q. We
now note that t <t + q
2< q and
t + q
2q< 1. We thus have t < q · t + q
2q, thus
t ∈ x · 1. Hence x · 1 = x.
86
For x = 0 we always have 0 · 1 = 0.
For x < 0 we have x · 1 = −((−x) · 1) = −(−x) = x.
Theorem 8.11 x 6= 0, if and only if ∃y such that xy = 1.
Proof In the converse direction we simple note that 0y = 0 ∀y ∈ R. Thus,
if xy = 1, then neither x nor y can be 0.
In the forward direction, let x > 0, and let
y = s|∃ p, q > 0, q ∈ x, pq < 1 and s < p.
We first show that y is a cut.
1. x > 0 ⇒ 0 ∈ x ⇒ ∃q ∈ x where q > 0. Since 0 · q = 0 < 1 ∀q we
have 0 ∈ y. Thus y 6= ∅. To show that y 6= Q, pick q ∈ x, q > 0, then
q−1 · q = 1, thus q−1 6∈ y.
2. If s ∈ y and t < s, then t < p, thus t ∈ y.
3. If s ∈ y, then s < p, thus s < s+p2
< p, thus s+p2∈ y.
We conclude that y is a cut and thus is a real number.
For any s ∈ y, q ∈ x and p > 0 where pq < 1 we have s < p, thus sq < pq < 1.
Hence yx = r|r < sq < pq < 1, s, p, q > 0 = r|r < 1 = 1.
To complete the proof, assume x < 0, then ∃y such that (−x) · y = 1,
thus
x · (−y) = −xy = (−x) · y = 1.
87
The set of Real Numbers is a Field
To complete the verification that the real numbers form a field we demon-
strate that multiplication distributes over addition.
Theorem 8.12 x · (y + z) = x · y + x · z.
To prove Theorem 8.12 we need the following lemma.
Lemma If p ∈ x and q ∈ y where p, q > 0, then pq ∈ xy.
Proof ∃ r ∈ x such that p < r, thus pq < rq, thus pq ∈ xy
Proof of Theorem 8.12
i) Assume x, y + z > 0, then either y > 0 or z > 0.
x · (y + z) = t|t < p · s where p ∈ x, s ∈ y + z and p, s > 0
If y and z are positive, then for s ∈ (y + z) we can write s = q + r
where q ∈ y, r ∈ z and q, r > 0. Thus we have
x·(y+z) = t|t < p·s = p(q+r) = pq+pr, p, q, r > 0 with p ∈ x, q ∈ y, r ∈ z.
Now
xy + xz = t|t < u + v, u ∈ xy and v ∈ xz= t|t < pq + p′r, p, q, p′, r > 0, p, p′ ∈ x, q ∈ y, r ∈ z.
Thus x · (y + z) ⊆ xy + xz.
Now let p, p′ ∈ x, p, p′ > 0, then pq + p′r = p(q +p′
pr) = p′(
p
p′q + r)
and eitherp′
p≤ 1 or
p
p′≤ 1, thus either
p
p′q ∈ y or
p′
pr ∈ z.
88
Thus xy + xz ⊆ x · (y + z), and thus xy + xz = x · (y + z).
Now without loss of generality, assume y > 0 and z ≤ 0. Then we have
x · (y + z) = t|t < ps, p > 0, s > 0, p ∈ x and s ∈ y + z.
Now, s = q − r, where q ∈ y, −r ∈ z, q, r > 0. Thus
x · (y + z) = t|t < ps = p(q − r) = pq − pr, p, q, r > 0.
Now
xy + xz = xy − x(−z) = t|t < p− q, p ∈ xy, and q ∈ x(−z).
We have p = su, q = rv, where s ∈ x, u ∈ y, r ∈ x, v ∈ −z. Thus
xy + xz = xy − x(−z) = t|t < su− rv.
Thus we again have x · (y + z) ⊆ xy + xz.
And again eitherr
s< 1 or
s
r< 1. Thus either
s
ru ∈ y or
r
sv ∈ −z.
Thus su− rv = s(u− r
sv) = r(
s
ru− v), and one is in x · (y + z).
Thus xy + xz ⊆ x · (y + z), and thus xy + xz = x · (y + z).
ii) Now assume x > 0, y + z < 0. Then
x · (y + z) = −(x · (−(y + z))
= −(x(−y − z))
= −(−xy − xz) = xy + xz.
iii) Assume x < 0, y + z > 0. Then
x · (y + z) = −(−x · (y + z))
= −(−xy − xz) = xy + xz.
89
iv) Assume x < 0, y + z < 0. Then
x · (y + z) = (−x · (−(y + z)))
= (−x · (−y − z))
= (−x(−y)− x(−z))
= −(x(−y) + x(−z)
= −(−(xy + xz)) = xy + xz
v) If y + z = 0, then z = −y and we have x · (y + z) = x · 0 = 0 and
xy + xz = xy + x(−y) = xy − xy = 0.
vi) Finally, if x = 0, then
x · (y + z) = 0 · (y + z) = 0 = 0 · y + 0 · z = xy + xz
The Least Upper Bound Property
The real numbers enjoy a property that is not shared by the rational
numbers, known as the least upper bound property.
Definition If X is a set of real numbers and the real number u satisfies the
condition that x ≤ u ∀x ∈ X, then u is said to be an upper bound for X.
Equivalently if the real number l satisfies the condition that x ≥ l ∀x ∈ X,
then l is said to be a lower bound for X. Any set that has an upper bound
is said to be bounded above, if the set has a lower bound it is said to be
bounded below. If a set has both an upper bound and a lower bound, we say
it is bounded.
Definition An upper bound, u of a set X, that satisfies the condition, if v
is also upper bound, then u ≤ v, is said to be the supremum, or the least
90
upper bound of X. A similar definition can be made for the greatest
lower bound or infimum.
We can now state and prove the following theorem.
Theorem 8.13 The Supremum Property Every non-empty set of real
numbers bounded above has a supremum.
Proof Let A be a non-empty collection of real numbers that is bounded
above by u. We will show that⋃
x∈A
x is a real number and is the supremum.
1. For any x ∈ A ∃p ∈ x ⇒ p ∈⋃
x∈A
x ⇒⋃
x∈A
x 6= ∅.x ≤ u ∀x ∈ A and
∃q 6∈ u ⇒ q 6∈ x ∀x ∈ A ⇒ q 6∈⋃
x∈A
x ⇒⋃
x∈A
x 6= Q.
2. If q < p where p ∈⋃
x∈A
x ⇒ p ∈ x for some x ⇒ q ∈ x ⇒ q ∈⋃
x∈A
x.
3. If p ∈⋃
x∈A
x ⇒ p ∈ x for some x ⇒ ∃q > p such that q ∈ x ⇒ q ∈⋃
x∈A
x.
Thus⋃
x∈A
x is a cut and hence a real number.
Now we show that⋃
x∈A
x is the supremum. If x ∈ A, then x ⊂⋃
x∈A
x, thus
x ≤⋃
x∈A
x. Hence⋃
x∈A
x is an upper bound. Now let y <⋃
x∈A
x ⇒ ∃p ∈⋃
x∈A
x
such that p 6∈ y. Since p ∈ x for some x we have y < p < x, thus y is not an
upper bound. Thus if z is an upper bound, we must have z ≥⋃
x∈A
x. Thus
⋃
x∈A
x is the supremum.
91
Exercise Show that q ∈ Q|q2 < 2 is bounded above in Q, and does not
have a supremum in Q. We conclude that the rational numbers do not have
the supremum property.
The Cardinality of the Real Numbers
Theorem 7.10 asserts that the integer and rational numbers are countable.
We now investigate the cardinality of the real numbers.
To facilitate our investigation we will develop an alternate representation
of the real numbers. We will show that every real number can be represented
as the sum of integral powers of 2. However, the computation
x =∑i∈ω
2n−i = 2n + 2n−1 + · · ·
⇒ 2x = 2n+1 + 2n + · · ·⇒ x = 2x− x = 2n+1
shows that the representation need not be unique. Thus we must take care
not to allow the duplications.
Definition Any function whose domain is an ordinal number is called a
sequence. If the domain is the ordinal number α we say the sequence is an
α-sequence, if the image of a sequence is in a set a, we say that the sequence is
an a-valued α-sequence. We use the notation (sn) to represent the sequence
s : α → s(α) where n ∈ α.
We begin with some terminology.
1. A sequence kn is decreasing if kn < km whenever n > m.
92
2. Let kn be a decreasing sequence of integers. We say kn is inessential
if there exists N ∈ N such that kn+1 = kn − 1 ∀n ≥ N . We say kn is
essential if it is not inessential.
3. We will say any rational number of the form 2k, k ∈ Z is a binary.
Consider the set of all binary-valued sequences on α ⊆ ω of the form sn = 2kn
where kn is an essential decreasing integer-valued sequence, i.e.,
B = (2kn)|(kn) is an essential decreasing integer-valued sequence.
We will construct a bijection, b, between B and the positive real numbers,
i.e. we construct b : R+ ↔ B. If b(x) = (2kn), we will say (2kn) is the binary
representation of x.
We construct b by demonstrating how to compute the image of x ∈ R+.
x > 0 ⇒ 0 ∈ x ⇒ ∃p > 0, p ∈ x.
Thus
∃k ∈ Z such that 2k < p.
Now
∃q 6∈ x and ∃m such that 2k+m ≥ q.
Thus
∃n such that 2k+n > x.
Consider m|2k+m > x ⊆ ω. Pick the least element,n, thus 2k+n > x and
2k+n−1 ≤ x, thus 2k+n−1 is the largest element of the form 2m in x. Set
k + n− 1 = k0.
Now consider x − 2k0 . There exists a largest element of the form 2m in
x− 2k0 , call that element 2k1 .
93
We show 2k1 < 2k0 .
2k1 ≤ x− 2k0 ⇒ 2k1 + 2k0 ≤ x
If 2k0 ≤ 2k1 , then we have
2k0 + 2k0 ≤ 2k1 + 2k0 < x.
Thus
2k0+1 = 2 · 2k0 < x.
Since 2k0 is the largest binary in x we have a contradiction, thus 2k1 < 2k0 .
We pick 2k2 to be the largest binary in x − 2k0 − 2k1 = x − (2k1 + 2k1).
Again 2k2 < 2k1 . We continue in this fashion to construct the sequence. This
construction process defines our map b. We now need to show that b is well
defined, one to one and onto.
To demonstrate that b is well defined we need only show that the maximal
binary in any real number is unique. If 2m and 2n are maximal binaries in x,
then 2m = 2n ⇒ m = n. Thus the maximal binary is unique and b is well
defined.
To show that b is onto, we need to define some notation and prove a
lemma.
Let (an) be an α-sequence, where α ⊆ ω.
Let∑
k∈n
ak = a0 + a1 + · · · + an for n ∈ α. Then we have( ∑
k∈n
ak
)is an
α-sequence. Which we call the sequence of partial sums of (an).
Now let (an) be an α-sequence of non-negative rational numbers, α ⊂ ω.
Each element of( ∑
k∈n
ak
)is a rational number since each sum is finite. Also
∑
k∈n
ak ≤∑
k∈m
ak if n < m, since each ai ≥ 0.
94
If the associated sequence of real numbers(∑
k∈n
ak
)is bounded above,
then we define ∑n∈α
an = sup∑
k∈n
an
.
Lemma If (an) = (2kn) where kn ∈ Z, n ∈ α ⊂ ω and kn < km if n > m,
then(∑
i∈n
2ki
)is bounded above.
Proof ∑i∈n
2ki ≤∑i∈n
2k0−i ≤∑i∈ω
2k0−i = 2k0+1
Thus ∑i∈n
2ki < 2k0+1 ∀n ∈ α ⇒∑i∈n
2ki ≤ 2k0+1 ∀n ∈ α
Thus(∑
i∈n
2ki
)is bounded above.
Now let (2kn) be a sequence where (kn) is a decreasing essential sequence
of integers. Since kn is essential∑
n∈N−02km+n < 2km ∀m ∈ N. Thus we see
that (2kn) is the binary representation of the real number∑
2kn .
To show that b is one to one, if b(x) = b(y), then 2kn = 2jn only if
2kn = 2jn ∀n.
Theorem 8.14 The real numbers are uncountable.
To prove this theorem we need two lemmas and a corollary.
Lemma 1 The set of all bi-valued ω-sequences is uncountable.
That is S = s : ω → 0, 1 is uncountable.
Proof The demonstration is by contradiction.
95
Assume there exists a bijection b : ω → S. Define a bi-valued ω-sequence
s by
s(n) =
0 if(b(n)
)(n) = 1
1 if(b(n)
)(n) = 0.
Since b is a bijection, ∃ k where b(k) = s. Then we have
s(k) =
0 if(b(k)
)(k) = s(k) = 1
1 if(b(k)
)(k) = s(k) = 0
which is a contradiction.
Lemma 2 The set of all bi-valued ω-sequences where s(k) = 0 for only a
finite number of times is countable.
That is, A = s|∃N ∈ ω such that s(n) = 1 ∀n ≥ N is countable.
Proof There is a reasonably obvious bijective map
j : A ↔⋃n∈ω
s : n → 0, 1|n ∈ ω.
Thus C(A) ≤ C
(⋃n∈ω
s : n → 0, 1|n ∈ ω)
, and the countable union of
finite sets is countable.
Corollary The set of all bi-valued ω-sequences where s(k) = 0 infinitely
often is uncountable.
Proof of Thoerem 8.14
Let A = s : ω → 0, 1, where sn = 0 infinitely often. There exists a
bijection
b : A ↔∑
n∈ω
sn · 2−n|s ∈ A
⊂ R
defined by b(s) =∑n∈ω
sn · 2−n. Since A is uncountable, so is R.
96
Exercise
Show that s : ω → 0, 1, where sn = 0 infinitely often = x|0 ≤ x ≤ 1.
The Cauchy Sequence Construction of the Real Numbers
We conclude the chapter with a set of exercises that leads to an alternate
construction of the real numbers.
Definition The functions Abs : R→ R defined by
Abs(x) =
x if x ≥ 0
−x if x < 0
is the absolute value function. We abbreviate Abs(x) by |x|,i.e. Abs(x) = |x|.
Definition A rational-valued sequence, sn, that satisfies the following:
∀ε > 0, ε ∈ Q, ∃N ∈ N such that ∀n,m > N ⇒ |sn − sm| < ε
is called a Cauchy Sequence.
We define an equivalence relation on the set of all rational-valued Cauchy
Sequences by
sn ≡ tn iff ∀ε > 0 ∃N ∈ N such that ∀n ≥ N ⇒ |sn − tn| < ε.
We define an order on the equivalence classes of Cauchy Sequences by
S ≥ T iff ∃N ∈ N such that ∀n > N ⇒ sn− tn ≥ 0 ∀(sn) ∈ S and ∀(tn) ∈ T.
97
Exercises
1. Show that the relation as defined above is in an equivalence relation.
2. Show that the order defined for the equivalence classes of Cauchy Se-
quences is a linear order.
3. Show that the set of equivalence classes of rational-valued Cauchy se-
quences is order isomorphic to the Real Numbers.
98
IX Complex numbers, Quaternions andOctonions
Since the product of two positive real numbers is positive, and the product
of any two negative real numbers is also positive, the solution to the equation
x2 + 1 = 0 is vacuous in the Real numbers. However by extending the real
numbers to a larger set of numbers we can create solutions to equations such
as the example given above. The construction of this larger set of numbers
from the Real numbers is far easier than the construction of the Reals from
the Rationals.
Complex Numbers
Definition The Complex Numbers, C, is the cartesian product of the
Real Numbers with themselves, R× R, with the following arithmetic.
1. (a, b) + (c, d) = (a + c, b + d).
2. (a, b) · (c, d) = (ac− bd, ad + bc).
The real numbers are embedded into the Complex numbers by the following
injection map
J : x → (x, 0).
Theorem 9.1 The complex numbers are a field.
Proof From the definition of addition and multiplication, the results of
the binary operation produces an ordered pair of real numbers hence the
Complex numbers are closed with respect to the binary operations of addition
and multiplication.
99
(a, b) + (c, d) = (a + c, b + d) = (c + a, d + b) = (c, d) + (a, b), and
(a, b) · (c, d) = (ac − bd, ad + bc) = (ca − db, cb + da) = (c, d) · (a, b). Hence
addition and multiplication is commutative.
((a, b)+(c, d))+(e, f) = (a+ c, b+d)+(e, f) = ((a+ c)+e, (b+d)+f) =
(a + (c + e), b + (d + f)) = (a, b) + (c + e, d + f) = (a, b) + ((c, d) + (e, f),
and ((a, b) · (c, d)) · (e, f) = (ac − bd, ad + bc) · (e, f) = ((ac − bd)e − (ad +
bc)f, (ac−bd)f +(ad+bc)e) = (ace−bde−adf−bcf, acf−bdf +ade+bce) =
(a(ce− df)− b(de + cf), a(cf + de) + b(ce− df)) = (a, b) · (ce− df, cf + de) =
(a, b) · ((c, d) · (e, f)). Hence addition and multiplication is associative.
(a, b) · ((c, d)+(e, f)) = (a, b) · (c+e, d+f) = (ac+ae− bd− bf, ad+af +
bc+ be) = (ac− bd, ad+ bc)+ (ae− bf, af + be) = (a, b) · (c, d)+ (a, b) · (e, f).
Hence multiplication distributes over addition.
(a, b) + (0, 0) = (a + 0, b + 0) = (a, b). Thus (0, 0) is the additive identity.
(a, b)·(1, 0) = (a·1−b·0, a·0+b·1) = (a, b). Thus (1, 0) is the multiplicative
identity.
(a, b)+(−a,−b) = (a−a, b−b) = (0, 0). Thus any complex number (a, b)
has an additive inverse (−a,−b).
(a, b) · ( a
a2 + b2,
−b
a2 + b2) = (
a2
a2 + b2− −b2
a2 + b2,−ab
a2 + b2+
ab
a2 + b2) =
(a2 + b2
a2 + b2,ab− ab
a2 + b2) = (1, 0). Thus for any complex number (a, b) 6= (0, 0) the
complex number (a
a2 + b2,
−b
a2 + b2) is the multiplicative inverse.
We have thus verified that the complex numbers forms a field.
We note that the product (0, 1) · (0, 1) = (−1, 0) Since (−1, 0) is the
embedding of the real number −1, we have (0, 1) as the square root of −1,
and we have a solution to the equation x2 + 1 = 0.
100
The simplest and standard way to represent Complex numbers is to repre-
sent them as the formal expression a+ bi, where i2 = (0, 1)(0, 1) = (−1, 0) =
−1. Now standard arithmetic on binomial expressions yield the appropriate
sums and products.
(a + bi) + (a′ + b′i) = (a + a′) + (b + b′)i
(a + bi)(a′ + b′i) = aa′ + ab′i + a′bi + bb′i2 = (aa′ − bb′) + (ab′ + a′b)i
.
For the complex number (a, b) we say (a,−b) is its complex conjugate.
We denote the complex conjugate of a complex number c by c∗, thus if
c = (a, b), then c∗ = (a, b)∗ = (a,−b).
Definition For x = (a1, a2, · · · , an) ∈ Rn = Πi∈nR, the real value
√n∑
i=1
a2i
is the norm of x. We say that the norm of a complex number is the norm
of the pair of real numbers that represents it.
We see that for any complex number c, we have c · c∗ = (a+ bi)(a− bi) =
a2 + b2 which is the norm squared of c.
Quarternions
Definition The Quaternions is the set C × C with the following arith-
metic:
1. (a, b) + (c, d) = (a + c, b + d)
2. (a, b) · (c, d) = (ac− d∗b, ad + bc∗).
101
We designate the Quaternions with the blackboard bold face capital H, H.
Let h ∈ H, then h = (a + bi, c + di). But again it is common practice to
write h = a + bi + cj + dk.
We leave it to the reader to verify i2 = j2 = k2 = −1.
We also leave it to the reader to verify ij = −ji = k, jk = −kj =
i, ki = −ik = j. Thus we see ij 6= ji, hence quaternions fail to have the
commutative property.
Octonions
The quaternion conjugate of a quaternion h = (a, b) is h∗ = (a∗,−b).
Definition The Octonions is the set H×H with the following arithmetic:
1. (a, b) + (c, d) = (a + c, b + d)
2. (a, b) · (c, d) = (ac− d∗b, ad + bc∗).
We designate the Octonions with a bold face capital K, K.
Again it is common practice to designate Octonions by k = a + be1 +
ce2 + de3 + ee4 + fe5 + ge6 + he7.
We leave it again to the reader to verify e2n = −1 and that for n 6= m we
have anticommutivity, enem = −emen.
We could continue constructing numbers in this fashion but more alge-
braic properties fail. In particular in the next constructions not every non
zero number has a multiplicative inverse.
102
X TRANSFINITE AND INFINITESIMALNUMBERS
In his development of the Calculus, Isaac Newton introduced the concept
of a “fluxion”, a number that was smaller than any positive number and yet
was greater than zero. This concept seemed to be contrary to the Archime-
dian principle, and led one of the critics of early analysis, Bishop Berkeley, to
comment “And what are these fluxions? The velocities of evanescent incre-
ments. And what are these same evanescent increments? They are neither
finite quantities, nor quantities infinitely small, nor yet nothing. May we
not call them ghosts of departed quantities?” At this time in history, when
Newton’s genius is greatly admired we may be inclined to dismiss Bishop
Berkeley’s comments as those of a pompous fool. However this is not true,
his criticisms were well founded as the rigorous structure of Newton’s Cal-
culus had not been well established. Criticism such as Bishop Berkeley, is
good for the development of any scholarly field of study, as it forces those
involved in that study to construct a firm basis for their claims. Of course
the concept of fluxions has been replaced with the limit, and all seems well
with the world, and Calculus.
With Cantor’s development of transfinite ordinal and cardinal numbers
it has become possible to make the concept of fluxions rigorous, and without
contradicting Archimedes. We will not use the term fluxion, but will call such
numbers infinitesimal. In this chapter we will construct transfinite Integers,
Rational and Real Numbers. The construction will be a simple extension of
our development of finite numbers.
We begin our construction of transfinite and infinitesmals by noting that
our construction of the integers, rational and real numbers used the ordinal
103
number ω as a basis on which to build the appropriate collection of sets that
would be our numbers. An attempt to use a larger ordinal than ω will fail, as
ordinal arithmetic is not commutative, and commutativity is crucial in the
construction of the equivalence classes that represent the different numbers.
Consider the equivalence relation defined on ω×ω that yielded the integers.
(a, b) ≡ (c, d) ⇔ a + d = b + c.
This relation is not an equivalence relation on ordinal numbers greater than
ω as (ω, 1) 6≡ (ω, 1) since ω + 1 6= 1 + ω, hence the relation does not satisfy
the reflexive property.
Transfinite Arithmetic
Our choices of binary operations were made from our experiences with tangi-
ble collection of objects (e.g. boxes of apples), and these tangible collections
must be finite by their very nature. The extension of these binary operations
to abstract and infinite sets, although well motivated is quite arbitrary. So
our strategy is to define a new arithmetic, that agrees with ordinal arithmetic
on finite sets, but satisfies the appropriate field axioms on infinite sets. As
noted in Chapter 3, the ordinal number ωω can be expressed as
x | x =n∑
i=0
ωiαi where n, αi ∈ ω.
Thus every element of ωω is of the form of a polynomial of indeterminate ω
with coefficients and exponents in ω. We thus define an arithmetic on ωω to
be polynomial arithmetic.
n∑i=0
ωiαi +m∑
i=0
ωiβi =
maxn,m∑i=0
ωi(αi + βi)
104
andn∑
i=0
ωiαi ·m∑
j=0
ωjβj =n∑
i=0
m∑j=0
ωi+j(αi · βj).
Where the arithmetic on ω is ordinal arithmetic.
We may now easily verify that these two operations are commutative
and associative, and that multiplication distributes over addition. Thus we
may repeat our construction of integers, rational and real numbers with this
arithmetic to form transfinite integers, transfinite rational, and transfinite
real numbers. Both the transfinite rational numbers and transfinite real
numbers contain elements that satisfy the conditions of Newton’s fluxions.
We will call those numbers infinitesimal.
If we let n ∈ ω be an arbitrary element , and we let ω represent the trans-
finite integer (ω, 0), 1 represent the transfinite integer (1, 0) and n represent
the transfinite integer (n, 0) we observe that 1 · n < 1 · ω ⇒ [1, ω] < [1, n],
where [1, ω] and [1, n] are transfinite rational numbers. Since the restriction
of the construction of transfinite rational numbers is the construction of the
rational numbers, we may say that [1, n] is a rational number. Thus for any
positive real number ε, by the Archimedian property there exists an integer
n such that [1, n] < ε, and since [1, ω] < [1, n] we have [1, ω] < ε. And thus
[1, ω] satisfies the condition of Newton’s fluxions.
Transfinite Numbers
We define the ω-Transfinite Natural Numbers, Nω to be ωω, with
polynomial arithmetic. We define the ω-Transfinite Integers, Zω, to be
the collection of equivalence classes
[x, y]|x, y ∈ Nω where (x, y) ≡ (z, w) ⇔ x + w = y + z.
105
We define the ω-Transfinite Rational numbers, Qω, to be the collection
of equivalence classes
[x, y]|x, y ∈ Zω where (x, y) ≡ (z, w) ⇔ xw = yz.
And we define the ω-Transfinite Real Numbers, R, to be the collection
of all Dedekind cuts of transfinite rational numbers.
We now observe that the ordinal number ω(ωω) can be expressed as
x | x =n∑
i=0
(ωω)iαi where n, αi ∈ ωω.
Thus every element of ωω is of the form of a polynomial of indeterminate ωω
with coefficients and exponents in ωω. We again define arithmetic as poly-
nomial arithmetic where the arithmetic on ωω is the arithmetic previously
defined.
We now define the ωω-Transfinite Natural Numbers, ωω-Transfinite In-
tegers, ωω-Transfinite Rational Numbers, and ωω-Transfinite Real Numbers
in the analogous fashion. We may continue indefinitely constructing Trans-
finite numbers in this fashion, We thus define any number constructed in
this fashion to be a Transfinite Number. Thus a Transfinite Integer is an
α-Transfinite Integer for some appropriate ordinal number α. And equiva-
lently for Transfinite Natural Numbers, Transfinite Rational Numbers, and
Transfinite Real numbers.
We will adopt the convention, that when referring to an element of the
transfinite real numbers that is the embedded image of either a transfinite
natural number, transfinite integer, or transfinite rational number, we will
simply refer to that number as being an element of that respective embedded
106
set, that is, a transfinite natural number, transfinite integer or transfinite
rational number.
Let ξ be an arbitrary transfinite real number. Consider the cut α|α2 < ξ.We may consider this cut to represent the transfinite real number
√ξ.
Questions What is√
ω ? What is√
ω + 1 ? What is 3√
ω ?
107
XI SURREAL NUMBERS
The origin of Surreal numbers is credited to John Conway, however the
name was coined by Don Knuth. For consistency we may use the symbol S
for surreal numbers, however the need to do so seldom arises.
The Constructive Definition of Surreal Numbers
John Conway defined Surreal numbers by formulating two simple rules
for the construction of Surreal numbers plus the definitions for addition and
multiplication. With these two rules and two definitions, a collection of num-
bers is constructed that includes all Real numbers, and all Ordinal numbers.
The collection although not a set, but rather a proper class, forms a field.
Rule 1: Every number is represented by a pair of sets of previously con-
structed numbers, a left set and a right set, where no number in the left set
is greater than or equal to any number of the right set.
Rule 2: A number, a, is less than or equal to a number, b, if and only if
no member of a′s left set is greater than or equal to b, and no member of b′s
right set is less than or equal to a.
The first rule tells us how to construct new Surreal numbers from previ-
ously constructed numbers. The second rule defines the order relation of the
collection of surreal numbers that is necessary for the construction.
We will develop the surreal numbers using an alternate definition, that
uses some of the principles already developed in Set Theory.
108
The Function Definition of Surreal Numbers
Definition A Surreal number is a function from an ordinal number to a
two point space. The two point space is designated by +,−. The domain
of a surreal number will be called its length.
The function from the ordinal number 0 = is of course the empty
function,, and is considered to be a surreal number, and we call it 0.
Example 5 = 0, 1, 2, 3, 4. Thus
0 1 2 3 4
↓ ↓ ↓ ↓ ↓+ + - + -
Is a surreal number
Recall that we defined a sequence as a function whose domain is an ordinal
number. If the domain of a sequence is the ordinal number α then we say
that the sequence is an α-sequence.
Thus a surreal number, a, is a binary valued α-sequence for some ordinal
α, and its length is α, which we indicate by l(a) = α. The 0-sequence is of
course the empty set.
Definition If a is an α-sequence, and b is a β-sequence, such that a∩b = b,
then b is an initial segment of a. If b 6= a, then b is a proper initial
segment. We see that if b is a proper initial segment of a, then l(b) < l(a).
Two surreal numbers are equal if they are equal as sets.
We define a linear order on the class of surreal numbers by the following:
Let a and b be surreal numbers, and let c be the maximal initial segment
109
that is in both a and b, where c is a γ-sequence. We say
a > b if
a(γ) = + and b(γ) = − or
a(γ) = + and b(γ) is undefined or
a(γ) is undefined and b(γ) = − .
The Canonical Representation of Surreal Numbers
Let a be a surreal number. If A′ = a′|a′ is an initial segment of a and
a′ < a, and A′′ = a′′|a′′ is an initial segment of a and a < a′′, then we
say A′|A′′ is the canonical representation of a.
Example (+ +−−+) = 138
= 0, 1, 114|2, 11
2
Addition of Surreal Numbers
We define addition in the following way.
If a = A′|A′′ and b = B′|B′′ are in canonical form, then
a + b = a + b′, b + a′|a + b′′, b + a′′
∀ a′ ∈ A,′ b′ ∈ B′, a′′ ∈ A′′, b′′ ∈ B′′.
We now verify that the elements of the left set are truly less than the
elements of the right set. But first we must verify that addition is commu-
tative.
a+ b = a+ b′, b+a′|a+ b′′, b+a′′ = b+a′, a+ b′|b+a′′, a+ b′′ = b+a
To complete the verification we induct on the ordinal sum of the lengths
of a and b. We make the inductive hypothesis that if l(c)+ l(d) < l(a)+ l(b),
l(c′) + l(d′) < l(a) + l(b), c < c′, and d′ < d′, then c + d < c′ + d′.
110
We have a′ < a < a′′, b′ < b < b′′. Thus
a + b′ < a + b′′, a + b′ < a′′ + b = b + a′′
and
b + a′ < b + a′′, b + a′′ < b′′ + a = a + b′′.
Thus a + b does represent a Surreal number. However we notice that a + b is
not in canonical form. Is it well defined?
If A and B are sets of surreal numbers such that a < b ∀a ∈ A and b ∈ B,
then A|B is the “first” surreal number c, such that a < c < b ∀a ∈ A and b ∈B. That is, c is the surreal number with the minimal domain such that
a < c < b ∀a and ∀b. Hence the sum is well defined.
The way to think of why this is true is this way. Start at 0 and try to
get into the gap between the two sets as quickly as possible. If there is an
element of the lower set greater or equal to zero step towards the upper set,
if the opposite is true step toward the lower set. If you arrive between the
two sets then you have defined that “youngest” surreal number. If you are
still “amid” one of the two sets, step towards the other set and continue until
you arrive between them. If at any stage you were to step away from the set
in which you are not amid, then you can never arrive between the sets, thus
there is only one way to arrive between the sets, and when you first get there
you stop.
Example Compute 1+1.
To add 1+1 we need to know 0+1, to add 0+1 we need to know 0+0.
We have 0 = ∅|∅, thus 0 + 0 = 0 + b′, 0 + a′|0 + b′′, 0 + a′′, where
a′, a′′, b′, b′′ ∈ ∅ →←. Since there are no elements in the empty set to add to
111
0 each of those sums does not exist and thus
0 + 0 = ∅|∅ = 0.
We now have 1 = 0|∅, thus
0 + 1 = 0 + 0, 1 + a′|0 + b′′, 1 + a′′ where a′, a′′, b′′ ∈ ∅ →← .
Thus we have 0 + 1 = 0|∅ = 1. Finally
1 + 1 = 1 + 0, 1 + 0|1 + b′′, 1 + a′′ where a′′, b′′ ∈ ∅ →← .
Thus 1 + 1 = 1|∅ = 2.
We can see that 0 = ∅|∅ is the additive identity, since if we let a = A′|A′′,
then
a + 0 = 0 + a′|0 + a′′ = a.
Now we can define the additive inverses of surreal numbers.
Let a be a surreal number. Define −a as:
−a(α) =
+ if a(α) = −− if a(α) = +.
Example Let a + (+−+) = 34
= 0, 12|1,
thus −a = (−+−) = −34
= −1|0,−12.
a + b = −1
4,−3
4|3
4,1
4 = 0.
The canonical representation of the opposite of a given surreal number
a = A′|A′′ has the opposites of A′′ as its lower set and the opposites of A′
112
as its upper set. Thus if a = A′|A′′, then −a = −A′′| − A′ where −A′′ =
−a′′|a′′ ∈ A′′ and −A′ = −a′|a′ ∈ A′.
Now a + (−a) = a + (−a′′), (−a) + a′|a + (−a′),−a + a′′
Since a′ < a < a′′ ⇒ −a′′ < −a < −a′ We have
a + (−a′′) < a′′ + (−a′′) = 0
−a + (a′) < a′ + (−a′) = 0
a + (−a′) > a′ + (−a′) = 0
−a + a′′ > −a′′ + a′′ = 0.
Thus a + (−a) = 0.
We can verify that addition of surreal numbers is associative and thus
satisfy the axioms for an abelian group under addition. Let a = A′|A′′, b =
B′|B′′, and c = C ′|C ′′.
(a + b) + c = a + b′, b + a′|a + b′′, b + a′′+ c
= (a + b) + c′, c + (a + b′), c + (b + a′|(a + b) + c′′, c + (a + b′′), c + (b + a′′)= a + (b + c′), a + (c + b′), (b + c) + a′|a + (b + c′′), a + (c + b′′), (b + c) + a′′= a + b + c′, c + b′|b + c′′, c + b′′= a + (b + c).
The Multiplication of Surreal Numbers
We now define multiplication.
As a motivation for the following definition consider real numbers a, b, c, d
where a < b, and c < d. We then have b− a > 0 and d− c > 0. Thus
(b−a)(d−c) > 0 ⇒ bd+ac−ad−bc > 0 ⇒ bd > ad+bc−ac, and bc < bd+ac−ad.
113
Definition Let a = A′|A′′, b = B′|B′′, then
ab = a′b + ab′ − a′b′, a′′b + ab′′ − a′′b′′|a′b + ab′′ − a′b′′, a′′b + ab′ − a′′b′
where a′ ∈ A′, b′ ∈ B′, a′′ ∈ A′′ and b′′ ∈ B′′.
Exercise Consider ε = (+−−− · · · ) = 0|1, 12, 1
4, · · · and
ω = (+ · · · ) = 0, 1, · · · |∅. Find ε · ω.
The Field Properties of Surreal Numbers
We have confirmed that Surreal numbers with the binary operation of
addition forms an abelian group. We now want to complete our verification
that Surreal numbers with the operations of addition and multiplication form
a field. Of course we must verify that the definition of multiplication does
indeed define a Surreal number, to do that we must first verify the properties
of commutativity and associativity of multiplication, and the distributive
property.
We again make the inductive hypothesis that the inequalities necessary
are valid for the product of surreal numbers whose sum of lengths are less than
the sum of the lengths of a and b, and thus the definition of multiplication
will be well defined.
Let a = A′|A′′, b = B′|B′′ and c = C ′|C ′′. We now have
ab = a′b + ab′ − a′b′, a′′b + ab′′ − a′′b′′|a′b + ab′′ − a′b′′, a′′b + ab′ − a′′b′= b′a + ba′ − b′a′, b′′a + ba′′ − b′′a′′|b′a + ba′′ − b′a′′, b′′a + ba′ − b′′a′ = ba
Where a′ ∈ A′, a′′ ∈ A′′, b′ ∈ B′, and b′′ ∈ B′′. Thus multiplication is
114
commutative.
(ab)c = (ab)′c + (ab)c′ − (ab)′c′, (ab)′′c + (ab)c′′ − (ab)′′c′′|(ab)′c + (ab)c′′ − (ab)′c′′, (ab)′′c + (ab)c′ − (ab)′′c′
= (a′b + ab′ − a′b′)c + abc′ − (a′b + ab′ − a′b′)c′,
(a′′b + ab′′ − a′b′′)c + abc′ − (a′′b + ab′′ − a′′b′′)c′,
(a′b + ab′′ − a′b′′)c + abc′′ − (a′b + ab′′ − a′b′′)c′′,
(a′′b + ab′ − a′′b′)c + abc′′ − (a′′b + ab′ − a′′b′)c′′|(a′b + ab′ − a′b′)c + (ab)c′′ − (a′b + ab′ − a′b′)c′′,
(a′′b + ab′′ − a′′b′′)c + (ab)c′′ − (a′′b + ab′′ − a′′b′′)c′′,
(a′b + ab′′ − a′b′′)c + (ab)c′ − (a′b + ab′′ − a′b′′)c′,
(a′′b + ab′ − a′′b′)c + (ab)c′ − (a′′b + ab′ − a′′b′)c′= a′bc + ab′c− a′b′c + abc′ + abc′ − a′bc′ − ab′c′ + a′b′c′,
a′′bc + ab′′c− a′′b′′c + abc′ − a′′bc− ab′′c + a′′b′′c,
a′bc + ab′′c− a′b′′c + abc′′ − a′bc′′ − ab′′c + a′b′′c′′,
a′′bc + ab′c− a′′b′c + abc′′ − a′′bc′′ − ab′c′′ + a′′b′c′′|a′bc + ab′c− a′b′c + abc′′ − a′bc′′ − ab′c′′ + a′b′c′′,
a′′bc + ab′′c− a′′b′′c + abc′′ − a′′bc′′ − ab′′c′′ + a′′b′′c′′,
a′bc + ab′′c− a′b′′c + abc′ − a′bc′ − ab′′c′ + a′b′′c′,
a′′bc + ab′c− a′′b′c + abc′ − a′′bc′ − ab′c + a′′b′c′
= a′(bc) + a(b′c + bc′ − b′c′)− a′(b′c + bc′ − b′c′),
a′(bc) + a(b′′c + bc′′ − b′′c′′)− a′(b′′c + bc′′ − b′′c′′),
a′′(bc) + a(b′c + bc′ − b′c′)− a′′(b′c + bc− b′c′),
a′′(bc) + a(b′′c + bc′′ − b′′c′′)− a′′(b′′c + bc′′ − b′′c′′)|a′(bc) + a(b′c + bc′′ − b′c′′)− a′(b′c + bc′′ − b′c′′),
a′(bc) + a(b′′c + bc′ − b′′c′)− a′(b′′c + bc′ − b′′c′),
a′′(bc) + a(b′c + bc′ − b′c′)− a′′(b′c + bc′ − b′c′),
a′′(bc) + a(b′′c + bc′′ − b′′c′′)− a′′(b′′c + bc′′ − b′′c′′)
115
= a′(bc) + a(bc)′ − a′(bc)′, a′′(bc) + a(bc)′′ − a′′(bc)′′|a′(bc) + a(bc)′′ − a′(bc)′′, a′′(bc)′ − a′′(bc)′ = a(bc)
Where a′ ∈ A′, a′′ ∈ A′′, b′ ∈ B′, b′′ ∈ B′′, c′ ∈ C ′, and c′′ ∈ C ′′ . Thus
multiplication is associative.
a(b + c) = a′(b + c) + a(b + c)′ − a′(b + c)′, a′′(b + c) + a(b + c)′′ − a′′(b + c)′′|a′(b + c) + a(b + c)′′ − a′(b + c)′′, a′′(b + c) + a(b + c)′ − a′′(b + c)′
= a′(b + c) + a(b + c′)− a′(b + c′), a′(b + c) + a(b′ + c)− a′(b′ + c),
a′′(b + c) + a(b + c′′)− a′′(b + c′′), a′′(b + c) + a(b′′ + c)− a′′(b′′ + c)|a′(b + c) + a(b + c′′)− a′(b + c′′), a′(b + c) + a(b′′ + c)− a′(b′′ + c),
a′′(b + c) + a(b + c′)− a′′(b + c′), a′′(b + c) + a(b′ + c)− a′′(b′ + c)= a′b + a′c + ab + ac′ − a′b′ − a′c′, a′b + a′c + ab′ + ac− a′b′ − a′c,
a′′b + a′′c + ab + ac′′ − a′′b− a′′c′′, a′′b + a′′c + ab′′ + ac− a′′b′′ − a′′c|a′b + a′c + ab + ac′′ − a′b− a′c′′, a′b + a′c + ab′′ + ac− a′b′′ − a′c,
a′′b + a′′c + ab + ac′ − a′′b− a′′c′, a′′b + a′′c + ab′ + ac− a′′b′ − a′′c= ab + a′c + ac′ − a′c′, ac + a′b + ab′ − a′b′,
ab + a′′c + ac′′ − a′′c′′, ac + a′′b + ab′′ − a′′b′′|ab + a′c + ac′′ − a′c′′, ac + a′b + ab′′ − a′b′′,
ab + a′′c + ac′ − a′′c′, ac + a′′b + ab′ − a′′b′= ab + (ac)′, (ab)′ + ac|ab + (ac)′′ + (ab)′′ + ac = ab + ac
Where a′ ∈ A′, a′′ ∈ A′′, b′ ∈ B′, b′′ ∈ B′′, c′ ∈ C ′, and c′′ ∈ C ′′ . Thus
multiplication distributes over addition.
116
We now verify that the Surreal number 1 = 0| is the multiplicative
identity. First we must verify that any surreal number multiplied times 0 is
0. Let a = A′|A′′, and we have 0 = |, then we have
a · 0 = a′b + ab′ − a′b′, a′′b + ab′′ − a′′b′′|a′b + ab′′ − a′b′′, a′′b + ab′ − a′′b′
where a′ ∈ A′, b′ ∈ , a′′ ∈ A′′ and b′′ ∈
Since there are no elements in we conclude that the left and right sets
of the product are empty and thus is the Surreal number 0.
Now consider a · 1.
a · 1 = a′ · 1 + a · 0− 1 · 0|a′′ · 1 + a · 0− a′′ · 0 = a′|a′′ = a.
Hence 1 is the multiplicative identity.
To demonstrate that all non-zero surreal numbers have multiplicative in-
verses is considerably more technical than the calculation arguments used
for the associative and distributive properties. A complete development of
inverses is given in the book “An introduction to the theory of Surreal Num-
bers” by Harry Gonshor. We give here an intuitive argument as to why
inverses should exist.
Given a non-zero surreal number x we naively pick a candidate y0 = B′|B′′
for its inverse. y0 > 0 if x > 0 and y0 < 0 if x < 0. If x · y0 > 1,
then we construct our next candidate by y1 = B′|B′′ ∪ y0 if x > 0 or
y1 = B′ ∪ y0|B′′ if x < 0. If x · y0 < 1, then we construct our next
candidate by y1 = B′ ∪ y0|B′′ if x > 0 or y1 = B′|B′′ ∪ y0 if x < 0.
We proceed in this fashion until the product is in fact 1. We naively
believe that the procedure will eventually end as we will exhaust all possible
117
numbers that could lie between the sets of x · yα except for 1. Then yα will
be the multiplicative inverse of x.
118
APPENDIX 1
The Continuum Hypothesis
A major question arising from the very beginning of Set Theory is one
regarding the relationship between ordinal numbers and cardinal numbers.
Namely, which ordinal is the first uncountable ordinal? We know that ordinal
numbers and cardinal numbers are identical until the ordinal ω+1 which has
the same cardinality as ω, namely ℵ0. Since we know that 2ℵ0 has cardinality
greater than ℵ0 could that be the next cardinal number? I.e. does there not
exist an ordinal number whose cardinality is strictly greater than ℵ0 and
strictly less than 2ℵ0? We call ℵ1 the next cardinal greater than ℵ0, and we
formally state the hypothesis ℵ1 = 2ℵ0 , which is known as the continuum
hypothesis.
Recall that cardinal numbers are ordinal numbers so the existence of the
next larger cardinal number is guaranteed by the well ordering of ordinal
numbers. If we let ℵ be an arbitrary cardinal number we designate the
next larger cardinal number by ℵ+. We can thus generalize the continuum
hypothesis by
ℵ+ = 2ℵ ∀ℵ ≥ ℵ0.
Paul Cohen demonstrated in 1963 that the answer to the above question
cannot be decided with the Zermelo-Fraenkel axioms. That is the contin-
uum hypothesis is independent of ZF Axioms! A sketch of the argument
that establishes this fact is given in Keith Devlin’s book The Joy of Sets,
and a rigorous account is given in Bell’s book Boolean-Valued Models and
Independence Proofs in Set Theory.
119
APPENDIX 2
The number 1
The symbol that we use to represent a set is the two braces and . The
symbols should be considered a single symbol, since in representing sets the
left or right brace alone is meaningless. If we should replace these symbols
with a simple closed curve, that is a circle, then we can essentially illustrate
numbers. Here we illustrate the various numbers 1.
The ordinal number 1 is also the cardinal number 1 and the natural
number 1.
The integer 1 and rational 1 are represented by their canonical represen-
tative from their respective equivalence classes.
The real 1 is an infinite collection of rational numbers and does not lend
itself to a concise picture, and hence we do not represent it here.
120
ORDINAL 1
121
INTEGER 1
122
RATIONAL 1
123
SURREAL 1
124
APPENDIX 3
Quantifiers
∀ for all or for any
∃ there exists
! Unique
Logical Connectives
- not
∨ or
∧ and
⇒ implies a⇒ b; if a, then b
⇔ if and only if a⇔ b; if a, then b and if b, then a
6⇒ does not imply a 6⇒ b ≡ −(a⇒ b)
Logical Contradiction
→← The preceding statement is self contradictory.
e.g. (a⇒ b) ∧ (a 6⇒ b) →←
Truth Tables
a −a
T F
F T
a b a ∨ b
T T T
T F T
F T T
F F F
a b a ∧ b
T T T
T F F
F T F
F F F
a b a⇒ b
T T T
T F F
F T T
F F T
a b a⇔ b
T T T
T F F
F T F
F F T
Bibliography
[1] Halmos, Paul R. Naive Set Theory D. Van Nostrand Company, Inc.,
Princeton, NJ, 1960
[2] Stoll, Robert R. Set Theory and Logic W. H. Freeman and Company,
San Francisco Ca, 1963
[3] Rudin, Walter The Principles of Mathematical Analysis, third edition
McGraw-Hill, inc., New York, NY, 1964
[4] Landau, Edmund Foundations of Analysis Chelsea Publishing Company,
New York, NY, 1951
[5] Gonshor, Harry An Introduction to the Theory of Surreal Numbers Cam-
bridge University Press, Cambridge, U.K., 1986
[6] Knuth, Donald Surreal Numbers Addison-Wesley Publishing Company,
Menlo Park, CA., 1974.
[7] Devlin, Keith The Joy of Sets, Second Edition Springer-Verlag, New York,
NY, 1993
[8] Bell, J.L. Boolean-Valued Models and Independence Proofs in Set Theory
Oxford University Press, London, 1977
[9] Websters New World Dictionary Warner Books Inc., New York, 1990
Index
Term Page
Absolute Value 96addition 51additive identity 67additive inverse 67Aleph 33antisymmetric 20Archimedes 102Archimedian Property 73Arithmetic 50Axiom of Choice 16Axiom of extension 3Axiom of infinity 10Axiom of pairing 6Axiom of power sets 9Axiom of regularity 17Axiom of unions 7Axiom schema of replacement 17Axiom schema of restriction 17Axiom schema of specification 3Bijection 30binary operation 50Binary representation 92Bishop Berkeley 102bound 39C 98canonical representation 109Cantor 102Cantor’s Theorem 33Cardinal Arithmetic 50Cardinal Number 32cardinality 30Cartesian Product 13Cauchy Sequence 96chain 41Choice Function 15codomain 13comparable 43complete 40Complex Numbers 98Compliment 8Composition 30continuation 45continuum hypotheses 118Conway 107countable 32countably infinite 32counting numbers 59Counting Theorem 47Cut 78De Morgan laws 8Dedekind cut 78
Term Page
disjoint 8disjoint union 51division algorithm 74domain 13element 1embedding 65Empty Set 4equivalence relation 31exponentiation 52,54Field 71fluxion 102Function 13,30greatest lower bound 40,90H 101image 13induction 72infimum 40,90initial segment 108injection 65integer addition 62integer multiplication 62Integers 60Integral Domain 67Intersection 7Isaac Newton 102K 75Knuth 101least upper bound 40,89Limit Ordinal 27linear order 21lower bound 40Mathematical Induction 72map 30multiplication 51N 59natural numbers 59Negative Integers 66Octonians 101One to One 30Onto 30order 20order isomorphic 41order preserving 40order type 48Ordinal Arithmetic 52Ordinal Number 22partial order 20Partition 31Positive Integers 66Power 52Power set 9
Term Page
preimage 13product 51Projection Map 15Proper Class 19proper initial segment 108Proposition 3Q 68Quaternions 100R 78range 13Rational numbers 68Real Number 78reflexive 20relation 20Russel’s Paradox 5S 107Schroder-Bernstein Theorem 34section 21sequence 91Set 1simple closed curve 119spaces 1subset 8
Term Page
successor 10successor set 11sum 51supremum 26,40,89Supremum Property 90Surreal numbers 107total order 21tower 43Transfinite Induction 24Transfinite numbers 102Transfinite Recursion Theorem 27transitive 20Trichotomy 21Trichotomy Property 80uncountable 32union 7upper bound 26,39weak section 22well ordered 21Well ordering theorem 45Z 59Zermelo Fraenkel 1Zorn’s Lemma 41