proof theory: from arithmetic to set theory - scandinavian logic

Proof Theory: From Arithmetic to Set Theory

Michael Rathjen

Accompanying notes for a course given at the Nordic Spring School,Nordfjordeid, 27–30 May 2013

Contents

• A brief history of proof theory

• Sequent calculi for classical and intuitionistic logic, Gentzen’s Hauptsatz: Cutelimination

• Consequences of the Hauptsatz: Subformula property, Herbrand’s Theorem,existence and disjunction property, geometric theories

• Ordinal functions and representations up to Γ0

• Ordinal analysis of Peano arithmetic, PA, and some subsystems of secondorder arithmetic.

• Limits for the deducibility of transfinite induction

• Kripke-Platek set theory, KP.

• The Bachmann-Howard ordinal

• KP goes infinite, RS.

• Impredicative cut elimination theorem

• Interpreting KP in RS

1

1 A short and biased history of logic till 1938

• Logical principles - principles connecting the syntactic structure of sentenceswith their truth and falsity, their meaning, or the validity of arguments inwhich they figure - can be found in scattered locations in the work of Plato(428–348 B.C.).

• The Stoic school of logic was founded some 300 years B.C. by Zeno of Citium(not to be confused with Zeno of Elea). After Zeno’s death in 264 B.C., theschool was led by Cleanthes, who was followed by Chrysippus. It waslargely through the copious writings of Chrysippus that the Stoic school be-came established, though many of these writings have been lost.

• The patterns of reasoning described by Stoic logic are the patterns of in-terconnection between propositions that are completely independent ofwhat those propositions say.

• The first known systematic study of logic which involved quantifiers, com-ponents such as “for all” and “some”, was carried out by Aristotle (384–322B.C.) whose work was assembled by his students after his death as a treatisecalled the Organon, the first systematic treatise on logic.

• Aristotle tried to analyze logical thinking in terms of simple inference rulescalled syllogisms. These are rules for deducing one assertion from exactlytwo others.

• An example of a syllogism is:

P1. All men are mortal.

P2. Socrates is a man.

C. Socrates is mortal.

• In the case of the above syllogism, it is obvious that there is a general pattern,namely:

P1. All M are P .

P2. S is a M .

C. S is P .

• Some of the other syllogisms Aristotle formulated are less obvious. E.g.

P1. No M is P .

P2. Some S is M .

C. Some S is not P .

2

• Aristotle appears to have believed that any logical argument can, in principle,be broken down into a series of applications of a small number of syllogisms.He listed a total of 19.

• The syllogism was found to be too restrictive (much later).

• For almost 2000 years Aristotle was revered as the ultimate authority on logicalmatters.

Bachelors and Masters of arts who do not follow Aristotle’s philoso-phy are subject to a fine of five shillings for each point of divergence,as well as for infractions of the rules of the ORGANON.

– Statuses of the University of Oxford, fourteenth century.

When did Modern Logic start?

• Aristotle’s logic was very weak by modern standards.

• The ideas of creating an artificial formal language patterned on mathemati-cal notation in order to clarify logical relationships - called characteristicauniversalis - and of reducing logical inference to a mechanical reasoning pro-cess in a purely formal language - called calculus rationatur - were due toGottfried Wilhelm Leibniz (1646-1716).

• Leibniz’s contributions include arithmetization of syllogistic, a theory of rela-tions, modal logic and logical grammar.

• Much of it published posthumously 1903 by Couturat Opuscules et fragmentinedit de Leibniz.

• Logic as we know it today has only emerged over the past 140 years.

• Chiefly associated with this emergence is Gottlob Frege (1848–1925). In hisBegriffsschrift 1879 (Concept Script) he invented the first programminglanguage.

• His Begriffsschrift marked a turning point in the history of logic. It broke newground, including a rigorous treatment of quantifiers and the ideas of functionsand variables.

• Frege wanted to show that mathematics grew out of logic.

• Charles Peirce (1839–1914) is another pioneer of modern logic.

• Another strand is Algebraic logic which stresses logic as a calculus: Augus-tus De Morgan (1806–1871), George Boole (1815–1864), Ernst Schroder(1841–1902).

• Modern logic was codified in Principia Mathematica (1910,1912,1913) byBertrand Russell (1872–1970) and Alfred N. Whitehead (1861–1947).

3

The Origins of Proof Theory (Beweistheorie)

• David Hilbert (1862–1943)

• Hilbert’s second problem (1900): Consistency of Analysis

• Hilbert’s Programme (1922,1925)

The Grundlagenkrise: the usual suspects

• Inconsistency in Frege’s Grundlagen.

• Cantor had already observed that in set theory the unrestricted Compre-hension Principle (CP) leads to contradictions. CP allows one to buildsets by collecting all the sets having in common a property P to form a newset

x |P (x).

• Russell’s Paradox (1901)

• Hermann Weyl: “Uber die neue Grundlagenkrise in der Mathematik” (1921)

19th century: Growth of the subject

• Beginning 19th century: mathematics was concrete, constructive, algorithmic

• End of 19th century: Much abstract, non-constructive, non-algorithmic math-ematics was under development

growing preference for short conceptual non-computational proofs over longcomputational proofs.

• Non-euclidian geometries: statements can be true in one geometry andfalse in another.

• But also consolidation: (More) rigorous foundations of analysis: Cauchy (1789-1857), Bolzano (1781-1848), Weierstrass (1815-1897)

4

New (non-constructive) proof methods

• Abstract notion of function (In Euler’s time functions were explicitly de-fined via an analytic expression)

• Indirect existence proofs (Hilbert’s Basis Theorem)

• Zermelo’s proof that R (the reals) can be well-ordered (1904)

Axiom of Choice

Let I be a set. Suppose that Ai is a non-empty set for each i ∈ I. Thenthere exists a function

f : I −→⋃i∈I

Ai

such thatf(i) ∈ Ai

holds for all i ∈ I.

Borel, Baire, Lebesgues against the Axiom of Choice 1905

Borel: It seems to me that the objection against it is also valid forevery reasoning where one assumes an arbitrary choice made an un-countable number of times, for such reasoning does not belongin mathematics.

Acceptance of AC

• By the 1930s AC was widely accepted.

• With AC, every vector space has a basis.

• Let V,W be a vector spaces over same field, u ∈ V, w ∈ W and u,w 6= 0.Then there is a linear mapping f : V→W such that f(v) = w.

Reactions and Cures

• Brouwer (1908) rejects the law of excluded middle(A ∨ ¬A for arbitrary statements A)Intuitionistic Mathematics

• Russell (1908) Vicious Circle Principle

5

• H. Weyl (1885-1955) criticizes impredicative set formation principles

Mathematics .. house build on sand (1918)

Hilbert’s way out

• Platonists, Logicists and Intuitionists seem to agree that a mathemat-ical concept, or sentence, or a theory is acceptable (or properly un-derstood) only if all terms which occur in it can be interpreted directly.

• By contrast, the formalist holds that direct interpretability is not a nec-essary condition for the acceptability of a mathematical theory.

To understand a theory means to be able to follow its logical developmentand not, necessarily, to interpret, or give a denotation for, its individualterms.

Hilbert’s two-tiered approach

1. Interpreted (material ”inhaltlich”) Mathematics: Basic rules of reasoningand arithmetic whose validity is self-evident.

2. Uninterpreted (or formal) mathematics obtained by the adjunction of“ideal” (uninterpreted) elements to material ”inhaltliche” mathemat-ics

In Hilbert’s case, interpreted mathematics was finitistic mathematics whereinreference to actual infinite sets was tabu.

Hilbert’s Program (1922,1925)

• I. Codify the whole of mathematical reasoning in a formal theory T.

• II. Prove the consistency of T by finitistic means.

• “No one shall drive us from the paradise which Cantor has created for us.”

6

Finitism

• The exact meaning of “finitistic means” was never precisely delineated byHilbert.

• Finitistic means form the basis of any scientific reasoning.

• They do not refer to the actual infinite and do not include any objectionableproof methods.

Hilbert’s Ontology

Real Objects: the natural numbers,

finite strings of symbols

(something a computer can deal with)

Ideal objects: the other mathematical objects:

abstract functions, choice functions, Hilbert spaces, ultrafilters, etc.

• Real objects are the main concern of mathematicians.They exist.

• Ideal/abstract objects exist merely as a facon de parler. But they areimportant for the progress of mathematics.

The method of ideal elements

• Solve a mathematical problem regarding a specific mathematical structure byadding new ideal elements to the structure.

• Hilbert: The method of ideal elements is of great importance to the progressof mathematical research.

Examples

Elementary Geometry → Points and lines at ∞→ Projective Geometry

Elementary number theory → number fields, ideals

→ algebraic number theory

Analysis/number theory → Ultrafilter

→ Set theory

7

Indispensable condition

• Hilbert: Es gibt namlich eine Bedingung, eine einzige, aber auch absolutnotwendige, an die die Anwendung der Methode der idealen Elemente geknupftist, und diese ist der Nachweis der Widerspruchsfreiheit: die Erweiterungdurch Zufugung von Idealen ist namlich nur dann statthaft, wenn also dieBeziehungen, die sich bei Elimination der idealen Gebilde fur die alten Gebildeherausstellen, stets im alten Bereiche gultig sind.

• There is just one condition, albeit an absolutely necessary one, connected withthe method of ideal elements. That condition is a proof of consistency, forthe extension of a domain by the addition of ideal elements is legitimate onlyif the extension does not cause contradictions to appear in the old, narrowerdomain, or, in other words, only if the relations that obtain among the oldstructures when the ideal structures are deleted are always valid in the olddomain.

• Another reading of Hilberts Programme:

Elimination of ideal elements.

Maybe we should refrain from ontological talk

• Abraham Robinson (1918-74):

Non-standard analysis (1966)

• this book ... appears to affirm the existence of all sorts of infinitaryentities.

However, from a formalist point of view we may look at our the-ory syntactically and may consider that what we have done is tointroduce new deductive procedures rather than new mathe-matical entities.

Mathematical statements

AAAA

Real statements Ideal statements

Real statements are of the following forms:

∀x1 · · · ∀xr f(x1, .., xr) = g(x1, .., xr);

∀x1 · · · ∀xr f(x1, .., xr) 6= g(x1, .., xr);

∀x1 · · · ∀xr f(x1, .., xr) ≤ g(x1, .., xr)

8

where f, g are basic functions (polynomials) on the naturals.

Examples of real statements

• Goldbach’s conjecture: Every even number n > 2 is the sum of two primes.(Confirmed up to at least 1018).

• Vinogradov’s Three Primes Theorem 1937: Every odd integer > 1013000

is the sum of three primes.

• Fermat’s conjecture ( Wiles’ Theorem 1995) :“For all naturals a, b, c, n, if a · b · c 6= 0 and n > 2 then

an + bn 6= cn .

• Riemann hypothesis All non-trivial zeros s of ζ satisfy Re(s) = 12.

• Four colour theorem

Ideal statements

• The axiom of choice.

• Every vector space has a base.

• If R is a noetherian ring, then so is the polynomial ring R[X].

• (Schroder-Berstein Theorem) If f : X → Y and g : Y → X are bothinjective functions, then there exists a 1-1 correspondence between X and Y .

Example of a real statement proved by using ideal elements

Theorem: 1.1 (Hadamard, de La Vallee Poussin 1896) Prime numbertheorem

limx→∞

π(x)x

ln(x)

= 1

where π(x) = number of prime numbers ≤ x.

The original proof used contour integration of curves over C.

Atle Selberg and Paul Erdos (1949) found proofs using only the means ofelementary number theory.

9

Hilbert’s Conservation Programme

• A consequence of Hilbert’s Programme

• Hilbert’s hope:If a real statement Ψ is provable in non-finitistic mathematics, then Ψ can alsobe proved by purely finitistic means.

THEOREM Let Ψ be a real statement, T a theory, and

F := Finitistic mathematics.

T proves Ψ =⇒ F plus ConT proves Ψ

T ` Ψ =⇒ F + ConT ` Ψ.

Hilbert’s Consistency Proofs

• Grundlagen der Geometrie (1899). Shows the consistency of theories of geome-tries (euclidian and non-euclidian) by reduction to the theory of arithmetic.

• Uber die Grundlagen der Logik und Arithmetik (1904) contains a consistencyproof of a weak theory of arithmetic (an almost equational theory).

• He shows that in this theory one can only deduce homogeneous equations,hence no contradiction.

• Hilbert in lectures 1920,1921. New techniques for consistency proofs. Theε-substitution method. Eliminates quantifiers.Clear distinction between finitistic metatheory and object-theory.

Hilbert School I

• Wilhelm Ackermann (1896–1962): Begrundung des tertium non datur mit-tels der Hilbertschen Theorie der Widerspuchsfreiheit (1925).

• Consistency proof for a theory of arithmetic with second order variables (rang-ing over functions). Function space closed under primitive recursion.

• The proof uses Hilbert’s ε-substitution. Very difficult to follow.

• Proof seems to require a transfinite induction up to ωωω.

• John von Neumann (1903–1957) Zur Hilbertschen Beweistheorie (1927)

10

Hilbert School II

• Gerhard Gentzen (1909–1945)

• Untersuchungen uber das logische Schliessen (1934) Dissertation:

• Introduces the natural deduction system and the sequent calculus. Proves cutelimination.

• Die Widerspruchsfreiheit der reinen Zahlentheorie (1936)

• Proves the consistency of Peano arithmetic.

Herbrand

• Jacques Herbrand (1908–1931)

• Sur la non-contradiction de l’Arithmetique (1931)

The most important structure

• The set of natural numbers N = 0, 1, 2, 3, 4, . . .

with operations of Addition (+) and Multiplication (×) and the less-thanrelation (<):

N = (N; 0, 1,+,×, <)

• Richard Dedekind (1831-1916), Giuseppe Peano (1858-1932)Axiomatization of N: called Peano Arithmetic ( PA)

Usual laws for +,× and <.

• Axiom scheme of mathematical induction.

• Many of the famous theorems and problems of mathematics (including theabove examples) can be formalized as a sentence ϕ of the language of N andthus are equivalent to the question whether N |= ϕ.

Is Ψ true in N?

Axiomatizing the Structure N Peano Arithmetic, PA.

Language of PA :=

Predicate symbols : =, <Function symbols : +, ·, S (Successor)Constant symbols : 0

(N1) ∀x(Sx 6= 0)

(N2) ∀xy[Sx = Sy → x = y]

11

(N3) ∀x[x+ 0 = x]

(N4) ∀xy[x+ Sy = S(x+ y)]

(N5) ∀x[x · 0 = 0]

(N6) ∀xy[x · Sy = (x · y) + x]

(N7) ∀x¬(x < 0)

(N8) ∀xy[x < Sy ↔ x < y ∨ x = y]

(N9) ∀xy[x < y ∨ x = y ∨ y < x]

(IND) ϕ(0) ∧ ∀x[ϕ(x)→ ϕ(Sx)]→ ∀xϕ(x)

12

2 The sequent calculus

Remark: 2.1 The most common logical calculi are Hilbert-style systems. Theyare specified by delineating a collection of schematic logical axioms and some infer-ence rules. The choice of axioms and rules is more or less arbitrary, only subject tothe desire to obtain a complete system. In model theory it is usually enough toknow that there is a complete calculus for first order logic as this already entails thecompactness theorem.

There are, however, proof calculi without this arbitrariness of axioms and rules.The natural deduction calculus and the sequent calculus were both inventedby Gentzen in 1934. Both calculi are pretty illustrations of the symmetries of logic.In this course I shall focus on the sequent calculus since it is a central tool in ordinalanalysis and allows for generalizations to infinitary logics.

Gentzen’s main theorem about the sequent calculus is the Hauptsatz, i.e. cutelimination.

2.1 Languages

As we will also consider intuitionistic theories and the intuitionistic version of thesequent calculus it is in order to spell out what we consider to be the ingredients ofa first order theory.

Definition: 2.2 All first order languages will share the same logical symbols:

∧,∨,→,¬,∀,∃,

bound variablesx0, x1, x2, x3, . . .

and free variablesa0, a1, a2, . . . .

A first order language L is specified by its non-logical symbols. These symbols areseparated into three groups: LC , LF , and LR. LC is the set of constant symbols,LF is the set of function symbols, and LR is the set of relation symbols. Eachfunction symbol f ∈ LF also comes equipped with an arity #f which is a number> 0. Likewise each relation symbol R ∈ LF comes equipped with an arity #R > 0.

The distinction between free and bound variables is not essential but it is ex-tremely useful and simplifies arguments a great deal. Terms can be freely sub-stituted for variables since variables occurring in them are always free and thuscannot be captured by quantifiers. Also the cut elimination theorem to be provedbelow would have to be reformulated in a slightly awkward way. For example,P (x, y)→ ∃y ∃xP (y, x) would not have a cut free proof.

Convention: 2.3 We will use metavariables x, y, z, u, v, . . . , y1, y2, . . . to range overbound variables and a, b, c, d, b1, b2, b3, . . . to range over free variables. We shall usec, d, e, . . . , c0, c1, c2, . . . to range over constants. Variables P,Q,R, S,R0, R1, R2, . . . ,will range over relation symbols while f, g, h, f0, f1, f2, f3, . . . , g0, g1, g2, . . . range overfunction symbols.

13

Definition: 2.4 The terms of L are inductively defined as follows:

1. Every free variable is a term.

2. Every constant symbol (of L) is a term.

3. If f is an n-ary function symbol and s1, . . . , sn are terms then f(s1, . . . , sn) isa term.

Terms are often denoted by t, s, t1, t2, . . ..

The formulas of L are inductively defined as follows:

1. If R is an n-ary relation symbol of L and t1, . . . , tn are terms the R(t1, . . . , tn)is a formula. R(t1, . . . , tn) is called an atomic formula.

2. If A and B are formulas, then so are (¬A), (A ∧B), A ∨B) and (A→ B).

3. If A is a formula, a is a free variable and x is a bound variable not occurringin A, then ∀xA′ and ∃xA′ are formulas, where A′ is the expression obtainedfrom A by replacing a everywhere in A by x.

Henceforth A,B,C, . . . , F,G,H, . . . will be metavariables ranging over formulas.

Definition: 2.5 A formula without free variables will be called a closed formulaor sentence.

In order to emphasize that they belong to a specific language L, a term or formulaof L will sometimes be called an L-term or L-formula.

To increase readability we shall omit parentheses whenever possible. Outerparentheses will always be omitted. We shall observe the following priority rules: ¬takes precedence over each of ∧ and ∨, and each of the latter two takes precedenceover→. For example, ¬A∧B is short for (¬A)∨B, and A∧B → A∨B is short for(A ∧ B) → (A ∨ B). Parentheses will also be omitted in case of double negations:e.g. ¬¬A stands for ¬(¬A). A↔ B is short for (A→ B) ∧ (B → A).

Convention: 2.6 If t is a term, we define the substitution of t for a free variablea by A(t/a). To simplify notation, we adopt the convention that if A is a formulaand s is a term we often write A(s) to refer to the formula A with some (or evenno) occurrences of s in A indicated. If we then write A(t) afterwards in the samecontext we refer to the result of replacing these indicated occurrences of s in A byt.

We say that the variable a is fully indicated in A(a) if all occurrences of a inA are indicated.

2.2 The rules

Definition: 2.7 A sequent (of L) is an expression Γ ⇒ ∆ where Γ and ∆ arefinite sequences of L-formulas A1, . . . , An and B1, . . . , Bm, respectively.

Γ ⇒ ∆ is read, informally, as Γ yields ∆ or, rather, the conjunction of the Aiyields the disjunction of the Bj.

In particular,

14

• If Γ is empty, the sequent asserts the disjunction of the Bj.

• If ∆ is empty, it asserts the negation of the conjunction of the Ai.

• if Γ and ∆ are both empty, it asserts the impossible, i.e. a contradiction.

We use upper case Greek letters Γ,∆,Λ,Θ,Ξ . . . to range over finite sequencesof formulae.

Definition: 2.8 We spell out the axioms and the inference rules of the sequentcalculus.

Identity AxiomA ⇒ A

where A is any formula. In point of fact, we shall limit this axiom to the case ofatomic formulae A.

CUTΓ ⇒ ∆, A A,Λ ⇒ Θ

CutΓ,Λ ⇒ ∆,Θ

A is called the cut formula of the inference.

Structural Rules Exchange, Weakening, Contraction

Γ, A,B,Λ ⇒ ∆ XlΓ, B,A,Λ ⇒ ∆

Γ ⇒ ∆, A,B,Λ XrΓ ⇒ ∆, B,A,Λ

Γ ⇒ ∆ WlΓ, A ⇒ ∆

Γ ⇒ ∆ WrΓ ⇒ ∆, A

Γ, A,A ⇒ ∆ ClΓ, A ⇒ ∆

Γ ⇒ ∆, A,A CrΓ ⇒ ∆, A

LOGICAL INFERENCES

Negation

Γ ⇒ ∆, A¬L¬A,Γ ⇒ ∆

B,Γ ⇒ ∆¬R

Γ ⇒ ∆,¬B

Implication

Γ ⇒ ∆, A B,Γ ⇒ Θ→ L

A→ B,Γ ⇒ ∆,Θ

A,Γ ⇒ ∆, B→ R

Γ ⇒ ∆, A→ B

15

Conjunction

A,Γ ⇒ ∆∧L1

A ∧B,Γ ⇒ ∆

B,Γ ⇒ ∆∧L2

A ∧B,Γ ⇒ ∆

Γ ⇒ ∆, A Γ ⇒ ∆, B∧R

Γ ⇒ ∆, A ∧BDisjunction

A,Γ ⇒ ∆ B,Γ ⇒ ∆∨L

A ∨B,Γ ⇒ ∆

Γ ⇒ ∆, A∨R1

Γ ⇒ ∆, A ∨BΓ ⇒ ∆, B

∨R2Γ ⇒ ∆, A ∨B

Quantifiers

F (t),Γ ⇒ ∆∀L∀xF (x),Γ ⇒ ∆

Γ ⇒ ∆, F (a)∀R

Γ ⇒ ∆, ∀xF (x)

F (a),Γ ⇒ ∆∃L∃xF (x),Γ ⇒ ∆

Γ ⇒ ∆, F (t)∃R

Γ ⇒ ∆, ∃xF (x)

In ∀L and ∃R, t is an arbitrary term. The variable a in ∀R and ∃L is an eigenvariableof the respective inference, i.e. a is not to occur in the lower sequent.

Definition: 2.9 The formulae in a logical inference marked blue are called theminor formulae of that inference, while the red formula is the principal formula ofthat inference. The other formulae of an inference are called side formulae.

A proof (aka deduction or derivation) D is a tree of sequents satisfying the followingconditions:

• The topmost sequents of D are identity axioms.

• Every sequent in D except the lowest one is an upper sequent of an inferencewhose lower sequent is also in D.

Definition: 2.10 (The INTUITIONISTIC case.) The intuitionistic sequentcalculus is obtained by requiring that all sequents be intuitionistic. A sequentΓ ⇒ ∆ is said to be intuitionistic if ∆ consists of at most one formula.

Specifically, in the intuitionistic sequent calculus there are no inferences correspond-ing to contraction right or exchange right.

16

Our first example is a deduction of the law of excluded middle.

A ⇒ A ¬R⇒ A,¬A∨R⇒ A, A ∨ ¬A Xr⇒ A ∨ ¬A, A∨R⇒ A ∨ ¬A, A ∨ ¬A Cr⇒ A ∨ ¬A

Notice that the above proof is not intuitionistic since it involves sequents that arenot intuitionistic.

The second example is an intuitionistic deduction.

F (a) ⇒ F (a)∃R

F (a) ⇒ ∃xF (x)¬L¬∃xF (x), F (a) ⇒Xl

F (a), ¬∃xF (x) ⇒¬L¬∃xF (x) ⇒ ¬F (a)∀R¬∃xF (x) ⇒ ∀x¬F (x)→R⇒ ¬∃xF (x)→ ∀x¬F (x)

Convention: 2.11 Logics without (some of the) structural rules became importantin the 1980s. In particular Linear Logic attracted a great deal of attention backthen. For our purposes the structural rules just add an additional layer of bureau-cracy. We would really like to sweep them under the carpet. We will achieve this byidentifying a sequence of formulas A1, . . . , An with the set of formulas A1, . . . , An.Henceforth variables ∆,Γ,Λ, . . . will range over finite sets of formulas. We will inter-pret a comma between these sets as set-theoretic union. Thus Γ,∆ stands for Γ∪∆.We also adopt the convention that Γ, A stands for Γ ∪ A. Likewise A1, . . . , Anstands for A1, . . . , An and Γ,∆, A stands for Γ ∪∆ ∪ A etc.

Since in the curly bracket notation A1, . . . , An the ordering of the formulasdoes not matter and repeating a formula doesn’t make a difference, this will takecare of the exchange and the contraction rules automatically.

This still leaves the weakening rules. However, we are going to ditch themcompletely in the classical case since it is always possible to add more side formulasalready at the leaves of a proof tree. Thus we adopt as Axioms all sequents of theform

Γ, A ⇒ ∆, A

where A is an atomic formula. Thus, henceforth we no longer consider explicitstructural rules in the classical case.

The left rule for → can be simplified a bit in the classical case. Henceforth weadopt this rule:

Γ ⇒ ∆, A B,Γ ⇒ ∆→ L

A→ B,Γ ⇒ ∆

17

while the intuitionistic rule takes the form

Γ ⇒ A B,Γ ⇒ ∆→ L

A→ B,Γ ⇒ ∆

with ∆ containing at most one formula.In the intuitionistic case, we shall also ditch the structural rules with one excep-

tion. Here the Axioms will be all the sequents of the form

∆, A ⇒ A

with A atomic. As a result we no longer need the left weakening rule. However westill need the right weakening rule that is from

Γ ⇒

we may inferΓ ⇒ B

for any formula B. This rule could also be called ex falso quodlibet.

Definition: 2.12 A sequent deduction D is a proof tree and we can measure a treeby its height, i.e. its longest branch. We use |D| to denote the height of D.

We shall use the notation Γ ⇒ ∆ to express that there is a deduction ofΓ ⇒ ∆ while

nΓ ⇒ ∆

is used to convey that there is a deduction of Γ ⇒ ∆ with height ≤ n.We use

In

Γ ⇒ ∆

to convey that that there is a deduction of Γ ⇒ ∆ with height ≤ n in the intu-itionistic sequent calculus, and I Γ ⇒ ∆ to say that there is an intutitionisticdeduction.

The length |A| of a formula A is defined as follows: |A| = 0 if A is atomic.|¬A| = |A| + 1, |A♦B| = max(|A|, |B|) + 1 if ♦ is one of the connectives ∨,∧,→,|∃xA| = |A|+ 1, |∀xA| = |A|+ 1.

We writen

kΓ ⇒ ∆

if there is a deduction of Γ ⇒ ∆ of height ≤ n such that all cuts in this deductionhave cut formulas with length < k.

In

kΓ ⇒ ∆ is defined similarly.

Lemma: 2.13 For every formula A there is an intuitionistic deduction of A ⇒ A.

Proof: Exercise. utWe list some technical lemmata that will be useful for proving cut elimination.

Lemma: 2.14 (Substitution) Let Γ(a) and ∆(a) be sets of formulas with all oc-currences of a indicated. Let s be an arbitrary term.

18

(i) Ifn

kΓ(a) ⇒ ∆(a) , then

n

kΓ(s) ⇒ ∆(s) .

(ii) If In

kΓ(a) ⇒ ∆(a) , then I

n

kΓ(s) ⇒ ∆(s) .

Lemma: 2.15 (Weakening) (i) Ifn

kΓ ⇒ ∆ , then

n

kΓ,Γ′ ⇒ ∆,∆′ .

(ii) If In

kΓ ⇒ ∆ , then I

n

kΓ,Γ′ ⇒ ∆ .

Proof: Just add Γ′ and ∆′ to all sequents in the deduction. Formally one provesthis by induction on n. In the cases of quantifier rules with eigenvariable conditionsone might have to replace these variables by ‘fresh’ ones, using Lemma 2.14. ut

Lemma: 2.16 (Inversion) (i) Ifn

kΓ, A ∧B ⇒ ∆ then

n

kΓ, A,B ⇒ ∆ .

(ii) Ifn

kΓ ⇒ ∆, A ∧B then

n

kΓ ⇒ ∆, A and

n

kΓ ⇒ ∆, B .

(iii) Ifn

kΓ, A ∨B ⇒ ∆ then

n

kΓ, A ⇒ ∆ and

n

kΓ, B ⇒ ∆ .

(iv) Ifn

kΓ ⇒ ∆, A ∨B then

n

kΓ ⇒ ∆, A,B .

(v) Ifn

kΓ ⇒ A→ B,∆ then

n

kA,Γ ⇒ ∆, B .

(vi) Ifn

kΓ, A→ B ⇒ ∆ then

n

kΓ ⇒ ∆, A and

n

kΓ, B ⇒ ∆ .

(vii) Ifn

kΓ ⇒ ¬A,∆ then

n

kΓ, A ⇒ ∆ .

(viii) Ifn

kΓ,¬A ⇒ ∆ then

n

kΓ ⇒ ∆, A .

(ix) Ifn

kΓ ⇒ ∆, ∀xB(x) then

n

kΓ ⇒ ∆, B(s) for any term s.

(x) Ifn

kΓ,∃xB(x) ⇒ ∆ then

n

kΓ, B(s) ⇒ ∆ for any term s.

(xi) With the exception of (iv), (vi) and (viii) the above inversion properties remainvalid for the intuitionistic sequent calculus. One half of (vi) also remains validintutionistically:

If In

kΓ, A→ B ⇒ ∆ then I

n

kΓ, B ⇒ ∆ .

Proof: All are provable by easy inductions on n. utWe have laid the groundwork for cut elimination.Here is an example of how to eliminate cuts of a special form:

A,Γ ⇒ ∆, B→R

Γ ⇒ ∆, A→ BΛ ⇒ Θ, A B,Ξ ⇒ Φ

→LA→ B,Λ,Ξ ⇒ Θ,Φ

CutΓ,Λ,Ξ ⇒ ∆,Θ,Φ

is replaced by

Λ ⇒ Θ, A A,Γ ⇒ ∆, BCut

Λ,Γ ⇒ Θ,∆, B B,Ξ ⇒ ΦCut

Γ,Λ,Ξ ⇒ ∆,Θ,Φ

19

So we have replaced a cut with cut formula A→ B by cuts with formulas of smallerlength. By doing this systematically we arrive at the Reduction Lemma. Well,actually it is not that easy when contractions are involved, i.e. when the principalformula of an inference is also a side formula:

A,Γ ⇒ ∆, B,A→ B→R

Γ ⇒ ∆, A→ BΛ, A→ B ⇒ Θ, A B,Ξ, A→ B ⇒ Φ

→LA→ B,Λ,Ξ ⇒ Θ,Φ

CutΓ,Λ,Ξ ⇒ ∆,Θ,Φ

Lemma: 2.17 (Reduction) Suppose k ≤ |C|. Ifn

kΓ, C ⇒ ∆ and

m

kΞ ⇒ Θ, C ,

then2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ .

Proof: Of course we could derive Γ,Ξ ⇒ ∆,Θ by an application of the cut rule,but the resulting derivation would have cut rank |C|+ 1.

The proof is by induction on n + m. Let D1 be a derivation of Γ, C ⇒ ∆ withcut rank ≤ k and length ≤ n. Likewise let D2 be a derivation of Ξ ⇒ C,Θ withcut rank ≤ k and length ≤ m.

Case 1: Γ, C ⇒ ∆ is an axiom whose principal formula is not C, i.e., Γ = Γ′, Aand ∆ = ∆′, A for some atom A. Then Γ,Ξ ⇒ ∆,Θ is an axiom too and the desiredassertion follows.

Similarly, if Ξ ⇒ Θ, C is an axiom whose principal formula is different from Cthen Ξ ⇒ Θ is an axiom and so is Γ,Ξ ⇒ ∆,Θ.

Case 2: Both Γ, C ⇒ ∆ and Ξ ⇒ Θ, C are axioms with principal formula C.Then ∆ = ∆′, C and Ξ = Ξ′, C for some ∆′ and Ξ′. Hence Γ,Ξ ⇒ ∆,Θ is an axiomas well.

Henceforth we may assume that Γ, C ⇒ ∆ or Ξ ⇒ Θ, C is not an axiom. Henceat least one of the derivations ends with an inference which will be called its lastinference.

Case 3: D1 ends with an inference whose principal formula is different from C.Then the premisses of the last inference are of the form

Γi, C ⇒ ∆i

and we haveni

kΓi, C ⇒ ∆i where ni < n. Since ni +m < n+m we can apply the

induction hypothesis to the premisses and obtain

2(ni+m)

|C|Γi,Ξ ⇒ ∆i,Θ .

By applying the same inference we get2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ . If the last inference

comes with an eigenvariable condition it might be necessary to substitute a newvariable. But by Lemma 2.14 this can be done without increasing length and cutrank of derivations.

Case 4: D2 ends with an inference whose principal formula is different from C.This is analogous to the previous case.

We may from now on assume that C is the principal formula of the last inference of

20

both D1 and D2. In particular C is not an atom.

Case 5: C is of the form A ∧B. Then we have

n1

kΓ, C, A ⇒ ∆ (1)

or

n1

kΓ, C,B ⇒ ∆ (2)

as well as

m1

kΞ ⇒ Θ, C, A (3)

and

m2

kΞ ⇒ Θ, C,B (4)

for some n1 < n and m1,m2 < m. Note that C could have been a side formula ofany of the last inferences of D1 and D2, and, moreover, that by weakening (Lemma2.15) we can always add C as a side formula without increasing the length or thecut rank of the derivation.

If (1) obtains we apply the induction hypothesis with (1) andm

kΞ ⇒ Θ, C to

arrive at

2(n1+m)

|C|Γ,Ξ, A ⇒ ∆,Θ . (5)

Applying the Inversion Lemma 2.16 (ii) to (3) we have

m1

kΞ ⇒ Θ, A . (6)

Cutting A out of (5) and (6) gives the desired

2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ

since |A| < |C|.If (2) obtains we apply the induction hypothesis with (2) and

m

kΞ ⇒ Θ, C to

arrive at

2(n1+m)

|C|Γ,Ξ, B ⇒ ∆,Θ . (7)

Applying the Inversion Lemma 2.16 (ii) to (4) we have

m1

kΞ ⇒ Θ, B . (8)

Cutting B out of (7) and (8) gives the desired2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ .

Case 6: C is of the form ∀xA(x). Then we have

n1

kΓ, C, A(s) ⇒ ∆ (9)

21

and

m1

kΞ ⇒ Θ, C, A(a) (10)

for some n1 < n and m1 < m with a being an eigenvariable. Applying the inductionhypothesis to (9) and

m

kΞ ⇒ Θ, C we get

2(n1+m)

|C|Γ,Ξ, A(s) ⇒ ∆,Θ . (11)

By applying first inversion (Lemma 2.16) to (10) and subsequently substitution(Lemma 2.14) (or the other way round) we get

m1

kΞ ⇒ Θ, A(s) . (12)

A cut performed on (11) and (12) yields2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ .

Case 7: C is of the form A→ B. Then we have

n1

kΓ, C ⇒ ∆, A (13)

and

n2

kΓ, C,B ⇒ ∆ (14)

as well as

m1

kΞ, A ⇒ Θ, C,B . (15)

for some n1, n2 < n and m1 < m.(13) can be linked up with

m

kΞ ⇒ Θ, C to furnish a pair to which we can apply

the induction hypothesis. Whence we get

2(n1+m)

|C|Γ,Ξ ⇒ ∆,Θ, A . (16)

Another pair to which we can apply the induction hypothesis is given by (15) andm

kΓ, C ⇒ ∆ . Thus

2(n+m1)

|C|Γ,Ξ, A ⇒ ∆,Θ, B . (17)

Applying a cut to (17) and (16) yields

max(2(n+m1),2(n1+m))+1

|C|Γ,Ξ ⇒ ∆,Θ, B . (18)

Applying the Inversion Lemma 2.16 (xi) to (14) yields

n1

kΓ, B ⇒ ∆ . (19)

Cutting out B from (18) and (19) we arrive at

max(2(n+m1),2(n1+m))+2

|C|Γ,Ξ ⇒ ∆,Θ . (20)

22

As max(2(n+m1), 2(n1 +m)) + 2 ≤ 2(n+m) we get the desired result from (20).

Case 8: C is of the form A ∨B. Then we have

n1

kΓ, C, A ⇒ ∆ (21)

and

n2

kΓ, C,B ⇒ ∆ (22)

and also

m1

kΞ ⇒ Θ, C, A (23)

or

m1

kΞ ⇒ Θ, C,B (24)

for some n1, n2 < n and m1 < m. To (21) andm

kΞ ⇒ Θ, C we apply the induction

hypothesis to arrive at

2(n1+m)

|C|Γ,Ξ, A ⇒ ∆,Θ . (25)

To (22) andm

kΞ ⇒ Θ, C we apply the induction hypothesis to arrive at

2(n2+m)

|C|Γ,Ξ, B ⇒ ∆,Θ . (26)

From (23) as well as (24) we get

m1

kΞ ⇒ Θ, A,B (27)

by the Inversion Lemma 2.16 (iv). Cutting A out of (25) and (27) yields

2(n1+m)+1

|C|Γ,Ξ ⇒ ∆,Θ, B . (28)

Performing a cut on (26) and (28) gives

2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ .

Case 9: C is of the form ∃xA(x). Then we have

n1

kΓ, C, A(a) ⇒ ∆ (29)

and

m1

kΞ ⇒ Θ, C, A(s) (30)

23

for some n1 < n and m1 < m with a being an eigenvariable. Applying the inductionhypothesis with (30) and

n

kΓ, C ⇒ ∆ we get

2(n+m1)

|C|Γ,Ξ ⇒ ∆,Θ, A(s) . (31)

By applying first inversion (Lemma 2.16) to (29) and subsequently substitution(Lemma 2.14) (or the the other way round) we get

n1

kΓ, A(s) ⇒ Θ . (32)

A cut performed on (31) and (32) yields2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ .

Case 10: C is of the form ¬A. Then we have

n1

kΓ, C ⇒ ∆, A (33)

and

m1

kΞ, A ⇒ Θ, C . (34)

for some n1 < n and m1 < m. The induction hypothesis applies to (33) andm

kΞ ⇒ Θ, C , furnishing

2(n1+m)

|C|Γ,Ξ ⇒ ∆,Θ, A . (35)

Now apply the Inversion Lemma 2.16 (vii) to (34) to get

m1

kΞ, A ⇒ Θ . (36)

Cutting out A from (35) and (36) we arrive at

2(n+m)

|C|Γ,Ξ ⇒ ∆,Θ .

ut

Theorem: 2.18 (Cut Reduction) Ifn

k+1Γ ⇒ ∆ then

4n

kΓ ⇒ ∆ .

Proof: We use induction on n. Suppose D is a derivation of Γ ⇒ ∆ with length≤ n and cut rank ≤ k + 1. If Γ ⇒ ∆ is an axiom then we clearly get the desiredresult. So let’s assume that Γ ⇒ ∆ is not an axiom. Then D has a last inference (I)with premisses Γi ⇒ ∆i. Suppose the inference was not a cut or a cut of a degree< k. We then have

ni

kΓi ⇒ ∆i for some ni < n. By the induction hypothesis

we have4ni

kΓi ⇒ ∆i . Applying the same inference (I) yields

4n

kΓ ⇒ ∆ since

4ni < 4n.Now suppose the last inference was a cut with a cut formula C satisfying |C| = k.

By the induction hypothesis we have

4n1

kΓ, C ⇒ ∆

24

and4n2

kΓ ⇒ ∆, C

for some n1, n2 < n. We can then apply the Reduction Lemma 2.17 to these deriva-

tions and arrive at2(4n1+4n2 )

kΓ ⇒ ∆ . Since 2(4n1 +4n2) ≤ 4n the desired conclusion

follows. ut

Corollary: 2.19 (Gentzen’s Hauptsatz) Let 4m0 = m and 4mr+1 = 44mr .

Ifn

kΓ ⇒ ∆ then

4nk

0Γ ⇒ ∆ .

As a result, there is a cut free derivation of Γ ⇒ ∆.

Proof: Just apply the previous result k times. Formally that is an induction onk. ut

Definition: 2.20 For a formula A we define its set of subformulae, Subf(A) asfollows: If A is an atom then Subf(A) = A. Subf(¬A) = Subf(A) ∪ ¬A.Subf(A♦B) = Subf(A) ∪ Subf(B) ∪ A♦B if ♦ is one of the connectives ∧,∨,→.

Subf(QxF (x)) = QxF (x) ∪⋃

s∈Term

Subf(F (s))

where Q is ∀ or ∃ and Term is the set of terms.B is said to be a subformula of A if B ∈ Subf(A).

Corollary: 2.21 (The subformula property) The Hauptsatz 2.19 has an im-portant corollary.

If a sequent Γ ⇒ ∆ is deducible, then it has a deduction such that every formulaoccurring in it is a subformula of some formula in γ ∪∆.

Proof: Take a cut free proof of Γ ⇒ ∆. Then it’s clear the the entire deduction ismade of subformulas of formulas in Γ and ∆. ut

Corollary: 2.22 A contradiction, i.e. the empty sequent cannot be deduced.

Proof: The empty sequent cannot have a cut free deduction. What could havebeen the last inference? ut

2.3 Cut elimination for the intuitionistic sequent calculus

Lemma: 2.23 (Reduction) Suppose k ≤ |C|. If In

kΓ, C ⇒ ∆ and I

m

kΞ ⇒ C ,

thenI

2(n+m)

|C|Γ,Ξ ⇒ ∆ .

Proof: The proof is similar to the classical case (Lemma 2.17). ut

Corollary: 2.24 (Gentzen’s Hauptsatz) Let 4m0 = m and 4mr+1 = 44mr .

If In

kΓ ⇒ ∆ then I

4nk

0Γ ⇒ ∆ .

As a result, there is a cut free intuitionistic derivation of Γ ⇒ ∆.

25

3 Consequences of the Hauptsatz

Definition: 3.1 A formula is said to be existential if it is quantifier free or of theform ∃x1 . . . ,∃xr B(x1, . . . , xr) with B(a1, . . . , br) quantifier free.

Note that a subformula of an existential formula is existential too.

Lemma: 3.2 Suppose that Γ consists of quantifier free formulae and ∆ consistsentirely of existential formulae. Let ∃xC(x) be an existential formula. If

Γ ⇒ ∆,∃xC(x)

then there exist terms t1, . . . , tk such that

Γ ⇒ ∆, C(t1), . . . , C(tk) .

Proof: By the Hauptsatz we have a cut free deduction D of Γ ⇒ ∆, ∃xC(x). Weproceed by induction on n = |D|. If n = 0 then Γ ⇒ ∆ is already an axiom. Nowlet n > 0. The D ended with an inference. First suppose the last inference of D doesnot have ∃xC(x) as principal formula. Then its premisses are of the form Γi ⇒∆i,∃xC(x). Note that the formulae of Γi must also be quantifier free and those in∆i must be existential too. Let’s assume we have two premisses. Inductively wethen have Γi ⇒ ∆i, C(ti1), . . . , C(tiri) for some terms and by applying weakeningand the same inference we get

Γ ⇒ ∆, C(t11), . . . , C(t1r1), C(t21), . . . , C(t2r2) .

If ∃xC(x) is the principal formula of the last inference of D then this must havebeen ∃R and its premiss is of the form Γ ⇒ ∆, ∃xC(x), C(t) for some term t.Inductively we have terms t′1, . . . , t

′l such that Γ ⇒ ∆, C(t′1), . . . , C(t′l), C(t) and

we are done. utWe shall sometimes write ` ∆ and I ` ∆ for ⇒ ∆ and I ⇒ ∆ , respec-

tively.

Theorem: 3.3 (Herbrand’s Theorem) If A(~a,~b ) is quantifier free and

` ∀~x∃~y A(~x, ~y )

then there are finitely many term tuples t1, . . . , tn each of the same length as ~b whosefree variables are among ~a such that

` A(~a, t1) ∨ . . . ∨ A(~a, tn).

Proof: Using inversion 2.16 (ix) several times we have ` ∃~y A(~a, ~y ). Now useLemma 3.2 several times followed by several ∨R inferences. ut

The intuitionistic case is much easier to prove.

Lemma: 3.4 If I ∃y F (y) then I F (t) for some term t.

26

Proof: We have In

0∃y F (y) for some n. The last inference of the pertaining deduc-

tion must have been ∃R. Hence In−1

0F (t) for some term t since in the intuitionistic

case we can not have side formulas in the antecedent. ut

Corollary: 3.5 IfI ∀~x∃~y A(~x, ~y )

then there exists a term tuple t of the same length as ~b whose free variables areamong ~a such that

I A(~a, t) .

Proof: Use ∀-inversion and apply the previous Lemma several times. ut

Examples: 3.6 In the classical case we cannot always find a single term as thefollowing example demonstrates. Let L be a language that has two constants 0, 1and two unary predicate symbols P and R. Then in classical logic we have

` ∃y [(P (0)→ R(0)) ∧ (¬P (0)→ R(1)) → R(y)]

but we can not prove

(P (0)→ R(0)) ∧ (¬P (0)→ R(1)) → R(t)

for any term t. (Exercise)

Definition: 3.7 A theory T is a set of sentences, called its axioms. T is said to beuniversal (or open) if all of its axioms are of the form ∀~xA(~x ) with A(~a ) quantifierfree.

If ~s is a tuple of terms (of the same length as ~x) then A(~s ) will be called asubstitution instance of ∀~xA(~x ).

Theorem: 3.8 (Hilbert-Ackermann Consistency Theorem) A universal the-ory T is inconsistent iff there is a tautology which is a disjunction of negations ofsubstitution instances of the axioms of T . In other words T is inconsistent iff thereare substitution instances B1, . . . , Bn of axioms of T such that

` ¬B1 ∨ . . . ∨ ¬Bn .

Proof: Clearly if ` ¬B1 ∨ . . . ∨ ¬Bn holds then T must be inconsistent since Tproves each Bi. Conversely, if T is inconsistent then there are finitely many axiomsA1, . . . , An of T such that

A1, . . . , An ⇒ . (37)

Each Ai is of the form ∀~xCi(~x ) with Ci(~a ) quantifier free. By applying ¬R to (37)n times we obtain

⇒ ¬A1, . . . ,¬An. (38)

Since ` ¬Ai ⇒ ∃~x¬Ci(~x ) holds for all i we can employ n cuts to (38) to arrive at

⇒ ∃~x¬C1(~x ), . . . ,∃~x¬Cn(~x ). (39)

Now apply Lemma 3.2 to (39) several times to get rid of the existential quantifiersand subsequently apply ∨R several times to get the desired result. ut

27

Remark: 3.9 There are many examples of universal theories: the theory of equality,the theory of groups with a constant symbol for the neutral element and a functionsymbol for the inverse operation, the theory of linear orderings and many equationaltheories.

Next we will turn to a richer class of theories, the so-called geometric theories.

Definition: 3.10 The geometric formulae are inductively defined as follows: Ev-ery atom is a geometric formula. If A and B are geometric formulae then so areA ∨B, A ∧B and ∃xA.

Another way of saying this is that a formula is geometric iff it does not containany of the particles →,¬,∀.

A formula is called a geometric implication if it is of either form ∀~xA or ∀~x¬Aor ∀~x (A→ B) with A and B being geometric formulae. Here ∀~x may be empty. Inparticular geometric formulae and their negations are geometric implications.

A theory is geometric if all its axioms are geometric implications.

Examples: 3.11 (i) 1. Robinson arithmetic. The language has a constant 0, aunary successor function suc and binary functions + and ·. Axioms are theequality axioms and the universal closures of the following.

1. ¬suc(a) = 0.

2. suc(a) = suc(b)→ a = b.

3. a = 0 ∨ ∃y a = suc(y).

4. a+ 0 = a.

5. a+ suc(b) = suc(a+ b).

6. a · 0 = 0.

7. a · suc(b) = a · b+ a

A classically equivalent axiomatization is obtained if (3) is replaced by

¬a = 0→ ∃y a = suc(y)

but this is not a geometric implication.

(ii) The theories of groups, rings, and local rings have geometric axiomatizations.

(iii) The theories of fields, ordered fields, algebraically closed fields and real closedfields have geometric axiomatizations.

To express algebraic closure replace axioms

s 6= 0 → ∃x sxn + t1xn−1 + . . .+ tn−1x+ tn = 0

bys = 0 ∨ ∃x sxn + t1x

n−1 + . . .+ tn−1x+ tn = 0

where sxk is short for s · x · . . . · x with k many x.

(iv) The theory of projective geometry has a geometric axiomatization.

28

We want to show that a geometric implication which is classically deducible in ageometric theory T is also intuitionistically deducible in T . We need some simpleobservations.

Lemma: 3.12 Let π 1, . . . , n → 1, . . . , n be a bijection.

1. If I Γ ⇒ A1 ∨ . . . ∨ An then I Γ ⇒ Aπ(1) ∨ . . . ∨ Aπ(n) .

2. If I Γ ⇒ D ∨ F (s) then I Γ ⇒ D ∨ ∃xF (x) .

3. If I Γ ⇒ D ∨ B and I Γ ⇒ D ∨ C then I Γ ⇒ D ∨ (B ∧ C) .

4. If I Γ ⇒ A ∨B then I Γ,¬A ⇒ B .

5. If I Γ, B ⇒ C and I Γ ⇒ C ∨ A then I Γ, A→ B ⇒ C .

Proof: Exercise. ut

Lemma: 3.13 For a finite set of formulas ∆ = A1, . . . , An let∨

∆ be the formulaA1 ∨ . . . ∨ An. If ∆ is empty then

∨∆ is the empty set.

Let Γ be a finite set of geometric implications and ∆ be a finite set of geometricformulas.

If Γ ⇒ ∆ then I Γ ⇒∨

∆ .

Proof: Let D be a cut free deduction of Γ ⇒ ∆. The proof proceeds by inductionon n = |D|. If Γ ⇒ ∆ is an axiom then there exists an atom A such that A ∈ Γ∩∆.If ∆ has no other formulae we are done. If there are other formulae in ∆, sayD1, . . . , Dk, then apply ∨R k times to arrive at I Γ ⇒

∨∆ .

Let n > 0. We inspect the last inference of D. Note that ∀R,¬R and → R areruled out since they have non-geometric principal formulas. If the last inference was∀L, ∃L, ∧L, or ∨L we can simply apply the induction hypothesis to the premissesand re-apply the same inference.

If the last inference was ∃R apply the induction hypothesis to its premiss andsubsequently use Lemma 3.12 (2) to get the desired result.

If the last inference was ∧R apply the induction hypothesis to its premisses andsubsequently use Lemma 3.12 (3).

If the last inference was ¬L then its minor formula must be geometric. Thenapply the induction hypothesis to its premiss and subsequently use Lemma 3.12 (4).

If the last inference was→ L then apply the induction hypothesis to its premissesand subsequently use Lemma 3.12 (5). ut

Theorem: 3.14 Let T be a geometric theory and suppose that there is a classicalproof of a geometric implication G in T . Then there is an intuitionistic proof of Gfrom the axioms of T .

Proof: G is of the form ∀~xF (~x ) where F (~a ) is a geometric formula or the negationof a geometric formula or an implication of two geometric formulae.

We haveA1, . . . , Ak ⇒ G

29

for some axioms A1, . . . , Ak of T . Using the Inversion Lemma 2.16 (ix) we get

A1, . . . , Ak ⇒ F (~a ) .

If F (~a ) is geometric we obtain I A1, . . . , Ak ⇒ F (~a ) by Lemma 3.13 so that via(several) ∀R inferences we arrive at the desired result.

If F (~a ) is of the form ¬F0(~a ) with F0(~a ) geometric we apply the InversionLemma 2.16 (vii) to get

A1, . . . , Ak, F0(~a ) ⇒ .

By Lemma 3.13 we infer that I A1, . . . , Ak, F0(~a ) ⇒ and thus, by ¬R, we have

I A1, . . . , Ak ⇒ ¬F0(~a )

so that via ∀R we arrive at I A1, . . . , Ak ⇒ ∀~x¬F0(~x ) .If F (~a ) is of the form F0(~a )→ F1(~a ) with Fi(~a ) geometric we apply the Inversion

Lemma 2.16 (v) to get

A1, . . . , Ak, F0(~a ) ⇒ F1(~a ) .

By Lemma 3.13 we infer that I A1, . . . , Ak, F0(~a ) ⇒ F1(~a ) . Via → R we getI A1, . . . , Ak ⇒ F0(~a )→ F1(~a ) and via ∀R we arrive at

I A1, . . . , Ak ⇒ ∀~x (F0(~x )→ F1(~x )) .

utThe previous result can be extended to infinitary languages which accommodate

infinite disjunctions∨

Φ and conjunctions∧

Φ, where Φ is set of (infinitary) formulassuch that the total number of variables (free and bounded) occurring in the formulasof Φ is finite. In this richer language a formula is said to be coherent if in addition to∨,∧, ∃ one also allows infinite disjunctions

∨Φ, where Φ is already a set of coherent

formulas satisfying the above proviso on the number of variables. Then a theoremsimilar to 3.14 can be shown for coherent theories, that is theories axiomatized bycoherent implications.

An example of an axiom expressible in this richer language via a coherent impli-cation is the Archimedian axiom:

∀x (x < 1 ∨ x < 1 + 1 ∨ . . . ∨ x < 1 + . . .+ 1 ∨ . . .)

or in more compact way:

∀x∨n∈N

x < n.

Geometric theories are quite ubiquitous. There exists a simple method whichis sometimes called Morleyisation (in honour of the logician Michael Morley) bywhich every theory can be given a geometric axiomatization in a richer language.The technique actually goes back to Skolem. Albeit Skolemization would be moreappropriate that name is already used for something else. Wilfrid Hodges called theprocedure to find a ∀∃ axiomatization in a richer language atomization.

30

Definition: 3.15 Below ∀~x (A1(~x ) A2(~x )) will stand for two formulas namely∀~x (A1(~x )→ A2(~x )) and ∀~x (A2(~x )→ A1(~x )).

Let T be a theory in a first order language L. For each formula A(a1, . . . , an) ofL with all free variables indicated we add two new n-ary relation symbols PA(~a ) andNA(~a ) to the language, where ~a = a1, . . . , an. Call the new language La. The theoryT a in the language La has the following axioms:

1. ∀~x¬(PA(~a )(~x) ∧ NA(~a )(~x)).

2. ∀~x (PA(~a )(~x) ∨ NA(~a )(~x)).

3. If A(~a ) is atomic add the axioms ∀~x (PA(~a )(~x) A(~x )).

4. If A(~a ) is B(~a ) ∧ C(~a ) add ∀~x (PA(~a )(~x ) PB(~a )(~x ) ∧ PC(~a )(~x )).

5. If A(~a ) is B(~a ) ∨ C(~a ) add ∀~x (PA(~a )(~x ) PB(~a )(~x ) ∨ PC(~a )(~x )).

6. If A(~a ) is ¬B(~a ) add ∀~x (PA(~a )(~x ) NB(~a)(~x )).

7. If A(~a ) is B(~a )→ C(~a ) add ∀~x (PA(~a )(~x ) NB(~a )(~x ) ∨ PC(~a )(~x )).

8. If A(~a ) is ∃yB(~a, y) add ∀~x (PA(~a )(~x ) ∃y PB(~a,b)(~x, y)).

9. If A(~a ) is ∀yB(~a, y) add ∀~x (NA(~a )(~x ) ∃y NB(~a,b)(~x, y)).

10. Finally, for each axiom ∀~xA(~x ) of T add ∀~xPA(~a)(~x ) as an axiom to T a.

Clearly T a is a geometric theory.

Theorem: 3.16 Let T and T a as above.

(i) For every formula A(~a ) of L with all free variables indicated,

T a ` ∀~x [A(~x ) ↔ PA(~a)(~x )].

(ii) Every model A of T can be expanded in just one way to an La-structure Aa

which is a model of T a.

(iii) T a is conservative over T , that is, for every L-sentence B,

T ` B iff T a ` B.

Proof: Exercise. ut

31

4 Ordinal functions and representation systems

The strength of appropriate theories can be aptly measured via transfinite ordinals.To be able to denote these ordinals and have a sufficient supply of them we shallgo beyond the operations of addition, multiplication and exponentiation on ordinalsand study a hierarchy of functions introduced by O. Veblen in 1908. In what followswe will work informally in a sufficiently strong classical set theory, e.g. ZF. Lowercase Greek letters α, β, γ, δ, . . . will be assumed to range over the class of ordinalsON. 0 is the smallest ordinal. Every ordinal α has a successor which we denote byα+1, i.e., α+1 is the smallest ordinal that is bigger than α. An ordinal of the formα+ 1 is a successor ordinal or just a successor. A limit ordinal or just a limitis an ordinal which is not a successor and > 0. We denote the ordering of ordinalsby < and the less-than-or-equal relation by ≤.

As per usual we identify an ordinal α with the set β | β < α. In set theory anordinal is defined to be a transitive set whose elements are transitive too. Moreover,< on ordinals coincides with ∈ and thus α = β | β ∈ α.

Some crucial properties about ordinals that we shall assume are the following.

Postulates: 4.1 (Ordinals)

(O1) < is a total linear ordering on ON, i.e. α 6< α and α < β ∨ β < α ∨ α = βhold for all α and β.

(O2) Every non-empty class X of ordinals contains a least element (necessarilyunique), i.e., there exists α0 ∈ X such that for all α ∈ X, α0 ≤ α. Thisordinal will be denoted by minX.

(O3) Whenever X is a set and f : X → ON is a function then there exists ξ ∈ ONsuch that f(u) < ξ for all u ∈ X.

In future I shall not explicitly mention the above postulates but note that (O2) isequivalent to the principle of transfinite induction on ON:

∀α (∀ξ < α ξ ∈ X → α ∈ X) → ON ⊆ X .

Definition: 4.2 Let N be the smallest set of ordinals which contains 0 and with αalso contains α+1. Then all the ordinals in N different from 0 are successor ordinals.The first ordinal that does not belong to N is the least limit ordinal, denoted by ω.

Definition: 4.3 A class U ⊆ ON is said to be an initial segment or just asegment if ∀α ∈ U ∀β < α β ∈ U .

Any segment is either an ordinal α (i.e. the set of ordinals < α) or the class ofordinals ON.

Let X, Y ⊆ ON and f : X → Y be a function. For V ⊆ X let f [V ] = f(α) |α ∈ V .

f is strictly increasing or order preserving if

∀α, β ∈ X (α < β → f(α) < f(β)).

f is said to be an enumeration function of Y or listing function of Y orordering function of Y if f is strictly increasing, f [X] = Y and X is a segment.

Given a set U ⊆ ON we denote by supU the smallest ordinal ξ such that∀α ∈ U α ≤ ξ.

32

Lemma: 4.4 Let X be a segment of ON and f : X → ON be order preserving.Then α ≤ f(α) holds for all α ∈ X.

Proof: Use transfinite induction on α. ut

Lemma: 4.5 Every Y ⊆ ON has a unique enumeration function EnumY

.

Proof: Existence. Define the collapsing function CY : Y → ON by

CY (α) = CY (ξ) | ξ ∈ Y ∧ ξ < α.

Then CY is 1–1 and X := CY [Y ] is a segment. Now let

EnumY

:= (CY )−1.

Here is another way of defining EnumY

: Let c be a set which is not an ordinal,e.g. c = 1, where 1 = 0. Define F : ON→ ON by transfinite recursion via

F (α) =

min(Y \ F (β) | β < α) if Y \ F (β) | β < α 6= ∅c otherwise.

Then let X := α | F (α) ∈ Y and EnumY

(α) = F (α) for α ∈ X.The proof that any of the above provides indeed an enumeration function for Y

is left to the reader.

Uniqueness. Let f : X → Y and g : X ′ → Y both be ordering functions of Y . ThenX ⊆ X ′ or X ′ ⊆ X since both are segments. In the first case show by induction onα ∈ X that f(α) = g(α). But since f [X] = Y and g is 1-1 this implies X = X ′.

The argument in the case X ′ ⊆ X is of course analogous. ut

Definition: 4.6 Let X ⊆ ON. X is unbounded if for all α there exists γ ∈ Xsuch that γ > α.

X is closed if supU ∈ X whenever U is a non-empty subset of X.We use the phrase X is club or a club to convey that X is closed and unbounded.A function f : ON → ON is continuous if f(supU) = sup f [U ] for all non-

empty sets of ordinals U .f : ON→ ON is a normal function if f is order preserving and continuous.

Lemma: 4.7 Let Y ⊆ ON. EnumY

is a normal function iff Y is closed and un-bounded (Y is club).

Proof: Set f := EnumY

. First suppose that f is normal. As dom(f) = ON, Y mustbe unbounded. Let V ⊆ Y be a non-empty set. Let U = f−1[V ] = ξ | f(ξ) ∈ V .Since f is continuous we have supV = sup f [U ] = f(supU) ∈ Y .

Conversely assume that Y is unbounded. Then the domain of EnumY

must beON. If Y is closed and U 6= ∅ is a set of ordinals we have sup f [U ] ∈ Y , hencesup f [U ] = f(α) for some α. Clearly, ξ ≤ α holds for all ξ ∈ U , and hence supU ≤ α.On the other hand, if δ < α then f(δ) < f(ξ) for some ξ ∈ U , and hence δ < supU .As a result, supU = α, thus sup f [U ] = f(supU). ut

33

Definition: 4.8 Let ON≥α := δ | δ ≥ α. Define the ordinal sum α + ξ by

α + ξ := EnumON≥α

(ξ).

Since ON≥α is obviously a club, the function ξ 7→ α + ξ is a normal function byLemma 4.7.

Lemma: 4.9 The following properties hold for ordinal addition:

1. α + 0 = α.

2. α + (ξ + 1) = (α + ξ) + 1.

3. α + λ = supξ<λ(α + ξ) for limits λ.

4. ξ < η implies α + ξ < α + η.

5. α ≤ α + ξ and ξ ≤ α + ξ.

6. α + (β + γ) = (α + β) + γ.

Proof: These are straightforward consequences of ξ 7→ α+ ξ being an enumerationfunction. (5) is proved by induction on γ. If γ = 0 this follows from (1). If γ = γ0+1,then

α + (β + γ)(2)= α + ((β + γ0) + 1)

(2)= (α + (β + γ0)) + 1

i.h.= ((α + β) + γ0) + 1

(2)= (α + β) + γ.

If γ is a limit then

(α + β) + γ = supξ<γ

((α + β) + ξ)i.h.= sup

ξ<γ(α + (β + ξ)) ≤ α + (β + γ).

Suppose ζ < α + (β + γ). Then ζ < α or ζ = α + ζ0 for some ζ0 < β + γ. Inthe latter case ζ0 < β or ζ0 = β + ξ for some ξ < γ. Thus in every case we haveζ < supξ<γ(α + (β + ξ)), showing that α + (β + γ) ≤ supξ<γ(α + (β + ξ)). ut

Definition: 4.10 We say that an ordinal α > 0 is an additive principal numberor additively indecomposable if ξ, η < α implies ξ + η < α.

The class of additive principal numbers we denote by AP.

Lemma: 4.11 1. Let α > 0. α /∈ AP iff there exist η, ξ < α such that η+ ξ = α.

2. 1 is the smallest additive principal number and ω is the next one. Additiveprincipal number > 1 are limit ordinals.

3. Every infinite cardinal is in AP.

4. AP is a club.

34

Proof: (1) Assume α /∈ AP. Then α ≤ ξ + δ for some ξ, δ < α. Since α ∈ ON≥ξthere exists η such that α = ξ + η. Hence η ≤ δ < α.

Conversely if α = ξ + η for some ξ, η < α then α /∈ AP.(2) is obvious.(3) Clearly ω ∈ AP. Let ρ be an infinite cardinal > ω. Note that if ξ, η < ρ then

the cardinalities of ξ and η are smaller than ρ and the cardinality of ξ + η is notbigger than the maximum of the cardinalities of ξ, η, ω, and hence < ρ.

(4) To show unboundedness, take any α and define α0 = α+1 and αn+1 = αn+αn.Let β := supαn | n ∈ N. Since αn > 0 we have αn < αn + αn = αn+1. Clearly,α < β.

If ξ, η < β then ξ, β < αn for some n, and hence ξ + η < αn + αn = αn+1 < β.Thus β ∈ AP.

As for closure, let U ⊆ AP be a non-empty set. Let α = supU . If ξ, η < α thenξ < ξ′ and η < η′ for some ξ′, η′ ∈ U . Hence ξ + η < max(ξ′, η′) ≤ α. ut

Definition: 4.12 Let ωα := EnumAP(α).

Lemma: 4.13 1. ω0 = 1 and ω1 = ω.

2. ωλ = supξ<λ ωξ.

3. If α < β then ωα < ωβ.

Proof: Obvious. ut

Lemma: 4.14 Let α > 0. Then α ∈ AP iff for all ξ < α, ξ + α = α.

Proof: This is true for α = 1.Let α ∈ AP and α > 1. Then α is a limit and hence ξ + α = supδ<α(ξ + δ) ≤ α.

On the other hand, ξ + α ≥ α.Conversely assume ξ + α = α for all ξ < α. Then if ξ, η < α we have ξ + η <

ξ + α = α, whence α ∈ AP. ut

Definition: 4.15 We write α =NF

α1+. . .+αn if α = α1+. . .+αn, α1, . . . , αn ∈ APand α1 ≥ . . . ≥ αn.

Theorem: 4.16 (Cantor’s normal form, Cantor 1897) For every α > 0 thereare uniquely determined α1, . . . , αn ∈ AP such that

α =NF

α1 + . . .+ αn.

Proof: We prove the existence by induction on α. If α ∈ AP, the α =NF

α. Ifα /∈ AP then by Lemma 4.11 there exist 0 < η, ξ < α such that η + ξ = α. By theinductive assumption we have η =

NFη1 + . . .+ ηm and ξ =

NFξ1 + . . .+ ξn for some

η1, . . . , ηm, ξ1, . . . , ξn ∈ AP. As a result,

α =NF

η1 + . . .+ ηj + ξ1 + . . .+ ξn

35

where j is the largest index such that ηj ≥ ξ1. Note that there exists such a j sinceη1 ≥ ξ1 for otherwise we would have η + ξ = ξ = α.

To show uniqueness assume α =NF

α1 + . . . + αm and α =NF

α∗1 + . . . + α∗n.We show m = n and αi = α∗i by induction on m. As α1 < α∗1 would entailα1 + . . . + αm < α∗1 we have α1 ≥ α∗1. Thus, by symmetry, α1 = α∗1. Hencem = n = 1 or α2 + . . . + αm = α∗2 + . . . + α∗n. In the latter case the inductionhypothesis tells us that m = n and αi = α∗i for 2 ≤ i ≤ m. ut

Corollary: 4.17 Let α =NF

α1 + . . . + αm and β =NF

β1 + . . . + βm. Then α < βiff one of the following holds:

(i) m < n and αi = βi for all i ≤ m;

(ii) there exists j ≤ min(m,n) such that αj < βj and αi = βi holds for all 1 ≤ i <j.

Proof: Obvious. ut

Definition: 4.18 We define ordinal multiplication and exponentiation as follows:

α · 0 = 0

α · (β + 1) = α · β + α

α · λ = supα · ξ | ξ < λ when λ is a limit.

α0 = 1

αβ+1 = αβ · ααλ = supαξ | ξ < λ when λ is a limit.

Note that on account of Lemma 4.14, definitions 4.12 and 4.18 give rise to the samefunction ξ 7→ ωξ.

Lemma: 4.19 1. α < β and γ > 0 iff γ · α < γ · β.

2. If α ≤ β then α · γ ≤ β · γ.

3. α · (β + γ) = α · β + α · γ.

4. (α · β) · γ = α · (β · γ).

5. ωα+1 = ωα · ω.

6. ωα+β = ωα · ωβ.

Proof: Exercises. ut

36

4.1 Veblen’s functions

Definition: 4.20 For f : ON→ ON define

Fix(f) := α | f(α) = α;f ′ := Enum

Fix(f).

Veblen called f ′ the derivative of f .

Lemma: 4.21 (i) If f : ON → ON is normal then Fix(f) is a club and f ′ is anormal function, too.

(ii) Let ρ > 0. If Xξ is a sequence of clubs for ξ < ρ then⋂ξ<ρ

Xξ

is also a club.

Proof: (i) By Lemma 4.7 it suffices to show that Fix(f) is a club. For unbounded-ness let α be arbitrary and define α0 = α+ 1, αn+1 = f(αn) and α∗ = supαn | n ∈ω. Then α∗ > α and

f(α∗) = supf(αn) | n ∈ ω = supαn+1 | n ∈ ω = α∗

whence α∗ ∈ Fix(f).For closure assume U ⊆ Fix(f) is a non-empty set. Then f(supU) = sup f [U ] =

supU since f is continuous and f [U ] = U since U consists of fixed points of f . ThussupU ∈ Fix(f).

(ii) Closure is obvious as each class Xξ is closed.For unboundedness let α be arbitrary. Recursively define αn and αξn for ξ < ρ

as follows. Set α0 = α + 1. For ξ < ρ choose αξn in such a way that αξn ∈ Xξ andαξn > αn. Let αn+1 = supξ<ρ α

ξn. Put α+ = supk αk. Then we have

αn < αξn ≤ αn+1

and hence α+ = supk αξk for all ξ < ρ. Whence α+ ∈ Xξ for all ξ < ρ. ut

Definition: 4.22 (Veblen 1908) Define

Cr(0) = AP;

Cr(α + 1) = Fix(ϕα);

Cr(λ) =⋂ξ<λ

Cr(ξ) if λ is a limit;

ϕα = EnumCr(α)

.

Corollary: 4.23 For every α, Cr(α) is a club and ϕα is a normal function.

Proof: Lemma 4.11 and Lemma 4.21. ut

37

Lemma: 4.24 1. If α ≤ β then Cr(β) ⊆ Cr(α).

2. ϕ0(α) = ωα.

3. ϕα is strictly increasing.

4. β ≤ ϕα(β).

5. If α < β then Cr(β) is a proper subclass of Cr(α), ϕα(γ) ≤ ϕβ(γ), andϕα(ϕβ(γ)) = ϕβ(γ).

Proof: (1) follows readily by induction on β. (2) and (3) are immediate and (4)follows from Lemma 4.4.

As to (5) note that ϕβ(γ) ∈ Cr(α + 1) by (1) and hence ϕα(ϕβ(γ)) = ϕβ(γ).As ϕα(0) < ϕα(ϕβ(0)) = ϕβ(0) it follows that ϕα(0) /∈ Cr(β) and hence Cr(β) is

a proper subclass of Cr(α). ut

Theorem: 4.25 (ϕ-comparison) (i) ϕα1(β1) = ϕα2(β2) holds iff one of the fol-lowing conditions is satisfied:

1. α1 < α2 and β1 = ϕα2(β2)

2. α1 = α2 and β1 = β2

3. α2 < α1 and ϕα1(β1) = β2.

(ii) ϕα1(β1) < ϕα2(β2) holds iff one of the following conditions is satisfied:

1. α1 < α2 and β1 < ϕα2(β2)

2. α1 = α2 and β1 < β2

3. α2 < α1 and ϕα1(β1) < β2.

Proof: We prove (i) and (ii) simultaneously.

Case 1: α1 < α2. Then ϕα1(ϕα2(β2)) = ϕα2(β2) and hence

ϕα1(β1) = ϕα2(β2) iff β1 = ϕα2(β2);

ϕα1(β1) < ϕα2(β2) iff β1 < ϕα2(β2).

Case 2: α1 = α2. Then

ϕα1(β1) = ϕα2(β2) iff β1 = β2;

ϕα1(β1) < ϕα2(β2) iff β1 < β2.

Case 3: α1 > α2. Then ϕα2(ϕα1(β1)) = ϕα1(β1) and hence

ϕα1(β1) = ϕα2(β2) iff β2 = ϕα1(β1);

ϕα1(β1) < ϕα2(β2) iff ϕα1(β1) < β2.

ut

Corollary: 4.26 If α < β then ϕα(0) < ϕβ(0). Hence α ≤ ϕα(0).

38

Proof: The first part follows from Theorem 4.25(ii)(1). Thus the function α 7→ϕα(0) is order preserving, so α ≤ ϕα(0) follows by Lemma 4.4. ut

Theorem: 4.27 (ϕ normal form) For every α ∈ AP there exist uniquely deter-mined ordinals ξ and η such that α = ϕξ(η) and η < α.

Proof: For existence, let ξ := minδ | α < ϕδ(α). ξ exists by Corollary 4.26. Ifξ = 0 we have α = ϕ0(η) for some η < α since α ∈ AP.

If ξ > 0 then ϕζ(α) = α for all ζ < ξ and hence α ∈ Cr(ξ) which impliesα = ϕξ(η) for some η < α.

It remains to show uniqueness. If α = ϕξ(η) = ϕξ′(η′) where η, η′ < α, then the

cases (1) and (3) from Theorem 4.25(i) cannot hold and hence η = η′ and ξ = ξ′. ut

Definition: 4.28 Let SC := α | ϕα(0) = α and Γβ = EnumSC

(β).

Theorem: 4.29 SC is a club and hence β 7→ Γβ is a normal function.

Proof: By Corollary 4.26 we know that α 7→ ϕα(0) is strictly increasing. One canalso show that this function is continuous. Hence its class of fixed points SC formsa club. ut

Lemma: 4.30 SC = α | α > 0 ∧ ∀ξ, η < α ϕξ(η) < α.

Proof: Exercise. ut

4.2 Two ordinal representation systems

Let ε0 be the first ordinal α such that such that ωα = α. Then ∀β < ε0 β < ωβ.Another notation for ε0 is ϕ1(0).

Also note that if ρ ∈ AP and ρ < Γ0 then there exist (unique) α, β < ρ such thatα = ϕα(β).

Definition: 4.31 (i) The set OT(ε0) is inductively defined by the following clauses:

1. 0 ∈ OT(ε0).

2. If α1, . . . , αn ∈ OT(ε0)∩AP and α1 ≥ . . . ≥ αn and n > 1 then α1 + . . .+αn ∈ OT(ε0).

3. If α ∈ OT(ε0) then ωα ∈ OT(ε0).

(ii) OT(Γ0) is inductively defined by the following clauses:

1. 0 ∈ OT(Γ0).

2. If α1, . . . , αn ∈ OT(Γ0) ∩ AP and α1 ≥ . . . ≥ αn and n > 1 then α1 +. . .+ αn ∈ OT(Γ0).

3. If α, β ∈ OT(Γ0) and α, β < ϕα(β) then ϕα(β) ∈ OT(Γ0).

Corollary: 4.32 (i) OT(ε0) = ε0.

39

(ii) OT(Γ0) = Γ0.

Proof: Use induction on α < ε0 to show that α ∈ OT(ε0). Similarly, use inductionon α < Γ0 to show that α ∈ OT(Γ0). ut

Ordinals β < Γ0 have a unique normal form, namely either β = 0 or β =NF

β1 + . . . βn with β1, βn ∈ AP and n > 1 or β =NF

ϕγ(δ) with γ, δ < β. Thus every0 < β < Γ0 can be uniquely represented in terms of smaller ordinals which again canbe uniquely represented in terms of yet smaller ordinals and 0 etc. As this processterminates after finitely many steps, every β < Γ0 has a unique term representationover the alphabet 0,+, ϕ.

Corollary: 4.33 There is a primitive recursive set A0 ⊆ N, a primitive recursiverelation ≺ on A0 and primitive binary recursive functions + and ϕ such that

f : (OT(Γ0), <,+, ϕ) ∼= (A0,≺, +, ϕ)

for some structural isomorphism f . Moreover

(OT(ε0), <,+, ϕ0) ∼= (B0,≺1, +, ϕ0),

where B0 = x ∈ A0 | x ≺ f(ε0), ≺1 is the restriction of ≺ to A0 and + and ϕ0

are the restrictions of these functions to B0.

Proof: Ordinals < Γ0 can be coded by natural numbers. For instance a codingfunction

d . e : Γ0 −→ N

could be defined as follows:

dαe =

0 if α = 0〈1, dα1e, . . . , dαne〉 if α =

NFα1 + · · ·+ αn where n > 1

〈2, dα1e, dα2e〉 if α =NF

ϕα1(α2)

where 〈k1, · · · , kn〉 := 2k1+1 · . . . · pkn+1n with pi being the ith prime number (or any

other coding of tuples). Further define:

A0 := range of d.e dαe ≺ dβe :⇔ α < β

dαe + dβe := dα + βe ϕ(dαe, dβe) := dϕα(β)e.

Then

〈Γ0,+, ϕ,<〉 ∼= 〈A0, +, ϕ,≺〉.

It remains to show that A0,≺, +, ϕ are primitive recursive. This can be seen bydefining them via a simultaneous primitive recursive definition, viewing Corollary4.18 and Theorem 4.25 as the recursive clauses for defining ≺. ut

40

5 Ordinal analysis of PA and some subsystems of

second order arithmetic

The most important structure in mathematics is arguably the structure of the nat-ural numbers N = (N; 0N, 1N,+N,×N, EN, <N), where 0N denotes zero, 1N denotesthe number one, +N,×N, EN denote the successor, addition, multiplication, and ex-ponentiation function, respectively, and <N stands for the less-than relation on thenatural numbers. In particular, EN(n,m) = nm.

Many of the famous theorems and problems of mathematics such as Fermat’sand Goldbach’s conjecture, the Twin Prime conjecture, and Riemann’s hypothesiscan be formalized as sentences of the language of N and thus concern questionsabout the structure N.

Definition: 5.1 A theory designed with the intent of axiomatizing the structure Nis Peano arithmetic, PA. The language of PA has the predicate symbols =, <,the function symbols +,×, E (for addition, multiplication,exponentiation) and theconstant symbols 0 and 1. The Axioms of PA comprise the usual equations and lawsfor addition, multiplication, exponentiation, and the less-than relation. In addition,PA has the Induction Scheme

(IND) A(0) ∧ ∀x[A(x)→ A(x+ 1)]→ ∀xA(x)

for all formulae A(a) of the language of PA.

Gentzen showed that transfinite induction up to the ordinal

ε0 = supω, ωω, ωωω , . . . = least α. ωα = α

suffices to prove the consistency of PA. To appreciate Gentzen’s result it is pivotal tonote that he applied transfinite induction up to ε0 solely to elementary computablepredicates and besides that his proof used only finitistically justified means. Hence,a more precise rendering of Gentzen’s result is

F + EC-TI(ε0) ` Con(PA), (40)

where F signifies a theory that embodies only finitistically acceptable means, EC-TI(ε0)stands for transfinite induction up to ε0 for elementary computable predicates, andCon(PA) expresses the consistency of PA. Finally, we should spell out the schemeEC-TI(ε0) in the language of PA:

∀x [∀y (y ≺ x→ P (y)) → P (x)] → ∀xP (x)

for all elementary computable predicates P .Gentzen also showed that his result was the best possible in that PA proves trans-

finite induction up to α for arithmetic predicates for any α < ε0. The compellingpicture conjured up by the above is that the non-finitist part of PA is encapsulatedin EC-TI(ε0) and therefore “measured” by ε0, thereby tempting one to adopt thefollowing definition of proof-theoretic ordinal of a theory T :

|T |Con = least α. F + EC-TI(α) ` Con(T ). (41)

41

In the above, many notions were left unexplained. We will now consider them oneby one. The elementary computable functions are exactly the Kalmar elementaryfunctions, i.e. the class of functions which contains the successor, projection, zero,addition, multiplication, and modified subtraction functions and is closed undercomposition and bounded sums and products. A predicate is elementary computableif its characteristic function is elementary computable.

According to an influential analysis of finitism due to W.W. Tait, finististicreasoning coincides with a system known as primitive recursive arithmetic. For thepurposes of ordinal analysis, however, it suffices to identify F with an even morerestricted theory known as Elementary Recursive Arithmetic, EA. EA is a weaksubsystem of PA having the same defining axioms for +,×, E,< but with inductionrestricted to elementary computable predicates.

We shall add a new unary predicate symbol U to the language of PA which willserve the purpose of a free predicate variable.

Definition: 5.2 We shall formalize PAU in the sequent calculus. In addition tothe rules of the (classical) sequent calculus we have to add the following axioms:

(=ref) Γ ⇒ ∆, t = t

(=sym) Γ, s = t ⇒ ∆, t = s

(=tran) Γ, s1 = s2, s2 = s3 ⇒ ∆, s1 = s3

(=sub) Γ, s1 = t1, . . . , sn = tn, A(s1, . . . , sn) ⇒ ∆, A(t1, . . . , tn)

only for atomic formulas A(~s ).

(suc1) Γ ⇒ ∆, suc(s) 6= 0 and Γ, suc(s) = suc(t) ⇒ ∆, s = t.

(+) Γ ⇒ ∆, s+ 0 = 0 and Γ ⇒ ∆, s+ suc(t) = suc(s+ t).

(·) Γ ⇒ ∆, s · 0 = 0 and Γ ⇒ ∆, s · suc(t) = s · t+ s.

(IND) Γ, A(0),∀x [A(x)→ A(x+ 1)] ⇒ ∆,∀xA(x)

for all formulas A(a).

As the ultimate goal of this course is to carry out an ordinal analysis of a systemof set theory, we shall not particularly dwell on an ordinal analysis of PA. Tosoften the ascend to set theory, however, we will first give an ordinal analysis of twosubsystems of second order arithmetic. The analysis of PAU will arise as a corollary.

Ordinal analysis is concerned with theories serving as frameworks for formalisingsignificant parts of mathematics. It is known that virtually all of ordinary mathe-matics can be formalized in Zermelo-Fraenkel set theory with the axiom of choice,ZFC. Hilbert and Bernays [25] showed that large chunks of mathematics can al-ready be formalized in second order arithmetic. Owing to these observations, prooftheory has been focusing on set theories and subsystems of second order arithmetic.Further scrutiny revealed that a small fragment is sufficient. Under the rubric ofReverse Mathematics a research programme has been initiated by Harvey Friedmansome thirty years ago. The idea is to ask whether, given a theorem, one can prove

42

its equivalence to some axiomatic system, with the aim of determining what proof-theoretical resources are necessary for the theorems of mathematics. More precisely,the objective of reverse mathematics is to investigate the role of set existence axiomsin ordinary mathematics. The main question can be stated as follows:

Given a specific theorem τ of ordinary mathematics, which set existenceaxioms are needed in order to prove τ?

Central to the above is the reference to what is called ‘ordinary mathematics’. Thisconcept, of course, doesn’t have a precise definition. Roughly speaking, by ordinarymathematics we mean main-stream, non-set-theoretic mathematics, i.e. the coreareas of mathematics which make no essential use of the concepts and methodsof set theory and do not essentially depend on the theory of uncountable cardinalnumbers.

Subsystems of second order arithmetic. The framework chosen for studyingset existence in reverse mathematics, though, is second order arithmetic rather thanset theory. Second order arithmetic, Z2, is a two-sorted formal system with freeand bound first order variables (also called numerical variables; the same as for PA)and free set variables U0, U1, U2, . . . as well as bound set variables X0, X1, X2, . . .supposed to range over sets of natural numbers. The language L2 of second-orderarithmetic also contains the symbols of PA, and in addition has a binary relationsymbol ∈ for elementhood. Formulae are built from atomic formulae s = t, s < t,and s ∈ U (where s, t are numerical terms, i.e. terms of PA) by closing off underthe connectives ∧,∨,→,¬, numerical quantifiers ∀x,∃x, and set quantifiers ∀X, ∃X.

The basic arithmetical axioms in all theories of second-order arithmetic are thedefining axioms for 0, 1,+,×, E,< (as for PA) and the induction axiom

∀X(0 ∈ X ∧ ∀x(x ∈ X → x+ 1 ∈ X)→ ∀x(x ∈ X)).

We consider the axiom schema of C-comprehension for formula classes C which isgiven by

C −CA ∃X∀u(u ∈ X ↔ F (u))

for all formulae F ∈ C in which X does not occur. Natural formula classes are thearithmetical formulae, consisting of all formulae without second order quantifiers∀X and ∃X, and the Π1

n-formulae, where a Π1n-formula is a formula of the form

∀X1 . . . QXnA(X1, . . . , Xn) with ∀X1 . . . QXn being a string of n alternating setquantifiers, commencing with a universal one, followed by an arithmetical formulaA(X1, . . . , Xn).

ACA0 denotes the theory consisting of the basic arithmetical axioms plus thescheme

∃X∀u(u ∈ X ↔ F (u))

for all arithmetical formula F (a) in which X does not occur. ACA denotes thetheory ACA0 augmented by the scheme of induction for all L2-formulae.

43

5.1 The semi-formal system RA∗ of Ramified Analysis

Definition: 5.3 RA∗ has the following symbols:

• Bound number variables: x0, x1, x2, . . ..

• Free predicate variables of level α for each ordinal α < Γ0: Uα0 , U

α1 , U

α2 . . .

• Bound predicate variables of level β for each ordinal 0 < β < Γ0: Xβ0 , X

β1 , X

β2 , . . ..

• The symbols 0, suc, +, ·.

• The logical symbols ∧, ∨, →, ¬, ∀, ∃ and λ.

• Symbols for primitive recursive functions and relations.

• Parentheses.

Inductive definition of formulas and predicators.

1. Every numerical atomic formula is a formula of level 0.

2. Every free predicate variable of level α is a predicator of level α.

3. If Pα is a predicator of level α and t is a term, then t ∈ Pα is a formula oflevel α.

4. If A and B are formulas of level α and β then A ∧ B, A ∨ B, A → B areformulas of level max(α, β) and ¬A is a formula of level α.

5. If F (0) is a formula of level α and x is a bound number variable which does notoccur in F (0), then ∀xF (x) and ∃xF (x) are formulae of level α and λxF (x)is a predicator of level α.

6. If Uβ is a free predicate variable of level β 6= 0, F (Uβ) is a formula of level αand Xβ a bound predicate variable of level β which does not occur in F , then∀XβF (Xβ) and ∃XβF (Xβ) are formulae of level max(α, β).

Inductive definition of the length |A| of a formula A.

1. Every atomic numerical formula A has length 0, |A| = 0.

2. |Uα(t)| = ω · α.

3. If A and B are formulas then |A∧B| = |A∨B| = |A→ B| = max(|A|, |B|)+1and |¬A| = |A|+ 1.

4. |∀xF (x)| = |∃xF (x)| = |λxF (x)| = |F (0)|+ 1.

5. |∀XβF (Xβ)| = |∃XβF (Xβ)| = max(ω · β, |F (U0)|+ 1).

Definition: 5.4 We define the infinitary proof system RA∗. A true (false) atomicformula is an atomic formula without free variables (and hence closed) which is true(false) on the standard interpretation. The axioms of ACA∞ are the following:

44

(A1) Γ ⇒ ∆, A where A is a true atomic formula.

(A2) Γ, A ⇒ ∆ where A is a false atomic formula.

(A3) Γ, Uα(s) ⇒ ∆, Uα(t) where s and t have the same numerical value and Uα isa free set variable.

The inference rules of RA∗ comprise those of the sequent calculus with the exceptionof (∀R) and (∃L). The latter are replaced by two infinitary rules, i.e. rules withinfinitely many premisses. They correspond to the so called ω-rule:

Γ ⇒ ∆, F (0); Γ ⇒ ∆, F (1); . . . ; Γ ⇒ ∆, F (n); . . .(ωR)

Γ ⇒ ∆,∀xF (x)

F (0),Γ ⇒ ∆; F (1),Γ ⇒ ∆; . . . ;F (n),Γ ⇒ ∆; . . .(ωL)

∃xF (x),Γ ⇒ ∆

The price to pay will be that deductions become infinite objects, i.e. infinite well-founded trees.

We will also need rules for the higher order quantifiers and predicators. VariablesP, P0, P1, . . . will range over predicators and variables Pα, Pα

0 , Pα1 , . . . will range over

predicators of level α. We write lev(P ) for the level of P . Pβ stands the collectionof predicators with levels < β.

F (t),Γ ⇒ ∆P L

λxF (x)(t),Γ ⇒ ∆

Γ ⇒ ∆, F (t)P R

Γ ⇒ ∆, λxF (x)(t)

F (P ),Γ ⇒ ∆,∀β L

∀Xβ F (Xβ),Γ ⇒ ∆

Γ ⇒ ∆, F (P ) all P ∈ Pβ ∀β RΓ ⇒ ∆,∀Xβ F (Xβ)

F (P ),Γ ⇒ ∆ all P ∈ Pβ ∃β L∃Xβ F (Xβ),Γ ⇒ ∆

Γ ⇒ ∆, F (P )∃β R

Γ ⇒ ∆,∃Xβ F (Xβ)

where in ∀βL and ∃βR, P is a predicator of level < β.

Definition: 5.5 RA∗α

ρ Γ ⇒ ∆ is defined inductively as follows:

(i) If Γ ⇒ ∆ is an axiom, then RA∗α

ρ Γ ⇒ ∆ for any α, ρ.

(ii) If RA∗αiρ Γi ⇒ ∆i holds for all premisses Γi ⇒ ∆i of an inference of RA∗

other than (Cut) with conclusion Γ ⇒ ∆ and αi < α holds for all i, thenRA∗

α

ρ Γ ⇒ ∆ .

(iii) If RA∗α1

ρ Γ, C ⇒ ∆ , RA∗α1

ρ Γ ⇒ ∆, C , |C| < ρ and α1, α2 < α, then

RA∗α

ρ Γ ⇒ ∆ .

Lemma: 5.6 (i) If B is a formula of level α then |A| = ω ·α+n for some n < ω.

45

(ii) For every formula A(U) and comprehension term Pα with α < β,

|A(Pα)| < |∀Xβ A(Xβ)|, |∃Xβ A(Xβ)|.

Proof: Exercise. ut

Lemma: 5.7 For every formula C of RA∗,

RA∗2·|A|0

Γ, C ⇒ ∆, C .

Proof: Use induction on |A|. utWe list some technical lemmata that will be useful for proving cut elimination.

Lemma: 5.8 (Substitution) Let Γ(s) and ∆(s) be sets of formulas with someoccurrences of s indicated and let t be a term with the same numerical value.

Ifα

ρ Γ(s) ⇒ ∆(s) , thenα

ρ Γ(t) ⇒ ∆(t) .

Proof: Use induction on α. ut

Lemma: 5.9 (Weakening)If RA∗

α

ρ Γ ⇒ ∆ , Γ ⊆ Γ′ and ∆ ⊆ ∆′, then RA∗α

ρ Γ′ ⇒ ∆′ .

Proof: Use induction on α. ut

Lemma: 5.10 (Inversion) (i) If RA∗α

ρ Γ, A ∧B ⇒ ∆ then RA∗α

ρ Γ, A,B ⇒ ∆ .

(ii) If RA∗α

ρ Γ ⇒ ∆, A ∧B then RA∗α

ρ Γ ⇒ ∆, A and RA∗α

ρ Γ ⇒ ∆, B .

(iii) If RA∗α

ρ Γ, A ∨B ⇒ ∆ then RA∗α

ρ Γ, A ⇒ ∆ and RA∗α

ρ Γ, B ⇒ ∆ .

(iv) If RA∗α

ρ Γ ⇒ ∆, A ∨B then RA∗α

ρ Γ ⇒ ∆, A,B .

(v) If RA∗α

ρ Γ ⇒ A→ B,∆ then RA∗α

ρ A,Γ ⇒ ∆, B .

(vi) If RA∗α

ρ Γ, A→ B ⇒ ∆ then RA∗α

ρ Γ ⇒ ∆, A and RA∗α

ρ Γ, B ⇒ ∆ .

(vii) If RA∗α

ρ Γ ⇒ ¬A,∆ then RA∗α

ρ Γ, A ⇒ ∆ .

(viii) If RA∗α

ρ Γ,¬A ⇒ ∆ then RA∗α

ρ Γ ⇒ ∆, A .

(ix) If RA∗α

ρ Γ ⇒ ∆,∀xB(x) then RA∗α

ρ Γ ⇒ ∆, B(s) for any closed term s.

(x) If RA∗α

ρ Γ,∃xB(x) ⇒ ∆ then RA∗α

ρ Γ, B(s) ⇒ ∆ for any closed term s.

(xi) If RA∗α

ρ Γ ⇒ ∆,∀Xβ B(Xβ) then RA∗α

ρ Γ ⇒ ∆, B(P ) for any predicatorP ∈ Pβ.

(xii) If RA∗α

ρ Γ,∃Xβ B(Xβ) ⇒ ∆ then RA∗α

ρ Γ, B(P ) ⇒ ∆ for any predicatorin P ∈ Pβ.

46

(xiii) If RA∗α

ρ Γ ⇒ ∆, λxF (x)(t) then RA∗α

ρ Γ ⇒ ∆, F (t) .

(xiv) If RA∗α

ρ Γ, λxF (x)(t) ⇒ ∆ then RA∗α

ρ Γ, F (t) ⇒ ∆ .

Proof: All are provable by easy inductions on α. ut

Lemma: 5.11 (Reduction)

Suppose ρ ≤ |C|. If RA∗α

ρ Γ, C ⇒ ∆ and RA∗β

ρ Ξ ⇒ Θ, C , then

RA∗α#α#β#β

|C|Γ,Ξ ⇒ ∆,Θ .

Proof: The proof is by induction on α#α#β#β and very similar to Lemma 2.17.We only look at two cases where C and was the principal formula of the last inferencein both derivations.

Case 1: The first is when C is of the form ∀Xβ A(Xβ). Then we have

RA∗α1

ρ Γ, C, A(P ′) ⇒ ∆

andRA∗

βPρ Ξ ⇒ Θ, C, A(P )

for some α1 < α and predicator P ′ ∈ Pβ as well as βP < β for all predicatorsP ∈ Pβ. By the induction hypothesis we obtain

RA∗α1#α1#β#β

|C|Γ,Ξ, A(P ′) ⇒ ∆,Θ

andRA∗

α#α#βP ′#βP ′

|C|Γ,Ξ ⇒ ∆,Θ, A(P ′) .

Cutting out A(P ′) gives RA∗α#α#β#β

|C|Γ,Ξ ⇒ ∆,Θ .

Case 2: The second case is when C is of the form ∀xA(x) Then we have

RA∗α1

ρ Γ, C, A(t) ⇒ ∆

andRA∗

βn

ρ Ξ ⇒ Θ, C, A(n)

for some α1 < α and closed term t as well as βn < β for all numbers n. Let m bethe numerical value of t. By Lemma 5.10(ix) we have

RA∗α1

ρ Γ, C, A(m) ⇒ ∆ .

By the induction hypothesis we thus get

RA∗α1#α1#β#β

|C|Γ,Ξ, A(m) ⇒ ∆,Θ

andRA∗

α#α#βm#βm

|C|Γ,Ξ ⇒ ∆,Θ, A(m) .

Cutting out A(m) gives RA∗α#α#β#β

|C|Γ,Ξ ⇒ ∆,Θ . ut

47

Theorem: 5.12 (First Cut Elimination Theorem)

If RA∗α

δ+1Γ ⇒ ∆ then RA∗

4α

δΓ ⇒ ∆ .

Proof: We use induction on α. If Γ ⇒ ∆ is an axiom then we clearly get thedesired result. So let’s assume that Γ ⇒ ∆ is not an axiom. Then we have a lastinference (I) with premisses Γi ⇒ ∆i. Suppose the inference was not a cut or acut of a degree < δ. We then have RA∗

αi

δΓi ⇒ ∆i for some αi < α. By the

induction hypothesis we have RA∗4αi

δΓi ⇒ ∆i . Applying the same inference (I)

yields RA∗4α

δΓ ⇒ ∆ since 4αi < 4α.

Now suppose the last inference was a cut with a cut formula C satisfying |C| = δ.By the induction hypothesis we have

RA∗4α1

δΓ, C ⇒ ∆

andRA∗

4α2

δΓ ⇒ ∆, C

for some α1, α2 < n. We can then apply the Reduction Lemma 5.11 to these

derivations and arrive at RA∗4α1#4α1#4α2#4α2

δΓ ⇒ ∆ . Since 4α1#4α1#4α2#4α2 ≤

4α the desired conclusion follows. ut

Theorem: 5.13 (Second Cut Elimination Theorem)

If RA∗α

ρ+ωνΓ ⇒ ∆ then RA∗

ϕν(α)

ρ Γ ⇒ ∆ .

Proof: We use induction on ν with a subsidiary induction on α. The assertionholds for ν = 0 by the First Cut Elimination Theorem 5.12. Now suppose ν > 0.

If Γ ⇒ ∆ is an axiom then we clearly get the desired result. So let’s as-sume that Γ ⇒ ∆ is not an axiom. Then we have a last inference (I) withpremisses Γi ⇒ ∆i. Suppose the inference was not a cut or a cut of rank < ρ.We then have RA∗

αi

ρ+ωνΓi ⇒ ∆i for some αi < α. By the subsidiary induction

hypothesis we have RA∗ϕν(αi)

ρ Γi ⇒ ∆i . Applying the same inference (I) yields

RA∗ϕν(α)

ρ Γ ⇒ ∆ .Now suppose the last inference was a cut with cut formula C such that ρ ≤

|C| < ρ+ ων . Then there exist ν0 < ν and n < ω such that |C| < ρ+ ων0 · n. Afterperforming a cut with C we have

RA∗ϕν(α)

ρ+ων0 ·n Γ ⇒ ∆ .

We also have ϕν0(ϕν(α)) = ϕν(α). Therefore by n-fold application of the main

induction hypothesis we obtain RA∗ϕν(α)

ρ Γ ⇒ ∆ . ut

5.2 Interpretation of subsystems of Z2 in RA∗

To facilitate the interpretation of subsystems of Z2 in RA∗ we will assume that theyare formalized via the sequent calculus.

48

Definition: 5.14 The sequent calculus version of ACA0 has all the axioms of PAU

given in Definition 5.2 but with IND excluded. Further axioms are:

(IA) Γ ⇒ ∆,∀X [0 ∈ X ∧ ∀u (u ∈ X → u+ 1 ∈ X) → ∀uu ∈ X].

(A-CA) Γ ⇒ ∆,∃Y ∀u [u ∈ Y ↔ A(u)]

where A(a) is an arithmetic formula in which Y does not occur.

In addition to the usual inference rules of the sequent calculus we also need inferencerules for the second order quantifiers:

F (V ),Γ ⇒ ∆,∀L∀X F (X),Γ ⇒ ∆

Γ ⇒ ∆, F (U)∀R

Γ ⇒ ∆,∀X F (X)

F (U),Γ ⇒ ∆∃L∃X F (X),Γ ⇒ ∆

Γ ⇒ ∆, F (V )∃R

Γ ⇒ ∆,∃X F (X)

where the variable U in ∀2R and ∃2L is an eigenvariable of the respective inference,i.e. U is not to occur in the lower sequent.

The sequent calculus version of ACA also has the axiom scheme (IND) from Defi-nition 5.2.

The theory of ∆11-analysis (that’s the name Schutte gave it in [52, VIII.20]) or

(∆11 −CR) comprises ACA and in addition has the rule of ∆1

1-comprehension:

⇒ ∀x [∀XA(X, x) ↔ ∃Y B(Y, x)]∆1

1-CRΓ ⇒ ∆,∃Z∀x [x ∈ Z ↔ ∀XA(X, x)]

where A(U, a) and B(U, a) are arithmetic formulae. Note that the premiss of aninstance of ∆1

1-CR does not have any side formulas.

Definition: 5.15 Let 0 < σ < Γ0. Let Ξ ⇒ Θ be an L2-sequent. We call anL∗RS-sequent Ξσ ⇒ Θσ a σ-instance of Ξ ⇒ Θ if is obtained by the followingsteps:

1. Write Ξ ⇒ Θ as

Ξ(a1, . . . , ak, U1, . . . , Ur) ⇒ Θ(a1, . . . , ak, U1, . . . , Ur)

fully indicating all free variables occurring in it.

2. Replace every free variable ai by a number mi and every variable Uj by apredicator Pj of level < σ.

3. Finally add to every bound variable occurring in

Ξ(m1, . . . ,mk, P1, . . . , Pr) ⇒ Θ(m1, . . . ,mk, P1, . . . , Pr)

a superscript σ (i.e., X changes to Xσ) and the result is Ξσ ⇒ Θσ.

If Γ ⇒ ∆ is a sequent of PAU , we say that Γ′ ⇒ ∆′ is a numerical instance ofΓ ⇒ ∆ if it is obtained by the following steps:

49

1. Write Γ ⇒ ∆ as Γ(a1, . . . , an) ⇒ ∆(a1, . . . , an), where all free number vari-ables are fully indicated.

2. Replace every ai by the same numeral mi.

3. In Γ(m1, . . . ,mn) ⇒ ∆(m1, . . . ,mn) replace every expression U(t) by t ∈ U00 ,

and the result is Γ′ ⇒ ∆′.

Lemma: 5.16 RA∗2·|F (0)|+ω0

F (0),∀x [F (x)→ F (x+ 1)] ⇒ ∀xF (x)

Proof: We show

RA∗2·(|F (0)|+n)

0F (0),∀x [F (x)→ F (x+ 1)] ⇒ F (n) (42)

by induction on n. Let η := |F (0)|. By Lemma 5.7 we have

RA∗2·η0F (0),∀x [F (x)→ F (x+ 1)] ⇒ F (0) .

Assume

RA∗2·(η+n)

0F (0),∀x [F (x)→ F (x+ 1)] ⇒ F (n) . (43)

We have RA∗2·η0F (n+ 1) ⇒ F (n+ 1) by Lemma 5.7 and thus via an inference

(→ L) we obtain

RA∗2·(η+n)+1

0F (0),∀x [F (x)→ F (x+ 1)], F (n)→ F (n+ 1) ⇒ F (n+ 1) .

Using (∀L) we arrive at

RA∗2·(η+n)+2

0F (0),∀x [F (x)→ F (x+ 1)] ⇒ F (n+ 1) (44)

which is what we want as 2 · (η + n) + 2 = 2 · (η + n+ 1).As a consequence of (42) we get the desired assertion via an inference (ωR). ut

Theorem: 5.17 (First Interpretation Theorem) (i) If PAU Γ ⇒ ∆ thenthere exist n, k < ω such that

RA∗ω+n

kΓ′ ⇒ ∆′

holds for every numerical instance of Γ′ ⇒ ∆′ of Γ ⇒ ∆.

(ii) If ACA0 ∀X A(X) where A(U) is arithmetic then there exist n, k < ω suchthat

RA∗ω+n

k∀X1A(X1) .

(iii) If ACA Γ ⇒ ∆ then there exist n, k < ω such that

RA∗ω+ω+n

ω+kΓ1 ⇒ ∆1

holds for every 1-instance Γ1 ⇒ ∆1 of Γ ⇒ ∆.

50

Proof: (i) Use induction on the length of the derivation in PAU . Numerical in-stances of the axioms of PAU other than (IA) are axioms of RA∗. (IA) is deduciblecut free and with length ω + 1 by Lemma 5.16. For the induction step note thatinferences of PAU other than (∀R) and (∃L) are inferences of RA∗ too. If the lastinference was (∀R) use (ωR) instead and if it was (∃L) use (ωL). Also note that ifA is numerical instance of a formula of PAU then |A| < ω.

(ii) follows from (i) since if ACA0 ∀X A(X) with A(U) is arithmetic thenPAU A(U) .

(iii) Again use induction on the length of the derivation. Note that a 1-instanceof a formula of ACA has length < ω + ω. ut

Theorem: 5.18 (Second Interpretation Theorem) If (∆11-CR)

nΓ ⇒ ∆ then

RA∗ω·σ+ω+6·nω·σ+ω

Γσ ⇒ ∆σ

holds for any σ = ωn · β with β > 0 and σ-instance Γσ ⇒ ∆σ of Γ ⇒ ∆.

Proof: Homework #5 Problem 5. ut

Corollary: 5.19 (i) If PAU Γ ⇒ ∆ then there exists α < ε0 such that

RA∗α

0Γ′ ⇒ ∆′

holds for every numerical instance of Γ′ ⇒ ∆′ of Γ ⇒ ∆.

(ii) If ACA0 ∀XA(X) where A(U) is arithmetic and has no free number vari-ables then there exists α < ε0 such that

RA∗α

0∀X1A(X1) .

(iii) If ACA Γ ⇒ ∆ then there exists α < εε0 such that

RA∗α

0Γ1 ⇒ ∆1

holds for every 1-instance Γ1 ⇒ ∆1 of Γ ⇒ ∆.

(iv) If (∆11-CR) ∀XA(X) where A(U) is arithmetic and has no free number

variables then there exists α < ϕω(0) such that

RA∗α

0∀X1A(X1) .

51

6 The limits of the deducibility of transfinite in-

duction

Definition: 6.1 Let ≺ be a relation on N. For a formula F (a) define Define

Prog(≺, F ) := ∀x (∀y ≺ xF (y) → F (x));

TI(≺, F ) := Prog(≺, F )→ ∀xF (x).

Also define

Prog(≺, U) := ∀x (∀y ≺ x y ∈ U → x ∈ U);

TI(≺, U) := Prog(≺, U)→ ∀x x ∈ U.

If ≺ is well-founded we define

|n|≺ = sup|k|≺ + 1 | k ≺ n‖≺‖ = sup|n|≺ | n ∈ N

For a theory T whose language comprises that of PAU define

‖ T ‖sup = sup‖≺‖| T ` TI(≺, U) where ≺ is primitive recursive.

Definition: 6.2 We define the notion of a U -positive (U -negative) formula of PAU .A formula in which U does not occur is both U -positive and U -negative. A formulat ∈ U is U -positive and ¬t ∈ U is U -negative. If A, B and F (a) are U -positive(U -negative) then so are A ∧ B, A ∨ B, ∀xF (x) and ∃xF (x). If A is U -positive(U -negative) then ¬A is U -negative (U -positive). If A is U -negative (U -positive)and B is U -positive (U -negative) then A→ B is U -positive (U -negative).

If A(U) is a formula of PAU without free number variables and X ⊆ N we write

(N, X) |= A(U)

if A(U) becomes true on interpreting U by X.Note that if A(U) is U -positive, X ⊆ Y ⊆ N and (N, X) |= A(U), then (N, Y ) |=

A(U). We shall refer to this fact as monotonicity of U -positive formulae. Similarly,U -negative formulae behave in an anti-monotonic way.

If Γ is a non-empty finite set of formulae we denote by∨

Γ and∧

Γ the dis-junction and conjunction of all formula in Γ, respectively. Also define

∧∅ to be the

formula 0 = 0 and∨∅ to be the formula 0 = 1.

Proposition: 6.3 Assume that ≺ is a well-founded relation on N which is definedby an arithmetic formula, i.e. there is an arithmetic formula B(a, b) with exactlythe exhibited free variables such that n ≺ m iff B(n,m) holds in the standard model.

Let ∆ be a finite set of U-positive arithmetic formulae and Γ be a finite set ofU-negative arithmetic formulae with no other free variables than U . We identify Uwith U0

0 .If δ = max(|t1|≺, . . . , |tr|≺) and

RA∗β

0t1 ∈ U, . . . , tr ∈ U,Prog(≺, U),Γ ⇒ ∆

then(N, m | |m|≺ < δ + 2β) |=

∧Γ→

∨∆ .

52

Proof: We employ induction on β. If the entire sequent is an axiom one readilychecks that the claim is true. If the last inference introduced a principal formulabelonging to Γ or ∆ the claim follows readily from the induction hypothesis appliedto the premisses. Now assume that the last inference had Prog(≺, U) as its principalformula. Then we have

RA∗β0

0t1 ∈ U, . . . , tr ∈ U,Prog(≺, U),∀y ≺ t y ∈ U → t ∈ U,Γ ⇒ ∆

for some closed term t and β0 < β. Using (→ L)-inversion we get

RA∗β0

0t1 ∈ U, . . . , tr ∈ U,Prog(≺, U),Γ ⇒ ∆,∀y ≺ t y ∈ U ; (45)

RA∗β0

0t1 ∈ U, . . . , tr ∈ U, t ∈ U,Prog(≺, U),Γ ⇒ ∆ . (46)

Note that ∀y ≺ t y ∈ U is a U -positive formula, and hence we may apply theinduction hypothesis to (45) to arrive at

(N, m | |m|≺ < δ + 2β0) |=∧

Γ→ (∨

∆ ∨ ∀y ≺ t y ∈ U ).

If (N, m | |m|≺ < δ + 2β0) |=∧

Γ →∨

∆ we are done owing to monotonicity. Ifthe latter is not the case, then we have

(N, m | |m|≺ < δ + 2β0) |= ∀y ≺ t y ∈ U

which entails that |t|≺ ≤ δ + 2β0 . As a result, the induction hypothesis applied to(46) with δ′ = δ + 2β0 yields

(N, m | |m|≺ < δ′ + 2β0) |=∧

Γ→∨

∆ .

As δ′ + 2β0 = δ + 2β0 + 2β0 < δ + 2β we are done again by monotonicity. ut

Corollary: 6.4 If

RA∗β

0Prog(≺, U)→ ∀x x ∈ U

then ‖≺‖≤ 2β.

Proof: The assumption entails that

RA∗β

0Prog(≺, U) ⇒ ∀x x ∈ U ,

and hence by the previous Proposition, |n|≺ < 2β holds for all n, whence ‖≺‖≤ 2β.ut

Corollary: 6.5 (i) ‖ PAU ‖sup = ε0.

(ii) ‖ ACA0 ‖sup = ε0.

(iii) ‖ ACA ‖sup = εε0.

(iv) ‖ (∆11-CR) ‖sup = ϕω(0).

Proof: The “≤” estimates follow from Corollary 5.19 in combination with Corol-lary 6.4. The “≥” estimates in (i),(ii),(iii) follow from homework assignment #6,problems 3 and 4. The “≥” part in (iv) will be another exercise. ut

53

6.1 Proof-theoretical reductions

Ordinal analyses of theories allow one to compare the strength of theories. This sub-section defines the notions of proof-theoretic reducibility and proof-theoretic strengththat will be used henceforth.

All theories T considered in the following are assumed to contain a modicumof arithmetic. For definiteness let this mean that the system PRA of PrimitiveRecursive Arithmetic is contained in T , either directly or by translation.

Definition: 6.6 Let T1, T2 be a pair of theories with languages L1 and L2, respec-tively, and let Φ be a (primitive recursive) collection of formulae common to bothlanguages. Furthermore, Φ should contain the closed equations of the language ofPRA.

We then say that T1 is proof-theoretically Φ-reducible to T2, written T1 ≤Φ T2, ifthere exists a primitive recursive function f such that

PRA ` ∀φ ∈ Φ∀x [ProofT1(x, φ) → ProofT2(f(x), φ)]. (47)

T1 and T2 are said to be proof-theoretically Φ-equivalent, written T1 ≡Φ T2, if T1 ≤Φ

T2 and T2 ≤Φ T1.The appropriate class Φ is revealed in the process of reduction itself, so that in

the statement of theorems we simply say that T1 is proof-theoretically reducible to T2

(written T1 ≤ T2) and T1 and T2 are proof-theoretically equivalent (written T1 ≡ T2),respectively. Alternatively, we shall say that T1 and T2 have the same proof-theoreticstrength when T1 ≡ T2.

Feferman’s notion of proof-theoretic reducibility (in S. Feferman: Hilbert’s pro-gram relativized: Proof-theoretical and foundational reductions, J. Symbolic Logic53 (1988) 364–384) is more relaxed in that he allows the reduction to be given by aT2-recursive function f , i.e.

T2 ` ∀φ ∈ Φ∀x [ProofT1(x, φ) → ProofT2(f(x), φ)]. (48)

The disadvantage of (48) is that one forfeits the transitivity of the relation ≤Φ.Furthermore, in practice, proof-theoretic reductions always come with a primitiverecursive reduction, so nothing seems to be lost by using the stronger notion ofreducibility.

6.2 The general form of ordinal analysis

In this subsection I attempt to say something general about all ordinal analyses thathave been carried out thus far. One has to bear in mind that these concern “natural”theories. Also, to circumvent countless and rather boring counter examples, I willonly address theories that have at least the strength of PA and and always assumethe pertinent ordinal representation systems are closed under α 7→ ωα.

Before delineating the general form of an ordinal analysis, we need several defini-tions. We first garner some features (following that ordinal representation systemsused in proof theory always have, and collectively call them “elementary ordinalrepresentation system”. One reason for singling out this notion is that it leads toan elegant characterization of the provably recursive functions of theories equippedwith transfinite induction principles for such ordinal representation systems.

54

Definition: 6.7 Elementary recursive arithmetic, EA, is a weak system of numbertheory, in a language with 0, 1,+,×, E (exponentiation), <, whose axioms are:

1. the usual recursion axioms for +,×, E,<.

2. induction on ∆0-formulae with free variables.

EA is referred to as elementary recursive arithmetic since its provably recursivefunctions are exactly the Kalmar elementary functions, i.e. the class of functionswhich contains the successor, projection, zero, addition, multiplication, and modi-fied subtraction functions and is closed under composition and bounded sums andproducts

Definition: 6.8 For a set X and and a binary relation ≺ on X, let LO(X,≺)abbreviate that ≺ linearly orders the elements of X and that for all u, v, wheneveru ≺ v, then u, v∈X.

A linear ordering is a pair 〈X,≺〉 satisfying LO(X,≺).

Definition: 6.9 An elementary ordinal representation system (EORS) for a limitordinal λ is a structure 〈A,, n 7→ λn,+,×, x 7→ ωx〉 such that:

(i) A is an elementary subset of N.

(ii) is an elementary well-ordering of A.

(iii) || = λ.

(iv) Provably in EA, λn is a proper initial segment of for each n, and⋃n λn = . In particular, EA ` ∀y λy ∈A ∧ ∀x∈A∃y [x λy].

(v) EA ` LO(A,)

(vi) +,× are binary and x 7→ ωx is unary. They are elementary functions onelementary initial segments of A. They correspond to ordinal addition, multi-plication and exponentiation to base ω, respectively. The initial segments ofA on which they are defined are maximal.

n 7→ λn is an elementary function.

(vii) 〈A,,+,×, ωx〉 satisfies “all the usual algebraic properties” of an initial seg-ment of ordinals. In addition, these properties of 〈A,,+,×, ωx〉 can beproved in EA.

(viii) Let n denote the nth element in the ordering of A. Then the correspondencen↔ n is elementary.

(ix) Let α = ωβ1 + · · · + ωβk , β1 ≥ · · · ≥ βk (Cantor normal form). Then thecorrespondence α ↔ 〈β1, . . . , βk〉 is elementary.

Elements of A will often be referred to as ordinals, and denoted α, β, . . ..

55

Definition: 6.10 Suppose LO(A,) and F (u) is a formula. Then TI〈A,〉(F ) is theformula

∀n∈A [∀x nF (x)→ F (n)] → ∀n∈AF (n). (49)

TI(A,) is the schema consisting of TI〈A,〉(F ) for all F .

Given a linear ordering 〈A,〉 and α∈A let Aα = β ∈A : β α and α be therestriction of to Aα.

In what follows, quantifiers and variables are supposed to range over the naturalnumbers. When n denotes a natural number, n is the canonical name in the languageunder consideration which denotes that number.

Observation: 6.11 Every ordinal analysis of a classical or intuitionistic theory Tthat has ever appeared in the literature provides an EORS 〈A,, . . .〉 such that T isproof-theoretically reducible to PA +

⋃α∈A TI(Aα,α).

Moreover, if T is a classical theory, then T and PA +⋃α∈A TI(Aα,α) prove

the same arithmetic sentences, whereas if T is based on intuitionististic, then T andHA +

⋃α∈A TI(Aα,α) prove the same arithmetic sentences.

Furthermore, ‖ T ‖sup=‖ ‖.

Remark: 6.12 There is a lot of leeway in stating the latter observation. For in-stance, instead of PA one could take PRA or EA as the base theory, and the schemeof transfinite induction could be restricted to Σ0

1 formulae as PA +⋃α∈A TI(Aα,α)

and EA +⋃α∈A Σ0

1-TI(Aα,α) have the same proof-theoretic strength, providingthat A is closed under exponentiation α 7→ ωα.

Observation 6.11 lends itself to a formal definition of the notion of proof-theoreticordinal of a theory T . Of course, before one can go about determining the proof-theoretic ordinal of T , one needs to be furnished with representations of ordinals.Not surprisingly, a great deal of ordinally informative proof theory has been con-cerned with developing and comparing particular ordinal representation systems.Assuming that a sufficiently strong EORS 〈A,, . . .〉 has been provided, we define

|T |〈A,,...〉 := least ρ ∈ A. T ≡ PA +⋃αρ

TI(Aα,α) (50)

and call |T |〈A,,...〉, providing this ordinal exists, the proof-theoretic ordinal of T withrespect to 〈A,, . . .〉.

Since, in practice, the ordinal representation systems used in proof theory arecomparable, we shall frequently drop mentioning of 〈A,, . . .〉 and just write |T | for|T |〈A,,...〉.

Note, however, that |T |〈A,,...〉 might not exist even if the order-type of isbigger than ‖ T ‖sup. A simple example is provided by the theory PA + Con(PA)(where Con(PA) expresses the consistency of PA) when we take 〈A,, . . .〉 to be astandard EORS for ordinals > ε0; the reason being that PA + Con(PA) is proof-theoretically strictly stronger than PA +

⋃αε0

TI(Aα,α) but also strictly weakerthan PA +

⋃αε0+1 TI(Aα,α). Therefore, as opposed to ‖ · ‖sup, the norm |·|〈A,,...〉

is only partially defined and does not induce a prewellordering on theories T with‖ T ‖sup<‖ ‖.

The remainder of this subsection expounds on important consequences of ordinalanalyses that follow from Observation 6.11.

56

Proposition: 6.13 PA +⋃α∈A TI(Aα,α) and HA +

⋃α∈A TI(Aα,α) prove the

same sentences in the negative fragment, where a sentence is in the negative fragmentif it is built from atomic formulae via ∧,→,¬,∀x.

Proof: PA +⋃α∈A TI(Aα,α) can be interpreted in HA +

⋃α∈A TI(Aα,α) via

the Godel–Gentzen ¬¬-translation. Observe that for an instance of the schema oftransfinite induction we have

(∀u [∀x (∀y [y ≺ x→ φ(y)]→ φ(x)) → φ(u)])¬¬ ≡(∀u [∀x (∀y [¬¬y ≺ x→ ¬¬φ(y)]→ ¬¬φ(x)) → ¬¬φ(u)]).

Thus for primitive recursive ≺ the ¬¬-translation is HA equivalent to an instanceof the same schema. ut

Corollary: 6.14 PA +⋃α∈A TI(Aα,α) and HA +


same Π01 sentences.

Since many well-known and important theorems as well as conjectures from numbertheory are expressible in Π0

1 form (examples: the quadratic reciprocity law, Wiles’theorem, also known as Fermat’s conjecture, Goldbach’s conjecture, the Riemannhypothesis), Π0

1 conservativity ensures that many mathematically important theo-rems which turn out to be provable in S will be provable in T , too.

However, Π01 conservativity is not always a satisfactory conservation result. Some

important number-theoretic statements are Π02 (examples are: the twin prime con-

jecture, miniaturized versions of Kruskal’s theorem, totality of the van der Waerdenfunction), and in particular, formulas that express the convergence of a recursivefunction for all arguments. Consider a formula ∀n∃mP (n,m), where P (n,m) isa primitive recursive formula expressing that “m codes a complete computation ofalgorithm A on input n.” The ¬¬-translation of this formula is ∀n¬∀m¬P (n,m),conveying the convergence of the algorithm A for all inputs only in a weak sense.Fortunately, Proposition 6.14 can be improved to hold for sentences of Π0

2 form.

Proposition: 6.15 PA +⋃α∈A TI(Aα,α) and HA +


same Π02 sentences.

The missing link to get from Proposition 6.13 to Proposition 6.15 is usuallyprovided by Markov’s Rule for primitive recursive predicates, MRPR: if ¬∀n¬Q(n)(or, equivalently, ¬¬∃nQ(n)) is a theorem, where Q is a primitive recursive relation,then ∃nQ(n) is a theorem. Kreisel [30] showed that MRPR holds for HA. Avariety of intuitionistic systems have since been shown to be closed under MRPR,using a variety of complicated methods, notably Godel’s dialectica interpretation andnormalizability. A particularly elegant and short proof for closure under MRPR isdue to Friedman [18] and, independently, to Dragalin [12]. However, though theFriedman–Dragalin argument works for a host of systems, it doesn’t seem to workin the case of HA +

⋃α∈A TI(Aα,α).

Proof of Proposition 6.15: We will give a direct proof, i.e. without using Propo-sition 6.13. So suppose

PA +⋃α∈A

TI(Aα,α) ` ∀x∃y φ(x, y),

57

where φ is ∆0. Then there already exists a δ ∈ A such that

PA + TI(Aδ,δ) ` ∀x∃y φ(x, y). (51)

We now use the coding of infinitary PA∞ derivations presented in [53], section 4.2.2.

Let dβ

ρ ψ signify that d is the code of a PA∞ derivation with length ≤ β, cut-rankρ and end formula ψ. (51) implies that there is a d0 and n < ω such that

HA +⋃α∈A

TI(Aα,α) ` d0δ·ωn ∀x ∃y φ(x, y) . (52)

To obtain a cut-free proof of ∀x∃y φ(x, y) in PA∞ one needs transfinite inductionup to the ordinal ωδ·ωn , where ωγ0 := γ and ωγm+1 := ωω

γm . This amount of transfinite

induction is available in our background theory HA +⋃α∈A TI(Aα,α) as A is

closed under ξ 7→ ωξ. Also note that the cut-elimination procedure is completelyeffective. Thus from (52) we obtain, for some d∗,

HA +⋃α∈A

TI(Aα,α) ` d∗ωδ·ωn

0∀x∃y φ(x, y) , (53)

and further

HA +⋃α∈A

TI(Aα,α) ` ∀x∃d dωδ·ωn

0∃y φ(x, y) (54)

(where Feferman’s dot convention has been used here). Let TrΣ1 be a truth predicatefor Godel numbers of disjunctions of Σ1 formulae (cf. [59], section 1.5, in particular1.5.7). We claim that

HA +⋃α∈A

TI(Aα,α) ` ∀d∀β ≤ ωδ·ωn ∀Γ ⊆ Σ1 [ dβ

0Γ → TrΣ1(

∨Γ)], (55)

where ∀Γ ⊆ Σ1 is a quantifier ranging over Godel numbers of finite sets of Σ1

formulae and∨

Γ stands for the Godel number corresponding to the disjunction ofall formulae of Γ. (55) is proved by induction on β by observing that all formulaeoccurring in a cut-free PA∞ proof of a set of Σ1 formulae are Σ1 themselves andthe only inferences therein are either axioms or instances of the (∃) rule or improperinstances of the ω rule. Combining (54) and (55) we obtain

HA +⋃α∈A

TI(Aα,α) ` ∀xTrΣ1(∃y φ(x, y) ). (56)

AsHA ` ∀x [ TrΣ1(∃y φ(x, y) ) ↔ ∃y φ(x, y)]

(cf. [59], Theorem 1.5.6), we finally obtain

HA +⋃α∈A

TI(Aα,α) ` ∀x ∃y φ(x, y).

ut

58

In section 2 we considered the ordinal |T |Con. What is the relation between|T |Con and |T |〈A,,...〉? First we have to delineate the meaning of |T |Con, though.The latter is only determined with respect to a given ordinal representation system〈B,≺, . . .〉. Thus let

|T |Con = least α ∈ B. PRA + PR-TI(α) ` Con(T ).

It turns out that the two ordinals are the same when T is proof-theoretically re-ducible to PA +

⋃α∈A TI(Aα,α), A is closed under α 7→ ωα and 〈B,≺, . . .〉 is a

proper end extension of 〈A,, . . .〉. The reasons are as follows:

Proposition: 6.16 The consistency of PA +⋃α∈A TI(Aα,α) can be proved in

the theory PRA+PR-TI(A,), where PR-TI(A,) stands for transfinite inductionalong for primitive recursive predicates.

Hint of proof. First note that PRA + PR-TI(A,) ` Π01-TI(A,). The key to

showing this is that for each α ∈ A and each x ∈ ω we can code α and x by theordinal ω · α + x which is less than ω · (α + 1) and therefore in A.

Secondly, one has to show that an ordinal analysis of PA+⋃α∈A TI(Aα,α) can

be carried out in PRA + Π01-TI(A,). The main tool to achieve this is to embed

PA+⋃α∈A TI(Aα,α) into a system of Peano arithmetic with an infinitary rule, the

so-called ω-rule, and a repetition rule, Rep, which simply repeats the premise as theconclusion. The ω-rule allows one to infer ∀xφ(x) from the infinitely many premisesφ(0), φ(1), φ(2), . . . (where n denotes the nth numeral); its addition accounts forthe fact that the infinitary system enjoys cut-elimination. The addition of theRep rule enables one to carry out a continuous cut elimination, due to Mints [35],which is a continuous operation in the usual tree topology on prooftrees. A furtherpivotal step consists in making the ω-rule more constructive by assigning codes toproofs, where codes for applications of finitary rules contain codes for the proofsof the premises, and codes for applications of the ω-rule contain Godel numbersfor primitive recursive functions enumerating codes of the premises. Details canbe found in [53]. The main idea here is that we can do everything with primitiverecursive proof–trees instead of arbitrary derivations. A proof–tree is a tree, witheach node labelled by: A sequent, a rule of inference or the designation “Axiom”,two sets of formulas specifying the set of principal and minor formulas,respectively,of that inference, and two ordinals (length and cut–rank) such that the sequent isobtained from those immediately above it through application of the specified ruleof inference. The well-foundedness of a proof–tree is then witnessed by the (first)ordinal “tags” which are in reverse order of the tree order. As a result, the notionof being a (code of a) proof tree is Π0

1. The cut elimination for infinitary proofswith finite cut rank (as presented in [53]) can be formalized in PRA+ Π0

1-TI(A,).The last step consists in recognizing that every endformula of Π0

1 form of a cut freeinfinitary proof is true. The latter employs Π0

1-TI(A,). For details see [53]. ut

59

7 Kripke-Platek Set Theory

One of the fragments of ZF which has been studied intensively is Kripke-Platekset theory, KP. Its standard models are called admissible sets. One of the reasonsthat this is a truly remarkable theory is that a great deal of set theory requires onlythe axioms of KP. An even more important reason is that admissible sets havebeen a major source of interaction between model theory, recursion theory and settheory. (cf. [4]1). KP arises from ZF by completely omitting the Powerset axiomand restricting Separation and Collection to absolute predicates (cf. [4]), i.e. ∆0

formulas. These alterations are suggested by the informal notion of ‘predicative’.The axiom systems for set theories considered in this paper are formulated in

the usual language of set theory (called L∈ hereafter) containing ∈ as the only non-logical symbol besides =. Formulae are built from prime formulae a ∈ b and a = bby use of propositional connectives and quantifiers ∀x,∃x. Quantifiers of the forms∀x ∈ a, ∃x ∈ a are called bounded. Bounded or ∆0-formulae are the formulae whereinall quantifiers are bounded; Σ1-formulae are those of the form ∃xϕ(x) where ϕ(a) isa ∆0-formula. For n > 0, Πn-formulae (Σn-formulae) are the formulae with a prefixof n alternating unbounded quantifiers starting with a universal (existential) onefollowed by a ∆0-formula. The class of Σ-formulae is the smallest class of formulaecontaining the ∆0-formulae which is closed under ∧, ∨, bounded quantification andunbounded existential quantification.

One of the set theories which is amenable to ordinal analysis is Kripke-Platekset theory, KP. Its standard models are called admissible sets. One of the reasonsthat this is an important theory is that a great deal of set theory requires onlythe axioms of KP. An even more important reason is that admissible sets havebeen a major source of interaction between model theory, recursion theory and settheory (cf. [4]). KP arises from ZF by completely omitting the power set axiomand restricting separation and collection to bounded formulae. These alterationsare suggested by the informal notion of ‘predicative’.

Definition: 7.1 By a ∆0 formula or bounded formula we mean a formula of settheory in which all the quantifiers appear restricted, that is have one of the forms(∀x∈b) or (∃x∈b).

The axioms of KP are:

Extensionality: ∀x (x ∈ a ↔ x ∈ b)→ a = b.

Set Induction: ∀x[∀y∈xG(y)→ G(x)]→ ∀xG(x)

Pair: ∃x (x = a, b).Union: ∃x (x =

⋃a).

Infinity: ∃x [x 6= ∅ ∧ (∀y∈x)(∃z∈x)(y∈z)].

∆0 Separation: ∃x ∀u[u ∈ x ↔ (u ∈ a ∧ F (u))]for all ∆0–formulas F

∆0 Collection: (∀x∈a)∃yG(x, y)→ ∃z(∀x∈a)(∃y∈z)G(x, y)for all ∆0–formulas G.

1J. Barwise: Admissible sets and structures. (Springer, Berlin, 1975)

60

To be more precise, the axioms of KP consist of Extensionality, Pair, Union,Infinity, Bounded Separation

∃x ∀u [u ∈ x↔ (u ∈ a ∧ F (u))]

for all bounded formulae F (u), Bounded Collection

∀x ∈ a ∃y G(x, y) → ∃z ∀x ∈ a ∃y ∈ z G(x, y)

for all bounded formulae G(x, y), and Set Induction

∀x [(∀y ∈ xH(y))→ H(x)] ,→ ∀xH(x)

for all formulae H(x).A transitive set A such that (A,∈) is a model of KP is called an admissible

set. Of particular interest are the models of KP formed by segments of Godel’sconstructible hierarchy L. The constructible hierarchy is obtained by iterating thedefinable powerset operation through the ordinals

L0 = ∅,Lλ =

⋃Lβ : β < λ λ limit

Lβ+1 = X : X ⊆ Lβ; X definable over 〈Lβ,∈〉.

So any element of L of level α is definable from elements of L with levels < α andthe parameter Lα. An ordinal α is admissible if the structure (Lα,∈) is a model ofKP.

Formulae of L2 can be easily translated into the language of set theory. Some ofthe subtheories of Z2 considered above have set-theoretic counterparts, characterizedby extensions of KP. KPi is an extension of KP via the axiom

(Lim) ∀x∃y[x∈y ∧ y is an admissible set].

KPl denotes the system KPi without Bounded Collection. It turns out that(Π1

1−AC) + BI proves the same L2-formulae as KPi, while (Π11−CA) proves the

same L2-formulae as KPl.The intuitionistic version of KP, will be denoted by IKP.By IKP0 we denote the system IKP bereft of Set Induction.

7.1 Basic principles

The intent of this section is to explore which of the well known provable consequencesof KP carry over to IKP.

7.1.1 Ordered Pairs

By the Pairing axiom, for sets a, b we get a set y such that

∀x(x∈ y ↔ x = a ∨ x = b).

61

This set is unique by Extensionality; we call this set a, b. a = a, a is the setwhose unique element is a. 〈a, b〉 = a, a, b is the ordered pair of a and b. Weclaim that if 〈a, b〉 = 〈c, d〉 then a = c and b = d.

The usual classical proof argues by cases depending, for example, whether ornot a = b. This method is not available here as we cannot assume that instance ofthe classical law of excluded middle. Instead we can argue as follows. Assume that〈a, b〉 = 〈c, d〉.

As a is an element of the left hand side it is also an element of the right handside and so either a = c or a = c, d. In either case a = c.

As a, b is an element of the left hand side it is also an element of the righthand side and so either a, b = c or a, b = c, d. In either case b = c or b = d.If b = c then a = c = b so that the two sets in 〈a, b〉 are equal and hence c = c, dgiving c = d and hence b = d. So in either case b = d. ut

We will also have use for ordered triples 〈a, b, c〉, ordered quadruples 〈a, b, c, d〉,etc. They are defined by iterating the ordered pairs formation as follows: 〈a〉 = aand 〈a1, . . . , ar, ar+1〉 = 〈〈a1, . . . , ar〉, ar+1〉.

Proposition: 7.2 (IKP0) If c, d are sets then so is the class c× d.

Proof: Let c, d be sets. Then, as

a × d = 〈a, b〉 | b ∈ d

is a set, by Replacement, so is

c× d =⋃a∈c

(a × d)

by Replacement and Union. ut

Definition: 7.3 The collection of Σ formulae is the smallest collection containingthe ∆0 formulae closed under conjunction, disjunction, bounded quantification andunbounded existential quantification. The collection of Π formulae is the small-est collection containing the ∆0 formulae closed under conjunction, disjunction,bounded quantification and unbounded universal quantification.

Given a formula A and a variable w not appearing in A, we write Aw for theresult of replacing each unbounded quantifier ∃x and ∀x in A by ∃x∈w and ∀x∈w,respectively.

Lemma: 7.4 For each Σ formula the following are intuitionistically valid:

(i) Au ∧ u ⊆ v → Av,

(ii) Au → A.

Proof: Both facts are proved by induction following the inductive definition of Σformula. ut

62

Theorem: 7.5 (Σ Reflection Principle). For all Σ formulae A we have thefollowing:

IKP0 ` A↔ ∃aAa.(Here a is any set variable not occurring in A; we will not continue to make these an-noying conditions on variables explicit.) In particular, every Σ formula is equivalentto a Σ1 formula in IKP0.

Proof: We know from the previous lemma that ∃aAa → A, so the axioms of IKP0

come in only in showing A → ∃aAa. proof is by induction on A, the case for ∆0

formulae being trivial. We take the three most interesting cases, leaving the othertwo to the reader.

Case 0. If A is ∆0 then A↔ Aa holds for every set a.Case 1. A is B ∧ C. By induction hypothesis, IKP0 ` B ↔ ∃aBa and IKP0 `

C ↔ ∃aCa. Let us work in IKP0, assuming B ∧C. Now there are a1, a2 such thatBa1 , Ca2 , so let a = a1 ∪ a2. Then Ba and Ca hold by the previous lemma, andhence Aa.

Case 2. A is B ∨ C. By induction hypothesis, IKP0 ` B ↔ ∃aBa and IKP0 `C ↔ ∃aCa. Let us work in IKP0, assuming B ∨ C. Then Ba1 for some set a1 orthere is a set a2 such that Ca2 . In the first case we have Ba ∨Ca with a := a1 whilein the second case we have Ba ∨ Ca with a := a2.

Case 2. A is ∀u∈ v B(u). The inductive assumption yields IKP0 ` B(u) ↔∃aB(u)a. Again, working in IKP0, assume ∀u∈ v B(u) and show ∃a ∀u∈ v B(u)a.For each u∈ v there is a b such that B(u)b, so by ∆0 Collection there is an a0 suchthat ∀u∈ v ∃b∈ a0B(u)b. Let a =

⋃a0. Now, for every u∈ v, we have ∃b ⊆ aB(u)b;

so ∀u∈ vB(u)a, by the previous lemma.Case 3. A is ∃uB(u). Inductively we have IKP0 ` B(u) ↔ ∃bB(u)b. Working

in IKP0, assume ∃uB(u). Pick u0 such B(u0) and b such that B(u0)b. Letting a =b ∪ u0 we get u0 ∈ a and B(u0)a by the previous lemma. Thence ∃a ∃u∈ aB(u)a.

utIn Platek’s original definition of admissible set he took the Σ Reflection Principle

as basic. It is very powerful, as we’ll see below. ∆0 Collection is easier to verify,however.

Theorem: 7.6 (The Strong Σ Collection Principle). For every Σ formula A thefollowing is a theorem of IKP0: If ∀x∈ a ∃yA(x, y) then there is a set b such that∀x∈ a∃y ∈ bA(x, y) and ∀y ∈ b ∃x∈ aA(x, y).

Proof: Assume that∀x∈ a∃y ∈ bA(x, y).

By Σ Reflection there is a set c such that

∀x∈ a ∃y ∈ cA(x, y)c. (57)

Let

b = y ∈ c| ∃x∈ aA(x, y)c, (58)

by ∆0 Separation. Now, sinceA(x, y)c → A(x, y) by 7.4, (57) gives us ∀x∈ a∃y ∈ bA(x, y),whereas (58) gives us ∀y ∈ b ∃x∈ aA(x, y). ut

63

Theorem: 7.7 (Σ Replacement). For each Σ formula A(x, y) the following is atheorem of IKP0: If ∀x∈ a∃!y A(x, y) then there is a function f , with dom(f) = a,such that ∀x∈ aA(x, f(x)).

Proof: By Σ Reflection there is a set d such that

∀x∈ a∃y ∈ dA(x, y)d.

Since A(x, y)d implies A(x, y) we get ∀x∈ a∃!y ∈ dA(x, y)d. Thus, defining f =〈x, y〉 ∈ a × d|A(x, y)d by ∆0 Separation, f is a function satisfying dom(f) = aand ∀x∈ aA(x, f(x)). utThe above is sometimes infeasible because of the uniqueness requirement ∃! in thehypothesis. In these situations it is usually the next result which comes to therescue.

Theorem: 7.8 (Strong Σ Replacement). For each Σ formula A(x, y) the followingis a theorem of IKP0: If ∀x∈ a∃y A(x, y) then there is a function f with dom(f) =a such that for all x∈ a, f(x) is inhabited and ∀x∈ a∀y ∈ f(x)A(x, y).

Proof: Exercise. utOne principle of KP that is not provable in IKP is ∆1 Separation.

Proposition: 7.9 (KP0) (∆1 Separation). If A is a Σ formula A and B is a Πformula, then

KP0 ` ∀x∈ a [A(x)↔ B(x)]→ ∃z ∀u[u ∈ z ↔ (u ∈ a ∧ A(x))].

Proof: The reason is that classically ∀x∈ a [A(x) ↔ B(x)] entails ∀x∈ a [A(x) ∨¬B(x)] which is classically equivalent to a Σ formula. ut

7.2 Σ Recursion in IKP

The mathematical power of KP resides in the possibility of defining Σ functions by∈-recursion and the fact that many interesting functions in set theory are definableby Σ Recursion. Moreover the scheme of ∆0 Separation allows for an extension withprovable Σ functions occurring in otherwise bounded formulae.

Proposition: 7.10 (Definition by Σ Recursion in IKP.) If G is a total (n+2)–aryΣ definable class function of IKP, i.e.

IKP ` ∀~xyz∃!uG(~x, y, z) = u

then there is a total (n+ 1)–ary Σ class function F of IKP such that2

IKP ` ∀~xy[F (~x, y) = G(~x, y, (F (~x, z)|z ∈ y))].

2(F (~x, z)|z ∈ y) := 〈z, F (~x, z)〉 : z ∈ y

64

Proof: Let A(f, ~x) be the formula

[f is a function] ∧ [dom(f) is transitive] ∧ [∀y ∈ dom(f) (f(y) = G(~x, y, f |y))].

SetB(~x, y, f) = [A(f, ~x) ∧ y ∈ dom(f)].

Claim IKP ` ∀~x, y∃!fB(~x, y, f).

Proof of Claim: By ∈ induction on y. Suppose ∀u∈ y ∃g B(~x, u, g). By Strong Σ Col-lection we find a set A such that ∀u∈ y ∃g ∈AB(~x, u, g) and ∀g ∈A∃u∈ y B(~x, u, g).Let f0 =

⋃g : g ∈ A. By our general assumption there exists a u0 such that

G(~x, y, (f0(u)|u ∈ y)) = u0. Set f = f0 ∪ 〈y, u0〉. Since for all g ∈ A, dom(g) istransitive we have that dom(f0) is transitive. If u ∈ y, then u ∈ dom(f0). Thusdom(f) is transitive and y ∈ dom(f). We have to show that f is a function. Butit is readily shown that if g0, g1 ∈ A, then ∀x ∈ dom(g0) ∩ dom(g1)[g0(x) = g1(x)].Therefore f is a function. This also shows that ∀w∈dom(f)[f(w) = G(~x, w, f |w)],confirming the claim (using Set Induction).

Now define F by

F (~x, y) = w := ∃f [B(~x, y, f) ∧ f(y) = w].

ut

Corollary: 7.11 There is a Σ function TC of IKP such that

IKP ` ∀a[TC(a) = a ∪⋃TC(x) : x ∈ a].

Proposition: 7.12 (Definition by TC–Recursion) Under the assumptions of Propo-sition 7.10 there is an (n+ 1)–ary Σ class function F of IKP such that

IKP ` ∀~xy[F (~x, y) = G(~x, y, (F (~x, z)|z ∈ TC(y)))].

Proof: Hint: Let C(f, ~x, y) be the Σ formula

[f is a function] ∧ [dom(f) = TC(y)] ∧ [∀u∈dom(f)[f(u) = G(~x, u, f |TC(u))]].

Prove by ∈–induction that ∀y∃!f C(f, ~x, y). ut

65

8 An Ordinal representation system for the Bachmann-

Howard ordinal

Serving as a miniature example of an ordinal analysis of an impredicative system,we carry out an ordinal analysis of KP. The first step is to find a sufficiently strongordinal representation system.

Definition: 8.1 Let Ω be a “big” ordinal, e.g. Ω = ℵ1. By recursion on α we definesets B(α) and the ordinal ψ

Ω(α) as follows:

B(α) =

closure of 0,Ω under:+, (ξ 7→ ωξ), (ξ, η 7→ ϕξ(η),(ξ 7−→ ψ

Ω(ξ))ξ<α

(59)

ψΩ(α) = minρ < Ω | ρ /∈ B(α) (60)

if the set ρ < Ω | ρ /∈ B(α) is non-empty.

As per definition ψΩα might not be defined but the next Lemma shows that it is a

total function.

Lemma: 8.2 (i) B(α) is a countable set.

(ii) ψΩ(α) is always defined and ψ

Ω(α) < Ω.

Proof: (i) B(α) =⋃n<ω Bn(α) where B0(α) = 0,Ω and

Bn+1(α) = Bn(α) ∪ η + δ | η, δ ∈ Bn(α) ∪ ϕη(δ) | η, δ ∈ Bn(α)∪ ψ

Ω(ξ) | ξ ∈ Bn(α) ∧ ξ < α.

Inductively, each of the sets Bn(α) is countable (actually finite) and therefore B(α)is countable.

(ii) Ω is assumed to be a regular uncountable cardinal, thus B(α)∩Ω cannot beunbounded in Ω. ut

Lemma: 8.3 (i) If α ≤ δ then B(α) ⊆ B(δ) and ψΩ(α) ≤ ψ

Ω(δ).

(ii) If α ∈ B(δ) ∩ δ then ψΩ(α) < ψ

Ω(δ).

(iii) If α ≤ δ and [α, δ) ∩ B(α) = ∅ then B(α) = B(δ).

(iv) If λ is a limit then B(λ) =⋃ξ<λ B(ξ).

Proof: (i): B(α) ⊆ B(δ) is clearly true if α ≤ δ. And thus ψΩ(α) ≤ ψ

Ω(δ) follows

by definition and Lemma 8.2.(ii): From α ∈ B(δ) ∩ δ we get ψ

Ω(α) ∈ B(δ) and also, by (i), ψ

Ω(α) ≤ ψ

Ω(δ).

Since ψΩ(δ) /∈ B(δ) this entails ψ

Ω(α) < ψ

Ω(δ).

(iii): By induction on n one easily shows that Bn(δ) ⊆ B(α). This is obvious forn = 0. Assume it is true for n. If β < δ and β ∈ Bn(δ) then inductively we haveβ ∈ B(α) and hence β < α, yielding ψ

Ω(β) ∈ B(α). Thus we get Bn+1(δ) ⊆ B(α).

66

(iv): By (i) we have⋃ξ<λ B(ξ) ⊆ B(λ). To show the reverse inclusion we only

need to show that⋃ξ<λ B(ξ) is closed under the operations that define B(λ). This

is obvious for + and ϕ. So assume that δ ∈⋃ξ<λ B(ξ) ∩ λ. Then δ < ξ0 and

δ ∈ B(ξ1) for some ξ0, ξ1 < λ. Thus, letting ξ∗ = max(ξ0, ξ1), we have ψΩ(δ) ∈

B(ξ∗) ⊆⋃ξ<λ B(ξ). ut

Lemma: 8.4 ψΩ(α) ∈ SC, i.e., ϕψ

Ω(α)(0) = ψ

Ω(α).

Proof: If ψΩ(α) = ξ + δ for some ξ, δ < ψ

Ω(α), then ξ, η ∈ B(α) and therefore

ψΩ(α) = ξ + δ ∈ B(α), contradicting the definition of ψ

Ω(α).

Likewise, if ψΩ(α) = ϕρ(η) for some ρ η < ψ

Ω(α) then ρ, η ∈ B(α) and therefore

ψΩ(α) = ϕρ(η) ∈ B(α), contradicting the definition of ψ

Ω(α). Thus ψ

Ω(α) ∈ SC

follows by Lemma 4.30. ut

Theorem: 8.5 B(α) ∩ Ω = ψΩ(α).

Proof: Clearly, ψΩ(α) ⊆ B(α) ∩ Ω.

To conclude equality, it suffices to show that X := ψΩ(α) ∪ δ ∈ B(α) | δ ≥ Ω

is closed under the operations that define B(α). closure of X under + and ϕ followsfrom Lemma 8.4. To show closure under ψ

Ωfor arguments < α, assume β ∈ X and

β < α. Then ψΩ(β) < ψ

Ω(α) by Lemma 8.3(ii), and hence ψ

Ω(β) ∈ X. ut

Corollary: 8.6 If λ is a limit then ψΩ(λ) = supξ<λ ψΩ

(ξ).

Proof:

ψΩ(λ) = B(λ) ∩ Ω = (

⋃ξ<λ

B(ξ)) ∩ Ω

=⋃ξ<λ

(B(ξ) ∩ Ω) =⋃ξ<λ

ψΩ(ξ) = sup

ξ<λψ

Ω(ξ).

Here the first and fourth equality follow from Theorem 8.5 while the second equalityis a consequence of Lemma 8.3(iv). ut

Definition: 8.7 Let βΓ

denote the least ordinal ρ > β such that ρ ∈ SC.

Lemma: 8.8 (i) ψΩ(α + 1) ≤ (ψ

Ω(α))

Γ.

(ii) α ∈ B(α + 1) implies ψΩ(α + 1) = (ψ

Ω(α))

Γ.

(iii) α /∈ B(α) implies B(α) = B(α + 1) and ψΩ(α + 1) = ψ

Ω(α).

Proof: (i): It suffices to show that

Y := (B(α + 1) ∩ (ψΩ(α))

Γ

) ∪ δ ∈ B(α + 1) | δ ≥ Ω

is closed under the operations that define B(α + 1). Clearly, Y is closed under +and ϕ. If β ∈ Y and β < α + 1, then ψ

Ω(β) ≤ ψ

Ω(α) and hence ψ

Ω(β) ∈ Y .

(ii): α ∈ B(α + 1) yields ψΩ(α) < ψ

Ω(α + 1). Let ψ

Ω(α) < η < (ψ

Ω(α))

Γ. Then

η /∈ SC. By induction on η one therefore easily shows that η < ψΩ(α+ 1). Together

with (i) this implies ψΩ(α + 1) = (ψ

Ω(α))

Γ.

(iii) follows from Lemma 8.3(iii) since α /∈ B(α) yields B(α) ∩ [α, α+1) = ∅. ut

67

Theorem: 8.9 (i) If ξ < ψΩ(Ω) then ξ < Γξ = ψ

Ω(ξ) < ψ

Ω(Ω).

(ii) ΓψΩ

(Ω) = ψΩ(Ω).

(iii) If ψΩ(Ω) ≤ ξ ≤ Ω then ψ

Ω(ξ) = ψ

Ω(Ω).

Proof: Exercise.

Definition: 8.10 We write δ =NF

ϕξ(η) if δ = ϕξ(η) and ξ, η < δ.We write δ =

NFψ

Ω(α) if δ = ψ

Ω(α) and α ∈ B(α).

Note that by Lemma 8.3, δ =NF

ψΩ(α) and δ =

NFψ

Ω(β) implies α = β.

Lemma: 8.11 (i) If β =NF

β1 + . . .+ βn and β ∈ B(α) then β1, . . . , βn ∈ B(α).

(ii) If δ =NF

ϕξ(η) ∈ B(α) then ξ, η ∈ B(α).

(iii) If δ =NF

ψΩ(β) ∈ B(ρ) then β ∈ B(ρ) and β < ρ.

Proof: (i): Define

X := β ∈ B(α) | if β =NF

β1 + . . .+ βn for some β1, . . . , βn then β1, . . . , βn ∈ B(α).

Show that X is closed under the operations that define B(α).(ii): Define

Y := β ∈ B(α) | if β =NF

ϕξ(η) for some ξ, η then ξ, η ∈ B(α).

Show that Y is closed under the operations that define B(α).(iii): ψ

Ω(β) ∈ B(ρ) implies ψ

Ω(β) < ψ

Ω(ρ) and hence β < ρ. As β ∈ B(β) we also

get β ∈ B(ρ). ut

Remark: 8.12 It is essential to require that Ω ∈ B(α). If, instead of 0,Ω ∈ B(α),one would require only 0 ∈ B(α), then

⋃α∈ON B(α) = σ, where σ is the least ordinal

such that Γσ = σ.

8.1 The ordinal representation system OT(Ω)

We will single out a set of ordinals that can be viewed as ordinal representation inthat all ordinals in it have a unique representation over the alphabet 0,Ω,+, ϕ, ψ

Ω().

Definition: 8.13 The set OT(Ω) and Gα for α ∈ OT(Ω) are inductively definedby the following clauses:

(R1) 0,Ω ∈ OT(Ω) and G0 = GΩ := 0.

(R2) If α =NF

α1 + . . . + αn, n > 1 and α1, . . . , αn ∈ OT(Ω) then α ∈ OT(Ω) andGα = max(Gα1, . . . ,Gαn) + 1.

(R3) If α =NF

ϕβ(δ), β, δ < Ω and β, δ ∈ OT(Ω) then α ∈ OT(Ω) and Gα =max(Gβ,Gδ) + 1.

68

(R4) If α =NF

ωβ, β > Ω and β ∈ OT(Ω) then α ∈ OT(Ω) and Gα = (Gβ) + 1.

(R5) If α =NF

ψΩ(β), β ∈ OT(Ω) and β ∈ B(β) then α ∈ OT(Ω) and Gα = (Gβ)+1.

It follows from earlier results that any α ∈ OT(Ω) enters this set according to exactlyone of the rules (R1)-(R5) in exactly one way, and thus Gα is defined unambiguously.Especially, OT(Ω) can be viewed as a set of terms which are composed of the symbols0,Ω,+, ϕ, ψ

Ωin a unique way. What we are driving at next is a procedure that

enables us to decide for α, β ∈ OT(Ω) with α 6= β whether α < β or α > β solelyby inspection of their term representation. We also need a recipe to decide whetheran expression made up of the symbols 0,Ω,+, ϕ, ψ

Ωrepresents an ordinal of OT(Ω).

The main obstacle is raised by (R5) since we do not know how to deal with thecondition β ∈ B(β). This problems gives rise to the following definition.

Definition: 8.14 Inductive definition of Kα for α ∈ OT(Ω).

(K1) K0 = KΩ = ∅.

(K2) Kα = Kα1 ∪ . . . ∪ Kαn if α =NF

α1 + . . .+ αn where n > 1.

(K3) Kα = Kβ ∪ Kδ if α =NF

ϕβ(δ).

(K4) Kα = Kβ ∪ β if α =NF

ψΩ(β).

If X is a set of ordinals we write X < η to convey that ξ < η holds for all ξ ∈ X.

Note that Kα is always a finite set.

Lemma: 8.15 Let α ∈ OT(Ω). Then α ∈ B(ρ) if and only if Kα < ρ.

Proof: We proceed by induction on Gα.If α =

NFα1 + . . .+ αn with n > 1 then:

α ∈ B(ρ) iff α1, . . . , αn ∈ B(ρ) iff Kα1 ∪ . . . ∪ Kαn < ρ iff Kα < ρ,

using Lemma 8.11(i) and the induction hypothesis.Likewise, if α =

NFϕη(β) then:

α ∈ B(ρ) iff η, β ∈ B(ρ) iff Kη ∪ Kβ < ρ iff Kα < ρ,

using Lemma 8.11(ii) and the induction hypothesis.Now let α =

NFψ

Ω(β). Then:

α ∈ B(ρ) iff β ∈ B(ρ) ∧ β < ρ iff Kβ < ρ ∧ β < ρ iff Kα < ρ,

using Lemma 8.11(iii) and the induction hypothesis. ut

Lemma: 8.16 If α ∈ OT(Ω) then ∀β ∈ Kα Gβ < Gα.

Proof: Use induction on Gα. utSummarizing results from section 4 and this section we arrive at a primitive

recursive characterization of < on OT(Ω). Below we write α ∈ SC if α = Ω orα =

NFψ

Ω(δ) for some δ.

69

Lemma: 8.17 Let α, β ∈ OT(Ω). Then α < β holds if and only if one of thefollowing conditions is satisfied:

1. α = 0 and β 6= 0.

2. α =NF

α0 + . . .+ αn, β =NF

β0 + . . .+ βm, 0 < n < m and ∀i ≤ nαi = βi.

3. α =NF

α0 + . . .+ αn, β =NF

β0 + . . .+ βm, 0 < n,m and∃i ≤ min(n,m)[∀j < i αj = βj ∧ αi < βi].

4. α =NF

α0 + . . .+ αn, n > 0, β ∈ AP and α1 < β.

5. α ∈ AP, β =NF

β0 + . . .+ βn, n > 0 and α ≤ β1.

6. α =NF

ϕα1(α2), β =NF

ϕβ1(β2), α1 < β1 and α2 < β.

7. α =NF

ϕα1(α2), β =NF

ϕβ1(β2), α1 = β1 and α2 < β2.

8. α =NF

ϕα1(α2), β =NF

ϕβ1(β2), β1 < α1 and α < β2.

9. α =NF

ϕα1(α2), α1, α2 < β and β ∈ SC.

10. α ∈ SC, β =NF

ϕβ1(β2) and α ≤ max(β1, β2).

11. α =NF

ψΩ(α0), β =

NFψ

Ω(β0) and α0 < β0.

12. α =NF

ψΩ(α0) and β = Ω.

Proposition: 8.18 OT(Ω) ⊆ B(εΩ+1) ∩ εΩ+1.

Proof: Use induction on Gα for α ∈ OT(Ω). ut

70

9 KP goes infinite: LRSA peculiarity of PA is that every object n of the intended model has a canonicalname in the language, namely, the nth numeral. It is not clear, though, how tobestow a canonical name to each element of the set–theoretic universe. This iswhere Godel’s constructible universe L comes in handy. As L is “made” from theordinals it is pretty obvious how to “name” sets in L once one has names for ordinals.These will be taken from OT(Ω). Henceforth, we shall restrict ourselves to ordinalsfrom OT(Ω).

Definition: 9.1 Up to know the basic symbols of our set-theoretic language havebeen = and ∈. For technical reasons we would like to get rid of =. We simply definea = b to be an abbreviation for

(∀x ∈ a)x ∈ b ∧ (∀x ∈ b)x ∈ a.

The axiom of extensionality then becomes a triviality. However, its role is takenover by the equality axioms which we have not explicitly considered hitherto. Therole of extensionality is then played by the axiom

c = d ∧ c ∈ a→ d ∈ a,

the unabbreviated version of which is

(∀x ∈ c)x ∈ d ∧ (∀x ∈ d)x ∈ c ∧ c ∈ a→ d ∈ a.

Exercise: 9.2 Show that from the previous axiom one can deduce

c = d ∧ F (c)→ F (d)

for any formula F (c).

Definition: 9.3 The set terms and their ordinal levels are defined inductively.

(i) For each α ∈ OT(Ω) ∩ Ω, there will be a set term Lα. Its ordinal level isdeclared to be α.

(ii) If F (a,~b ) is a set-theoretic formula, i.e. a formula of KP (whose free variablesare among the indicated) and ~s ≡ s1, · · · , sn are set terms with levels < α,then the formal expression

x∈Lα | F (x,~s )Lα

is a set term of level α. Here F (x,~s)Lα results from F (x,~s) by restricting allunbounded quantifiers to Lα.

A formula of RS is any expression of the form F (s1, . . . , sn), where F (a1, . . . , an)is a formula f KP with all free variables indicated and s1, . . . , sn are set terms.

In the sequel, RS–formulae will be referred to just as formulae.If A is a formula, then

k(A) := α : Lα occurs in A .

71

Here any occurrence of Lα, i.e. also those inside of terms, has to be considered. Fora term s we set k(s) := k(s = s).

In what follows s, t, p, q, r, s1, s2, . . . will range over set terms. For a set term swe shall notate the level of s by | s |. We also write s < t instead of | s | < | t |.

For terms s, t with | s | < | t | we set

s∈ t ≡

B(s) if t ≡ x∈Lβ | B(x)s /∈ L0 if t ≡ Lβ.

The collection of set terms will serve as a formal universe for a theory LRS withinfinitary rules. The infinitary rule for the universal quantifier on the right takes theform: From Γ ⇒ ∆, F (t) for all RS–terms t conclude Γ ⇒ ∆,∀x F (x). There arealso rules for bounded universal quantifiers: From Γ ⇒ ∆, F (t) for all RS–termst with levels < α conclude Γ ⇒ ∆, (∀x ∈ Lα) F (x). The corresponding rule forintroducing a universal quantifier bounded by a term of the form x∈Lα : F (x,~s)Lαis slightly more complicated. With the help of these infinitary rules it now possible togive logical deductions of all axioms of KP with the exception of Bounded Collection.The latter can be deduced from the rule of Σ-Reflection: From Γ ⇒ ∆, C concludeΓ ⇒ ∆,∃z Cz for every Σ-formula C. The class of Σ-formulae is the smallest classof formulae containing the bounded formulae which is closed under ∧, ∨, boundedquantification and unbounded existential quantification. Cz is obtained from C byreplacing all unbounded quantifiers ∃x in C by ∃x ∈ z.

The length and cut ranks of KP∞-deductions will be measured by ordinals fromOT(Ω). If

KP ` F (u1, . . . , ur)

thenLRS

Ω·mΩ+n

B(s1, . . . , sr)

holds for some m,n and all set terms s1, . . . , sr; m and n depend only on theKP-derivation of B(~u).

Definition: 9.4 The inference rules of KP∞ include all the propositional inferencesof the sequent calculus (i.e., those pertaining to ∧,∨,→,¬) as well as the cut rule(Cut). In addition, KP∞ has the following rules, where in (∈R), (b∀L) and (b∃R)it is also assumed that s < t):

Elementhood

p∈ t ∧ r = p,Γ ⇒ ∆ all p < t

(∈∞)r ∈ t,Γ ⇒ ∆

Γ ⇒ ∆, s∈ t ∧ r = s

(∈ R)Γ ⇒ ∆, r ∈ t

Bounded Quantifiers

s∈ t→ F (s),Γ ⇒ ∆

(b∀L)(∀x ∈ t)F (x),Γ ⇒ ∆

Γ ⇒ ∆, p∈ t→ F (p) all p < t

(b∀∞)Γ ⇒ ∆, (∀x ∈ t)F (x)

p∈ t ∧ F (p),Γ ⇒ ∆ all p < t

(b∃∞)(∃x ∈ t)F (x),Γ ⇒ ∆

Γ ⇒ ∆, s∈ t ∧ F (s)

(b∃R)Γ ⇒ ∆, (∃x ∈ t)F (x)

72

Unbounded Quantifiers

F (t),Γ ⇒ ∆(∀L)

∀xF (x),Γ ⇒ ∆

Γ ⇒ ∆, F (p) for all p(∀∞)

Γ ⇒ ∆,∀xF (x)

F (p),Γ ⇒ ∆ for all p(∃∞)

∃xF (x),Γ ⇒ ∆

Γ ⇒ ∆, F (t)(∃R)

Γ ⇒ ∆,∃xF (x)

Σ-Reflection

Γ ⇒ ∆, A(Σ-Ref)

Γ ⇒ ∆, ∃xAx

where A is a Σ-formula

Definition: 9.5 The rank of formulae and terms is determined as follows.

1. rk(Lα) = ω · α.

2. rk(x∈Lα | F (x)) = maxω · α + 1, rk(F (L0)) + 2.

3. rk(s∈t) := maxrk(s) + 6, rk(t) + 1.

4. rk(¬A) := rk(A) + 1.

5. rk(A ∧B) = rk(A ∨B) = rk(A→ B) = max(rk(A), rk(B)) + 1.

6. rk((∃x∈t)F (x)) := rk((∀x∈t)F (x)) := maxrk(t), rk(F (L0)) + 2.

7. rk(∃xF (x)) := rk(∀xF (x)) := maxΩ, rk(F (L0)) + 1.

There is plenty of leeway in designing the actual rank of a formula.

Definition: 9.6 Let Pow(ON) = X | X is a set of ordinals.A class function

H : Pow(ON)→ Pow(ON)

will be called an operator if the following conditions are met for all X,X ′ ∈Pow(ON):

(H0) 0∈H(X).

(H1) For α =NF ωα1 + · · ·+ ωαn ,

α∈H(X) ⇐⇒ α1, ..., αn∈H(X).

(In particular, (H1) implies that H(X) will be closed under + and σ 7→ ωσ,i.e., if α, β∈H(X), then α + β, ωα∈H(X).)

(H2) X ⊆ H(X)

73

(H3) X ′ ⊆ H(X) =⇒ H(X ′) ⊆ H(X).

Note that an operator is monotone, i.e., if X ′ ⊆ X then X ′ ⊆ H(X) by (H2), andhence H(X ′) ⊆ H(X) using (H3).

Definition: 9.7 (i) When f is a mapping f : ONk −→ ON, then H is said tobe closed under f , if, for all X∈Pow(ON) and α1, . . . , αk∈H(X),

f(α1, . . . , αk)∈H(X).

(ii) α∈H := α∈H(∅); s∈H := k(s) ⊆ H.

(iii) X ⊆ H := X ⊆ H(∅).

(iv) If Y is a set of ordinals we denote by H[Y ] the operator with

(H[Y ])(X) := H(Y ∪X).

(v) For a set term s let H[s] denote the operator H[k(s)]

The next Lemma garners some simple properties of operators.

Lemma: 9.8 Let H be an operator, s be a set term and Y be a set of ordinals.

(i) H[Y ] and H[s] are operators.

(ii) Y ⊆ H =⇒ H[Y ] = H.

(iii) ∀X,X ′∈Pow(ON)[X ′ ⊆ X =⇒ H(X ′) ⊆ H(X)].

For a set of formulae Γ = A1, . . . , An let k(Γ) = k(A1) ∪ . . . ∪ k(An).

Definition: 9.9 We define the relation

H α

ρ Γ ⇒ ∆

by recursion on α by requiring that

k(Γ) ∪ k(∆) ∪ α ⊆ H(∅)

holds and one of the following conditions is satisfied:

1. Γ ⇒ ∆ is the result of a propositional inference (pertaining to one of theconnectives ∧,∨,→,¬) with premisses Γi ⇒ ∆i and H αi

ρ Γi ⇒ ∆i for someαi < α.

2. H α1

ρ Γ, A ⇒ ∆ and H α2

ρ Γ ⇒ ∆, A for some α1, α2 < α and formula Awith rk(A) < ρ.

3. Γ is of the form r ∈ t,Γ′ and

H[p]αp

ρ p∈ t ∧ r = p,Γ′ ⇒ ∆

holds for all p < t for some αp < α.

74

4. ∆ is of the form ∆′, r ∈ t and

H α0

ρ Γ ⇒ ∆′, s∈ t ∧ r = s

holds for some s < t with | s | < α and some α0 < α.

5. Γ is of the form (∀x ∈ t)F (x),Γ′ and

H α0

ρ s∈ t→ F (s),Γ′ ⇒ ∆

holds for some s < t with | s | < α and α0 < α.

6. ∆ is of the form ∆′, (∀x ∈ t)F (x) and

H[p]αp

ρ Γ ⇒ ∆, p∈ t→ F (p)


7. Γ is of the form (∃x ∈ t)F (x),Γ′ and

H[p]αp

ρ p∈ t ∧ F (p),Γ′ ⇒ ∆


8. ∆ is of the form ∆′, (∃x ∈ t)F (x) and

H α0

ρ Γ ⇒ ∆′, s∈ t ∧ F (s)

holds for some s < t with | s | < α and some α0 < α.

9. Γ is of the form ∀xF (x),Γ′ and

H α0

ρ F (s),Γ′ ⇒ ∆

holds for some s with | s | < α and α0 + 2 < α.

10. ∆ is of the form ∆′,∀xF (x) and

H[p]αp

ρ Γ ⇒ ∆, F (p)

holds for all p for some αp + 2 < α.

11. Γ is of the form ∃xF (x),Γ′ and

H[p]αp

ρ F (p),Γ′ ⇒ ∆

holds for all p for some αp + 2 < α.

12. ∆ is of the form ∆′,∃xF (x) and

H α0

ρ Γ ⇒ ∆′, F (s)

holds for some s with | s | < α and some α0 + 2 < α.

75

13. α ≥ Ω and ∆ is of the form ∆′,∃z Az, where A is a Σ-formula, and

H α0

ρ Γ ⇒ ∆′, A

holds for some α0 + 1 < α.

Lemma: 9.10 (i) If Γ0 ⊆ Γ, ∆0 ⊆ ∆, k(Γ), k(∆) ⊆ H, α ∈ H, α0 ≤ α, ρ0 ≤ ρand

H α0

ρ0Γ0 ⇒ ∆0

thenH α

ρ Γ ⇒ ∆ .

(ii) IfH α

ρ Γ ⇒ ∆, (∀x ∈ Lβ)F (x) , γ ∈ H and γ ≤ β thenH α

ρ Γ ⇒ ∆, (∀x ∈ Lγ)F (x)

Proof: (i) is proved by a straightforward induction on α0.For (ii) we use induction on α. the only interesting case is when (∀x ∈ Lγ)F (x)

was the principal formula of the last inference which would have been (b∀)∞. So wehave

H[s]αp

ρ Γ ⇒ ∆, (∀x ∈ Lβ)F (x), p /∈ L0 ∧ F (p)

for all p < β, where αp < α. By the induction hypothesis we get

H[s]αp

ρ Γ ⇒ ∆, (∀x ∈ Lγ)F (x), p /∈ L0 ∧ F (p)

for all p < γ and thus, via another (b∀)∞ inference, we get the desired result. ut

Lemma: 9.11 If k(s) ⊆ H, α ∈ H and α > 0 then

H α

0⇒ s /∈ L0 .

Proof: We have H[p]αp

0p∈L0 ∧ p = s ⇒ for all p < 0 for some αp < 0 (since

there ain’t any such p). Hence, via an inference (∈)∞ we get H[p]0

0s ∈ L0 ⇒ ,

from which we get H α

0⇒ s /∈ L0 via (¬R). ut

Lemma: 9.12 The inversions (i)-(viii) of RA∗ of Lemma 5.10 concerning propo-sitional logic also hold for RS. In addition the following inversions hold for RS.

(i) If H α

ρ r ∈ t,Γ ⇒ ∆ then H[p]α

ρ p∈ t ∧ r = p,Γ ⇒ ∆ holds for all p < t.

(ii) If H α

ρ Γ ⇒ ∆, (∀x ∈ t)F (x) then H[p]α

ρ Γ ⇒ ∆, p∈ t→ F (p) holds for all

p < t.

(iii) If H α

ρ (∃x ∈ t)F (x),Γ ⇒ ∆ then H[p]α

ρ Γ ⇒ ∆, p∈ t ∧ F (p) holds for all

p < t.

(iv) If H α

ρ Γ ⇒ ∆,∀xF (x) then H[s]α

ρ Γ ⇒ ∆, F (s) holds for all s.

(v) If H α

ρ ∃xF (x),Γ ⇒ ∆ then H[s]α

ρ F (s),Γ ⇒ ∆ holds for all s.

76

Proof: All are straightforward by induction on α. ut

Lemma: 9.13 (Reduction) Let ρ = |C| 6= Ω. IfH α

ρ Γ, C ⇒ ∆ andH β

ρ Ξ ⇒ Θ, C ,then

H α#α#β#β

ρ Γ,Ξ ⇒ ∆,Θ .

Proof: The proof is by induction on α#α#β#β and very similar to Lemma 5.11.We only look at two cases where C and was the principal formula of the last inferencein both derivations. It is essential to notice that C is not the principal formula ofan inference (Σ-Ref) since |C| 6= Ω.

Case 1: The first is when C is of the form r ∈ t. Then we have

H[p]αp

ρ Γ, C, p∈ t ∧ r = p ⇒ ∆

for all p < t with αp < α and

H β0

ρ Ξ ⇒ Θ, C, s∈ t ∧ r = s

for some β0 < β and term s < t with | s | < β.Since k(s) ⊆ H we also have that H = H[s].By the induction hypothesis we obtain

H αs#αs#β#β

ρ Γ,Ξ, s∈ t ∧ r = s ⇒ ∆,Θ

andH α#α#β0#β0

ρ Γ,Ξ ⇒ ∆,Θ, s∈ t ∧ r = s .

Cutting out s∈ t ∧ r = s gives H α#α#β#β

ρ Γ,Ξ ⇒ ∆,Θ .Case 2: The second case is when C is of the form (∀x ∈ t)A(x) Then we have

H α1

ρ Γ, C, s∈ t→ A(s) ⇒ ∆

for some α1 < α and s < t with | s | < α. And we also have

H[s]βs

ρ Γ ⇒ ∆, C, s∈ t→ A(s)

for some βs < β and s < t with | s | < β. Since k(s) ∈ H we have H[s] = H. By theinduction hypothesis we thus get

H α1#α1#β#β

ρ Γ,Ξ, s∈ t→ A(s) ⇒ ∆,Θ

andH α#α#βs#βs

ρ Γ,Ξ ⇒ ∆,Θ, s∈ t→ A(s) .

Cutting out s∈ t→ A(s) gives H α#α#β#β

ρ Γ,Ξ ⇒ ∆,Θ . ut

Theorem: 9.14 (First Cut Elimination Theorem)

If H α

δ+1Γ ⇒ ∆ and δ 6= Ω then H 4α

δΓ ⇒ ∆ .

77

Proof: Use induction on α and the previous Lemma. ut

Theorem: 9.15 (Predicative cut elimination) Let H be closed under ϕ.If H α

ρ+ωνΓ ⇒ ∆ , Ω /∈ [ρ, ρ+ ων [ and ν ∈ H, then

Hϕν(α)

ρ Γ ⇒ ∆ .

Proof: By main induction on ν and subsidiary induction on α. The assertion holdsfor ν = 0 by the First Cut Elimination Theorem 9.14 since ρ 6= Ω. Now supposeν > 0. There will be a last inference (I) with premisses Γi ⇒ ∆i. Suppose theinference was not a cut or a cut of rank < ρ. We then have H[i]

αi

ρ+ωνΓi ⇒ ∆i for

some αi < α. By the subsidiary induction hypothesis we have H[i]ϕν(αi)

ρ Γi ⇒ ∆i .

Applying the same inference (I) yields Hϕν(α)

ρ Γ ⇒ ∆ .Now suppose the last inference was a cut with cut formula C such that ρ ≤

|C| < ρ+ ων . Then there exist ν0 < ν and n < ω such that |C| < ρ+ ων0 · n. Afterperforming a cut with C we have

Hϕν(α)

ρ+ων0 ·n Γ ⇒ ∆ .

We also have ϕν0(ϕν(α)) = ϕν(α). Therefore by n-fold application of the main

induction hypothesis we obtain Hϕν(α)

ρ Γ ⇒ ∆ . ut

Lemma: 9.16 (Bounding Lemma) Let B be a Σ-formula and A be a Π-formula.Suppose α ≤ β < Ω and β ∈ H.

(i) If H α

ρ Γ ⇒ ∆, B then

H α

ρ Γ ⇒ ∆, BLβ .

(ii) If H α

ρ Γ, A ⇒ ∆ then

H α

ρ Γ, ALβ ⇒ ∆ .

Proof: (i) Use induction on α. Note that the deductions cannot contain anyinference (Σ-Ref) since α < Ω.

Note that if B is not the principal formula of the last inference then the assertionfollows readily from the induction hypothesis. So let’s assume that B was theprincipal formula of the last inference. If B is a ∆0 formula or of either formB0 ∨ B1, B0 ∧ B1, (∀x ∈ t)F (x), or (∃x ∈ t)F (x) then the assertion follows readilyfrom the induction hypothesis. So suppose B is of the form ∃xF (x). Then we have

H α0

ρ Γ ⇒ ∆, B, F (s)

for some α0 + 2 < α and a term s with | s | < α. Inductively we have

(∗) H α0

ρ Γ ⇒ ∆, BLβ , F (s)Lβ .

We also have(∗∗) H α0+1

ρ Γ ⇒ ∆, BLβ , s /∈ L0

78

by Lemma 9.11. Thus from (∗) and (∗∗) we get

H α0+2

ρ Γ ⇒ ∆, BLβ , s /∈ L0 ∧ F (s)Lβ

via (∧R). The latter is the same as H α0+2

ρ Γ ⇒ ∆, BLβ , s∈Lβ ∧ F (s)Lβ since

| s | < β, and hence, using (b∃R), we get H α

ρ Γ ⇒ ∆, BLβ . ut

79

10 Impredicative Cut Elimination

The usual cut elimination procedure works unless the cut formulae have been in-troduced by Σ-reflection rules. The obstacle to pushing cut elimination further isexemplified by the following scenario:

δ

ΩΓ ⇒ ∆, C

ξ

ΩΓ ⇒ ∆,∃z Cz

RefΣ· · · ξs

ΩΞ, Cs ⇒ Λ · · · (|s |< Ω)ξ

ΩΞ,∃z Cz ⇒ Λ

(∃L)

α

Ω+1Γ,Ξ ⇒ ∆,Λ

(Cut)

In order to be able to remove these critical cuts, i.e. cuts which were intro-duced by (Σ-Ref), we have to forgo arbitrary operators. We shall need operators Hsuch that an H–controlled derivation that satisfies certain extra conditions can be“collapsed” into a derivation with much smaller ordinal labels.

From now on we will identify ON with B(ΩΓ). All operators are thereforesupposed to just act on subsets of B(ΩΓ).

Definition: 10.1 The operator Hη for η < εΩ+1 is defined by

Hη(X) =⋂B(β) | X ⊆ B(β) ∧ η < β.

Lemma: 10.2 (i) Hη is an operator.

(ii) η < η′ =⇒ Hη(X) ⊆ Hη′(X).

(iii) Hη is closed under ϕ and ψΩ

η + 1.

Proof: (i): X ⊆ Hη(X) follows by definition. If X ′ ⊆ Hη(X), then, for any β > ηsuch that X ⊆ B(β), we have X ′ ⊆ B(β), and therefore Hη(X

′) ⊆ B(β), henceHη(X

′) ⊆ Hη(X).So far we have verified (H0), (H2) and (H3). As to (H1), suppose α =

NFωα1 +

. . .+ ωαn . We have to show

α ∈ Hη(X) iff α1, . . . , αn ∈ Hη(X).

But this is a consequence of

α ∈ B(β) iff α1, . . . , αn ∈ B(β)

which holds by Lemma 8.11(i).(ii) is obvious. (iii) follows from the fact that the sets B(β) with β > η are closed

under ϕ and ψΩ

η + 1. ut

Lemma: 10.3 Suppose η ∈ Hη. Define β := η + ωΩ+β.

(i) If α ∈ Hη then α, ψΩ(α) ∈ Hα.

(ii) If α0 ∈ Hη and α0 < α then ψΩ(α0) < ψ

Ω(α).

80

Proof: Obviously, Hη(∅) = B(η+1). From α, η ∈ B(η+1) we obtain α ∈ B(α), andhence ψ

Ω(α) ∈ B(α+1) = Hα(∅). This shows (i). Now suppose α0 ∈ Hη and α0 < α.

By the preceding argument we then have ψΩ(α0) ∈ B(α), thus ψ

Ω(α0) < ψ

Ω(α). ut

Lemma: 10.4 (Persistence) Let δ ∈ H.

(i) If H α

ρ Γ ⇒ ∆,∀xF (x) then H α

ρ Γ ⇒ ∆, (∀x ∈ Lδ)F (x) .

(ii) If H α

ρ ∃xF (x),Γ ⇒ ∆ then H α

ρ (∃x ∈ Lδ)F (x),Γ ⇒ ∆ .

Proof: (i): We proceed by induction on α. The only interesting case is when thelast inference was (∀)∞. Thus

H[s]αsρ Γ ⇒ ∆, ∀xF (x), F (s)

holds for all s for some αs + 2 < α. Inductively we have

H[s]αsρ Γ, s

∈Lδ ⇒ ∆, (∀x ∈ Lβ)F (x), F (s)

and henceH[s]

αs+1

ρ Γ ⇒ ∆, (∀x ∈ Lβ)F (x), s∈Lδ → F (s)

for all | s | < β. Thus, via (b∀)∞ we conclude that H α

ρ Γ ⇒ ∆, (∀x ∈ Lδ)F (x) .(ii) is similar. ut

Theorem: 10.5 (Collapsing and Impredicative Cut Elimination) Let Γ be setof Π-formulae and ∆ be a set of Σ-formulae. Suppose that η ∈ Hη. Then

Hηα

Ω+1Γ ⇒ ∆ implies Hα

ψΩ

(α)

ψΩ

(α)Γ ⇒ ∆

where α = η + ωΩ+α.This result can also be established for the intuitionistic version of RS provided

one adds the extra assumption that all formulae in Γ have rank at most Ω.

Proof: We proceed by induction on α.

Case 0: If the last inference was propositional then the assertion follows easily fromthe induction hypothesis.

Case 1: Suppose the last inference was (b∀)∞. Then a formula (∀x ∈ t)F (x)appears in ∆ and

H[p]αp

Ω+1Γ ⇒ ∆, p

∈ t→ F (p)

holds for all p < t for some αp < α. Since k(t) ⊆ H we have k(t) ⊆ B(η + 1) andthus | t | < ψ

Ω(η + 1). As a result, | p | < ψ

Ω(η + 1) and therefore k(p) ⊆ H holds

for all p < t, and hence H[p] = H for all p < t. The formula p∈ t → F (p) might

not be a Σ-formula but F (p) is a Σ-formula since (∀x ∈ t)F (x) is. Using inversion(Lemma 9.12) we have

Hαp

Ω+1Γ, p

∈ t ⇒ ∆, F (p) (61)

81

for all p < t Thus we can apply the induction hypothesis to (61), yielding

Hαp

ψΩ

(αp)

ψΩ

(αp)Γ, p

∈ t ⇒ ∆, F (p)

and hence

Hαp+1

ψΩ

(αp)

ψΩ

(αp)Γ ⇒ ∆, p

∈ t→ F (p) (62)

for all p < t. As ψΩ(αp) + 1 < ψ

Ω(α) holds by Lemma 10.3(ii), we can apply an

inference (b∀)∞ to get Hα

ψΩ

(α)

ψΩ

(α)Γ ⇒ ∆ .

Case 3: Suppose the last inference was (Σ-Ref). Then ∆ contains a formula ∃z Az,where A is a Σ-formula and

H α0

Ω+1Γ ⇒ ∆, A

for some α0 < α. By the induction hypothesis we have

Hα0

ψΩ

(α0)

ψΩ

(α0)Γ ⇒ ∆, A .

Using the Bounding Lemma 9.16 we get

Hα0

ψΩ

(α0)

ψΩ

(α0)Γ ⇒ ∆, A

LψΩ

(α0) .

Via an inference (∃R) we get

Hα0

ψΩ

(α0)+2

ψΩ

(α0)Γ ⇒ ∆,∃z Az .

Since ψΩ(α0) + 2 < ψ

Ω(α), by Lemma 10.3, and ∃z Az is in ∆, we also have

Hα

ψΩ

(α)

ψΩ

(α)Γ ⇒ ∆ .

Case 4: Suppose the last inference was a cut. Then there exists a formula C withrk(C) ≤ Ω and α0 < α such that

H α0

Ω+1Γ, C ⇒ ∆ ; (63)

H α0

Ω+1Γ ⇒ ∆, C . (64)

Case 4.1: rk(C) < Ω. Then we can apply the induction hypothesis to both (63)and (64) so that

Hα0

ψΩ

(α0)

ψΩ

(α0)Γ, C ⇒ ∆ ; (65)

Hα0

ψΩ

(α0)

ψΩ

(α0)Γ ⇒ ∆, C . (66)

Since k(C) ⊆ Hη this implies rk(C) < ψΩ(η + 1). Thus applying a cut to (65) and

(66) yields Hα

ψΩ

(α)

ψΩ

(α)Γ ⇒ ∆ .

82

Case 4.2: rk(C) = Ω. Then C is of the form QxF (x) with Q ∈ ∃,∀ and F (L0)being ∆0. Let’s first suppose that C is ∃xF (x). Then we can apply the inductionhypothesis to (64) and we get

Hα0

ψΩ

(α0)

ψΩ

(α0)Γ ⇒ ∆, C . (67)

Using the Persistence Lemma 10.4 and the fact that ψΩ(α0) ∈ Hα0 (invoking Lemma

10.3(i)) we infer from (63) that

Hα0

α0

Ω+1Γ, (∃x ∈ Lψ

Ω(α0))F (x) ⇒ ∆ . (68)

Since (∃x ∈ LψΩ

(α0))F (x) is ∆0 the induction hypothesis can be applied to (68),yielding

Hα1

ψΩ

(α1)

ψΩ

(α1)Γ, (∃x ∈ Lψ

Ω(α0))F (x) ⇒ ∆ , (69)

where α1 = α0 + ωΩ+α0 . Since α1 < η + ωΩ+α = α and rk((∃x ∈ LψΩ

(α0))F (x)) <

ψΩ(α) hold, cutting with (67) and (69) furnishes Hα

ψΩ

(α)

ψΩ

(α)Γ ⇒ ∆ .

If C is ∀xF (x) the argument is similar. ut

11 Interpreting KP in RS

Theorem: 11.1 (Interpretation Theorem) If KP ` A where A is sentence thenthere exist m < ω such that

H0Ω·ωm

Ω+mA .

Proof: The proof is too long to be incorporated here. ut

Corollary: 11.2 (i) If A is a Σ sentence of KP and KP ` A then

LψΩ

(εΩ+1) |= A.

(ii) If KP ` C where C is a sentence of the form ∀x∃y F (x, y) with F (a, b) beinga Σ formula, then

LψΩ

(εΩ+1) |= C.

(iii) There is no ordinal < ψΩ(εΩ+1) that satisfies (i).

(iv) ‖ KP ‖= ψΩ(εΩ+1).

Proof: (i): Suppose KP ` A. By Theorem 11.1 we find m < ω such that

H0Ω·ωm

Ω+mA .

We can assume that m > 1. Using the First Cut Elimination Theorem 9.14 m-1-times we get

H0σ0

Ω+1A (70)

83

where σ0 := ωm−1(Ω · ωm). Note that to (70) we can apply Impredicative CutElimination 10.5, and hence, since 0 + ωΩ+σ0 = ωσ0 ,

Hσ1

ψΩ

(σ1)

ψΩ

(σ1)A (71)

where σ1 = ωσ0 . By the Bounding Lemma 9.16 it follows that

Hσ1

σ2

σ2ALσ2 (72)

where σ2 = ψΩ(σ1). By Predicative Cut Elimination 9.15 we conclude from (71)

that

Hσ1

ϕσ2 (σ2)

0ALσ2 . (73)

As the derivation from (73) contains no inference (Σ-Ref) one then shows by induc-tion on ϕσ2(σ2) that all sequents appearing in the derivation are true in Lσ2 on thestandard interpretation.

Obviously, ϕσ2(σ2) < ψΩ(εΩ+1). As A is a Σ-formula it follows that Lψ

Ω(εΩ+1) |=

B.

(ii) follows from (i) and (iii) using Theorem 2.1 from M. Rathjen: Fragments ofKripke-Platek set theory with infinity, in: P. Aczel, H. Simmons, S. Wainer (eds.):Proof Theory (Cambridge University Press, Cambridge, 1992) 251-273.

(iii) requires a well-ordering proof in KP.

(iv) follows from the fact that PA+TI(ψΩ(εΩ+1)) proves the consistency of KP and

a cunning argument involving Lob’s Theorem. utψ

Ω(εΩ+1) is also known as the Bachmann-Howard ordinal.

References

[1] T. Arai: Proof theory for theories of ordinals I: recursively Mahlo ordinals, An-nals of Pure and applied Logic 122 (2003) 1–85.

[2] T. Arai: Proof theory for theories of ordinals II: Π3-Reflection, Annals of Pureand Applied Logic.

[3] H. Bachmann: Die Normalfunktionen und das Problem der ausgezeichneten Fol-gen von Ordinalzahlen. Vierteljahresschrift Naturforsch. Ges. Zurich 95 (1950)115–147.

[4] J Barwise: Admissible Sets and Structures (Springer, Berlin 1975).

[5] W. Buchholz: Eine Erweiterung der Schnitteliminationsmethode, Habilitations-schrift (Munchen 1977).

[6] A simplified version of local predicativity, in: Aczel, Simmons, Wainer (eds.),Leeds Proof Theory 1991 (Cambridge University Press, Cambridge, 1993) 115–147.

84

[7] W. Buchholz, S. Feferman, W. Pohlers, W. Sieg: Iterated inductive definitionsand subsystems of analysis (Springer, Berlin, 1981).

[8] W. Buchholz and K. Schutte: Proof theory of impredicative subsystems of anal-ysis. (Bibliopolis, Naples, 1988).

[9] W. Buchholz: Explaining Gentzen’s consistency proof within infinitary proof the-ory. in: G. Gottlob et al. (eds.), Computational Logic and Proof Theory, KGC’97, Lecture Notes in Computer Science 1289 (1997).

[10] G. Cantor: Beitrage zur Begrundung der transfiniten Mengenlehre II. Mathe-matische Annalen 49 (1897) 207–246.

[11] T. Carlson: Elementary patterns of resemblance, Annals of Pure and AppliedLogic 108 (2001) 19-77.

[12] A.G. Dragalin: New forms of realizability and Markov’s rule (Russian), Dokl.Acad. Nauk. SSSR 2551 (1980) 543–537; translated in: Sov. Math. Dokl. 10,1417–1420.

[13] F. Drake: Set Theory: An introduction to large cardinals. Amsterdam: NorthHolland 1974

[14] S. Feferman: Systems of predicative analysis, Journal of Symbolic Logic 29(1964) 1–30.

[15] S. Feferman: Proof theory: a personal report, in: G. Takeuti, Proof Theory, 2nd

edition (North-Holland, Amsterdam, 1987) 445–485.

[16] S. Feferman: Hilbert’s program relativized: Proof-theoretical and foundationalreductions, J. Symbolic Logic 53 (1988) 364–384.

[17] S. Feferman: Remarks for “The Trends in Logic”, in: Logic Colloquium ‘88(North-Holland, Amsterdam, 1989) 361–363.

[18] H. Friedman: Classically and intuitionistically provably recursive functions. In:G.H. Muller, D.S. Scott: Higher set theory (Springer, Berlin, 1978) 21–27.

[19] H. Friedman, K. McAloon, and S. Simpson: A finite combinatorial principlewhich is equivalent to the 1-consistency of predicative analysis, in: G. Metakides(ed.): Patras Logic Symposium (North-Holland, Amsterdam, 1982) 197–220.

[20] H. Friedman, N. Robertson, P. Seymour: The metamathematics of the graphminor theorem, Contemporary Mathematics 65 (1987) 229–261.

[21] H. Friedman and S. Scedrov: Large sets in intuitionistic set theory, Annals ofPure and Applied Logic 27 (1984) 1–24.

[22] H. Friedman and S. Sheard: Elementary descent recursion and proof theory,Annals of Pure and Applied Logic 71 (1995) 1–45.

[23] G.H. Hardy: A theorem concerning the infinite cardinal numbers. QuarterlyJournal of Mathematics 35 (1904) 87–94.

85

[24] D. Hilbert: Die Grundlegung der elementaren Zahlentheorie, MathematischeAnnalen 104 (1931).

[25] D. Hilbert and P. Bernays: Grundlagen der Mathematik II (Springer, Berlin,1938).

[26] G. Jager: Zur Beweistheorie der Kripke–Platek Mengenlehre uber den naturli-chen Zahlen, Archiv f. Math. Logik 22 (1982) 121–139.

[27] G. Jager and W. Pohlers: Eine beweistheoretische Untersuchung von∆1

2–CA + BI und verwandter Systeme, Sitzungsberichte der BayerischenAkademie der Wissenschaften, Mathematisch–Naturwissenschaftliche Klasse(1982).

[28] A. Kanamori, M. Magidor: The evolution of large cardinal axioms in set the-ory. In: G. H. Muller, D.S. Scott (eds.) Higher Set Theory. Lecture Notes inMathematics 669 (Springer, Berlin, 1978) 99-275.

[29] G. Kreisel: On the interpretation of non-finitist proofs II, Journal of SymbolicLogic 17 (1952) 43–58.

[30] G. Kreisel: Mathematical significance of consistency proofs. Journal of SymbolicLogic 23 (1958) 155–182.

[31] G. Kreisel: Generalized inductive definitions, in: Stanford Report on the Foun-dations of Analysis (Mimeographed, Stanford, 1963) Section III.

[32] G. Kreisel: A survey of proof theory, Journal of Symbolic Logic 33 (1968) 321–388.

[33] G. Kreisel: Notes concerning the elements of proof theory. Course notes of acourse on proof theory at U.C.L.A. 1967 - 1968.

[34] G. Kreisel, G. Mints, S. Simpson: The use of abstract language in elementarymetamathematics: Some pedagogic examples, in: Lecture Notes in Mathematics,vol. 453 (Springer, Berlin, 1975) 38–131.

[35] G.E. Mints: Finite investigations of infinite derivations, Journal of SovietMathematics 15 (1981) 45–62.

[36] W. Pohlers: Cut elimination for impredicative infinitary systems, part II: Or-dinal analysis for iterated inductive definitions, Arch. f. Math. Logik 22 (1982)113–129.

[37] W. Pohlers: Proof theory and ordinal analysis, Arch. Math. Logic 30 (1991)311–376.

[38] M. Rathjen: Ordinal notations based on a weakly Mahlo cardinal, Archive forMathematical Logic 29 (1990) 249–263.

[39] M. Rathjen: Proof-Theoretic Analysis of KPM, Arch. Math. Logic 30 (1991)377–403.

86

[40] M. Rathjen: How to develop proof–theoretic ordinal functions on the basis ofadmissible sets. Mathematical Quarterly 39 (1993) 47–54.

[41] M. Rathjen: Collapsing functions based on recursively large ordinals: A well–ordering proof for KPM. Archive for Mathematical Logic 33 (1994) 35–55.

[42] M. Rathjen: Proof theory of reflection. Annals of Pure and Applied Logic 68(1994) 181–224.

[43] M. Rathjen: Recent advances in ordinal analysis: Π12-CA and related systems.

Bulletin of Symbolic Logic 1, 468–485 (1995).

[44] M. Rathjen: The realm of ordinal analysis. S.B. Cooper and J.K. Truss (eds.):Sets and Proofs. (Cambridge University Press, 1999) 219–279.

[45] M. Rathjen: An ordinal analysis of stability, Archive for Mathematical Logic44 (2005) 1 - 62.

[46] M. Rathjen: An ordinal analysis of parameter-free Π12 comprehension Archive

for Mathematical Logic 44 (2005) 263 - 362.

[47] Richter, W. and Aczel, P.: Inductive definitions and reflecting properties of ad-missible ordinals. In: J.E. Fenstad, Hinman (eds.) Generalized Recursion Theory(North Holland, Amsterdam, 1973) 301-381.

[48] K. Schutte: Beweistheoretische Erfassung der unendlichen Induktion in derZahlentheorie, Mathematische Annalen 122 (1951) 369–389.

[49] K. Schutte: Beweistheorie (Springer, Berlin, 1960).

[50] K. Schutte: Eine Grenze fur die Beweisbarkeit der transfiniten Induktion inder verzweigten Typenlogik, Archiv fur Mathematische Logik und Grundlagen-forschung 67 (1964) 45–60.

[51] K. Schutte: Predicative well-orderings, in: Crossley, Dummet (eds.), Formalsystems and recursive functions (North Holland, 1965) 176–184.

[52] K. Schutte: Proof Theory (Springer, Berlin, 1977).

[53] H. Schwichtenberg: Proof theory: Some applications of cut-elimination. In: J.Barwise (ed.): Handbook of Mathematical Logic (North Holland, Amsterdam,1977) 867–895.

[54] S. Simpson: Nichtbeweisbarkeit von gewissen kombinatorischen Eigenschaftenendlicher Baume, Archiv f. Math. Logik 25 (1985) 45–65.

[55] S. Simpson: Subsystems of second order arithmetic (Springer, Berlin, 1999).

[56] G. Takeuti: Consistency proofs of subsystems of classical analysis, Ann. Math.86, 299–348.

[57] G. Takeuti: Proof theory and set theory, Synthese 62 (1985) 255–263.

87

[58] G. Takeuti, M. Yasugi: The ordinals of the systems of second order arithmeticwith the provably ∆1

2–comprehension and the ∆12–comprehension axiom respec-

tively, Japan J. Math. 41 (1973) 1–67.

[59] A. S. Troelstra: Metamathematical investigations of intuitionistic arithmeticand analysis, (Springer, Berlin, 1973).

[60] A. S. Troelstra and D. van Dalen: Constructivism in Mathematics: An Intro-duction, volume I, II, North–Holland, Amsterdam 1988.

[61] O. Veblen: Continuous increasing functions of finite and transfinite ordinals,Trans. Amer. Math. Soc. 9 (1908) 280–292.

88

proof theory: from arithmetic to set theory - scandinavian logic

Documents