formalizing computability theory via partial recursive ...€¦ · i computability theory has...
TRANSCRIPT
1/23
Formalizing computability theory via partialrecursive functions
Mario Carneiro
Carnegie Mellon University
September 9, 2019
2/23
Lean
I An open source interactive theorem prover developedprimarily by Leonardo de Moura (Microsoft Research)
I Focus on software verification and formalized mathematicsI Based on Dependent Type Theory
I Classical, non-HoTTI Similar to CIC, the axiom system used by Coq
I Lean 3 includes a powerful metaprogramminginfrastructure for Lean in Lean
I The mathlib library for Lean 3 provides a broad range ofpure mathematics and tools for (meta)programming
I Includes abstract algebra, category theory, computabilitytheory, and much much more. . .
I This formalization is available online1
1https://github.com/leanprover-community/mathlib/tree/master/
src/computability
3/23
How to start?
I Computability theory has multiple “competing”formalizations
I Turing machinesI Lambda calculusI Partial recursive functionsI Minsky register machinesI Wang B-machinesI . . .
I . . . and they are all equivalent (Church-Turing thesis)
In theory it shouldn’t matter, but in practice different ways ofsaying the same thing make a huge difference in the difficulty ofa formalization project. The only way to get good data is to tryall the ways and see which is easiest.
3/23
How to start?
I Computability theory has multiple “competing”formalizations
I Turing machinesI Lambda calculusI Partial recursive functionsI Minsky register machinesI Wang B-machinesI . . .
I . . . and they are all equivalent (Church-Turing thesis)
In theory it shouldn’t matter, but in practice different ways ofsaying the same thing make a huge difference in the difficulty ofa formalization project. The only way to get good data is to tryall the ways and see which is easiest.
3/23
How to start?
I Computability theory has multiple “competing”formalizations
I Turing machinesI Lambda calculusI Partial recursive functionsI Minsky register machinesI Wang B-machinesI . . .
I . . . and they are all equivalent (Church-Turing thesis)
In theory it shouldn’t matter, but in practice different ways ofsaying the same thing make a huge difference in the difficulty ofa formalization project. The only way to get good data is to tryall the ways and see which is easiest.
4/23
Turing machines
I Very simple to describe, veryhard to use
I Mathematical description bearsobvious resemblance to vonNeumann architecture andphysical machines of Turing’sday
I Non-compositional semanticsand graph based transitionsmake them unnatural in a formalproof
5/23
Post-Turing machines / Wang B-machinesI Various variations and
limitations of Turing machinesmake them more like standardprocessors
I not every instruction is a jump
I By limiting control flow patterns,this can be made compositional
I Memory access is stillnon-compositional
write 0
write 1
move left
move right
if read 0 then goto i
if read 1 then goto j
Verdict: This is a good way to wrangle Turing machines, but itis still too concrete; proofs involve many details of the modeland the ideas are lost in the noise. (Concrete is useful, butconcrete + unrealistic is not.)
5/23
Post-Turing machines / Wang B-machinesI Various variations and
limitations of Turing machinesmake them more like standardprocessors
I not every instruction is a jump
I By limiting control flow patterns,this can be made compositional
I Memory access is stillnon-compositional
write 0
write 1
move left
move right
if read 0 then goto i
if read 1 then goto j
Verdict: This is a good way to wrangle Turing machines, but itis still too concrete; proofs involve many details of the modeland the ideas are lost in the noise. (Concrete is useful, butconcrete + unrealistic is not.)
6/23
Lambda calculus
e ::= x | e e′ | λx. e
(λx. e) a −→ e[a/x]
I Simple, abstract, and compositionalI Partiality and confluence are new problems in this setting
I Turing machines aren’t compositional so there is only oneplace that partiality can arise, but any subterm of a lambdaterm can fail to terminate.
I Confluence can be addressed by picking an evaluationorder, or by proving Church-Rosser.
I Lean is a dependent type theory so it is itself a variant onlambda calculus, but we can’t reuse this lambda as alambda term (a.k.a. higher order abstract syntax (HOAS)).
I Lean needs all terms to terminate, so a direct embedding isdifficult
I Verdict: weak reject
6/23
Lambda calculus
e ::= x | e e′ | λx. e
(λx. e) a −→ e[a/x]
I Simple, abstract, and compositionalI Partiality and confluence are new problems in this setting
I Turing machines aren’t compositional so there is only oneplace that partiality can arise, but any subterm of a lambdaterm can fail to terminate.
I Confluence can be addressed by picking an evaluationorder, or by proving Church-Rosser.
I Lean is a dependent type theory so it is itself a variant onlambda calculus, but we can’t reuse this lambda as alambda term (a.k.a. higher order abstract syntax (HOAS)).
I Lean needs all terms to terminate, so a direct embedding isdifficult
I Verdict: weak reject
7/23
Primitive recursive functions
inductive primrec : (N → N) → Prop
| zero : primrec (λ n, 0)| succ : primrec succ
| left : primrec (λ n, fst (unpair n))| right : primrec (λ n, snd (unpair n))| pair {f g} : primrec f → primrec g →
primrec (λ n, mkpair (f n) (g n))| comp {f g} : primrec f → primrec g →
primrec (f ◦ g)
| prec {f g} : primrec f → primrec g →
primrec (unpaired (λ z n, nat.rec_on n (f z)(λ y IH, g (mkpair z (mkpair y IH)))))
Primitive recursion asserts that if f and g are primrec then so isthe function h such that
h(z, 0) = f (z) and h(z,n + 1) = g(z, (n, h(z,n)))
8/23
Primitive recursive functions
I Primitive recursive functions are functions in Lean’s logicI All primrec functions terminate, so Lean can evaluate them
without issueI No n-ary functions needed because we embed projection
and pairing relative to Cantor’s pairing functionI composition is simply f , g primrec→ f ◦ g primrec
I Although slightly more complicated to define, primrecfunctions come with enough “built in” for us to getaddition and multiplication very easily, as well as manyrecursive constructions.
9/23
Primitive recursive functions on α→ β
As we are going for a general purpose library, we want thetheory to play well with Lean. So we should be able to talkabout a function f : α→ β being primitive recursive.
A bad definition, primrec functions on α→ β
f : α→ β is primrec iff there exist bijections e1 :N→ α ande2 : β→N such that e2 ◦ f ◦ e1 :N→N is primrec.
A good definition of primrec on other types should coincidewith the original when α = β =N, but this one implies that thehalting oracle
f (n) =
1 if the nth program halts0 o.w.
is primitive recursive with e1 = id, and e2(in) = 2n, e2(jn) = 2n + 1where in, jn enumerate halting and non-halting programs resp.
9/23
Primitive recursive functions on α→ β
As we are going for a general purpose library, we want thetheory to play well with Lean. So we should be able to talkabout a function f : α→ β being primitive recursive.
A bad definition, primrec functions on α→ β
f : α→ β is primrec iff there exist bijections e1 :N→ α ande2 : β→N such that e2 ◦ f ◦ e1 :N→N is primrec.
A good definition of primrec on other types should coincidewith the original when α = β =N, but this one implies that thehalting oracle
f (n) =
1 if the nth program halts0 o.w.
is primitive recursive with e1 = id, and e2(in) = 2n, e2(jn) = 2n + 1where in, jn enumerate halting and non-halting programs resp.
10/23
Primitive recursive functions on α→ β
I Intuitively, e2 should not have been admissible because it’snot computable, but we can’t say what that means ingeneral since e2 : β→N.
I What we need is a “standard bijection” on all countabletypes so that we can coherently talk about other bijectionsbeing computable or not by comparing to the standard one
I Typeclasses to the rescue!
11/23
Primcodable types
I A type α is encodable if it comes equipped with mapsencode : α→N and decode :N→ option α such thatdecode (encode a) = some a.
I Classically, this means the same as α is countable (finite orcountably infinite).
I Encodable types are closed under most type formersI Encodable instances can be inferred by typeclass inferenceI An encodable instance yields a nontrivial function
f :N→N defined by
f (n) =
encode a + 1 if decode n = some a0 if decode n = none
I A type α is primcodable if f is primrec.
12/23
Primitive recursive functions on α→ β
A better definition, primrec functions on α→ β
For primcodable types α, β, f : α→ β is primrec iffencodeoptN ◦map (encodeβ ◦ f ) ◦ decodeα :N→N is primrec.
I This definition coincides with the original for functionsN→N, using the standard primcodable instance forN.
I The assumption that α and β are primcodable rather thanjust encodable is required to prove that the identityfunction α→ α is primrec.
I This definition does not directly apply to binary functionsα→ β→ γ because β→ γ is not usually primcodable (it isnot countable), but we can define binary primrec functionsby currying
13/23
Everything interesting is primitive recursive
I There are many things in the library, but almost everythingcomputable in the Lean sense is primitive recursive, e.g. allstandard functions on option, sum, prod, list, vector, etc.
I It is not hard to get course of values recursion from this,once we know that list α is primcodable.
I Many theorems later...
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
≡ Σ(p : α→ Prop), {x : α | p x} → β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
≡ Σ(p : α→ Prop), {x : α | p x} → β
' Σ(p : α→ Prop),Π(x : α), p x→ β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
≡ Σ(p : α→ Prop), {x : α | p x} → β
' Σ(p : α→ Prop),Π(x : α), p x→ β
' Π(x : α),Σ(p : Prop), p→ β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
≡ Σ(p : α→ Prop), {x : α | p x} → β
' Σ(p : α→ Prop),Π(x : α), p x→ β
' Π(x : α),Σ(p : Prop), p→ β
≡ α→ Σ(p : Prop), p→ β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
α 7→ β := Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
≡ Σ(p : α→ Prop), {x : α | p x} → β
' Σ(p : α→ Prop),Π(x : α), p x→ β
' Π(x : α),Σ(p : Prop), p→ β
≡ α→ Σ(p : Prop), p→ β︸ ︷︷ ︸part β
14/23
PartialityI If the goal is to get to a universal function, primitive
recursion isn’t enough. How to deal with partiality?I Lean has a type of partial functions already:
Σ(S : set α),S→ β
≡ Σ(S : set α), {x : α | x ∈ S} → β
≡ Σ(p : α→ Prop), {x : α | p x} → β
' Σ(p : α→ Prop),Π(x : α), p x→ β
' Π(x : α),Σ(p : Prop), p→ β
α 7→ β := α→ Σ(p : Prop), p→ β︸ ︷︷ ︸part β
15/23
The partiality monad
part α := Σ(dom : Prop), dom→ α
return a := 〈>, λ . a〉assert : Π(p : Prop), (p→ part α)→ part α
assert p f := 〈(∃h : p, (f h)1), λ〈h, h′〉. (f h)2 h′〉bind 〈p, f 〉 g := assert p (g ◦ f )
I The type operator part is a monad in which we can assertarbitrary facts using the “assert” operation above
I Classically, this type looks the same as option α, but it doesnot assume decidability of the domain predicate
I This differs from a relational partiality type (e.g.{S : set α | |S| ≤ 1}) in that we can evaluate the function ifwe can prove it is in domain
16/23
Partial recursive functions
I This turns out to be the “right” notion of partiality we needfor partial recursive functions, because we can define
fix : (α 7→ α + β)→ α 7→ β
I This allows us to constructively define the µ-operator ofpartial recursive functions: µn. p(n) returns the smallestn ∈N such that p(n) is true if there is one, otherwise itdiverges. We can represent this as
pfind : (N 7→ bool) 7→N
17/23
Partial recursive functions
inductive partrec : (N 7→ N) → Prop
| zero : partrec (pure 0)
| succ : partrec succ
| left : partrec (λ n, fst (unpair n))| right : partrec (λ n, snd (unpair n))| pair {f g} : partrec f → partrec g →
partrec (λ n,f n >>= λ a, g n >>= λ b, pure (mkpair a b))
| comp {f g} : partrec f → partrec g →
partrec (λ n, g n >>= f)| prec {f g} : partrec f → partrec g →
partrec (unpaired (λ a n, nat.rec_on n (f a)(λ y IH, IH >>= λ i,g (mkpair a (mkpair y i)))))
| find {f} : partrec f → partrec (λ a,find (λ n, (λ m, m = 0) <$> f (mkpair a n)))
18/23
Primitive recursive functions
inductive primrec : (N → N) → Prop
| zero : primrec (λ n, 0)| succ : primrec succ
| left : primrec (λ n, fst (unpair n))| right : primrec (λ n, snd (unpair n))| pair {f g} : primrec f → primrec g →
primrec (λ n,mkpair (f n) (g n))
| comp {f g} : primrec f → primrec g →
primrec (f ◦ g)
| prec {f g} : primrec f → primrec g →
primrec (unpaired (λ a n, nat.rec_on n (f a)(λ y IH,g (mkpair a (mkpair y IH)))))
19/23
Partial recursive functions
inductive partrec : (N 7→ N) → Prop
| zero : partrec (pure 0)
| succ : partrec succ
| left : partrec (λ n, fst (unpair n))| right : partrec (λ n, snd (unpair n))| pair {f g} : partrec f → partrec g →
partrec (λ n,f n >>= λ a, g n >>= λ b, pure (mkpair a b))
| comp {f g} : partrec f → partrec g →
partrec (λ n, g n >>= f)| prec {f g} : partrec f → partrec g →
partrec (unpaired (λ a n, nat.rec_on n (f a)(λ y IH, IH >>= λ i,g (mkpair a (mkpair y i)))))
| find {f} : partrec f → partrec (λ a,find (λ n, (λ m, m = 0) <$> f (mkpair a n)))
20/23
Codes for partial recursive functions
inductive code : Type
| zero : code
| succ : code
| left : code
| right : code
| pair : code → code → code
| comp : code → code → code
| prec : code → code → code
| find’ : code → code
I This is a primcodable typeI The constructors are primrec, and the recursor is also
primrec (partrec) if the functions giving the cases areprimrec (partrec)
20/23
Codes for partial recursive functions
inductive code : Type
| zero : code
| succ : code
| left : code
| right : code
| pair : code → code → code
| comp : code → code → code
| prec : code → code → code
| find’ : code → code
I This is a primcodable typeI The constructors are primrec, and the recursor is also
primrec (partrec) if the functions giving the cases areprimrec (partrec)
21/23
Evaluation
def evaln : ∀ k : N, code → N → option N := ...def eval : code → N 7→ N := ...
I The function evaln k c n evaluates the function describedby c at n for at most k “steps”
I Only the prec and find’ cases actually need to take a step,the others are well founded on the size of the program
I If the function does not terminate in k steps, or if it usesvalues larger than k in a subcomputation, then it returnsnone
I eval c n evaluates c “all the way”, with the partial resultindicating divergence
I eval c is the semantics corresponding to the syntax cI evaln is primitive recursive, and eval is partial recursive.
Proof: by constructionI ⇒ eval is a universal partial recursive function
22/23
Computability theory
The s-m-n theoremA function curry : code→N→ code satisfyingeval (curry c m) n = eval c (m,n) is primrec.
The fixed point theorems
1. If f : code→ code is computable, then there exists somecode c such that eval (f c) = eval c.
2. If f : code→N 7→N is partial recursive, then there existssome code c such that eval c = f c.
Rice’s theoremLet C ⊆ (N 7→N) such that {c | eval c ∈ C} is computable. ThenC is trivial, that is, C = ∅ or C =N 7→N.
23/23
Future directions
I All library objectives were achievedI i.e. we can say “computable” now and have the basic facts
I Turing jump (a.k.a. what problems become computablewith an oracle for the halting problem?)
I ...even less realistic than the computability theory done hereI Complexity theory
I this requires less abstract models of computationI Work in this direction is much more advanced in Coq, see
Forster & Smolka (2017)I Proving equivalence of different models of computation to
this oneI Some work has been done on TM→Wang-B→ partrecI Work in Isabelle is more advanced, see Xu, Zhang, & Urban
(2013)
I All of these models are too abstract; come back on Fridayto see something more practical (formalizing x86)
23/23
Future directions
I All library objectives were achievedI i.e. we can say “computable” now and have the basic facts
I Turing jump (a.k.a. what problems become computablewith an oracle for the halting problem?)
I ...even less realistic than the computability theory done hereI Complexity theory
I this requires less abstract models of computationI Work in this direction is much more advanced in Coq, see
Forster & Smolka (2017)I Proving equivalence of different models of computation to
this oneI Some work has been done on TM→Wang-B→ partrecI Work in Isabelle is more advanced, see Xu, Zhang, & Urban
(2013)
I All of these models are too abstract; come back on Fridayto see something more practical (formalizing x86)