type inference

Type Inference

Edited version of notes by Dave Walker@ Princeton

who in turn edited notes byBob Harper@CMU

Both Dave and Bob were grad students at Cornell, so I decided it was OK to borrow their work…

Criticisms of Typed Languages

Types overly constrain functions & datapolymorphism makes typed constructs useful in

more contextsuniversal polymorphism => code reuseexistential polymorphism => rep. independencesubtyping => coming soon

Types clutter programs and slow down programmer productivity type inference

uninformative annotations may be omitted

Type Inference

Todayoverviewgeneration of type constraints from

unannotated simply-typed programssolving type constraints

Type Schemes

A type scheme contains type variables that may be filled in during type inferences ::= a | int | bool | s1 -> s2

A term scheme is a term that contains type schemes rather than proper typese ::= ... | fun f (x:s1) : s2 is e end

Example

fun map (f, l) is if null (l) then nil else cons (f (hd l), map (f, tl l)))end

Step 1: Add Type Schemes

fun map (f : a, l : b) : c is if null (l) then nil else cons (f (hd l), map (f, tl l)))end

type schemeson functions

Step 2: Generate Constraints


b = b’ list



: d list

constraintsb = b’ list



: d list

constraintsb = b’ list

b = b’’ list b = b’’’ list


fun map (f : a, l : b) : c is if null (l) then nil else cons (f (hd l : b’’), map (f, tl l : b’’’ list)))end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ list

b = b’’’ lista = a


fun map (f : a, l : b) : c is if null (l) then nil else cons (f (hd l : b’’) : a’, map (f, tl l) : c))end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ list

a = b’’ -> a’


fun map (f : a, l : b) : c is if null (l) then nil else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

end

: d list

constraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’

c = c’ lista’ = c’


fun map (f : a, l : b) : c is if null (l) then nil else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list

end

: d list


c = c’ lista’ = c’d list = c’ list


fun map (f : a, l : b) : c is if null (l) then nil else cons (f (hd l : b’’) : a’, map (f, tl l) : c)) : c’ list: d list

end

: d list


c = c’ lista’ = c’d list = c’ listd list = c



finalconstraintsb = b’ listb = b’’ listb = b’’’ lista = ab = b’’’ lista = b’’ -> a’c = c’ lista’ = c’d list = c’ listd list = c

Step 3: Solve Constraints

finalconstraintsb = b’ listb = b’’ listb = b’’’ lista = a...

Constraint solution provides all possible solutions to type scheme annotations on terms

solutiona = b’ -> c’b = b’ listc = c’ list

map (f : a -> b x : a list): b listis ...end

Step 4: Generate types

Generate types from type schemesOption 1: pick an instance of the most

general type when we have completed type inference on the entire program

map : (int -> int) -> int list -> int listOption 2: generate polymorphic types for

program parts and continue (polymorphic) type inference

map : All (a,b,c) (a -> b) -> a list -> b list

Type Inference Details

Type constraints are sets of equations between type schemesq ::= {s11 = s12, ..., sn1 = sn2}

eg: {b = b’ list, a = b -> c}

Constraint Generation

Syntax-directed constraint generationour algorithm crawls over abstract syntax

of untyped expressions and generatesa term schemea set of constraints

Algorithm defined as set of inference rules (as always)G |-- u => e : t, q

Constraint Generation

Simple rules:G |-- x ==> x : s, { } (if G(x) = s)

G |-- 3 ==> 3 : int, { } (same for other ints)

G |-- true ==> true : bool, { }

G |-- false ==> false : bool, { }

Operators

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 + u2 ==> e1 + e2 : int, q1 U q2 U {t1 = int, t2 = int}

G |-- u1 ==> e1 : t1, q1 G |-- u2 ==> e2 : t2, q2------------------------------------------------------------------------G |-- u1 < u2 ==> e1 + e2 : bool, q1 U q2 U {t1 = int, t2 = int}

If statements

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2G |-- u3 ==> e3 : t3, q3----------------------------------------------------------------G |-- if u1 then u2 else u3 ==> if e1 then e2 else e3

: a, q1 U q2 U q3 U {t1 = bool, a = t2, a = t3}

Function Application

G |-- u1 ==> e1 : t1, q1G |-- u2 ==> e2 : t2, q2----------------------------------------------------------------G |-- u1 u2==> e1 e2

: a, q1 U q2 U {t1 = t2 -> a}

Function Declaration

G, f : a -> b, x : a |-- u ==> e : t, q----------------------------------------------------------------G |-- fun f(x) is u end ==> fun f (x : a) : b is e end

: a -> b, q U {t = b}

(a,b are fresh type variables; not in G, t, q)

Solving Constraints

A solution to a system of type constraints is a substitution Sa function from type variables to typesbecause some things get simpler, we assume

substitutions are defined on all type variables:S(a) = a (for almost all variables a)S(a) = s (for some type scheme s)

dom(S) = set of variables s.t. S(a) a

Substitutions

Given a substitution S, we can define a function S* from type schemes (as opposed to type variables) to type schemes:S*(int) = intS*(s1 -> s2) = S*(s1) -> S*(s2)S*(a) = S(a)

Due to laziness, I will write S(s) instead of S*(s)

Composition of Substitutions

Composition (U o S) applies the substitution S and then applies the substitution U: (U o S)(a) = U(S(a))

We will need to compare substitutionsT <= S if T is “more specific” than ST <= S if T is “less general” than SFormally: T <= S if and only if T = U o S for

some U

Composition of Substitutions

Examples:example 1: any substitution is less general than

the identity substitution I:S <= I because S = S o I

example 2:S(a) = int, S(b) = c -> cT(a) = int, T(b) = int -> int, T(c) = intwe conclude: T <= S if T(a) = int, T(b) = int -> bool then T is unrelated to S

(neither more nor less general)

Solving a Constraint

S |= q if S is a solution to the constraints q

S(s1) = S(s2) S |= q-----------------------------------S |= {s1 = s2} U q

----------S |= { }

any substitution isa solution for the emptyset of constraints

a solution to an equationis a substitution that makesleft and right sides equal

Most General Solutions

S is the principal (most general) solution of a constraint q ifS |= q (it is a solution) if T |= q then T <= S (it is the most general one)

Lemma: If q has a solution, then it has a most general oneWe care about principal solutions since they will give us the most general types for terms

Examples

Example 1q = {a=int, b=a}principal solution S:

Examples

Example 1q = {a=int, b=a}principal solution S:

S(a) = S(b) = intS(c) = c (for all c other than a,b)

Examples

Example 2q = {a=int, b=a, b=bool}principal solution S:

Examples

Example 2q = {a=int, b=a, b=bool}principal solution S:

does not exist (there is no solution to q)

principal solutions

principal solutions give rise to most general reconstruction of typing information for a term: fun f(x:a):a = x

is a most general reconstruction

fun f(x:int):int = x is not

Unification

Unification: An algorithm that provides the principal solution to a set of constraints (if one exists) If one exists, it will be principal

Unification

Unification: Unification systematically simplifies a set of constraints, yielding a substitutionduring simplification, we maintain (S,q)

S is the solution so farq are the constraints left to simplifyStarting state of unification process: (I,q)Final state of unification process: (S, { })

Unification Machine

We can specify unification as a transition system: (S,q) -> (S’,q’)

Base types & simple variables:

--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> (S, q)

-----------------------------(S,{a=a} U q) -> (S, q)

Unification Machine

Functions:

Variable definitions

----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q)

--------------------------------------------- (a not in FV(s))(S,{a=s} U q) -> ([a=s] o S, [s/a]q)

-------------------------------------------- (a not in FV(s))(S,{s=a} U q) -> ([a=s] o S, [s/a]q)

Occurs Check

What is the solution to {a = a -> a}?

Occurs Check

What is the solution to {a = a -> a}?There is none! (Remember your

homework)The occurs check detects this situation

-------------------------------------------- (a not in FV(s))(S,{s=a} U q) -> ([a=s] o S, [s/a]q)

occurs check

Irreducible States

Recall: final states have the form (S, { })Stuck states (S,q) are such that every equation in q has the form: int = bools1 -> s2 = s (s not function type)a = s (s contains a)or is symmetric to one of the above

Stuck states arise when constraints are unsolvable

Termination

We want unification to terminate (to give us a type reconstruction algorithm)In other words, we want to show that there is no infinite sequence of states (S1,q1) -> (S2,q2) -> ...

TerminationWe associate an ordering with constraintsq < q’ if and only if

q contains fewer variables than q’q contains the same number of variables as q’ but

fewer type constructors (ie: fewer occurrences of int, bool, or “->”)

This is a lexicographic orderingThere is no infinite decreasing sequence of

constraintsTo prove termination, we must demonstrate that

every step of the algorithm reduces the size of q according to this ordering

Termination

Lemma: Every step reduces the size of qProof: By cases (ie: induction) on the definition

of the reduction relation.--------------------------------(S,{int=int} U q) -> (S, q)

------------------------------------(S,{bool=bool} U q) -> (S, q)

-----------------------------(S,{a=a} U q) -> (S, q)

----------------------------------------------(S,{s11 -> s12= s21 -> s22} U q) -> (S, {s11 = s21, s12 = s22} U q)

------------------------ (a not in FV(s))(S,{a=s} U q) -> ([a=s] o S, [s/a]q)

Complete Solutions

A complete solution for (S,q) is a substitution T such that

1. T <= S2. T |= qintuition: T extends S and solves qA principal solution T for (S,q) is complete for (S,q) and

3. for all T’ such that 1. and 2. hold, T’ <= T

Properties of Solutions

Lemma 1:Every final state (S, { }) has a complete

solution. It is S:

S <= S S |= { }


Lemma 2No stuck state has a complete solution (or

any solution at all) it is impossible for a substitution to make the

necessary equations equal int bool int t1 -> t2 ...


Lemma 3 If (S,q) -> (S’,q’) then

T is complete for (S,q) iff T is complete for (S’,q’)

T is principal for (S,q) iff T is principal for (S’,q’) in the forward direction, this is the preservation

theorem for the unification machine!

Summary: Unification

By termination, (I,q) ->* (S,q’) where (S,q’) is irreducible. Moreover: If q’ = { } then

(S,q’) is final (by definition)S is a principal solution for q

Consider any T such that T is a solution to q. Now notice, S is principal for (S,q’) (by lemma 1) S is principal for (I,q) (by lemma 3) Since S is principal for (I,q), we know T <= S and

therefore S is a principal solution for q.

Summary: Unification (cont.)

... Moreover: If q’ is not { } (and (I,q) ->* (S,q’) where

(S,q’) is irreducible) then (S,q) is stuck. Consequently, (S,q) has no

complete solution. By lemma 3, even (I,q) has no complete solution and therefore q has no solution at all.

Summary: Type Inference

Type inference algorithm.Given a context G, and untyped term u:

Find e, t, q such that G |- u ==> e : t, qFind principal solution S of q via unification

if no solution exists, there is no reconstructionApply S to e, ie our solution is S(e)

S(e) contains schematic type variables a,b,c, etc that may be instantiated with any type

Since S is principal, S(e) characterizes all reconstructions.

type inference

Documents

b listb

b lista

b listend

b liststep

b ac

b astep

b cb

b listc