towards a language design for modular software verification aleks nanevski microsoft research,...

Towards a language design for modular software verification

Aleks NanevskiMicrosoft Research, Cambridge

Joint with Greg Morrisett (Harvard), Lars Birkedal (ITU Copenhagen), Amal Ahmed (TTI-Chicago)

Workshop on Effects and Type TheoryTallinn, December 13, 2007

How to design a programming language from scratch with verification in mind?

• Simple types have been very successful in preventing a class of programming errors.

• But many errors are outside of their reach. index-out-of-bounds division-by-zero invariants on mutable state, or almost anything involving effects

• Can a language enforce these deeper properties? While supporting usual features from programming practice. Be conservative over simply-typed languages.

Two foundational approaches to program specification and verification

• Hoare Logic starts with an existing language usually imperative, untyped, first-order recent extensions to simply-typed functional languages

[Honda’05],[Krishnaswami’06],[Birkedal’05]

• Dependent type theory targets pure higher-order lambda calculus types may capture deep semantic properties of data

• integer is even, list has 5 elements, etc.

• I want to argue that we essentially want a combination of both.

What limitations of simple types to address?

• Simple types cannot specify effects.

• These operations are naturally partial, but here they must be “completed”: perform run-time check possibly raise exception

• Simple types do not capture this partiality.

How to specify effect behavior?

• Type-and-effect systems: refine the type with the effect annotation.

Semantic disconnect in type-and-effect systems

• Following term would be labeled as throwing DivByZero, in most type-and-effect systems.

• Also, execution of div x n will repeat the check for n>0, even if it doesn’t need to.

• Also, how to specify dynamically generated exns? this immediately requires dependent types

How to reconnect type-and-effects with semantics?

• Idea: draw effect annotations from logic.

• y > 0 is a precondition that must be proved before running div x y. we will also require postconditions, like in Hoare logic and proofs

• Important: Pre/post-conditions become embedded in types.

Why embed specifications into types?

• Captures partiality e.g., no need to define div x y in case y · 0. hence, strictly more expressive than Hoare Logic

• Enables trade-offs between proving and efficiency I.e. we can immediately define:

• Uniform abstraction over terms, types, specs. essential for information hiding and scalability essential for higher-order and local state

Which logic to use for specifications?

• It should be able to support all kinds of programming features: practical data structures (e.g., hash-tables). higher-order functions, polymorphism. pointers, aliasing, state ownership recursion, callcc, IO, concurrency.

• Thus, the logic better be very expressive.

• Type theory (like Coq) seems perfect.

• But need to reconcile it with effects.

Hoare Type Theory (HTT)

• Introduce a type corresponding to specs in Hoare Logic (for partial correctness).

• Hoare type stands for stateful programs with precondition P postcondition Q result type A

• Simply-typed fragment (almost) core Haskell.

Hoare Type Theory (cont’d)

• Fruitful combination of some fundamental PL ideas: Dijkstra’s predicate transformer. Curry-Howard isomorphism. Monads (as in Haskell). Separation Logic of Reynolds, O’Hearn, et al.

• Provably compositional: components can be specified and checked in isolation.

• Prototype under construction as extension of Coq. Execution by code extraction.

Dependent types and effects

Type theories are unsound if effects are added naively

• Propositions like (10 < 0) are types.

• Effectful programs can often be given any type:

divergence via infinite recursion exceptions mutable state IO concurrency

• An effectful program can prove that (10 < 0)! Hence, the system is inconsistent

The

awkward

squad

from

Haskell

A solution: Monads

• Like in Haskell, distinguish purity with types pure fragment – the underlying type theory

• e : nat e is an integer value

• e : ST nat e is delayed effectful computation. when executed, it may change the state and diverge. but since it is delayed, it is actually considered pure. hence, can safely appear in types, predicates, proofs.

• e : ST (10 < 0) a computation which must diverge when executed.

Refine the monad with pre/post-condition to capture effectful behavior and partiality

• Hoare type is a dependent (or indexed) monad.

• Formation rule ST{P}x:A{Q} : Type if P : heap Prop A : Type x:A |- Q : heap heap Prop, where

heap = loc option(a:Type. a), and loc = nat.

• Note: postcondition is binary relation on heaps. Variant of VDM notation.

• whereis true if x points to v:A in h.

• Note: before running inc x, must prove that x stores a nat. because x may store a value of some other type. because x may be a dangling pointer.

Example: specify function that increments location contents and returns old value

Implementation of inc in Haskell-style do-notation.

• HTT implementation typechecks inc as follows: Compute P,Q=weakest pre/strongest post for the do-block Then emit obligation to prove the consequence:

Typing of primitive commands designed to compute weakest pre and strongest post

• Memory read

• (Strong) Memory update

Typing of primitive commands designed to compute weakest pre and strongest post

• Memory allocation

• Memory deallocation

Fixpoints are a little bit different…

• Pre/posts must be given explicitly (for now)

• Corresponds to giving loop invariants in Hoare Logic

• But should be possible to write a rule that infers the strongest invariant! Future work.

Monadic primitives (unit)

• Roughly, corresponds to Hoare Logic rule of variable assignment.

Monadic primitives (bind)

• Rule of sequential composition (but higher-order)

• Note: quantifications over pre/posts and heaps is essential for obtaining tightest specs.

Monadic primitives (Haskell-style do)

• Rule of consequence

• Interesting fact: “do” is not ordinary coercion it is an introduction form for Hoare type bind is corresponding elimination

Example: counter

• Allocate a private location x

• Export function that increments x

• Executing fcounter; x0f; x1f; x2f will bind

0,1,2 to x0,x1,x2, respectively.

• What is the spec for counter?

• Problem: x is out of scope in return type.

A specification with nested Hoare types

• Introduce invariant into code to hide how count is kept.

• Another problem: fst(f) 0 h states (x0) h, but we lost connection with i

• We will need Separation Logic to handle this.

Hide private state by existential abstraction

Proving program correctness in HTT

Weakest pre and strongest post precisely capture the semantics of a program.

• Problem: these may not be easy to read!

• Remember the example 3-line program:

Here is the computed tightest spec for inc, in Coq syntax.

inc : forall x : loc, ST (fun i : heap => (fun i0 : heap => exists v : nat, ptsto x v i0) i /\ (forall (x0 : nat) (m : heap), (fun (y : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y i0) x0 i m -> (fun (xv : nat) (i0 : heap) => (fun i1 : heap => exists B : Type, exists w : B, ptsto x w i1) i0 /\ (forall (x1 : unit) (m0 : heap), (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 m0 -> (fun (_ : unit) (_ : heap) => True) x1 m0)) x0 m)) (fun (y : nat) (i m : heap) => exists x0 : nat, exists h : heap, (fun (y0 : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y0 i0) x0 i h /\ (fun (xv y0 : nat) (i0 m0 : heap) => exists x1 : unit, exists h0 : heap, (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 h0 /\ (fun (_ : unit) (r : nat) (i1 f : heap) => r = xv /\ f = i1) x1 y0 h0 m0) x0 y h m)

Luckily, the spec has a lot of structure!

• It literally represents the program as a predicate.

• We apply the proving strategy from Hoare Logic: symbolically evaluate the program, one step at a time. at each step, discharge the verification condition that enables

the next evaluation step.

• With a twist: Evaluation/VC-generation can be implemented as a set of lemmas. proving the lemmas verifies the VC-gen implementation.

Example lemma for symbolic evaluation (in Coq syntax)

• If program starts with a read from location x: first prove that x is initialized (ptsto x v i) then proceed to prove the spec of the continuation.

• Other lemmas similar (evals_bind_write, evals_bind_new…)

• Applicable lemma can be determined by a tactic.

Lemma evals_bind_read :

forall (A B : Type) (x : loc) (v : A) (p2 : A -> heap -> Prop)

(q2 : A -> B -> heap -> heap -> Prop) (i : heap) (q : B -> heap -> Prop),

ptsto x v i -> (p2 v i /\ forall y m, q2 v y i m -> q y m) ->

(bind_pre (read_pre A x) (read_post A x) p2 i /\ forall y m, (bind_post (read_pre A x)

(read_post A x) p2 q2 y i m -> q y m.

Separation Logic

Large footprints in Hoare Logic

• Let inc:

• Q: What is known after inc runs in a heap with locations x and y?

• A: Only that xv+1, but all info about y is lost.

• Spec should explicitly say that y is not changed. possible to write in ST, but quite inconvenient

Small footprints and Separation Logic

• Specs should only describe what the program changes [O’Hearn,Reynolds,Pym,…]

• If e : STsep{P}x:A{Q}, then e can run in any heap containing a subheap i such that P i diverges, or returns subheap m such that Q i m part of initial heap outside i is not accessible.

• Easier to use than large footprints, but more difficult meta theory.

Separation logic adds two new things:

• Separating conjunction (easily definable in HTT):

(P * Q) holds of heap h iff P and Q hold of disjoint parts of h

• Frame rule of inference: If then

• Can we add Frame rule to HTT? How to prove that Frame is sound?

Employ a type-theoretic idea to expedite…

• Impose that well-typed programs must satisfy Frame!

• Define new monad STsep, over ST:

• Then re-type the stateful commands, using rule of consequence.

Programs remain the same, but specs become much simpler

• Example: allocation

empty subheap is consumed and replaced by rv r must be fresh (as new can’t access existing state)

• Example: deallocation

subheap x- is consumed and replaced by empty.

• Analogy with linear logic.

• Now (fst f) 0 replaces empty from the precondition.

• Meaning: initial heap is extended with x0

STsep monad correctly handles private state

Meta-theoretic properties:soundness, compositionality, equations

Verification in HTT reduces to typechecking

• Theorem: If e:ST{P}r:A{Q}, then E evaluates as expected.

• Proved via Preservation and Progress lemmas. but much more demanding!

• Preservation: evaluation preserves types, normal forms, and postconditions. e.g: if e:ST{T}r:int{r = 55} then e does produce 55.

• Progress demands soundness of assertion logic Requires a denotational model for HTT.

Type checking is syntax directed

• Program properties independent of context. No need for whole program reasoning. Proofs by induction on program structure.

• Program is a proof of its spec: in the pure case, by Curry-Howard. in the impure case, by weakest pre/strongest post.

• Formal statements of compositionality In the pure case, substitution principles. In the impure case, Hoare’s rule of composition.

Denotational models

• Denotation for e : ST{P}x:A{Q x} is apredicate transformer: takes p:heapProp such that 8h. p h P h returns q:AheapProp such that

8x h. q x h 9i. p i Æ Q x i h is monotone

• Model suffices for soundness, but too large e.g., does not support storing monads into heaps also, requires showing monotonicity before taking fix.

• Better, realizability model [Petersen,Birkedal’08]. But not implemented in Coq, and seems very hard to!

Implementation, related work, future work, summary

Summary

• HTT reflects effect information into types via Hoare-style pre/post conditions. Generalization of monadic type-and-effect systems, but

effect annotations are logical predicates over heaps.

• Types determine in which context a program may be used (in a context satisfying the precondition). This is a uniquely type-theoretic property, generalizing

ordinary Hoare Logics.

• Combines usefully with higher-order features of a type theory like Coq, to represent modes of use of state, like: freshnes, aliasing, ownership (via Separation Logic) higher-order and shared local state (via existential

abstraction).

Related work

• Extended static checking: ESC/Java, JML, Spec#, SPlint, Cyclone, Sage Hoare-like annotations verified during typechecking. Restrictive strategies for dealing with undecidability

• Dependent types and effects [Augustson’98],[Mandelbaum’03],[Zhu,Xi’05],[Shao’05],

[Sheard’05],[Westbrook’06],[Taha’07],[Condit’07]. Programs and specs cannot share pure code

(phase separation)

• Hoare Logics for higher-order functions: [Schoeder’02],[Honda’05],[Krishnaswami’06],[Birkedal’04] Simply-typed underlying languages (with effects) Hoare triples do not integrate into types.

HTT in comparison to related work.

Spec expressiveness

Programming features

Typed lambda calculus

Java,C#,Haskell,O’Caml

Dependent type theory (Coq,Epigram,NuPRL…)

Hoare specs (ESC,JML,Spec#,Cyclone)

Light dependent types (Cayenne,DML, ATS,Omega)

Fully verified

software

Fully verified

software

HTT

Future work: gain more experience with implementation in Coq

• A lot of scaffolding for verification is in place symbolic evaluation lemmas tactics for Separation Logic reasoning (were tricky to nail down at

first; several wrong starts)

• Getting ready to attack larger programs. Probably start with libraries for imperative data structures.

• Largest so far: Hash-table module, Stack module, Parsing combinators.

• Experience encouraging: proofs/code ratio quite large but proofs were not difficult

Future work: other effects

• First attempts at formulating Haskell-style monad for transactional concurrency. Separate state into private and shared Reasoning like O’Hearn’s concurrent separation logic Hoare type is a 4-touple STM{I}{P}x:A{Q} I – invariant of shared state

• Other notions of concurrency? Auxiliary variables, history/prophecy variables? Predicate transformers for concurrency?

• IO monad? Specifications must be limited to statements that are

invariant against outside changes to the world.

• Continuation monad? (first attempts made)

Future work: better models and axiomatizations

• Can we encode equality over effectful code as some reasonable judgment?

• Without having to implement involved categorical models.

Hopefully in future not too far, far away…

Spec expressiveness

Programming features

Typed lambda calculus

Java,C#,Haskell,O’Caml

Dependent type theory (Coq,Epigram,NuPRL…)

Hoare specs (ESC,JML,Spec#,Cyclone)

Light dependent types (Cayenne,DML, ATS,Omega)

Fully verified

software

Fully verified

software

HTT

towards a language design for modular software verification aleks nanevski microsoft research,...

Documents

hoare type theory htt

type theory tallinn

dependent types

verification hoare logic

local state slide

exception simple types

limitations of simple

effect systems