towards a language design for modular software verification aleks nanevski microsoft research,...
TRANSCRIPT
Towards a language design for modular software verification
Aleks NanevskiMicrosoft Research, Cambridge
Joint with Greg Morrisett (Harvard), Lars Birkedal (ITU Copenhagen), Amal Ahmed (TTI-Chicago)
Workshop on Effects and Type TheoryTallinn, December 13, 2007
How to design a programming language from scratch with verification in mind?
• Simple types have been very successful in preventing a class of programming errors.
• But many errors are outside of their reach. index-out-of-bounds division-by-zero invariants on mutable state, or almost anything involving effects
• Can a language enforce these deeper properties? While supporting usual features from programming practice. Be conservative over simply-typed languages.
Two foundational approaches to program specification and verification
• Hoare Logic starts with an existing language usually imperative, untyped, first-order recent extensions to simply-typed functional languages
[Honda’05],[Krishnaswami’06],[Birkedal’05]
• Dependent type theory targets pure higher-order lambda calculus types may capture deep semantic properties of data
• integer is even, list has 5 elements, etc.
• I want to argue that we essentially want a combination of both.
What limitations of simple types to address?
• Simple types cannot specify effects.
• These operations are naturally partial, but here they must be “completed”: perform run-time check possibly raise exception
• Simple types do not capture this partiality.
How to specify effect behavior?
• Type-and-effect systems: refine the type with the effect annotation.
Semantic disconnect in type-and-effect systems
• Following term would be labeled as throwing DivByZero, in most type-and-effect systems.
• Also, execution of div x n will repeat the check for n>0, even if it doesn’t need to.
• Also, how to specify dynamically generated exns? this immediately requires dependent types
How to reconnect type-and-effects with semantics?
• Idea: draw effect annotations from logic.
• y > 0 is a precondition that must be proved before running div x y. we will also require postconditions, like in Hoare logic and proofs
• Important: Pre/post-conditions become embedded in types.
Why embed specifications into types?
• Captures partiality e.g., no need to define div x y in case y · 0. hence, strictly more expressive than Hoare Logic
• Enables trade-offs between proving and efficiency I.e. we can immediately define:
• Uniform abstraction over terms, types, specs. essential for information hiding and scalability essential for higher-order and local state
Which logic to use for specifications?
• It should be able to support all kinds of programming features: practical data structures (e.g., hash-tables). higher-order functions, polymorphism. pointers, aliasing, state ownership recursion, callcc, IO, concurrency.
• Thus, the logic better be very expressive.
• Type theory (like Coq) seems perfect.
• But need to reconcile it with effects.
Hoare Type Theory (HTT)
• Introduce a type corresponding to specs in Hoare Logic (for partial correctness).
• Hoare type stands for stateful programs with precondition P postcondition Q result type A
• Simply-typed fragment (almost) core Haskell.
Hoare Type Theory (cont’d)
• Fruitful combination of some fundamental PL ideas: Dijkstra’s predicate transformer. Curry-Howard isomorphism. Monads (as in Haskell). Separation Logic of Reynolds, O’Hearn, et al.
• Provably compositional: components can be specified and checked in isolation.
• Prototype under construction as extension of Coq. Execution by code extraction.
Dependent types and effects
Type theories are unsound if effects are added naively
• Propositions like (10 < 0) are types.
• Effectful programs can often be given any type:
divergence via infinite recursion exceptions mutable state IO concurrency
• An effectful program can prove that (10 < 0)! Hence, the system is inconsistent
The
awkward
squad
from
Haskell
A solution: Monads
• Like in Haskell, distinguish purity with types pure fragment – the underlying type theory
• e : nat e is an integer value
• e : ST nat e is delayed effectful computation. when executed, it may change the state and diverge. but since it is delayed, it is actually considered pure. hence, can safely appear in types, predicates, proofs.
• e : ST (10 < 0) a computation which must diverge when executed.
Refine the monad with pre/post-condition to capture effectful behavior and partiality
• Hoare type is a dependent (or indexed) monad.
• Formation rule ST{P}x:A{Q} : Type if P : heap Prop A : Type x:A |- Q : heap heap Prop, where
heap = loc option(a:Type. a), and loc = nat.
• Note: postcondition is binary relation on heaps. Variant of VDM notation.
• whereis true if x points to v:A in h.
• Note: before running inc x, must prove that x stores a nat. because x may store a value of some other type. because x may be a dangling pointer.
Example: specify function that increments location contents and returns old value
Implementation of inc in Haskell-style do-notation.
• HTT implementation typechecks inc as follows: Compute P,Q=weakest pre/strongest post for the do-block Then emit obligation to prove the consequence:
Typing of primitive commands designed to compute weakest pre and strongest post
• Memory read
• (Strong) Memory update
Typing of primitive commands designed to compute weakest pre and strongest post
• Memory allocation
• Memory deallocation
Fixpoints are a little bit different…
• Pre/posts must be given explicitly (for now)
• Corresponds to giving loop invariants in Hoare Logic
• But should be possible to write a rule that infers the strongest invariant! Future work.
Monadic primitives (unit)
• Roughly, corresponds to Hoare Logic rule of variable assignment.
Monadic primitives (bind)
• Rule of sequential composition (but higher-order)
• Note: quantifications over pre/posts and heaps is essential for obtaining tightest specs.
Monadic primitives (Haskell-style do)
• Rule of consequence
• Interesting fact: “do” is not ordinary coercion it is an introduction form for Hoare type bind is corresponding elimination
Example: counter
• Allocate a private location x
• Export function that increments x
• Executing fcounter; x0f; x1f; x2f will bind
0,1,2 to x0,x1,x2, respectively.
• What is the spec for counter?
• Problem: x is out of scope in return type.
A specification with nested Hoare types
• Introduce invariant into code to hide how count is kept.
• Another problem: fst(f) 0 h states (x0) h, but we lost connection with i
• We will need Separation Logic to handle this.
Hide private state by existential abstraction
Proving program correctness in HTT
Weakest pre and strongest post precisely capture the semantics of a program.
• Problem: these may not be easy to read!
• Remember the example 3-line program:
Here is the computed tightest spec for inc, in Coq syntax.
inc : forall x : loc, ST (fun i : heap => (fun i0 : heap => exists v : nat, ptsto x v i0) i /\ (forall (x0 : nat) (m : heap), (fun (y : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y i0) x0 i m -> (fun (xv : nat) (i0 : heap) => (fun i1 : heap => exists B : Type, exists w : B, ptsto x w i1) i0 /\ (forall (x1 : unit) (m0 : heap), (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 m0 -> (fun (_ : unit) (_ : heap) => True) x1 m0)) x0 m)) (fun (y : nat) (i m : heap) => exists x0 : nat, exists h : heap, (fun (y0 : nat) (i0 m0 : heap) => m0 = i0 /\ ptsto x y0 i0) x0 i h /\ (fun (xv y0 : nat) (i0 m0 : heap) => exists x1 : unit, exists h0 : heap, (fun (_ : unit) (i1 m1 : heap) => m1 = update x (xv + 1) i1) x1 i0 h0 /\ (fun (_ : unit) (r : nat) (i1 f : heap) => r = xv /\ f = i1) x1 y0 h0 m0) x0 y h m)
Luckily, the spec has a lot of structure!
• It literally represents the program as a predicate.
• We apply the proving strategy from Hoare Logic: symbolically evaluate the program, one step at a time. at each step, discharge the verification condition that enables
the next evaluation step.
• With a twist: Evaluation/VC-generation can be implemented as a set of lemmas. proving the lemmas verifies the VC-gen implementation.
Example lemma for symbolic evaluation (in Coq syntax)
• If program starts with a read from location x: first prove that x is initialized (ptsto x v i) then proceed to prove the spec of the continuation.
• Other lemmas similar (evals_bind_write, evals_bind_new…)
• Applicable lemma can be determined by a tactic.
Lemma evals_bind_read :
forall (A B : Type) (x : loc) (v : A) (p2 : A -> heap -> Prop)
(q2 : A -> B -> heap -> heap -> Prop) (i : heap) (q : B -> heap -> Prop),
ptsto x v i -> (p2 v i /\ forall y m, q2 v y i m -> q y m) ->
(bind_pre (read_pre A x) (read_post A x) p2 i /\ forall y m, (bind_post (read_pre A x)
(read_post A x) p2 q2 y i m -> q y m.
Separation Logic
Large footprints in Hoare Logic
• Let inc:
• Q: What is known after inc runs in a heap with locations x and y?
• A: Only that xv+1, but all info about y is lost.
• Spec should explicitly say that y is not changed. possible to write in ST, but quite inconvenient
Small footprints and Separation Logic
• Specs should only describe what the program changes [O’Hearn,Reynolds,Pym,…]
• If e : STsep{P}x:A{Q}, then e can run in any heap containing a subheap i such that P i diverges, or returns subheap m such that Q i m part of initial heap outside i is not accessible.
• Easier to use than large footprints, but more difficult meta theory.
Separation logic adds two new things:
• Separating conjunction (easily definable in HTT):
(P * Q) holds of heap h iff P and Q hold of disjoint parts of h
• Frame rule of inference: If then
• Can we add Frame rule to HTT? How to prove that Frame is sound?
Employ a type-theoretic idea to expedite…
• Impose that well-typed programs must satisfy Frame!
• Define new monad STsep, over ST:
• Then re-type the stateful commands, using rule of consequence.
Programs remain the same, but specs become much simpler
• Example: allocation
empty subheap is consumed and replaced by rv r must be fresh (as new can’t access existing state)
• Example: deallocation
subheap x- is consumed and replaced by empty.
• Analogy with linear logic.
• Now (fst f) 0 replaces empty from the precondition.
• Meaning: initial heap is extended with x0
STsep monad correctly handles private state
Meta-theoretic properties:soundness, compositionality, equations
Verification in HTT reduces to typechecking
• Theorem: If e:ST{P}r:A{Q}, then E evaluates as expected.
• Proved via Preservation and Progress lemmas. but much more demanding!
• Preservation: evaluation preserves types, normal forms, and postconditions. e.g: if e:ST{T}r:int{r = 55} then e does produce 55.
• Progress demands soundness of assertion logic Requires a denotational model for HTT.
Type checking is syntax directed
• Program properties independent of context. No need for whole program reasoning. Proofs by induction on program structure.
• Program is a proof of its spec: in the pure case, by Curry-Howard. in the impure case, by weakest pre/strongest post.
• Formal statements of compositionality In the pure case, substitution principles. In the impure case, Hoare’s rule of composition.
Denotational models
• Denotation for e : ST{P}x:A{Q x} is apredicate transformer: takes p:heapProp such that 8h. p h P h returns q:AheapProp such that
8x h. q x h 9i. p i Æ Q x i h is monotone
• Model suffices for soundness, but too large e.g., does not support storing monads into heaps also, requires showing monotonicity before taking fix.
• Better, realizability model [Petersen,Birkedal’08]. But not implemented in Coq, and seems very hard to!
Implementation, related work, future work, summary
Summary
• HTT reflects effect information into types via Hoare-style pre/post conditions. Generalization of monadic type-and-effect systems, but
effect annotations are logical predicates over heaps.
• Types determine in which context a program may be used (in a context satisfying the precondition). This is a uniquely type-theoretic property, generalizing
ordinary Hoare Logics.
• Combines usefully with higher-order features of a type theory like Coq, to represent modes of use of state, like: freshnes, aliasing, ownership (via Separation Logic) higher-order and shared local state (via existential
abstraction).
Related work
• Extended static checking: ESC/Java, JML, Spec#, SPlint, Cyclone, Sage Hoare-like annotations verified during typechecking. Restrictive strategies for dealing with undecidability
• Dependent types and effects [Augustson’98],[Mandelbaum’03],[Zhu,Xi’05],[Shao’05],
[Sheard’05],[Westbrook’06],[Taha’07],[Condit’07]. Programs and specs cannot share pure code
(phase separation)
• Hoare Logics for higher-order functions: [Schoeder’02],[Honda’05],[Krishnaswami’06],[Birkedal’04] Simply-typed underlying languages (with effects) Hoare triples do not integrate into types.
HTT in comparison to related work.
Spec expressiveness
Programming features
Typed lambda calculus
Java,C#,Haskell,O’Caml
Dependent type theory (Coq,Epigram,NuPRL…)
Hoare specs (ESC,JML,Spec#,Cyclone)
Light dependent types (Cayenne,DML, ATS,Omega)
Fully verified
software
Fully verified
software
HTT
Future work: gain more experience with implementation in Coq
• A lot of scaffolding for verification is in place symbolic evaluation lemmas tactics for Separation Logic reasoning (were tricky to nail down at
first; several wrong starts)
• Getting ready to attack larger programs. Probably start with libraries for imperative data structures.
• Largest so far: Hash-table module, Stack module, Parsing combinators.
• Experience encouraging: proofs/code ratio quite large but proofs were not difficult
Future work: other effects
• First attempts at formulating Haskell-style monad for transactional concurrency. Separate state into private and shared Reasoning like O’Hearn’s concurrent separation logic Hoare type is a 4-touple STM{I}{P}x:A{Q} I – invariant of shared state
• Other notions of concurrency? Auxiliary variables, history/prophecy variables? Predicate transformers for concurrency?
• IO monad? Specifications must be limited to statements that are
invariant against outside changes to the world.
• Continuation monad? (first attempts made)
Future work: better models and axiomatizations
• Can we encode equality over effectful code as some reasonable judgment?
• Without having to implement involved categorical models.
Hopefully in future not too far, far away…
Spec expressiveness
Programming features
Typed lambda calculus
Java,C#,Haskell,O’Caml
Dependent type theory (Coq,Epigram,NuPRL…)
Hoare specs (ESC,JML,Spec#,Cyclone)
Light dependent types (Cayenne,DML, ATS,Omega)
Fully verified
software
Fully verified
software
HTT