using smt solvers for program analysis · •convert each occurrence of async-call to call •make...
TRANSCRIPT
![Page 1: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/1.jpg)
Using SMT solvers for program analysis Shaz Qadeer Research in Software Engineering Microsoft Research
![Page 2: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/2.jpg)
Satisfiability modulo theories
(a c) (b c) (a b c)
c = true b = true a = true
(a c) (b c) (a b c) a f(x-y) = 1 b f(y-x) = 2 c x = y
c = false, b = true, a = true, x = 0, y = 1, f = [-1 1, 1 2, else 0]
![Page 3: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/3.jpg)
Communicating theories
f(x – y) = 1, f(y-x) = 2, x = y
f(p) = q, f(r) = s, x = y p = x – y, q = 1, r = y – x, s = 2
x = y
p = r
q = s
UNSAT
![Page 4: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/4.jpg)
Applications
• Symbolic execution – SAGE – PEX
• Static checking of code contracts
– Spec# – Dafny – VCC
• Security analysis – HAVOC
• Searching program behaviors – Poirot
![Page 5: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/5.jpg)
Anatomy of an application
• The profile of each application determined by
– Boolean structure
– theories used
– theory vs. propositional
– deep vs. shallow
– presence/absence of quantifiers
– …
![Page 6: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/6.jpg)
Applications
• Symbolic execution – SAGE – PEX
• Static checking of code contracts
– Spec# – Dafny – VCC
• Security analysis – HAVOC
• Searching program behaviors – Poirot
![Page 7: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/7.jpg)
* BoogiePL
Z3
Boogie VCGen
C/.NET/Dafny Program
BoogiePL program
Verification condition
Verified Warning
Annotations
SMT in program analysis
![Page 8: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/8.jpg)
class C { int size; int[] data; void write(int i, int v) {
if (i >= data.Length) { var t = new int[2*i]; copy(data, t); data = t; } data[i] = v; } static copy(int[] from, int[] to) { for (int i = 0; i < from.Length; i++) { to[i] = from[i]; } } }
var size: [Ref]int; var data: [Ref]Ref; var Contents: [Ref][int]int function Length(Ref): int; proc write(this: Ref, i: int, v: int) { var t: Ref; if (i >= Length(data)) { call t := alloc(); assume Length(t) == 2*i; call copy(data[this], t); data[this] := t; } assert 0 <= i && i < Length(data[this]); Contents[data[this]][i] := v; } proc copy(from: Ref, to: Ref) { var i: int; i := 0; while (i < Length(from)) { assert 0 <= i && i < Length(from); assert 0 <= i && i < Length(to); Contents[to][i] := Contents[from][i]; i := i + 1; } }
![Page 9: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/9.jpg)
Modeling the heap
Contents[data[this]][i] := v
Contents[Select(data, this)][i] := v
Contents[Select(data, this)] := Update(Contents[Select(data, this)], i, v)
Contents := Update(Contents, Select(data, this), Update(Contents[Select(data, this)], i, v))
Theory of arrays: Select, Store for all f, i, v :: Select(Update(f, i, v), i) = v for all f, i, v, j :: i = j Select(Update(f, i, v), j) = Select(f, j)
for all f, g :: f = g (exists i :: Select(f, i) Select(g, i))
var Alloc: [Ref]bool; proc alloc() returns (x: int) { assume !Alloc[x]; Alloc[x] := true; }
![Page 10: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/10.jpg)
Program correctness
• Floyd-Hoare triple {P} S {Q}
P, Q : predicates/property
S : a program
• From a state satisfying P, if S executes, – No assertion in S fails, and
– Terminating executions end up in a state satisfying Q
![Page 11: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/11.jpg)
Annotations
• Assertions over program state • Can appear in
– Assert – Assume – Requires – Ensures – Loop invariants
• Program state can be extended with ghost variables – State of a lock – Size of C buffers
![Page 12: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/12.jpg)
Weakest liberal precondition
wlp( assert E, Q ) = E Q wlp( assume E, Q ) = E Q wlp( S;T, Q ) = wlp(S, wlp(T, Q)) wlp( if E then S else T, Q ) = if E then wlp(S, Q) else wlp(T, Q) wlp( x := E, Q ) = Q[E/x] wlp( havoc x, Q ) = x. Q
![Page 13: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/13.jpg)
Desugaring loops
– inv J while B do S end
• Replace loop with loop-free code: assert J; havoc modified(S); assume J; if (B) { S;
assert J; assume false; }
Check J at entry
Check J is inductive
![Page 14: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/14.jpg)
Desugaring procedure calls
• Each procedure verified separately • Procedure calls replaced with their specifications
procedure Foo(); requires pre; ensures post; modifies V;
call Foo(); assert pre; havoc V; assume post;
precondition
postcondition
set of variables possibly modified in Foo
![Page 15: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/15.jpg)
Inferring annotations
• Problem statement – Given a set of procedures P1, …, Pn – A set of C of candidate annotations for each procedure – Returns a subset of the candidate annotations such that
each procedure satisfies its annotations
• Houdini algorithm
– Performs a greatest-fixed point starting from all annotations • Remove annotations that are violated
– Requires a quadratic (n * |C|) number of queries to a modular verifier
![Page 16: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/16.jpg)
Limits of modular analysis
• Supplying invariants and contracts may be difficult for developers
• Other applications may be enabled by whole program analysis
– Answering developer questions: how did my program get to this line of code?
– Crash-dump analysis: reconstruct executions that lead to a particular failure
![Page 17: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/17.jpg)
Reachability modulo theories Variables: X
T1(X,X’) T2(X,X’)
T3(X,X’) T4(X,X’)
T5(X,X’) T6(X,X’)
Ti(X, X’) are transition predicates for transforming input state X to output state X’ • assume satisfiability for Ti(X, X’) is “efficiently”
decidable Is there a feasible path from blue to orange node?
Parameterized in two dimensions • theories: Boolean, arithmetic, arrays, … • control flow: loops, procedure calls, threads, …
T8(X,X’) T7(X,X’)
![Page 18: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/18.jpg)
Complexity of (sequential) reachability-modulo-theories
• Undecidable in general
– as soon as unbounded executions are possible
• Decidable for hierarchical programs
– PSPACE-hard (with only Boolean variables)
– NEXPTIME-hard (with uninterpreted functions)
– in NEXPTIME (if satisfiability-modulo-theories in NP)
![Page 19: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/19.jpg)
Corral: A solver for reachability-modulo-theories
• Solves queries up to a finite recursion depth
– reduces to hierarchical programs
• Builds on top of Z3 solver for satisfiability-modulo-theories
• Design goals
– exploit efficient goal-directed search in Z3
– use abstractions to speed-up search
– avoid the exponential cost of static inlining
![Page 20: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/20.jpg)
Corral architecture for sequential programs
Input Program
Abstract Program
Variable Abstraction
Stratified Inlining Z3
True counter-example
?
Unreachable
Reachable Hierarchical Refinement
Yes No
Z3
![Page 21: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/21.jpg)
Corral architecture for sequential programs
Input Program
Abstract Program
Variable Abstraction
Stratified Inlining Z3
True counter-example
?
Unreachable
Reachable Hierarchical Refinement
Yes No
Z3
![Page 22: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/22.jpg)
Corral architecture for sequential programs
Input Program
Abstract Program
Variable Abstraction
Stratified Inlining Z3
True counter-example
?
Unreachable
Reachable Hierarchical Refinement
Yes No
Z3
![Page 23: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/23.jpg)
Handling concurrency
Input Program
Abstract Program
Variable Abstraction
Stratified Inlining Z3
True counter-example
?
Unreachable
Reachable Hierarchical Refinement
Yes No
Z3
Sequentialization
![Page 24: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/24.jpg)
What is sequentialization?
• Given a concurrent program P, construct a sequential program Q such that Q P
• Drop each occurrence of async-call
• Convert each occurrence of async-call to call
• Make Q as large as possible
![Page 25: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/25.jpg)
Parameterized sequentialization
• Given a concurrent program P, construct a family of programs Qi such that
– Q0 Q1 Q2 … P
– iQi = P
• Even better if interesting behaviors of P manifest in Qi for low values of i
![Page 26: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/26.jpg)
Context-bounding
• Captures a notion of interesting executions in concurrent programs
• Under-approximation parameterized by K ≥ 0
– executions in which each thread gets at most K contexts to execute
– as K , we get all behaviors
![Page 27: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/27.jpg)
Context-bounding is sequentializable
• For any concurrent program P and K ≥ 0, there is a sequential program QK that captures all executions of P up to context bound K
• Simple source-to-source transformation
– linear in |P| and K
– each global variable is copied K times
![Page 28: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/28.jpg)
Challenges
![Page 29: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/29.jpg)
Programming SMT solvers
• Little support for decomposition
– Floyd-Hoare is the only decomposition rule
• Little support for abstraction
– SMT solvers are a black box
– difficult to influence search
• How do we calculate program abstractions using an SMT solver?
![Page 30: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/30.jpg)
Mutable dynamically-allocated memory
• Select-Update theory is expensive
• Select-Update theory is not expressive enough
– to represent heap shapes
– to encode frame conditions
![Page 31: Using SMT solvers for program analysis · •Convert each occurrence of async-call to call •Make Q as large as possible . Parameterized sequentialization •Given a concurrent program](https://reader034.vdocuments.us/reader034/viewer/2022050515/5f9ef43c48f5001cd46f1d11/html5/thumbnails/31.jpg)
Quantifiers
• Appear due to
– partial axiomatizations
– frame conditions
– assertions
• Undecidable in general
• A few decidability results
– based on finite instantiations
– brittle