stp: a decision procedure for bit-vectors and arrays david l. dill stanford university
DESCRIPTION
What went before ● Series of decision procedures: SVC, CVC, CVCL ● All of these had combinations of first-order theories – Equality – Uninterpreted functions and predicates – Boolean connectives – Linear arithmetic over real numbers (and integers, in the case of CVCL) – But not quantifiers. ● CVCL was in use in EXE (or it’s predecessor – Dawson Engler research group)TRANSCRIPT
![Page 1: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/1.jpg)
STP: A Decision Procedure for
Bit-vectors and Arrays
David L. DillStanford University
![Page 2: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/2.jpg)
Software analysis tools present unique challenges for decision
procedures● Theories must match programming
language semantics– Operations are on bit-vectors, not integers– Arrays (for modelling memories)
● Must handle very large inputs with – Many array reads – Deeply nested array writes – Many linear equations – Many variables
● Decision procedure is called many times.
![Page 3: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/3.jpg)
What went before● Series of decision procedures: SVC, CVC, CVCL● All of these had combinations of first-order
theories– Equality– Uninterpreted functions and predicates– Boolean connectives– Linear arithmetic over real numbers (and integers,
in the case of CVCL)– But not quantifiers.
● CVCL was in use in EXE (or it’s predecessor – Dawson Engler research group)
![Page 4: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/4.jpg)
Combination of theories● The core strategy of SVC, CVC, CVCL was
based on dynamically breaking down formulas into conjunctions of “atomic formulas”– Atomic formulas have no Boolean connectives
(correspond to propositional variables).– Recursively assert/deny alpha (deny = assert
negation of)– Simplify after assertion– When simplified formula is conjunction of
literals, use special decision procedures.
![Page 5: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/5.jpg)
SAT vs. CVCL
● CVC/CVCL used CHAFF-like SAT solver to choose splitting variables
● … but puts lots of slow stuff in the inner loop!
![Page 6: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/6.jpg)
STP● CVCL was already used by Engler’s group,
but was unfixably slow.– There existed many examples generated by
Engler– Made correctness/performance testing easy.
● Vijay Ganesh and I decide to try a different approach, inspired by UCLID (Seshia & Bryant).– Put SAT at the bottom, with unmodified inner
loop.– Preprocess formula for higher-level reasoning
(bit-vectors and arrays).
![Page 7: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/7.jpg)
STP● Decides satisfiability of formulas over
– Bit-vector Terms● Constants● +, -, *, (signed) div, (signed) mod● Concatenation, Extraction● Left/Right Shift, Sign-extend, bitwise-Booleans
– Array Terms● Read(Array, index)● Write(Array, index, val)● But no array equality
– Predicates: =, signed & unsigned comparisons● If satisfiable, produces a model.
![Page 8: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/8.jpg)
Comparison with Saturn● STP is a separate “component” (can be
stand-alone, or used through an API).– Programming language or other tool is separate
● General input language (can define 23-bit bit-vector types if you want).
● Signed/unsigned encoded in operators, not in data types.
● No “points to”, heap, etc.● Implements signed/unsigned multiply,
divide, remainder (but no floating point).
![Page 9: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/9.jpg)
Some projects using STP
● STP has evolved (maintained by someone I don’t know in Australia)
● Several projects have used it.– EXE : Bug Finder by Dawson Engler, Cristian
Cadar and others at Stanford– Klee : Cadar, Dunbar, Engler– MINESWEEPER: Bug Finder by Dawn Song
and her group at CMU– …
![Page 10: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/10.jpg)
Main Ideas of STP
● Eager translation to CNF with word-level pre-processing – Theories not in “inner loop” of SAT solver,
unlike Nelson-Oppen approaches (e.g. CVCL).
● Abstraction-Refinement for arrays.– Laziness to counterbalance eagerness
● Solve linear formulas mod 2n in P-time
![Page 11: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/11.jpg)
Bitvector theory
● Data type: BV(n), where n is constant.● Almost all machine bitvector operations
– Change length (sign extended and not).– Concatenate bitvectors, extract bits from
BVs.– Signed and unsigned arithmetic: +, -, *, /,
%, <,>, etc.– AND, OR, NOT, XOR, etc.
![Page 12: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/12.jpg)
Array theory● Array type: BV(n) -> B(m)● read(A, i) – value of A[i]● write(A, i, v) – copy of A updated at
index i with value v.● No destructive modification – write
returns a new array, which is updated old array.
● Sometimes used to represent heap storage.
![Page 13: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/13.jpg)
Array theory
Identities:read(write(A, i, v), j) = ite(i = j, v, read(A, j))
STP has a limited theoryNo comparison of whole arrays, e.g.
write(A, i, v) = write(A, j, w)This makes things easier (see
http://sprout.stanford.edu/PAPERS/LICS-SBDL-2001.pdf
if you don’t like “easier”).
![Page 14: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/14.jpg)
Implementation
● DAG representation of expressions– Same subexpression structure = same
pointer.– Maintained by hashing.– No destructive operations on DAGs
(modification requires new nodes).● Makes substitution, equality check very
efficient.● Often log size of tree expression
representation.
![Page 15: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/15.jpg)
DAGs
● All recursive traversals must be “memoized”– Want traversal to be linear in size of DAG,
not tree.– First thing to think about when functions
don’t finish: “Maybe I messed up memoization.”
● Updating nodes near the root less expensive than updating nodes near leaves.
![Page 16: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/16.jpg)
STP ArchitectureInput Formula
Refinement Loop
Substitutions
Simplifications
Linear Solving
BitBlast
CNF Coversion
SAT Solver
Array Abstraction
Result
![Page 17: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/17.jpg)
Substitution is Important
● Inputs often have many simple equations (in EXE, this is how constant arrays are defined):– x = 4– A[4] = e
● Early pass to substitute these– Allows constant evaluation– Enables other optimizations– Reduces non-constant indices in array reads.
![Page 18: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/18.jpg)
Word-level simplifications● Many simple local rewrites:
– Bitwise Boolean identities (e.g. a AND !a = 00000, a XOR a = 11111, (a + b)[0:3] = a[0:3]+b[0:3], etc.
– Generally, avoid distributive laws because they cause blow-up.
– Be careful about “destroying sharing” in DAG.● Flatten trees of associative operators● Sort operands of commutative operation
– Tweak ordering for easy simplification– Expressions numbered in order of creation (children <
parents). Use this order to sort, but:– Put constants first (F AND a AND …), (1 + 3 + x +)– Arrange for x, !x to be adjacent (also x, -x)
![Page 19: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/19.jpg)
Array Reads
Read(A,i0) = t0Read(A,i1) = t1...Read(A,in) = tn
v0 = t0v1 = t1...vn = tn
(i1=i0) => v1=v0(i2=i0) => v2=v0(i2=i1) => v2=v1…
• Problem : O(n2) axioms added, n is number of read indices• Lethal, if n is large: n = 10000, # of axioms: ~ 100 million • Blowup seems hard to avoid (e.g. UCLID).• This is “aliasing” from another perspective.
Replace array readswith fresh variables
and axioms
![Page 20: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/20.jpg)
Abstraction/Refinement in STP
● Pose problem as a conjunction of formulas– E.g., instantiated array read axioms
● Abstraction: solve for a proper subset of the formulas– E.g., omit array read axioms
● Early exit if:– Unsatisfiable– Satisfiable and model actually satisfies unabstracted
formula.● Otherwise, add some omitted formulas to the
abstracted formula and solve again.– If at least one of these formulas is false in the model, that
model will not be regenerated.
![Page 21: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/21.jpg)
Abstraction-Refinement Algorithm for Array Reads
Read(A,0)=0Read(A,i)=1
Input
v0 = 0vi = 1
After Abstraction
SATSolver
i = 0v0 = 0vi = 1
Check Inputon Assignment
Counterexample
Refinement Step:Add Axiom
(i=0) => vi = 0Rerun SATSolver
i=1vi=1 False:
Read(A,0)=0Read(A,0)=1
![Page 22: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/22.jpg)
Experience with Read Abstraction-Refinement
Works well, even in satisfiable cases– Satisfier often finds a model that minimizes
aliasing– Few axioms need to be added during
refinement– Typical number of refinement loops : < 3
![Page 23: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/23.jpg)
The Problemwith Array Writes
● Standard transformation read(write(A, i, v), j) = ite(i=j, v, read(A, i)) <- “if-then-else” causes term blow-up● Many different read expressions share
write sub-terms.● O(n*m) blow-up in expr DAG
– n is write term nesting levels– m is number of read indices
![Page 24: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/24.jpg)
The Problem with Array Writes
R(W(W(A,i0,v0),i1,v1),j) =R(W(W(A,i0,v0),i1,v1),k)
If (i1=j) v1 elsif (i0=j) v0else R(A,j)
v0
If (i1=k) v1 elsif(i0=k) v0
else R(A,j)=
=
R
j
R
k
W
i0A
W
v1i1
=
ite=
i1 k
ite=
i1 jv1 v1
ite=
i0 jv0
ite=
i0 kv0
RA j
![Page 25: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/25.jpg)
Write transformation
Replace read(write(A, i, v), j)with a fresh variable (e.g., v0)and “axiom” v0 = ite(i=j, v, read(A, j))
Abstraction omits axiom.
![Page 26: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/26.jpg)
Abstraction-Refinement Algorithm
for Array WritesR(W(A,i,v),j)= 0R(W(A,i,v),k)=1i = j /= kv /= 0
After Abstraction v1=0v2=1
i = j /=kv/=0
SATSolver
v1=0, v2=1i = j =0, k=1,
v = 1
Check modelon original
formula
Refinement Step
Add Axiom to SATv1=ite(i=j,v,R(A,j))
UNSAT
False:R(W(A,0,1),0)=0R(W(A,0,1),1)=10 = 0 /= 11 /= 0
![Page 27: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/27.jpg)
Experimental Results:Array Writes
Example (Size in unique nodes)
Result Write Abstraction (sec)
NO Write Abstraction
(sec)Grep0084 (69K)
Sat 109 506
Grep0106 (69K)
Sat 270 TimeOut
Grep0117 (70K)
Sat 218 TimeOut
610dd9dc (15k)
Sat 188 101
Testcase20 (1.2M)
Sat 67 Memory Out
Examples courtesy Dawn Song (CMU) and David Molnar (Berkeley)
![Page 28: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/28.jpg)
Algorithm for SolvingLinear Bit-vector Equations
● Inspired by Barrett et al., DAC 1998● Basic Idea in STP
– Solve for a variable and substitute it away– If cannot eliminate a whole variable,
eliminate as many bits as possible.● Previous Work
– Mostly variants of Gaussian Elimination– Solve-and-substitute is more convenient in
a general decision procedure.
![Page 29: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/29.jpg)
Algorithm for Solving Linear Bit-vector Equations
(3 bits)3x + 4y + 2z = 02x + 2y + 2 = 04y + 2x + 2z = 0
Solve for x infirst eqn:
3-1 mod 8 = 3,
x = 4y + 2z(3 bits)
2y + 4z + 2 = 04y + 6z = 0
Substitute x
![Page 30: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/30.jpg)
Algorithm for SolvingLinear Bit-vector Equations
(3 bits)2y + 4z + 2 = 04y + 6z = 0
All Coeffs EvenNo Inverse
Divide by 2Ignore high-order
bits
(2 bits)y[1:0] + 2z[1:0] + 1 = 02y[1:0] + 3z[1:0] = 0
![Page 31: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/31.jpg)
Algorithm for SolvingLinear Bit-vector Equations
(2 bits)y[1:0] + 2z[1:0] + 1 = 02y[1:0] + 3z[1:0] = 0
Solve for y[1:0]
(2 bits)y[1:0]=2z + 3
Substitute y[1:0]
(2 bits)3z[1:0] + 2 = 0
![Page 32: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/32.jpg)
Algorithm for SolvingLinear Bit-vector Equations
(2 bits)3z[1:0] + 2 = 0 Solve for z[1:0]
(2 bits)z[1:0]=2
Solution (3 bits):z[1:0] = 2y[1:0] = 2z[1:0] + 3 = 3y = y’ @ 2z = z’ @ 3x = 4y + 2z
![Page 33: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/33.jpg)
Experimental Results: Solver for Linear Equations
Name (Node Size)
Result All On
(sec)
Arr-On,Linear OFF
(sec)
Arr OFF,Linear On(sec)
All OFF
(sec)
Test15 (0.9M)
Sat 66 192 64 Memory Out
Test16(0.9M)
Sat 67 233 66 MemoryOut
Thumb1(3.2M)
Sat 115 111 113 MemoryOut
Thumb2(4.3M)
Sat 1920 MO 1920 MemoryOut
Thumb3(2.7M)
Sat 840 MO 840 MemoryOut
![Page 34: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/34.jpg)
Equivalence checking of block cipher implementations● Problem: Prove correctness of block
ciphers (e.g., AES).– Constant number of loop iterations– No interesting heap usage
● Approach:– Given two implementations, AES1 and AES2– Turn them into big expressions by unrolling
loops, etc.– Prove that AES1(x) ≠ AES2(x) is
unsatisfiable.
![Page 35: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/35.jpg)
How can this possibly work?
● Round 1
● Round 2
● Round 3
● Round 4
● Round 1
● Round 2
● Round 3
● Round 4
Many block ciphers consist of a fixed sequence of “rounds”.
Implementations of rounds in two algorithms may vary, but bits “between” rounds are equal.
So, we only have to prove individual rounds are equivalent.
![Page 36: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/36.jpg)
Equivalence checking
inputs
Equal for many testinputs.
Only try to prove Equivalence when nodes pass this test.
● ●
![Page 37: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/37.jpg)
Equivalence checking
Prove equal usingSTP
● ●
![Page 38: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/38.jpg)
Equivalence checking
inputs
Replace by everywhere in DAG.
This makes higher-level expressions more similar.
●
![Page 39: STP: A Decision Procedure for Bit-vectors and Arrays David L. Dill Stanford University](https://reader035.vdocuments.us/reader035/viewer/2022062401/5a4d1b587f8b9ab0599aa2b1/html5/thumbnails/39.jpg)
Project ideas
● Use abstraction/refinement to improve efficiency of verification with multiplication/division and other ugly operators.
● Hack Klee (open source, based on LLVM, uses STP).
● Extend to floating point.● Formal verification of crypto functions in
“C” (get Eric Smith thesis from me).