introduction to satisfiability modulo theories yakir vizel smt seminar spring 2014

55
Introduction to Satisfiability Modulo Theories Yakir Vizel SMT Seminar Spring 2014

Upload: merryl-hart

Post on 02-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Introduction to Satisfiability Modulo Theories

Yakir Vizel

SMT Seminar Spring 2014

Outline

• Background– Propositional logic– DPLL-style SAT solvers

• The SMT World– First order logic– Theories

• SMT Solving– Eager approach– Lazy approach– DPLL(T)

Motivation

int power3(int in) {int i, out_a; out_a = in; for (i=0; i<2; i++) out_a = out_a * in; return out_a; }

int power3_new(int in) {int out_b; out_b = (in * in) * in; return out_b; }

• Are these two programs equal?

• In general, this is an undecidable problem– No sound and complete method

Motivation (2)

out0_a = in ⋀out1_a = out0_a * in ⋀out2_a = out1_a * in

out0_b = (in * in) * in

• Only bounded loops in our example

Need to prove that a⋀ b out2_a = out0_b is a valid formula

Basic Definitions• Assignment to a formula is a mapping

from the variables of to a domain D– (a ⋁ b) c: a -> 1, b -> 0, c -> 1

• Given a formula – it is satisfiable if there exists an assignments

s.t. is evaluated to TRUE– it is contradiction if such an assignment does

not exist– it is a tautology (or valid) if under every

assignment it evaluates to TRUE

Decision Procedure• The Decision Problem for a given formula

is to determine whether is valid

• So clearly, a Decision Procedure… –We want it to be Sound and Complete

• Sound = returns “Valid” when is valid• Complete = terminates, and when is valid, it

returns “Valid”

• A Theory is decidable if there is a decision procedure for it

Formal Reasoning• We want to reason about the validity (or

satisfiability) of a formula

• Model Theoretic approach– Enumerate possible solutions– Need a finite number of candidates

• Proof Theoretic approach– Deductive mechanism– Based on axioms and inference rules

• Forms an inference system

Formal Reasoning• Three statements

– If x is a prime number greater than 2, then x is odd– It it not the case that x is not a prime number

greater than 2– x is not odd

• Prime number greater than 2 – A• Odd number - B

A BAB

Inference Rules

(M.P)

¬(CONTR)FALSE

(DOUBLE NEG)¬¬⇔

Proof Theoretic

1. A B (premise)

A BAB

2.A (premise)

4. B (premise)3. A (2, DOUBLE NEG)

5. B (1+3, M.P)6. FALSE (4+5, CONTR)

(DOUBLE NEG)¬¬⇔

(M.P)

Model Theoretic

A BAB

A B AB (AB) ⋀ A (AB) ⋀ A ⋀ ¬B

0 0 1 0 0

0 1 1 0 0

1 0 0 0 0

1 1 1 1 0

Propositional Logic

• Logic:– A (formal) language for making statements

about objects and reasoning about properties of these objects

• The alphabet:– Countable set of proposition symbols: a, b, c, …– Logical connectives: ⋁ (or), ⋀ (and), ¬

(not/negation), ≣ (equivalence), (implication)• Connectives can be expressed using other

connectives• (a b) is the same as (¬a ⋁ b)

Propositional Logic (2)

• Formula is defined using the following grammar:– formula: formula ⋀ formula | ¬formula | (formula) |

atom– atom: Boolean variable | TRUE | FALSE

• An example to a data structure that uses this grammar: And-Inverter Graph (AIG)

a b c

Normal Forms• Negation Normal Form

– Use only or, and and not– ¬ appears only on proposition symbols

(variables)

• Conjunctive Normal Form– literal: a or a– clause: disjunction of literals– CNF: conjunction of clauses

DPLL-style SAT solvers

• Objective:– Check satisfiability of a CNF formula

• Given a CNF formula, is there a satisfying assignment?

• Approach:– Branch: make arbitrary decisions– Propagate implication graph– Use conflicts to guide inference steps

GRASP, CHAFF, MiniSAT, Glucose

SAT solvers can also generate refutation proofs!

Pseudo CodeDPLL() AddClauses(cnf()); if (BCP() == “confl”) return UnSAT; while (true) do if (!Decide()) return SAT; while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl);

The Implication Graph (BCP)

(a b) (b c d)

a

c

Decisions

b

Assignment: a b c d

d

Propositional Resolution

a b c a c d

b c d

When a conflict occurs, the implication graph isused to guide the resolution of clauses, so that thesame conflict will not occur again.

Conflict Clauses

(a c) (a b) (b c d) (b d)

b

Assignment: a b c d

d

Conflict!

(b c )

resolve

Conflict!(a c)

resolve

Conflict!

a

c

Decision

Conflict Clauses

c (a c) (a b) (b c d) (b d)

(b c )

(a c)

(c)

( )

b

d

a

c

Generating refutations

• Refutation = a proof of the null clause– Record a DAG containing all resolution steps

performed during conflict clause generation.– When null clause is generated, we can extract a

proof of the null clause as a resolution DAG.

Original clauses

Derived clauses

Null clause

Formal Reasoning – Which?

• Model Theoretic– The SAT solver tries to find an assignment

• In a way enumerating assignments

– Makes arbitrary decisions for variables

• Proof Theoretic– The learning scheme when hitting a conflict– Uses Resolution as an inference rule

• The only inference rule

• SO… Combination of the two… v ⋁ A ¬v ⋁ B(RES)A ⋁ B

Circuit to SAT

ab

c p

g

Can the circuit output be 1?

inputvariables output

variable

(a g) (b g)(a b g)

(g p) (c p)(g c p)

CNF(p)

p is satisfiable when theformula CNF(p) pis satisfiable

Questions?

• Is a SAT Solver a Decision Procedure?

• Yes!– But it looks for an assignment…?– Yes, but can prove the lack of one

• So, in order to prove the validity of – Prove ¬ is a contradiction

First Order Logic

• The alphabet:– Countable set of symbols: a, b, c, …– Logical connectives: ⋁ (or), ⋀ (and), ¬

(not/negation), ≣ (equivalence), (implication), quantifiers (∀,∃),

– Non-logical symbols • functions (F(x1,…,xn))

• Constants (1,2,3,…)• Predicate symbols (x > 0)

First Order Logic

• The Decision Problem for FOL is undecidable– There is no sound and complete decision

procedure for it

• Several approaches– Limit the syntax

• Only FOL formulas without quantifiers or function symbols

– Limit the possible models (assignments)• Not always easier

– Show validity for a given Theory• Propositional logic is a “subset” – or formally – a

Theory

Equality and Uninterpreted Functions

• Or in short EUF• Based on

– (x1 = y1 … xn = yn) f(x1,…,xn) = f(y1,….,yn)

• NP-Complete problem– Meaning Decidable– Polynomial reduction to propositional problem

Equality and Uninterpreted Functions

• Recall the motivating example

out0_a = in ⋀out1_a = F(out0_a, in)

⋀out2_a = F(out1_a, in)

out0_b = (in * in) * inout0_a = in ⋀out1_a = out0_a * in ⋀out2_a = out1_a * in

out0_b = F(F(in, in), in)

• Suitable for any function F

Linear Arithemetic

• Or LIA• Formulas of the form

– 3x + 2y ≤ 5z ⋀ 2x -2y = 0– Equality is a fragment of LIA

• Usage– Code optimizations performed by compilers

Arrays

• Operations on array data-structure– A map from an index to an element

• Basic operations– Read an element from an index i– Writing an element to index i

Satisfiability Modulo Theories

• Objective:– Given a theory T check satisfiability of a T-

formula

• Approach:– Eager encoding

• Reduce to a SAT problem

– Lazy encoding• Intro to DPLL(T)

– DPLL(T)

Reduction to SAT

• SMT Solving is based on the observation that a T-formula can be reduced to SAT

• (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))– Use Boolean variables for each conjunct

• a – (x-y≤0)• b – (y-z≤0)• c – (z-x≤-1)• d – (z-x≤-2)

– a ⋀ b ⋀ (c ⋁ d)

Eager Encoding

• Reduction to SAT clearly does not capture the semantics

• Eagerly encode the semantics into the formula• In our example, we have four T-atoms: a,b,c,d

– Need to capture the relation between the atoms:• Are a and b consistent?• Are a and ¬b consistent?• …• Are a and b and c consistent?• …

– Can lead to 2n relations – Exponential!

Eager Encoding - Example

• Back to our examples– (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))– a ⋀ b ⋀ (c ⋁ d)

• Clearly (x-y≤0), (y-z≤0) and (z-x≤-1) cannot be all TRUE– Thus we can add ¬(a ⋀ b ⋀ c)

• In a similar manner we add– ¬(¬a ⋀ ¬b ⋀ ¬c)– ¬(a ⋀ b ⋀ d)– ¬(¬a ⋀ ¬b ⋀ ¬d)

Eager Encoding – Example (2)

• The resulting formula– (a ⋀ b ⋀ (c ⋁ d)) ⋀ ¬(a ⋀ b ⋀ c) ⋀ ¬(¬a ⋀ ¬b

⋀ ¬c) ⋀ ¬(a ⋀ b ⋀ d) ⋀ ¬(¬a ⋀ ¬b ⋀ ¬d)

• The formula is then passed to a SAT solver– If it is satisfiable, then so is the original formula– Same goes for unsatisfiability

• Redundancy?– ¬(¬a ⋀ ¬b ⋀ ¬d)– ¬(¬a ⋀ ¬b ⋀ ¬c)

Schematics

Encoder

SAT-Solver

SMT formula

SAT/UnSAT

CNF formula

T-SolverCNF + Relations

Pseudo Code

Eager() R = T-Solver(lit()); B = e() ⋀ e(R); while (true) do <a, res> = SAT(B); if (res == “UnSAT”) return UnSAT else return SAT;

Eager Encoding

• Pros– Easy to implement

• SAT solver is used as a black-box

– Works well for bit-vectors theory

• Cons– Encoding may get too big to handle– Search starts only after entire problem is

translated

Lazy Encoding

• Recall: reduction to SAT does not capture the semantics

• Lazily encode the semantics into the formula– How?

• Less redundancies

Lazy Encoding – How it works

• Our example– (x-y≤0) ⋀ (y-z≤0) ⋀ ((z-x≤-1) ⋁ (z-x≤-2))– a ⋀ b ⋀ (c ⋁ d)

• Start by trying to solve the propositional formula– If UnSAT – done– Otherwise, get the assignment

• a -> TRUE, b -> TRUE, c -> TRUE, d -> FALSE

– Check if the assignment hold in the theory• If yes, a true assignment• Otherwise, block the assignment: ¬(a ⋀ b ⋀ c ⋀ ¬d)

– Solve again

Schematics

Encoder

SAT-Solver

SMT formula

UnSAT

CNF formula

T-Solver

SAT

assignment

blocking clause

Pseudo Code

Lazy() B = e(); while (true) do <a, res> = SAT(B); if (res == “UnSAT”) return UnSAT else <t,res> = T-Solver(T(a)) if (res == “SAT”) return SAT; B = B ⋀ e(t)

Lazy Encoding

• Pros– Easy to implement

• SAT solver is used as a black-box• Need to implement T-Solver

• Cons– SAT Solver must finish entirely before T-

inconsistency is checked– SAT Solver restarts (at the beginning) if last

assignment failed

How Can We Improve?

• The SAT Solver and T-Solver are separate– Each works as a black-box

• Try to bring them closer together– One should help the other

Incrementality

• Problem– SAT Solver restarts (at the beginning) if last

assignment failed

• Keep the state of the SAT solver, and simply resume– After a full assignment is found, check with T-

Solver– If not a real assignment, add a clause and BCP

Pseudo CodeLazy-DPLL() AddClauses(cnf(e())); if (BCP() == “confl”) return UnSAT; while (true) do if (!Decide()) <t,res> = T-Solver(T(a)) if (res == “SAT”) return SAT; AddClause(e(t)); while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl); else while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl);

Use Partial Assignment

• Problem– SAT Solver must finish entirely before T-

inconsistency is checked

• Try to find a T-inconsistency using only a partial assignment– Send partial assignment to T-Solver after k

decisions• May be hard to find the optimal k

Use Partial Assignment

• Consider a formula that has (10 ≤ x) and (x < 0) (represented by v and u respectively)– v and u are both assigned to TRUE– Every call to the T-Solver results in UnSAT

• May also be due to other reasons

• Thus, when these two are assigned TRUE we want to block all possible assignments– By adding ¬(v ⋀ u)

Theory Propagation

• How do we improve more?• Deeper integration

• T-Solver uses current partial assignment to deduce values for other literals– Similar to BCP, but using the Theory

• Improves performance dramatically• But makes conflict analysis hard

Theory Propagation

• Consider a formula that has (10 ≤ x), (x < 0), (x ≤ y) and (5 ≤ y)

• By knowing that (10 ≤ x) and (x ≤ y) are assigned to TRUE, the T-Solver can deduce facts

(5 ≤ y)

¬(x < 0) (10 ≤ x)

(x ≤ y)

Decisions

Pseudo CodeDPLL_T() AddClauses(cnf(e())); if (BCP() == “confl”) return UnSAT; while (true) do if (!Decide()) return SAT; repeat while(BCP() == “confl”) do bl = Analyze(); if (bl < 0) return UnSAT; else BackTrack(bl); <t,res> = T-Solver(T(a)) AddClause(e(t)); until (t == true)

Schematics

Encoder

SAT-Solver

SMT formula

UnSAT/SAT

CNF formula

T-Solver

Combining Theories

• We have seen how to solve a specific theory

• But what if we have– 3x + 2y ≤ f(z) ⋀ f(x) -2y = 0– LIA_UF

• Assuming we have a decision procedure for each theory– There is a procedure, called Nelson-Oppen– Other conditions as well

Advanced Topics

• Proof generation– We want something similar to a Resolution

Proof in the propositional case

• Interpolants generation– Specific methods for different theories– Based on deduction rules

• Specific usage in model checking

Questions?