logical aspects of ai lecture 3 - combining decision ...conchon/enspsaclay/lecture3.pdf · logical...
TRANSCRIPT
ENS Paris-Saclay
2019 - 2020
Logical Aspects of AI
Lecture 3 - Combining decision procedures
Sylvain Conchon
LRI (UMR 8623), Universite Paris-SudEquipe Toccata, INRIA Saclay – Ile-de-France
1
Road map
Lecture 3
1. Theory Combination
2. Quantifiers
3. Extra material
2
THEORY COMBINATION
Combination of Theories
In CDCL(T), the theory T is usually combination of theories
For instance,
x+ 2 = y ⇒ f(read(write(a, x, 3), y − 2)) = f(y − x+ 1)
4
Union of theories
Given two signatures Σ1 and Σ2, and two consistent theories T1and T2 over Σ1 and Σ2, respectively
I Is the union T1 ∪ T2 consistent?
Undecidable in the general case
I Can we build a decision procedure for T1 ∪ T2 from decisionprocedures of T1 and T2?
Methods exist only for restricted classes of theories
5
Union of theories
Given two signatures Σ1 and Σ2, and two consistent theories T1and T2 over Σ1 and Σ2, respectively
I Is the union T1 ∪ T2 consistent?
Undecidable in the general case
I Can we build a decision procedure for T1 ∪ T2 from decisionprocedures of T1 and T2?
Methods exist only for restricted classes of theories
5
Union of theories
Given two signatures Σ1 and Σ2, and two consistent theories T1and T2 over Σ1 and Σ2, respectively
I Is the union T1 ∪ T2 consistent?
Undecidable in the general case
I Can we build a decision procedure for T1 ∪ T2 from decisionprocedures of T1 and T2?
Methods exist only for restricted classes of theories
5
Union of theories
Given two signatures Σ1 and Σ2, and two consistent theories T1and T2 over Σ1 and Σ2, respectively
I Is the union T1 ∪ T2 consistent?
Undecidable in the general case
I Can we build a decision procedure for T1 ∪ T2 from decisionprocedures of T1 and T2?
Methods exist only for restricted classes of theories
5
Union of theories
Given two signatures Σ1 and Σ2, and two consistent theories T1and T2 over Σ1 and Σ2, respectively
I Is the union T1 ∪ T2 consistent?
Undecidable in the general case
I Can we build a decision procedure for T1 ∪ T2 from decisionprocedures of T1 and T2?
Methods exist only for restricted classes of theories
5
Robinson Joint Consistency Theorem
Given two consistent theories T1 and T2 over Σ1 and Σ2,respectively
Theorem:
T1 ∪ T2 is not consistent if there exists a formula ϕ over Σ1 ∩ Σ2
such that T1 |= ϕ and T2 |= ¬ϕ
6
Union of Disjoint Theories
When Σ1 and Σ2 are disjoints signatures
Theorem [Tinelli]:
T1 ∪ T2 is consistent if T1 and T2 have a infinite model
7
Lowenheim-Skolem Upward Theorem
Given a signature Σ and a theory T over Σ.
Theorem:
If T has an infinite model of cardinality κ, then T has a model ofcardinality κ′, for any κ′ ≥ κ
I used to align cardinalities of models
I useful to prove completeness of combination methods
8
Union of Disjoint Theories
Proof. Let A1 and A2 models of T1 and T2, respectively
By the Lowenheim-Skolem Upward theorem, if T1 and T2 have aninfinite model then they also have models of any infinite cardinality.We can thus assume that A1 and A2 have the same cardinality.
By the Joint Consistency theorem, if T1 ∪ T2 is not consistent thenthere exists a formula ψ such that A1 |= ψ et A2 |= ¬ψ (1).
Now, as Σ1 and Σ2 are disjoint, T1 ∩ T2-formulas can only beequational formulas, that is ψ only contains literals of the formx = y or x 6= y.
It is a well-known result in model theory that the reducts of anytwo models to the empty signature are isomorphic when they havethe same cardinality (any one-to-one correspondence works)
Consequently, either A1 and A2 are model of ψ or neither of themdoes, which contradicts (1).
9
Union of Disjoint Theories
Proof. Let A1 and A2 models of T1 and T2, respectively
By the Lowenheim-Skolem Upward theorem, if T1 and T2 have aninfinite model then they also have models of any infinite cardinality.We can thus assume that A1 and A2 have the same cardinality.
By the Joint Consistency theorem, if T1 ∪ T2 is not consistent thenthere exists a formula ψ such that A1 |= ψ et A2 |= ¬ψ (1).
Now, as Σ1 and Σ2 are disjoint, T1 ∩ T2-formulas can only beequational formulas, that is ψ only contains literals of the formx = y or x 6= y.
It is a well-known result in model theory that the reducts of anytwo models to the empty signature are isomorphic when they havethe same cardinality (any one-to-one correspondence works)
Consequently, either A1 and A2 are model of ψ or neither of themdoes, which contradicts (1).
9
Union of Disjoint Theories
Proof. Let A1 and A2 models of T1 and T2, respectively
By the Lowenheim-Skolem Upward theorem, if T1 and T2 have aninfinite model then they also have models of any infinite cardinality.We can thus assume that A1 and A2 have the same cardinality.
By the Joint Consistency theorem, if T1 ∪ T2 is not consistent thenthere exists a formula ψ such that A1 |= ψ et A2 |= ¬ψ (1).
Now, as Σ1 and Σ2 are disjoint, T1 ∩ T2-formulas can only beequational formulas, that is ψ only contains literals of the formx = y or x 6= y.
It is a well-known result in model theory that the reducts of anytwo models to the empty signature are isomorphic when they havethe same cardinality (any one-to-one correspondence works)
Consequently, either A1 and A2 are model of ψ or neither of themdoes, which contradicts (1).
9
Union of Disjoint Theories
Proof. Let A1 and A2 models of T1 and T2, respectively
By the Lowenheim-Skolem Upward theorem, if T1 and T2 have aninfinite model then they also have models of any infinite cardinality.We can thus assume that A1 and A2 have the same cardinality.
By the Joint Consistency theorem, if T1 ∪ T2 is not consistent thenthere exists a formula ψ such that A1 |= ψ et A2 |= ¬ψ (1).
Now, as Σ1 and Σ2 are disjoint, T1 ∩ T2-formulas can only beequational formulas, that is ψ only contains literals of the formx = y or x 6= y.
It is a well-known result in model theory that the reducts of anytwo models to the empty signature are isomorphic when they havethe same cardinality (any one-to-one correspondence works)
Consequently, either A1 and A2 are model of ψ or neither of themdoes, which contradicts (1).
9
Union of Disjoint Theories
Proof. Let A1 and A2 models of T1 and T2, respectively
By the Lowenheim-Skolem Upward theorem, if T1 and T2 have aninfinite model then they also have models of any infinite cardinality.We can thus assume that A1 and A2 have the same cardinality.
By the Joint Consistency theorem, if T1 ∪ T2 is not consistent thenthere exists a formula ψ such that A1 |= ψ et A2 |= ¬ψ (1).
Now, as Σ1 and Σ2 are disjoint, T1 ∩ T2-formulas can only beequational formulas, that is ψ only contains literals of the formx = y or x 6= y.
It is a well-known result in model theory that the reducts of anytwo models to the empty signature are isomorphic when they havethe same cardinality (any one-to-one correspondence works)
Consequently, either A1 and A2 are model of ψ or neither of themdoes, which contradicts (1).
9
Union of Disjoint Theories
Proof. Let A1 and A2 models of T1 and T2, respectively
By the Lowenheim-Skolem Upward theorem, if T1 and T2 have aninfinite model then they also have models of any infinite cardinality.We can thus assume that A1 and A2 have the same cardinality.
By the Joint Consistency theorem, if T1 ∪ T2 is not consistent thenthere exists a formula ψ such that A1 |= ψ et A2 |= ¬ψ (1).
Now, as Σ1 and Σ2 are disjoint, T1 ∩ T2-formulas can only beequational formulas, that is ψ only contains literals of the formx = y or x 6= y.
It is a well-known result in model theory that the reducts of anytwo models to the empty signature are isomorphic when they havethe same cardinality (any one-to-one correspondence works)
Consequently, either A1 and A2 are model of ψ or neither of themdoes, which contradicts (1). 9
Naive Combination of Decision Procedures
Assume T1 is the theory of (integer) arithmetic and T2 the theoryof arrays, defined by the following axioms
v[i← e][i] = ei 6= j ⇒ v[i← e][j] = v[i]
Is the following formula ψ (T1 ∪ T2)-satisfiable?
v[i← v[j]][i] 6= v[i] ∧ i+ j ≤ 2j ∧ j + 4i ≤ 5i
10
Naive Combination of Decision Procedures
First step : decompose ψ in two pure formulas ψ1 and ψ2 of T1and T2
ψ1 = v[i← v[j]][i] 6= v[i]ψ2 = i+ j ≤ 2j ∧ j + 4i ≤ 5i
11
Naive Combination of Decision Procedures
ψ1 = v[i← v[j]][i] 6= v[i]ψ2 = i+ j ≤ 2j ∧ j + 4i ≤ 5i
Second step : use the decision procedures of T1 and T2 todetermine the satisfiability of ψ1 and ψ2, respectively
I ψ1 is satisfiable
I ψ2 is satisfiable
But is ψ satisfiable?
12
Naive Combination of Decision Procedures
ψ1 = v[i← v[j]][i] 6= v[i]ψ2 = i+ j ≤ 2j ∧ j + 4i ≤ 5i
Second step : use the decision procedures of T1 and T2 todetermine the satisfiability of ψ1 and ψ2, respectively
I ψ1 is satisfiable
I ψ2 is satisfiable
But is ψ satisfiable?
12
Naive Combination of Decision Procedures
ψ1 = v[i← v[j]][i] 6= v[i]ψ2 = i+ j ≤ 2j ∧ j + 4i ≤ 5i
Second step : use the decision procedures of T1 and T2 todetermine the satisfiability of ψ1 and ψ2, respectively
I ψ1 is satisfiable
I ψ2 is satisfiable
But is ψ satisfiable?
12
Naive Combination of Decision Procedures
ψ1 = v[i← v[j]][i] 6= v[i]ψ2 = i+ j ≤ 2j ∧ j + 4i ≤ 5i
Second step : use the decision procedures of T1 and T2 todetermine the satisfiability of ψ1 and ψ2, respectively
I ψ1 is satisfiable
I ψ2 is satisfiable
But is ψ satisfiable?
12
Naive Combination of Decision Procedures
ψ = v[i← v[j]][i] 6= v[i] ∧ i+ j ≤ 2j ∧ j + 4i ≤ 5i
ψ is unsatisfiable
Proof.
i+ j ≤ 2j ∧ j + 4i ≤ 5i implies i = j
v[i← v[j]][i] 6= v[i] ∧ i = j implies v[i] 6= v[i]
The problem is that ψ1 and ψ2 are not independent, they aresharing variables and the equality predicate
Solution: compute the implied formula i = j
13
Naive Combination of Decision Procedures
ψ = v[i← v[j]][i] 6= v[i] ∧ i+ j ≤ 2j ∧ j + 4i ≤ 5i
ψ is unsatisfiable
Proof.
i+ j ≤ 2j ∧ j + 4i ≤ 5i implies i = j
v[i← v[j]][i] 6= v[i] ∧ i = j implies v[i] 6= v[i]
The problem is that ψ1 and ψ2 are not independent, they aresharing variables and the equality predicate
Solution: compute the implied formula i = j
13
Naive Combination of Decision Procedures
ψ = v[i← v[j]][i] 6= v[i] ∧ i+ j ≤ 2j ∧ j + 4i ≤ 5i
ψ is unsatisfiable
Proof.
i+ j ≤ 2j ∧ j + 4i ≤ 5i implies i = j
v[i← v[j]][i] 6= v[i] ∧ i = j implies v[i] 6= v[i]
The problem is that ψ1 and ψ2 are not independent, they aresharing variables and the equality predicate
Solution: compute the implied formula i = j
13
Craig Interpolation Theorem
Given two pure formulas ϕ1 and ϕ2 over Σ1 and Σ2, respectively
Theorem:
If ϕ1 ∧ ϕ2 is T1 ∪ T2-unsatisfiable then there exists a sentence ψover Σ1 ∩ Σ2 such that
1) |=T1 ϕ1 ⇒ ψ
2) ϕ2 ∧ ψ is T2-unsatisfiable
I ψ is an interpolant
Computing interpolants is the basis of combination methods likeNelson-Oppen
14
Craig Interpolation Theorem
Given two pure formulas ϕ1 and ϕ2 over Σ1 and Σ2, respectively
Theorem:
If ϕ1 ∧ ϕ2 is T1 ∪ T2-unsatisfiable then there exists a sentence ψover Σ1 ∩ Σ2 such that
1) |=T1 ϕ1 ⇒ ψ
2) ϕ2 ∧ ψ is T2-unsatisfiable
I ψ is an interpolant
Computing interpolants is the basis of combination methods likeNelson-Oppen
14
Nelson-Oppen (NO) Combination Methods
Let Σ1 and Σ2 two disjoint signatures
Input. ψ a conjunction of literals over Σ1 ∪ Σ2
Step 1. Purify ψ into a equisatisfiable formula ψ1 ∧ ψ2 such thatψi ∈ Σi
Step 2. Guess a partition of the variables of ψ1 and ψ2. Express itas a conjunction of literals ϕ.
Example. The partition {x1}, {x2, x3}, {x4} is representedas x1 6= x2, x1 6= x4, x2 6= x4, x2 = x3
Step 3. Decide whether ψi ∧ ϕ is satisfiable by using individualdecision procedures
Output. yes if all the decision procedures return yes, no otherwise
15
NO in Practice
A simple and elegant correctness proof of NO has been given byTinelli and Harandi in 1996
Correctness becomes an issue for deterministic and efficientimplementations
I purification with term sharing
I deducing the equalities to be shared
I theory state normalization
I deduction by lookup
I Relevant equation selection
I etc.
16
Deterministic Combination Algorithm
We present a deterministic version of NO at a description levelhigh-enough to enjoy a simple correctness proof, and low-enoughto describe crucial implementation details
I The algorithm is described as a set of inference rules
I Specific rules for optimizations
I Strategies as regular expressions
I Shostak pattern for efficient deduction
17
NO : State of the Algorithm
The internal state of the algorithm is represented by configurationsof the form
〈V | ∆ | Γ | Φ1, . . . ,Φn〉
I Γ is a set of literals of the form a = b or a 6= b, where a and bare terms in the union of theories T1 ∪ · · · ∪ Tn
I ∆ is a set of literals of the form x = y or x 6= y, where x andy are variables
I Each Φi is a set equations of the form x = a where x is avariable and a is a term in Ti
I V is a set of variables that appear in Γ and ∆
We also use the symbol ⊥ as a configuration
18
NO : State of the Algorithm
The internal state of the algorithm is represented by configurationsof the form
〈V | ∆ | Γ | Φ1, . . . ,Φn〉
I Γ is a set of literals of the form a = b or a 6= b, where a and bare terms in the union of theories T1 ∪ · · · ∪ Tn
I ∆ is a set of literals of the form x = y or x 6= y, where x andy are variables
I Each Φi is a set equations of the form x = a where x is avariable and a is a term in Ti
I V is a set of variables that appear in Γ and ∆
We also use the symbol ⊥ as a configuration
18
NO : State of the Algorithm
The internal state of the algorithm is represented by configurationsof the form
〈V | ∆ | Γ | Φ1, . . . ,Φn〉
I Γ is a set of literals of the form a = b or a 6= b, where a and bare terms in the union of theories T1 ∪ · · · ∪ Tn
I ∆ is a set of literals of the form x = y or x 6= y, where x andy are variables
I Each Φi is a set equations of the form x = a where x is avariable and a is a term in Ti
I V is a set of variables that appear in Γ and ∆
We also use the symbol ⊥ as a configuration
18
NO : State of the Algorithm
The internal state of the algorithm is represented by configurationsof the form
〈V | ∆ | Γ | Φ1, . . . ,Φn〉
I Γ is a set of literals of the form a = b or a 6= b, where a and bare terms in the union of theories T1 ∪ · · · ∪ Tn
I ∆ is a set of literals of the form x = y or x 6= y, where x andy are variables
I Each Φi is a set equations of the form x = a where x is avariable and a is a term in Ti
I V is a set of variables that appear in Γ and ∆
We also use the symbol ⊥ as a configuration
18
NO : State of the Algorithm
The internal state of the algorithm is represented by configurationsof the form
〈V | ∆ | Γ | Φ1, . . . ,Φn〉
I Γ is a set of literals of the form a = b or a 6= b, where a and bare terms in the union of theories T1 ∪ · · · ∪ Tn
I ∆ is a set of literals of the form x = y or x 6= y, where x andy are variables
I Each Φi is a set equations of the form x = a where x is avariable and a is a term in Ti
I V is a set of variables that appear in Γ and ∆
We also use the symbol ⊥ as a configuration
18
NO : State of the Algorithm
The internal state of the algorithm is represented by configurationsof the form
〈V | ∆ | Γ | Φ1, . . . ,Φn〉
I Γ is a set of literals of the form a = b or a 6= b, where a and bare terms in the union of theories T1 ∪ · · · ∪ Tn
I ∆ is a set of literals of the form x = y or x 6= y, where x andy are variables
I Each Φi is a set equations of the form x = a where x is avariable and a is a term in Ti
I V is a set of variables that appear in Γ and ∆
We also use the symbol ⊥ as a configuration
18
Purification
Abstract〈V | ∆ | Γ ] {a = b} | . . . ,Φi, . . .〉
〈V ∪ {z} | ∆ | Γ ∪ {a[π 7→ z] = b} | . . . ,Φi ∪ {z = aπ}, . . .〉
where aπ ∈ Ti and z is a fresh variable
Share〈V | ∆ | Γ ] {a = b} | Φ1, . . . ,Φn〉
〈V | ∆ | Γ ∪ {a[π 7→ z] = b} | Φ1, . . . ,Φn〉
where aπ ∈ Ti and z is a fresh variable and Φi,∆ |=Ti z = aπ
19
Equality Propagation
Arrange〈V | ∆ | Γ ] {x ./ y} | Φ1, . . . ,Φn〉〈V | ∆ ∪ {x ./ y} | Γ | Φ1, . . . ,Φn〉
Deduct〈V | ∆ | Γ | Φ1, . . . ,Φn〉
〈V | ∆ ∪ x = y | Γ | Φ1, . . . ,Φn〉
if Φi,∆ |=Ti x = y and ∆ 6|= x = y
Contradict〈V | ∆ | Γ | Φ1, . . . ,Φn〉
⊥if Φi ∧∆ is not Ti-satisfiable
20
Example
f(x) = x ∧ f(2x− f(x)) 6= x
V ∆ Γ Φ1 Φ2 Rulef(x) = x
x ∅ f(2x− f(x)) 6= x ∅ ∅y = x
x, y ∅ f(2x− f(x)) 6= x y = f(x) ∅ Ab1
x, y y = x f(2x− f(x)) 6= x y = f(x) ∅ Arx, y y = x f(2x− y) 6= x y = f(x) ∅ Sh1
x, y, z y = x f(z) 6= x y = f(x) z = 2x− y Ab2
x, y, z, u y = x u 6= xy = f(x)u = f(z)
z = 2x− y Ab1
x, y, z, uy = xu 6= x
∅ y = f(x)u = f(z)
z = 2x− y Ar
x, y, z, uy = xu 6= xz = x
∅ y = f(x)u = f(z)
z = 2x− y De2
⊥ Co121
Convex Theories
Rule Deduct is applicable only if a theory can always infer aunique equality from Φi ∧∆. This property is called convexity
Convex Theories.
A theory T is convex if and only if for all finite set Γ of literals,and for all non-empty disjunction
∨i∈I xi = yi of variables
Γ |=T∨i∈I
xi = yi iff Γ |=T xi = yi for some i ∈ I
22
Convexity
Some theories are convex
I Linear rational arithmeticI Equational theories
Many theories are not convex
I Linear integer arithmetic
y = 1, z = 2, 1 ≤ x ≤ 2 |= x = y ∨ x = z
I Non linear arithmetic
x2 = 1, y = 1, z = −1 |= x = y ∨ x = z
I Theory of Bit-vectors
I Theory of Arrays
v1 = a[i← v2][j], v3 = a[j] |= v1 = v2 ∨ v1 = v3
23
Correctness (I)
A configuration 〈V | ∆ | Γ | Φ1, . . . ,Φn〉 is satisfiable if theformula Γ ∧ Φ1 ∧ · · · ∧ Φn ∧∆ is satisfiable. ⊥ is unsatisfiable
Satisfiability of a conjunction of literals Γ is thus equivalent to thesatisfiability of the initial configuration 〈V | ∅ | Γ | ∅〉
We write C ⇒ C′ if the configuration C can be reduced to C′ byone of the inference rules. A configuration that cannot be reducedis called irreducible. It is proper if it is not ⊥
Theorem [Correctness]
A set of literals Γ is satisfiable iff there exists an irreducible andproper configuration C such that 〈V | ∅ | Γ | ∅〉 ⇒∗ C.
24
Correctness: Termination
Theorem
The reduction relation ⇒ is terminating
Proof.
We consider the measure (|Γ|, |∆|) on configurations defined by
I |Γ| is the sum total of sizes of its termsI |∆| is the number of equivalence classes represented by ∆
Notice that given a fix number of variables, the ordering on |∆| iswell-founded (there is only a finite number of classes)
It is immediate to see that
I |Γ| decreases by Abstract and ArrangeI |Γ| is constant by Deduct and |∆| decreases
25
Correctness (III): Finite Models
Assume T1 is a Σ1-theory whose models have at most 2 elements,and T2 is a Σ2-theory admitting models of any cardinality
I Let f ∈ Σ1 and g ∈ Σ2 such that
6|=T1 ∀x, y.f(x) = f(y) and 6|=T2 ∀x, y.g(x) = g(y)
(f and g are not constant functions)I Let Γ = {f(x) 6= f(y), g(x) 6= g(z), g(y) 6= g(z)}
Running NO from 〈V | ∅ | Γ | ∅〉, we reach the final configuration
〈V ′ | ∆ | ∅ | Φ1,Φ2〉
with∆ = {x1 6= y1, x2 6= z2, y2 6= z2}Φ1 = {x1 = f(x), y1 = f(y)}Φ2 = {x2 = g(x), y2 = g(y), z2 = g(z)}
26
Correctness (III): Finite Models (cont)
x and y are the only variables shared variables, and only x = y orx 6= y can be shared
I x = y is impossible since the procedure would have reach ⊥I with x 6= y, ∆ ∪ Φ1 and ∆ ∪ Φ2 are satisfiables, but it is
straightforward to check that
Γ |=T1∪T2 x 6= y ∧ x 6= z ∧ y 6= z
Γ is thus unsatisfiable since T1 ∪ T2 (as T1) has only modelswith at most 2 elements.
NO is unsound for theories with finite models only
27
Correctness (IV): Stably Infinite Theories
NO can only combine decision procedures for theories that haveinfinite models
Stably Infinite Theories
A theory T is stably infinite if every T -satisfiable formula issatisfiable in an infinite model
Example. Theories with only finite models are not stably infinite
∀x, y, z. x = y ∨ x = z ∨ y = z
This condition also ensures that union of two consistent, disjoint,stably infinite theories is consistent
28
Correctness (V): Tinelli-Harandi’s theorem
An arrangement ∆(V ) of a set of variables V is a set of formulasof the form x = y or x 6= y such that for all pairs of variablesx, y ∈ V we have ∆(V ) |= x = y or ∆(V ) |= x 6= y
Given two stably-infinite theories T1 and T2 over two disjointsignatures Σ1 and Σ2
Theorem[Tinelli-Harandi (1996)]
Let Φ1 and Φ2 two sets of Σ1 and Σ2-literals. Let V be the set ofthe variables shared by Φ1 and Φ2, and ∆(V ) an arrangement onV . If Φ1 ∧∆(V ) is T1-satisfiable and Φ2 ∧∆(V ) is T2-satisfiablethen Φ1 ∧ Φ2 is (T1 ∪ T2)-satisfiable
29
Correctness (VI): Irreducible
Lemma Every proper irreducible configuration is satisfiable
Proof. Let 〈V | ∆ | Γ | Φ1, . . . ,Φn〉 be such a configuration.
Since Abtract and Arrange cannot be applied, Γ is empty.Since Contradict does not apply, ∆ ∧ Φi is Ti-satisfiable. If∆(V ) is an arrangement then Tinelli-Harandi’s Theorem finishesthe proof. Otherwise, let ∆′ = ∆ ∪ {x1 6= y1, . . . , xk 6= yk} themaximal satisfiable extension of ∆ such that ∆ 6|= xi 6= yi. ∆′(V )is an arrangement and ∆′ |= ∆.
If Φi ∧∆′ is not Ti-satisfiable then Φi |=Ti ∆+ −→ ¬∆− ∨ δ whereδ is the clause x1 = y1 ∨ · · · ∨ xk = yk and ∆+ (resp. ∆−)(dis)equations of ∆. Since Ti is convex, Φi |=Ti, ∆+ −→ x = ywhere x = y ∈ ¬∆− ∨ δ. Since Deduct does not apply, we have∆ |= x = y and (since ∆ |= ∆−) ∆ |= δ, which contradict thesatisfiability of ∆′
30
Correctness (VII): Final Proof
Lemma [Equisatisfiability]
If C ⇒ C′ then C and C′ are equisatisfiables
Theorem [Correctness]
A set of literals Γ is satisfiable iff there exists an irreducible andproper configuration C such that 〈V | ∅ | Γ | ∅〉 ⇒∗ C.
Proof.
It suffices to prove that a configuration C is satisfiable if and only ifthere exists a proper irreducible configuration C′ such that C ⇒∗ C′.
By induction over the terminating relation ⇒.
If C is irreducible, we conclude by the Lemma on irreducibility. If Creduces to C′ then C′ is equisatisfiable and we conclude by theinduction hypothesis on C′.
31
Non-Convex Theories (I)
We first adapt Deduct for dealing with disjunctions of equalities
Deduct〈V | ∆ | Γ | Φ0, . . . ,Φn〉〈V | ∆ ∪ δ | Γ | Φ0, . . . ,Φn〉
if Φi,∆ |=Ti δ qand ∆ 6|= δ
where δ is a disjunction of equalities between variables
32
Non-Convex Theories (II)
We also add a branching rule
Branch〈V | ∆ ] {x1 = y1 ∨ · · · ∨ xk = yk} | Γ | Φ0, . . . ,Φn〉
〈V | ∆ ∪ {xi = yi} | Γ | Φ0, . . . ,Φn〉
if ∆ 6|= xi = yi (1 ≤ i ≤ k)
The correctness proof requires only a small number ofmodifications
33
Splitting on Demand
Case splits for non-convex theories can be lifted to the SAT solver
When Mode = search
T-LearnM |=T
∨xi = yi xi and yi are shared variables
F := F ∪ {∨xi = yi}
34
Delayed Theory Propagation
I Nondeterministic NO
I Create a set of interface equalities x = y between sharedvariables
I Use SAT solver to guess the partition
Main disavantages
I The number of additional equality literals is quadratic in thenumber of shared variables
I Extension to quantified formulas
35
Model-Based Theory Combination
Instead of propagating disjunctions when
Γ |=T u1 = v1 ∨ . . . un = vn
Use a candidate model M of T that satisfies Γ and propagate allequalities implied by M
if M |= T ∧ Γ ∧ ui = vi then propagate ui = vi
If other theories do not agree with that choice, then backtrack tofix the model
I In practive, the number of inter-theory equalities that matteris small, but intra-theory equalities does matter
I Backtracking is usually cheapI Limit the number of equalities implied by M
36
Efficient Deduction of Implied Equalities
In rule Deduct, when the theory is convex, we have to find a pairof new variables (x, y) such that
∆,Φi |=Ti, x = y
Generic Solution:
For every pair of variables (x, y), use the decision procedure of Tito check whether ∆ ∪ Φi ∪ {x 6= y} is Ti-satisfiable
This solution may be very expensive; for some convex theories thereexists efficient algorithms for computing new implied equalities.
For that, the decision procedures maintain a union-find datastructure on terms such that a new equality x = y can beefficiently deduced by checking that find(x) = find(y) is true.
37
State Normalization
To maintain its union-find data structure the decision procedureuses a normalization function (which is theory dependent). Anormalization step is represented by the relation
(∆,Φi)� (∆,Φ′i)
Intuitively, Φi can be simplified, with the use of ∆, into a “more”normalized equivalent set Φ′i.
Three conditions are necessary
1. The relation � must be terminating
2. Φi ∧∆ and Φ′i ∧∆ must be equisatisfiable3. Completeness of �:
if Φi,∆ |=Ti x = y and ∆ 6|= x = y then there exists Φ′i suchthat (Φi,∆)�∗ (Φ′i,∆) and {x = t, y = t} ⊆ Φ′i
Norm〈V | ∆ | Γ | . . . ,Φi, . . .〉〈V | ∆ | Γ | . . . ,Φ′i, . . .〉
if (∆,Φ)� (∆,Φ′)
38
State Normalization
To maintain its union-find data structure the decision procedureuses a normalization function (which is theory dependent). Anormalization step is represented by the relation
(∆,Φi)� (∆,Φ′i)
Intuitively, Φi can be simplified, with the use of ∆, into a “more”normalized equivalent set Φ′i.
Three conditions are necessary
1. The relation � must be terminating2. Φi ∧∆ and Φ′i ∧∆ must be equisatisfiable
3. Completeness of �:if Φi,∆ |=Ti x = y and ∆ 6|= x = y then there exists Φ′i suchthat (Φi,∆)�∗ (Φ′i,∆) and {x = t, y = t} ⊆ Φ′i
Norm〈V | ∆ | Γ | . . . ,Φi, . . .〉〈V | ∆ | Γ | . . . ,Φ′i, . . .〉
if (∆,Φ)� (∆,Φ′)
38
State Normalization
To maintain its union-find data structure the decision procedureuses a normalization function (which is theory dependent). Anormalization step is represented by the relation
(∆,Φi)� (∆,Φ′i)
Intuitively, Φi can be simplified, with the use of ∆, into a “more”normalized equivalent set Φ′i.
Three conditions are necessary
1. The relation � must be terminating2. Φi ∧∆ and Φ′i ∧∆ must be equisatisfiable3. Completeness of �:
if Φi,∆ |=Ti x = y and ∆ 6|= x = y then there exists Φ′i suchthat (Φi,∆)�∗ (Φ′i,∆) and {x = t, y = t} ⊆ Φ′i
Norm〈V | ∆ | Γ | . . . ,Φi, . . .〉〈V | ∆ | Γ | . . . ,Φ′i, . . .〉
if (∆,Φ)� (∆,Φ′)
38
State Normalization
To maintain its union-find data structure the decision procedureuses a normalization function (which is theory dependent). Anormalization step is represented by the relation
(∆,Φi)� (∆,Φ′i)
Intuitively, Φi can be simplified, with the use of ∆, into a “more”normalized equivalent set Φ′i.
Three conditions are necessary
1. The relation � must be terminating2. Φi ∧∆ and Φ′i ∧∆ must be equisatisfiable3. Completeness of �:
if Φi,∆ |=Ti x = y and ∆ 6|= x = y then there exists Φ′i suchthat (Φi,∆)�∗ (Φ′i,∆) and {x = t, y = t} ⊆ Φ′i
Norm〈V | ∆ | Γ | . . . ,Φi, . . .〉〈V | ∆ | Γ | . . . ,Φ′i, . . .〉
if (∆,Φ)� (∆,Φ′)38
Efficient Deduction Rule
We implement ∆ as a union-find data structure and we write ∆(x)the representative of the variable x
Using state normalization, new equalities between variables can bedireclty extracted from the union-find data structure
TDeduct〈V | ∆ | Γ | . . . ,Φi ∪ {x = a, y = a}, . . .〉
〈V | ∆ ∪ {x = y} | Γ | . . . ,Φi ∪ {x = a, y = a}, . . .〉
if ∆(x) 6= ∆(y)
NO is correct if Deduct is replaces by Norm and TDeduct
39
Normalization for Shostak Theories (I)
A Shostak theory T is a convex theory for which there exists acanonizer and a solver
A canonizer σ is a function that for every term u returns a uniquerepresentative σ(u) in the equivalence class of u. A canonizer mustsatisfy the following conditions:
1. T |= u = v iff σ(u) = σ(v)
2. σ(σ(u)) = σ(u)
3. if x occurs in σ(u) then x occurs in u
4. if σ(u) = u then σ(v) = v for every sub-terms v of u
40
General Solutions of Equations
A general solution of a statisfiable equation u = v is a set ofequations of the triangular form
x1 = t1, . . . , xk = tk
where xi are variables occuring in u and v but not in ti, such that
|=T u = v ←→ (∃y1 . . . ym) (x1 = t1 ∧ · · · ∧ xk = tk)
where y1, . . . , ym are the ti’s variables
41
Solvers
A solver for a theory T is an algorithm that takes a T -equationu = v as input, and returns unsat if this equation is notT -satisfiable, and its general solution if it is T -satisfiable
Examples of theories equipped with solvers found in practice:
linear arithmetic over the rationals, theory of records
42
Normalization for Shostak Theories (II)
We use the canonizer and the solvers of a Shostak theory to bringΦ into a triangular form {x1 = t1, . . . , xk = tk}
Example:
1. Assume Φ and ∆ are of the form
Φ = {x1 = u− v, x2 = 2v − u, x3 = 2u− v, x4 = 2v}∆ = {x1 = x2}
2. Solve x1 = x2, that is u− v = 2v − u
u− v = 2v − u ⇒ {u = 3t, v = 2t}
3. Substitute u and v in Φ, and canonize the right parts
Φ′ = {x1 = t, x2 = t, x3 = 4t, x4 = 4t}
4. Apply TDeduct to infer x3 = x4
43
Normalization for Shostak Theories (II)
We use the canonizer and the solvers of a Shostak theory to bringΦ into a triangular form {x1 = t1, . . . , xk = tk}
Example:
1. Assume Φ and ∆ are of the form
Φ = {x1 = u− v, x2 = 2v − u, x3 = 2u− v, x4 = 2v}∆ = {x1 = x2}
2. Solve x1 = x2, that is u− v = 2v − u
u− v = 2v − u ⇒ {u = 3t, v = 2t}
3. Substitute u and v in Φ, and canonize the right parts
Φ′ = {x1 = t, x2 = t, x3 = 4t, x4 = 4t}
4. Apply TDeduct to infer x3 = x4
43
Normalization for Shostak Theories (III)
Rule Norm for a Shostak theory Ti is implemented by acombination of the following rules
Canon〈V | ∆ | Γ | . . . ,Φi ] {x = a}, . . .〉
〈V | ∆ | Γ | . . . ,Φi ∪ {x = canoni(a)}, . . .〉
if a 6= canoni(a)
Solvei〈V | Γ | ∆ | . . . ,Φi ∪ {x = a, y = b}, . . .〉
〈V | Γ | ∆ | . . . , (Φi ∪ {x = a, y = b} ∪ solve(a = b))2, . . .〉
if ∆(x) = ∆(y) and a 6= b and a = b is Ti-satisfiable
44
QUANTIFIERS
Quantified Formulas
Consider the following axiomatization (in Alt-Ergo’s syntax) for anordering relation le
logic le: int,int -> prop
axiom refl: forall x:int. le(x,x)
axiom trans:
forall x,y,z:int. le(x,y) and le(y,z) -> le(x,z)
axiom antisym:
forall x,y:int. le(x,y) and le(y,x) -> x = y
and some goals we want to prove:
goal g1: le(2,5) and le(5,10) -> le(2,10)
goal g2:
forall a:int.
le(a,5) and le(5,8) and le(8,a) -> a=5
46
Quantified Formulas
Consider the following axiomatization (in Alt-Ergo’s syntax) for anordering relation le
logic le: int,int -> prop
axiom refl: forall x:int. le(x,x)
axiom trans:
forall x,y,z:int. le(x,y) and le(y,z) -> le(x,z)
axiom antisym:
forall x,y:int. le(x,y) and le(y,x) -> x = y
and some goals we want to prove:
goal g1: le(2,5) and le(5,10) -> le(2,10)
goal g2:
forall a:int.
le(a,5) and le(5,8) and le(8,a) -> a=5
46
Guiding Quantifier Instantiation
Many SMT solvers handle universal formulas through aninstantiation mechanism
Questions:
I How to find good instances to prove a goal?
I How to limit the (prohibitive) number of instances?
A possible answer: find good heuristics!
I In practice, heuristics for choosing new instances are based ontriggers : lists of patterns (terms with variables) that guide(or restrict) instantiations to known ground terms that have agiven form
47
Guiding Quantifier Instantiation
Many SMT solvers handle universal formulas through aninstantiation mechanism
Questions:
I How to find good instances to prove a goal?
I How to limit the (prohibitive) number of instances?
A possible answer: find good heuristics!
I In practice, heuristics for choosing new instances are based ontriggers : lists of patterns (terms with variables) that guide(or restrict) instantiations to known ground terms that have agiven form
47
Guiding Quantifier Instantiation
Many SMT solvers handle universal formulas through aninstantiation mechanism
Questions:
I How to find good instances to prove a goal?
I How to limit the (prohibitive) number of instances?
A possible answer: find good heuristics!
I In practice, heuristics for choosing new instances are based ontriggers : lists of patterns (terms with variables) that guide(or restrict) instantiations to known ground terms that have agiven form
47
Guiding Quantifier Instantiation
Many SMT solvers handle universal formulas through aninstantiation mechanism
Questions:
I How to find good instances to prove a goal?
I How to limit the (prohibitive) number of instances?
A possible answer: find good heuristics!
I In practice, heuristics for choosing new instances are based ontriggers : lists of patterns (terms with variables) that guide(or restrict) instantiations to known ground terms that have agiven form
47
Triggers: Example
If P(x) is used as trigger in the following axiom ax1
logic P,Q,R: int -> prop
axiom ax1: forall x:int. (P(x) or Q(x)) -> R(x)
goal g3: P(1) -> R(1)
goal g4: Q(2) -> R(2)
then, among the set of known terms {P(1), R(1), P(2), R(2)},only P(1) can be used to create the following instance of ax1
( P(1) or Q(1) ) -> R(1)
which implies that only goal g3 is proved
48
Explicit Triggers
SMT solvers’ input syntax provides the possibility for a user tospecify its own triggers
For instance, in Alt-Ergo, the list of terms [f(x), Q(y)] is anexplicit trigger for the following axiom ax2
logic P,Q,R: int -> prop
logic f: int -> int
axiom ax2:
forall x,y:int [f(x), Q(y)].
P(f(x)) and Q(y) -> R(x)
49
Matching
We use a matching algorithm to create new instances of universalformulas
Given a ground term t and a pattern p, the matching algorithmreturns a set S of substitutions over the variables of p such that
t = σ(p) for all σ ∈ S
50
Limitation of Matching
Purely syntactic matching is very limited!
Consider for instance the following formulas:
logic P,R : int -> prop
logic f : int -> int
axiom ax : forall x:int [P(f(x))]. P(f(x)) -> R(x)
goal g1 : forall a:int. P(a) -> a = f(2) -> R(2)
The trigger P(f(x)) prevents the creation of instances of axiom ax
since there is no ground term of the form P(f( )) in the problem
To prove such goals, we need to extend the matching algorithm tofind substitutions modulo (ground) equalities
51
E-Matching
Given a set of ground equations E, a ground term t and a patternp, the e-matching algorithm returns a set S of substitutions overthe variables of p such that
E |= t = σ(p) for all σ ∈ S
In the previous example
logic P,R : int -> prop
logic f : int -> int
axiom ax : forall x:int [P(f(x))]. P(f(x)) -> R(x)
goal g1 : forall a:int. P(a) -> a = f(2) -> R(2)
e-matching takes advantage of ground equality a = f(2) andreturns the substitution σ = {x 7→ 2} which is used to create theinstance P(f(2)) -> R(2) of axiom ax
52
Ground Terms
Known ground terms are extracted from literals assumed or impliedby the SAT solver
Instantiation based mechanisms are strongly impacted by thenumber and the relevance of known ground terms :
I more ground terms, more instances of lemmas
I irrelevant ground terms, irrelevant instances
53
Ground Terms and Linear CNF
The shape of formulas to be proved, and in particular theconversion process used to produce a CNF, has a strong impact onthe number of known ground terms
Consider for instance the following formula
A ∨ (B ∧ C)
When A is assumed to be true, terms of A become known and therest of the (terms of the) formula (B ∧ C) can be ignored
However, because of the shape of the CNF conversion
(A ∨X) ∧ (X ⇔ (B ∧ C))
the SMT solver will assign a value to X (even when A is true) andterms from B and C will be considered has known terms :-(
54
Ground Terms and Linear CNF
The shape of formulas to be proved, and in particular theconversion process used to produce a CNF, has a strong impact onthe number of known ground terms
Consider for instance the following formula
A ∨ (B ∧ C)
When A is assumed to be true, terms of A become known and therest of the (terms of the) formula (B ∧ C) can be ignored
However, because of the shape of the CNF conversion
(A ∨X) ∧ (X ⇔ (B ∧ C))
the SMT solver will assign a value to X (even when A is true) andterms from B and C will be considered has known terms :-(
54
EXTRA MATERIAL
First-Order Logic : Signature and Terms
I A signature Σ is a finite set of function and predicate symbolswith an arity
I Constants are just function symbols of arity 0
I We assume that Σ contains the binary predicate =
I We assume a set V of variables, distinct from Σ
I T (Σ,V) is the set of terms, i.e. the smallest set whichcontains V and such that f(t1, . . . , tn) ∈ T (Σ,V) whenevert1, . . . , tn ∈ T (Σ,V) and f ∈ Σ
I T (Σ, ∅) is the set of ground terms
I Terms are just trees. Given a term t and a position π in atree, we write tπ for the sub-term of t at position π. We alsowrite t[π 7→ t′] for the replacement of the sub-term of t atposition π by the term t′
56
First-Order Logic : Formulas
I An atomic formula is P (t1, . . . , tn), where t1, . . . , tn are termsin T (Σ,V) and P is a predicate symbol of Σ
I Literals are atomic formulas or their negation
I Formulas are inductively constructed from atomic formulaswith the help of Boolean connectives and quantifiers ∀ and ∃
I Ground formulas contain only ground terms
I A variable is free if it is not bound by a quantifier
I A sentence is a formula with no free variables
57
First-Order Logic : Models
A model M for a signature Σ is defined by
I a domain DM
I an interpretation fM for each function symbol f ∈ Σ
I a subset PM of DnM for each predicate P ∈ Σ of arity n
I an assignment M(x) for each variable x ∈ V
The cardinality of model M is the the cardinality of DM
58
First-Order Logic : Semantics
Interpretation of terms:
M[x] = M(x)M[f(t1, . . . , tn)] = fM(M[t1], . . . ,M[tn])
Interpretation of formulas:
M |= t1 = t2 = M[t1] =M[t2]M |= P (t1, . . . , tn) = (M[t1], . . . ,M[tn]) ∈ PMM |= ¬F = M 6|= FM |= F1 ∧ F2 = M |= F1 and M |= F2
M |= F1 ∨ F2 = M |= F1 or M |= F2
M |= ∀x.F = M{x 7→ v} |= F for all v ∈ DMM |= ∃x.F = M{x 7→ v} |= F for some v ∈ DM
59
First-Order Logic : Validity
I A formula F is satisfiable if there a model M such thatM |= F , otherwise F is unsatisfiable
I A formula F is valid if ¬F is unsatisfiable
60
First-Order Logic : Theories
A first-order theory T over a signature Σ is a set of sentences
A theory is consistent if it has (at least) a model
A formula F is satisfiable in T (or T -satisfiable) if there exists amodel M for T ∧ F , written M |=T F
A formula F is T -validity, denoted |=T F , if ¬F is T-unsatisfiable
61
Decision Procedures
A decision procedure is an algorithm used to determine whether aformula F in a theory T is satisfiable
Many decision procedures work on conjunctions of (ground) literals
62