deciding equality with uninterpreted functions using congruence closure constantinos bartzis
TRANSCRIPT
The problem Determine the validity of formulas like
f(a,b)=a f(f(a, b), b)=a or, equivalently, satisfiability of
f(a,b)=a f(f(a, b), b)a In 1954 Ackermann showed it is decidable. In 1976 Nelson and Oppen implemented an
O(m2) algorithm based on congruence closure computation.
The algorithm of Nelson and Oppen
G. Nelson and D. Oppen, “Fast Decision Procedures Based on Congruence Closure”, JACM 1980
Quantifier Free Theory of Equality with Uninterpreted Function Symbols Also known as QFEUF The language consists of
Variables a, b, c, … (also called constants)Uninterpreted function symbols f, g, …Predicate “=“Boolean connectives
Example theorem
a=b f(a)=f(b)
Congruence Closure Let G=(V, E) be a directed graph with n
labeled vertices and m edges For vertex v, let (v) denote its label and
(v) denote its outdegree Let v[i] denote the ith successor of v Let R be a relation on V Two vertices u and v are congruent under
R, if (u)=(v), (u)=(v), and for all 1i(u), (u[i], v[i])R.
Congruence Closure R is closed under congruences, if for all
vertices u and v that are congruent under R, (u, v)R.
The congruence closure of R is the unique minimal extension R’ of R that is an equivalence relation and is closed under congruences.
Example
f
f
a b
v1
v2
v3 v4
Let R={(v4 , v4), (v2, v3)}.
Then v1 and v2 are congruent under R.
The equivalence relation associated with the partition {{v1, v2, v3}, {v4}} is closed under congruences.
Note that the nodes represent the (sub)terms of the formula: f(a,b)=a f(f(a, b), b)=a
Deducing that v1 is equivalent to v3 is analogous to deducing f(f(a, b), b)=a from f(a,b)=a
f
Example 2
f
v1
f
v2
f
v3
f
v4
f
v5
av6
Let R={(v1 , v6), (v3, v6)}.
Then v2 and v5 are congruent under R.
The congruence closure should also include (v1, v4).
From transitivity, v4 is equivalent to v6
Now v3 is congruent to v5
All six vertices are equivalent in the congruence closure.
We proved that f(f(f(a)))=a f(f(f(f(f(a)=a f(a)=a
Computing the Congruence Closure An equivalence relation is represented by
its corresponding partition UNION(u, v) combines the equivalence
classes of vertices u and v FIND(u) returns the unique name of the
class of vertex u Given a relation R that is closed under
congruences, the following procedure MERGE(u, v) constructs the congruence closure of R{(u,v)}
Merge ProcedureMERGE(u, v)1. If FIND(u) = FIND(v) then return
2. Let Pu and Pv be the set of predecessors of all nodes in the class of u and v, respectively
3. Call UNION(u, v)
4. For each (x,y) s.t. xPu and yPv, if FIND(x)FIND(y) but CONGRUENT(x, y)=TRUE, then MERGE(x,y)
CONGRUENT(u, v)1. If (u)(v) or (u)(v), then return FALSE2. For 1i(u), if FIND(u[i])FIND(v[i]), then return
FALSE3. Return TRUE
Deciding QFEUF Suffices to check conjunctive formulas
t1=u1 … tp=up r1s1 … rqsq
Algorithm:1. Construct the corresponding graph G with a
node n(x) for each (sub)term x in the formula2. Let R be the identity relation on the nodes of G
3. For 1ip, call MERGE(n(ti), n(ui))
4. For 1iq, if n(ri) is equivalent to n(si), then return UNSATISFIABLE
5. Return SATISFIABLE
Theory solvers for DPLL(T) Need to be incremental i.e., at each step
receive one new constraint and determine satisfiability of the partial conjunction, doing only the necessary extra work.Nelson and Oppen’s algorithm is incremental
Need to provide explanations for unsatisfiabilityNieuwenhuis and Olivas address this issue
Initial transformations Curryfy
Use only one “apply” function symbol fE.g. g(a) becomes f(g,a) and g(a, h(b), b)
becomes f(f(f(g, a), f(h, b)), b)Only linear growth in size
FlattenReplace subterms with new variablesE.g. f(f(f(g, a), f(h, b)), b) = b becomes
{f(g,a)=c, f(h,b)=d, f(c,d)=e, f(e,b)=b}
Data structures1. Pending: a list of input equations a=b or pairs
of input equations (f(a1,a2)=a, f(b1,b2)=b) where ai and bi are already congruent
2. Representative table: array that stores the class representative of each variable
3. Class lists: store all members of each class
4. Use lists: Uselist(a) contains all f(b1,b2)=b s.t. a is a representative of b1 or b2
5. Lookup table: Lookup(b,c) is some f(a1,a2)=a s.t. b and c are representatives of a1, and a2
Producing explanations Explain(e,e’): If a sequence U of unions of pairs
(e1,e1’)…(ep,ep’) has taken place, it returns a minimal subset E of U such that (e,e’) belongs to the equivalence relation genetared by E.
Example: After this sequence of unions
(1,8),(7,2),(3,13),(7,1),(6,7),(9,5),(9,3),
(14,11),(10,4)(12,9),(4,11),(10,7)
a call to Explain(1,4) returns the explanation {(7,1),(10,7),(10,4)}
The returned set is unique
A new data structure We want Explain() to run O(k) time, were k is the
size of the proof The graph whose edges are the pairs in the
sequence of unions is a forest Explain(e,e’) consists of the edges in the path
between e and e’ Easy to find if edges are directed towards the root At each Union(e, e’), where |tree(e)|<|tree(e’)|:
Reverse all edges on the path between e and its root Add an edge ee’
Example After this sequence of unions
(1,8),(7,2),(3,13),(7,1),(6,7),(9,5),(9,3),
(14,11),(10,4)(12,9),(4,11),(10,7)
The proof forest could be
The explanation of (1,4) consists of this path
8 1 7 2
14 11 4 10 6
12 9 3 13
5
Implementation of Explain
f(g,h)=d, c=d, f(g,d)=a, e=c, e=b, b=h 1 2 3 4 5 6
a d c e b h1,3 2 4 5 6
On an Explain(a,b) operation, the equations on the paths ad and bd are output
From 1 and 3, we need to recursively call Explain(h, d)
To maintain complexity, make sure no node is visited twice
Good explanations Shorter and older explanations are good
because they lead to better backtracking Finding the shortest explanation is NP-hard,
however we can still reduce the size of an explanation by removing redundancies
Example:
a1=b1, a1=c1, f(a1,a1)=a, f(b1,b1)=b, f(c1,c1)=c
Explain(a=c) will return all five equations but first and fourth are redundant
a b c b1 a1 c1
Minimizing explanations Post-process explanations to remove redundant
steps. Theorem: A proof is redundant only if it contains
three equations of the form f(a1,a2)=a, f(b1,b2)=b, f(c1,c2)=c, where ai, bi and ci are equivalent.
Presence of such equations is checked in O(k). Algorithm: While not all equations are marked as
necessary, pick an unmarked one. Remove it, if the remaining explanation is correct. Otherwise mark it as necessary. Complextity O(k2logk)