1 a tutorial on parallel and distributed model checking orna grumberg computer science department...
TRANSCRIPT
![Page 1: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/1.jpg)
1
A Tutorial on
Parallel and Distributed Model Checking
Orna Grumberg
Computer Science Department
Technion, Israel
![Page 2: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/2.jpg)
2
Aspects of parallelism• Why to parallelize – gain memory or
time– For model checking: usually memory
• Special purpose hardware or network of workstations – Networks of workstations
• Distributed or shared memory– Distributed memory with message passing
![Page 3: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/3.jpg)
3
Parallel and distributed algorithms were developed for
• Explicit state methods- reachability and model construction- LTL model checking- model checking for alternation-free -calculus
• BDD-based methods- reachability and generation of counter example- model checking for full -calculus
• Operations on BDDs• SAT solvers• Timed and probabilistic model checking
![Page 4: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/4.jpg)
4
Elements of distributed algorithms
• Partitioning the work among the processes• Dynamic or static load balance to maintain
balanced use of memory• Maintaining a good proportion between
computation at each process and communication
• Distributed or centralized termination detection
![Page 5: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/5.jpg)
5
Reachability analysis(BDD- based)
![Page 6: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/6.jpg)
6
References
“A scalable parallel algorithm for reachability analysis of very large circuits”,
Heyman, Geist, Grumberg, Schuster, (CAV’00)
Also:• Cabodi, Camurati, and Quer, 1999• Narayan, Jain, Isles, Brayton, and Sangiovanni-
Vincentalli, 1997
![Page 7: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/7.jpg)
7
Reachability analysis
Goal:
Given a system (program or circuit) to compute the set of reachable states from the set of initial states
Commonly done by Depth First Search (DFS) or Breadth First search (BFS)
![Page 8: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/8.jpg)
8
Symbolic (BDD-based) model checking
• BDD – a data structure for representing Boolean functions that is often concise in space
• BDDs are particularly suitable for representing and manipulating sets
• Symbolic model checking algorithms – hold the transition relation and the computed sets of
states as BDDs
– Apply set operations in order to compute the model checking steps
![Page 9: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/9.jpg)
9
Sequential reachability by BFS
Reachable = new = InitialStates
While (new ) {
next = successors(new)
new = next \ reachable
reachable = reachable new
}
![Page 10: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/10.jpg)
10
The distributed algorithm
• The state space is partitioned into slices• Each slice is owned by one process• Each process runs BFS on its slice• When non-owned states are discovered
they are sent to the process that owns them
Goal: reducing space requirements (and possibly time)
![Page 11: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/11.jpg)
11
The distributed algorithm (cont.)The initial sequential stage• BFS is performed by a single process until some
threshold is reached• The state space is sliced into k slices.
Each slice is represented by a window function.• Each process is informed of:
- The set of window functions W1,…,Wk
- Its own slice of reachable - Its own slice of new
![Page 12: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/12.jpg)
12
Elements of distributed symbolic algorithm
• Slicing algorithm that partitions the set of states among the processes
• Load balance algorithm that keeps these sets similar in size during the execution
• Distributed termination detection
• Compact BDD representation that can be transferred between processes and allows different variable orders
![Page 13: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/13.jpg)
13
v1
v3
v2
0
0 1
1
v3
v2
v1
0
0
v1
v3
v2
1
1
v2
f fv1 fv1
that causes BDD to be a compact representationWhen slicing a BDD we loose the sharing
Slicing:
![Page 14: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/14.jpg)
14
Slicing (cont.)• We choose a variable v and partition a Boolean
function f to fv=1 and fv=0.
• The chosen v has minimal cost that guarantees:– The size of each slice is below a threshold.
I.e., the partition is not trivial(no | f1| | f2 | | f | or | f2 | 0 )
– The duplication is kept as small as possible
• An adaptive cost function is used to keep the duplication as small as possible
![Page 15: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/15.jpg)
15
Load Balance
• The initial slicing distributes the memory requirements equally among the processes.
• As more states are discovered, the memory requirements might become unbalanced.
• Therefore, at the end of each step in the computation of the reachable states a load balance procedure is applied.
![Page 16: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/16.jpg)
16
Load Balance (Cont.)
• Process i with a small slice sends its slice to process j with a large slice.
• Process j applies the slicing procedure for k=2 and obtains two new balanced slices.
• Process j sends process i its new slice and informs all other processes of the change in windows.
![Page 17: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/17.jpg)
17
The parallel stage requires a coordinator for:
Pairing processes for exchanging non-owned states
• Processes notify the coordinator of the processes they want to communicate with.
• The coordinator pairs processes that need to communicate.
• Several pairs are allowed to communicate in parallel.
![Page 18: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/18.jpg)
18
Important: Data is transferred directly between the processes and not via the coordinator
![Page 19: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/19.jpg)
19
Coordinator can also be used for:
• Pairing processes for load balancing
• Distributed termination detection
![Page 20: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/20.jpg)
20
Experimental results
On 32 non-dedicated machines, running IBM RuleBase model checker:
• On examples for which reachability terminates with one process, adding more processes reduces memory (number of BDD nodes)
• On examples for which reachability explodes, more processes manage to compute more steps of the reachability algorithm
![Page 21: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/21.jpg)
21
• Communication was not a bottleneck
• Time requirements were better in some examples and worse in others– better – because BDDs are smaller– worse – overhead and lack of optimizations
for improving time
![Page 22: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/22.jpg)
22
Future work
• Improve slicing
• Exploit different orderings in different processes
• Adapt the algorithm to dynamic networks
• Adapt the algorithm for hundreds and thousands of parallel machines
![Page 23: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/23.jpg)
23
Model checking safety properties
• Checking AGp can be performed by – Computing the set of reachable states– For each state checking whether it satisfies p
• If a state which satisfies p is found, a counter example – a path leading to the error state - is produced
• Checking other safety properties can also be reduced to reachability
![Page 24: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/24.jpg)
24
Back to reachability(explicit state)
![Page 25: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/25.jpg)
25
ReferencesExplicit state reachability:• “Distributed-Memory Model Checking with SPIN”, Lerda
and Sisto, 1999Also:• Caselli, Conte, Marenzomi, 1995• Stern, Dill 1997• Garavel, Mateescu, Smarandache, 2001
LTL model checking:• “Distributed LTL Model-Checking in SPIN”, Barnat,
Brim, and Stribrna, 2001
![Page 26: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/26.jpg)
26
Sequential reachability
• States are kept in a hash table
• Reachability is done using a DFS algorithm
![Page 27: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/27.jpg)
27
Distributed Reachability
• The state space is statically partitioned
• When a process encounters a state that does not belong to it - the state is sent to the owner
• Received states are kept in a FIFO queue
• Verification ends when all processes are idle and all queues are empty
![Page 28: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/28.jpg)
28
Choosing the partition function
• Must depend only on the state• Should divide the state space evenly• Should minimize cross-transitions
• First solution - partition the space of the hash function cannot be implemented on a heterogeneous network even distribution, but not necessarily of the reachable
states does not minimize cross transitions
![Page 29: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/29.jpg)
29
A better partition function for asynchronous programs (like in SPIN)
A global state s consists of the local states si of each concurrent sub-program
• Choose a specific sub-program progi
• Define the partition function according to the value of the local state si of sub-program progi
Since a transition generally involves one or two sub-programs, this partitionminimizes cross-transitionsdistributes the state-space evenly
![Page 30: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/30.jpg)
30
LTL model checking with
Büchi automata(explicit state)
![Page 31: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/31.jpg)
31
LTL model checking withBüchi automata
• A Büchi automaton is a finite automaton on infinite words.
• An infinite word is accepted if the automaton, when running on this word, visits an accepting state infinitely often.
• Every LTL formula can be translated into a Büchi automaton that accepts exactly all infinite paths that satisfy the formula.
![Page 32: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/32.jpg)
32
Checking M |= for an LTL formula
In order to verify a property , an automaton A is built.
• A contains all behaviors that satisfy .
• M x A contains all the behaviors of M that do not satisfy .
• M |= iff M x A is empty.
![Page 33: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/33.jpg)
33
Checking for (non)emptiness
• Looking for a reachable loop that contains an accepting state
• Tarjan’s algorithm, O(|Q| + |T|)
![Page 34: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/34.jpg)
34
Nested DFS Algorithm• Two DFS searches are interleaved
– The first looks for an accepting state– The second looks for a cycle back to this state
• When the first DFS backtracks from an accepting state it starts the second (nested) DFS
![Page 35: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/35.jpg)
35
• The second DFS looks for a loop back to the accepting state
• When the second DFS is done (without success) the first DFS resumes
• Each DFS goes through every reachable state only once!
![Page 36: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/36.jpg)
36
Why nested-DFS won’t work in parallel
Relative speed determines if a cycle is found
S3
S4
S1 S2
![Page 37: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/37.jpg)
37
• The order matters
A nested DFS should start from s iff all accepting states below s have finished their nested DFS
S3
S1
S5S4
S2
Process A
Process B
![Page 38: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/38.jpg)
38
Inefficient solution
• Holding for each state the list of NDFSs it participated in– Requires too much space– Allows each state to be traversed more than
once for each of the two DFSs
![Page 39: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/39.jpg)
39
Main characteristics ofthe distributed algorithm
• Dependency graph, containing only accepting states and border states, is used to preserve limited amount of information
• Each process holds its own dependency graph• NDFS starts from a state only after all its
successors are search by DFS and NDFS• NDFS is not performed in parallel with another
NDFS
![Page 40: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/40.jpg)
40
Experimental results
• Preliminary
• 9 workstations interconnected by Ethernet
• Implemented within SPIN and compared to standard, sequential SPIN
• Could apply LTL model checking to larger problems
![Page 41: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/41.jpg)
41
Future work
• Improve the partition function
• Increase the level of parallelism by allowing NDFSs to work in parallel under certain conditions
![Page 42: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/42.jpg)
42
SAT-based model checking
![Page 43: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/43.jpg)
43
State explosion problem in model checking
• Systems are extremely large• State of the art symbolic model checking
can handle medium to small size systems effectively:a few hundreds Boolean variables
Other solutions for the state explosion problem are needed.
![Page 44: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/44.jpg)
44
SAT-based model checking
• Translates the model and the specification to a propositional formula
• Uses efficient tools for solving the satisfiability problem
Since the satisfiability problem is NP-complete, SAT solvers are based on heuristics.
![Page 45: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/45.jpg)
45
SAT tools
• Using heuristics, SAT tools can solve very large problems fast.
• They can handle systems with 1000 variables that create formulas with a few thousands of variables.
GRASP (Silva, Sakallah)
Prover (Stalmark)
Chaff (Malik)
![Page 46: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/46.jpg)
46
Bounded model checkingfor checking AGp
• Unwind the model for k levels, i.e., construct all computations of length k
• If a state satisfying p is encountered, then produce a counter example
The method is suitable for falsification, not verification
![Page 47: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/47.jpg)
47
Bounded model checking with SAT
• Construct a formula fM describing all possible computations of M of length k
• Construct a formula f expressing =EFp
• Check if f = fM f is satisfiable
If f is satisfiable then M | AGp
The satisfying assignment is a counterexample
![Page 48: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/48.jpg)
48
Bounded model checking
• Can handle LTL formulas, when interpreted over finite paths.
• Can be used for verification by choosing k which is large enough so that every path of length k contains a cycle.– We then need to identify cycles using
propositional formulas.– Using such k is often not practical due to the
size of the model.
![Page 49: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/49.jpg)
49
SAT Solvers
Main problem: time
Secondary problem: space
![Page 50: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/50.jpg)
50
References• “PSATO: a Distributed Propositional Prover and
Its Application to Quasigroup Problems”, Zhang, Bonancina and Hsiang, 1996
• “PaSAT – parallel SAT-checking with lemma exchange: implementation and applications”, Sinz, Blochinger, Kuchlin, 2001
• Also:• Bohm, Speckenmeyer 1994• Zhao, Malik, Moskewicz, Madigan, 2001
![Page 51: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/51.jpg)
51
Propositional formula inConjunctive Normal Form (CNF)
CNF consists of a conjunction of clauses.
A clause is a disjunction of literals.
A literal is either a proposition or a negation of a proposition.
(a e) (c b) (c d) (c)
![Page 52: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/52.jpg)
52
A Simple Davis-Putnam Algorithm
Function Satisfiable (set S) return boolean repeat /* unit propagation */ for each unit clause L in S do delete (L Q) from S /* unit subsumption */ delete L from ( L Q) in S /* unit resolution */ od If S is empty return TRUE else if null clause in S return FALSE until no further changes result
![Page 53: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/53.jpg)
53
Davis-Putnam Algorithm (Cont.)
/* case splitting */ choose a literal L occurring in S if Satisfiable ( S {L} ) return TRUE else if Satisfiable (S {L}) return TRUE else return FALSEend function
![Page 54: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/54.jpg)
54
(a e) (a e) (c b) (c d) (c)• Unit clause: c=0
(a e) (a e) ( b)• Unit clause: b=1
(a e) (a e) • Selecting splitting literal: a=0
(e) ( e) – conflict!• Create conflict clause: (c b a )• Backtracking and choosing a=1• Satisfying assignment: c=0, b=1, a=1
![Page 55: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/55.jpg)
55
Points of wisdom
• Clever choice of the splitting literal.
• Clever back-jumping on unsuccessful assignments.
• Remembering unsuccessful assignments as conflict clauses or lemmas.
![Page 56: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/56.jpg)
56
PSATOA distributed implementation of SAT on
network of workstations.• The goal is to exploit their under-used
computation power, especially after hours:parallelize and cumulate the work
• Dynamic load balance is needed since the computing power of each workstation is not known in advance (it may be shared with other programs).
![Page 57: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/57.jpg)
57
a
b c
1,B
0,B 1,N
d0,B 1,N
0,N
e
a <0,N> c <1,N>
a <0,N> c <0,N> d <0.N> e
a <0,N> c <0,N> d <1.N>
Partitioning the work
![Page 58: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/58.jpg)
58
The Master-Slave Modelof PSATO
• One master, many slaves• Communication only between master and slaves • Master sends jobs (S,P) to slaves
S – set of clauses, P – guiding path• Each slave runs Davis-Putnam according to P
• When a slave stops, it sends master– TRUE or FALSE, if job is finished– guiding path, if job is interrupted
![Page 59: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/59.jpg)
59
Balancing the workload
• If a slave returns TRUE, all slaves are stopped
• If it returns FALSE, the slave is assigned a new path.
• If time expires, the master sends halt signal to stop the current run and collects new paths
The new paths will be used in the next run
![Page 60: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/60.jpg)
60
Achievements • Accumulation of work: cumulates the
results of separate runs on the same problem
• Scalability: more workstations result in a faster solution
• Fault tolerance: minimal damage by failure of one workstation or network interruption
• No redundant work: processes explore disjoint portions of the search space
![Page 61: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/61.jpg)
61
Experimental results
• For random hard 3-SAT problems, the speedup on 20 machines was from 6 to 18.– Speedup is the ration between CPU time of the
sequential machine and the average time over the parallel machines.
• For open quasigroup problems they managed to solve a problem on 20 machine in 35 “working days” that would otherwise require 240 days of continuous run on a single machine.
![Page 62: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/62.jpg)
62
PaSAT
Can run on multi-processor computer and on a networked standard PCs
Implemented on shared memory with dynamic creation of threads
![Page 63: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/63.jpg)
63
PaSAT (Cont.)
• Uses guiding paths as in PSATO for partitioning the work and for balancing it
• Holds conflict clauses learned by all tasks in a shared memory– implemented so that it allows concurrent access without
synchronization
• Each task filters its conflict clauses and put only the “best” in the global store
• Periodically, each task integrates new clauses from the global store into its current set
![Page 64: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/64.jpg)
64
Experimental results
• On a machine with 4 processors, on satisfiable SAT problems: obtained a speedup (time on sequential/time of parallel) of up to 3.99 without exchange of conflict clauses and even higher with the exchange
![Page 65: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/65.jpg)
65
Future work
• Implement the ideas of PaSAT on distributed memory
• Extend the ideas for many machines working in parallel
![Page 66: 1 A Tutorial on Parallel and Distributed Model Checking Orna Grumberg Computer Science Department Technion, Israel](https://reader035.vdocuments.us/reader035/viewer/2022062515/56649cc55503460f9498e1af/html5/thumbnails/66.jpg)
66
THE END