protocol verification with merci
DESCRIPTION
Protocol Verification with Merci. Mark R. Tuttle and Amit Goel DTS SCL. Introduction. I love proof Proof is the path to understanding why things work But theorem provers are too hard for the masses (even me) I advocate model checking at Intel - PowerPoint PPT PresentationTRANSCRIPT
Protocol Verification with Merci
Mark R. Tuttle and Amit GoelDTS SCL
Introduction• I love proof
– Proof is the path to understanding why things work– But theorem provers are too hard for the masses (even me)
• I advocate model checking at Intel– It is the path to automated formal verification for the masses– But model checkers verify without explaining, and don’t scale
• But the world has changed– Decision procedures and SMT now automate some forms of proof– Is theorem proving now viable for nonspecialists in product groups?
Slide 2
Our result• Amit wrote Merci: SMT-based proof checker from SCL
– Systems modeled with guarded commands (like Murphi, TLA+)– Clean mapping to decision procedures of an SMT solver
• Mark validated a classical distributed algorithm– A novice: no prior exposure to Merci, little exposure to SMT– Model done in 3 days, proof done in 3 days, just 9 pages long– Model looks like ordinary code, invariants explain the algorithm
• Found little need to coach the prover about “obvious” things
Slide 3
Consensus
• Validity:– Each output was an input
• Agreement:– All outputs are equal
• Termination:– All nodes choose an output
n1 n2 n3
0 1 0
1 1 1
nodes
inputs
outputs
[Pease, Shostak, Lamport]
messagepassing
Slide 4
A shocking result!
• Consensus is impossible in an asynchronous system if even one node can fail.– Asynchronous: no bound on node step time, msg delivery time– Failure: node just stops (crashes)
• A decade of papers– Different system models, different failure models– How fast? How few messages? How many failures
• Consensus is the “hardest problem” in concurrency!– but sometimes it can be solved…
[Fischer, Lynch, Patterson]
[Herlihy]
Slide 5
Synchronous modelComputation is a sequence of rounds of message passing.
nodes send
messages
nodesreceive
messages
nodeschange
state
round r round r+1
node
Slide 6
Crash failures
At most t nodes can fail.
n
n is correctsends all messages
n is silentsends no messages
n crashes!sends some
messages
Slide 7
Algorithm
procedure consensus (node n)state ← { input }for each round r = 1, 2, …, t+1 do
broadcast state to all nodesreceive state1, state2, …, statek from other nodesstate ← state1 U state2 U … U statek
output ← min(state)
Validity: each output was an inputTermination: all nodes choose an output at end of round t+1Agreement: ???
[Dolev, Strong]
Slide 8
Clean round: no nodes fail
• There is a clean round in t+1 rounds (at most t failures).• Nodes have same state after a clean round.• Nodes choose same output value min(state). Agreement!
[Dwork, Moses]
Clean round!
Slide 9
Merci • A typed procedural language
• Guarded commands used to describe systems
type nodevar array(node, bool) y = mk_array[node](false)var array(node, bool) critical =mk_array[node](false)var node turn
transition unit req_critical (node n)require (!y[n]){ y[n] := true; }
transition unit enter_critical (node n)require (y[n] && !critical[n] && turn=n){ critical[n] := true; }
transition unit exit_critical (node n)require (critical[n]){critical[n] := false; y[n] := false; nondet turn;}
[Amit Goel]
Merci• A typed procedural language
• Guarded commands used to describe systems
• A goal description language for compositional reasoning
def bool mutex = (node n1, node n2) (critical[n1] && critical[n2] => n1=n2)
def bool aux = (node n) (critical[n] => turn=n)
goal g0 = invariant mutex assuming auxgoal g1 = invariant aux
[Amit Goel]
Merci• A typed procedural language
• Guarded commands used to describe systems
• A goal description language for compositional reasoning
• A template system for extending the language
template <type elem> Set { type t // set type const bool mem (elem x, t s) const t add (elem x, t s) const t remove (elem x, t s)
axiom mem_add = (elem x, elem y, t s) (mem (x, add (y, s)) = (x = y || mem (x, s)))
axiom mem_remove = (elem x, elem y, t s) (mem (x, remove(y, s)) = (x !=y && mem(x, s)))}
type nodemodule Node= Set<type node>
[Amit Goel]
Crash failure model
def bool is_crash_behavior (Nodes crashed, Nodes crashing, message_pattern deliver) =
(node p) (p crashed => is_silent(p,deliver)) && (node p) (is_faulty(p,deliver) => p crashed || p crashing) &&Nodes.disjoint(crashed,crashing) &&Nodes.cardinality(crashed) + Nodes.cardinality(crashing) ≤ t
faulty
silent
Slide 13
Synchronous model
for each node pinitialize state of p
for each round rfor each p and q
send msg from p to qfor each p and q
receive msg from p to qfor each p
update state of p
phase
init
send
recv
comp
program counter
init[p]
send[p][q]
recv[p][q]
comp[p]
algorithm
how?
what?
how?
how?decide?decide!
Slide 14
phase ← send
phase ← recv
phase ← comp
Synchronous model• Transitions
– initialize(p)
– start_send– send(p,q)
– start_recv– recv(p,q)
– start_comp– comp(p)
init[p] ← true
send[p][q] ← true
recv[p][q] ← true
comp[p] ← true
increment roundsend[q][p] ← falserecv[p][q] ← falsecomp[p] ← fasle
is_init_phase = phase = init
init_phase_done = forall (node p) (init[p])
Slide 15
transition start_sending () require ( is_init_phase && init_phase_done ||
is_comp_phase && comp_phase_done){
"send[p][q], recv[p][q], comp[p] <= false""message[p][q] <= null_message"
round := round + 1; phase := send;
crashed := Nodes.union(crashed,crashing);nondet crashing;nondet deliver;assume is_crash_behavior(crashed,crash,deliver);
}
Slide 16
transition send (node n, node m) require (is_send_phase)require (!send[n][m]){
messages[n][m] := (deliver [n][m] ? global_state[n] : null_message);
send[n][m] := true;}
initialize(p) 8 lines
start_send() 16 lines send(p,q) 9 lines
start_recv() 5 lines recv(p,q) 7 lines
start_comp() 5 lines comp(p) 13 lines
Transition size
Slide 17
Agreement proof• Recall the agreement proof
– A1: There is a clean round – A2: All states are equal at the end of a clean round – A3: All states remain equal after a clean round – A4: All nodes choose from their states the same output value
• Merci proof is short– A1: 7 lines– A2: 127 lines– A3: 12 lines– A4: 25 lines
• Merci proof is almost entirely at the algorithmic levelSlide 18
A1: There is a clean rounddef bool clean_round_by_round_t_plus_1 =
round >= t+1 => !before_clean
def bool faulty_grows_until_clean_round = before_clean => Nodes.cardinality(faulty) >= round
goal clean1 = invariant faulty_grows_until_clean_roundgoal clean2 = invariant clean_round_by_round_t_plus_1 assuming faulty_grows_until_clean_round
Slide 19
A2: All states equal …def bool state_equality =
(node n, node m) (noncrashed(n) && noncrashed(m) => state[n] = state[m])
def bool state_equality_in_clean = in_clean && send_phase_done && recv_phase_done =>
state_equality
• Proof– A2.1: If nonfaulty n has v, then n received v in a message– A2.2: That message was sent to everyone since round is clean– A2.3: If m received v in a message, then m has v– A2.4: So nonfaulty n and m have the same values
• Proof algorithmic and short: 48, 34, 15, and 30 lines long
Slide 20
Conclusion• Classical fault-tolerant distributed algorithm proved w/Merci
– Model looks like ordinary code, invariants explain the algorithm– Merci proof is 170 lines, Classical proof is 1+ page– Model and proof done in 6 days with no prior experience
• Yices made quantification hard– exists: usually have to produce the example by hand– forall: template instantiation wouldn’t find the right instantiation
• Yices counterexamples mostly useless– Get a context from first few lines, ignore the rest– “Is property false or is Yices failing to instantiate a forall template?”– BKM: Think about the algorithm itself, and ignore Yices output
Slide 21