1 a compositional approach to verifying hierarchical cache coherence protocols xiaofang chen 1 yu...
Post on 19-Dec-2015
213 views
TRANSCRIPT
1
A Compositional Approach to Verifying Hierarchical Cache Coherence Protocols
Xiaofang Chen1
Yu Yang1
Ganesh Gopalakrishnan1
Ching-Tsun Chou2
1University of Utah2Intel Corporation
* Supported in part by Intel SRC Customization Award 2005-TJ-1318
2FMCAD 2006
Hierarchical Cache Coherence Protocols
Chip-level protocols
Inter-cluster protocols
Intra-cluster protocols
dirmem dirmem
…
3FMCAD 2006
Verification Challenges
No public domain benchmarks
More complicated with more Corner cases
State space
4FMCAD 2006
Outline
Two hierarchical protocols Inclusive
Non-inclusive
A compositional approach Abstraction
Counter-example guided refinement
Soundness
5FMCAD 2006
A Multicore Coherence Protocol
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
Global Dir
MainMemory
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
6FMCAD 2006
Protocol Features
Both levels use MESI protocols Level-1: FLASH
Level-2: DASH
Silent drop on non-Modified cache lines
Network channels are non-FIFO
7FMCAD 2006
Livelock Problem
DirAgent1 Agent2
1. Req_E
2. Grant_E
4. Req_S
3. Silent-drop
5. Fwd_Req6. NACK
Invld InvldExcl
8FMCAD 2006
Blocking WB + NACK_SD
DirA1 A2Req_E
Gnt_EReq_S
Modify
WB
Fwd_S
WB_Ack
NAck_SD
NAck
(I) (I)
(E)
(M)
(I)
9FMCAD 2006
Complexity of the Protocol
Multiplicative effect of four protocols running
concurrently
Model check failed after 161,876,000 of
states
10FMCAD 2006
Outline
Two hierarchical protocols Inclusive
Non-inclusive
A compositional approach Abstraction
Counter-example guided refinement
Soundness
11FMCAD 2006
A Compositional Approach
Constraining
Original protocol
Abstraction
…
Abstracted protocol
12FMCAD 2006
Non-Circular Assume/Guarantee
We can’t Verify: h ║ r1 ║ r2 ╞ Coh
Instead Check-1: h ║ R1 ║ R2 ╞ Coh1 Λ Constrains1
Check-2: H ║ r1 ║ R2 ╞ Coh2 Λ Constrains2
13FMCAD 2006
Verification Methodology
Abstraction Two abstracted protocols
Fixing real bugs in M
Refinement
14FMCAD 2006
Abstracted Protocol #1
RAC
L2 Cache+Local Dir’
Global Dir
MainMemory
Home Cluster
Remote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
RAC
L2 Cache+Local Dir’
15FMCAD 2006
Abstracted Protocol #2
RAC
L2 Cache+Local Dir’
Global Dir
MainMemory
Home Cluster
Remote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
RAC
L2 Cache+Local Dir’
16FMCAD 2006
Abstraction
States Projection
Transitions Overapproximation
17FMCAD 2006
Abstraction on States
Intra-cluster details
Inter-cluster details
18FMCAD 2006
Abstracting Transitions
Rule-based system: guard action; Relaxing guards
Relaxing expr values
Remove stmt
Procs[p].WbMsg.Cmd = WB_Wb
→
Procs[p].L2.Data := Procs[p].WbMsg.Data;
Procs[p].L2.HeadPtr := L2; …
true→Procs[p].L2.Data := d; …
19FMCAD 2006
Detecting Bugs in M
When a real error is found in Mi
Fix bug in M
Regenerate Mi’s
Iterate the process
20FMCAD 2006
Refinement
When a bogus error found in Mi
Analyze and find out problematic rule
g → a
Locate original rule in M
G → A
Add a new lemma in one abstracted protocol
G => P
Strengthen rule into
g Λ P → a
21FMCAD 2006
1
M1
1. False alarm found Remote cluster-1 can
modify its L2 line arbitrarily
Details of Refinement (I)
true → …
22FMCAD 2006
2. Locate the original rule in M
before abstraction Guard: when the local dir receives
a WB from an L1 cache
Details of Refinement (II)
1
M1
Procs[p].WbMsg.Cmd = WB
→ …
23FMCAD 2006
3. Strengthen problematic rule in 1. Only when local dir is exclusive,
could L2 modify its line
Details of Refinement (III)
1
M1
3
true &
Procs[p].L2.State = Excl
→ …
24FMCAD 2006
4. Why strengthening is sound?
Details of Refinement (IV)
1
M1
3
25FMCAD 2006
4. We can add a new lemma in M2
Details of Refinement (V)
M1
1 3
M2
4
Procs[p].WbMsg.Cmd = WB
=>
Procs[p].L2.State = Excl
26FMCAD 2006
One Detail
Excl: 1
Home Cluster
Remote Cluster 1
Remote Cluster 2
Excl
Excl
Invld
Invld
1
23
45
1 Req_E 2 Req_E 3 Fwd_ReqE
4 Fwd_ReqE 5 Gnt_E
27FMCAD 2006
Original Transitions (I)
GUniMsg[src].Cmd = RDX_RAC &
GUniMsg[src].Cluster = r &
Procs[r].L2.Gblock_WB = false &
Procs[r].L2.State = Excl &
Procs[r].L2.HeadPtr != L2
…
undefine GUniMsg[src];
GUniMsg[src].Cmd := GUNI_None;
28FMCAD 2006
Original Transitions (II)
Procs[r].ShWbMsg.Cmd = SHWB_FAck &
src_node = L2
…
true &
ABSProcs[r].L2.State = Excl &
ABSProcs[r].RAC.State = Inval &
ABSProcs[r].L2.Gblock_WB = false &
GUniMsg[src].Cmd = RDX_RAC &
GUniMsg[src].Cluster = p
…
29FMCAD 2006
Adding A Variable
Excl: 1
Home Cluster
Remote Cluster 1
Remote Cluster 2
Excl
Excl
Invld
Invld
1
23
45
ifKeepMsg: boolean
30FMCAD 2006
Soundness of the Approach
Goal If M1 and M2 can be model checked correct
w.r.t. the coherence property Ф in M, M must
also be correct w.r.t Ф
31FMCAD 2006
Soundness Proof
Temporal Induction Initial states
Each var has the same value in M, M1 and M2
Each newly added lemma is checked in M1 and M2
Each property is checked
Suppose soundness in state s
32FMCAD 2006
Soundness Proof (II)
h1, h2, r11, r12, r21, r22
h1, h2, r12, r22
h1, r11, r12, r22
h1’, h2’, r11’, r12’, r21’, r22’g a
g1 & p1 a1
h1’, h2’, r12’, r22’
g2 & p2 a2
h2’, r11’, r12’, r22’
M
M1
M2
33FMCAD 2006
Experiment Results
A real bug found
10 iterations of refinements The size of each error trace is < 12
One person-day of work
34FMCAD 2006
Protocol Number of states
M > 161,876,000
M1 31,919,219
M2 78,689,678
Reduction
64-bit Murphi
IA-64 with 20GB of memory
35FMCAD 2006
Outline
Two hierarchical protocols Inclusive
Non-inclusive
A compositional approach Abstraction
Counter-example guided refinement
Soundness
36FMCAD 2006
Caching Hierarchy
Inclusive
Exclusive
Non-inclusive
37FMCAD 2006
A Non-Inclusive Hierarchical Protocol
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
Global Dir
MainMemory
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
38FMCAD 2006
Protocol Differences
Broadcasting channels
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
SnoopMsg[]
39FMCAD 2006
Imprecise Local Directory
LDirL1-1 GDir
Req_S
(S) S: L1-1
L1-2
(I)Swap
Broadcast
NAckFwd_Req
Gnt_S
Gnt_S
S: L1-2Imprecision!
40FMCAD 2006
Verification Difficulty
Coherence properties Can involve multiple L1 caches
Refinement Noninterference lemmas cannot infer L2 cache
line states, from local behaviors
41FMCAD 2006
An Example
Excl
Excl Invld
Invld
Excl Invld
WB WB
L2:
(Excl, data1) (Excl, data2)
L2:
(Invld, *) (Excl, data2)
42FMCAD 2006
Two Approaches of Refinement
Inferring “exclusive” from Outside the cluster
Inside the cluster
43FMCAD 2006
Infer exclusive From Outside
Invld
Excl Invld
WB
L2:
(Invld, *) (Excl, data2)
IsExcl(p) Ξ
Dir.State = Excl &
GUniMsg[p].Cmd != (ACK || IACK || ImACK) &
GUniMsg[h].Cmd != (ACK || IACK || ImACK) &
GWbMsg.Cmd = GWB_None &
( (GShWbMsg.Cmd = GSHWB_None &
Dir.Headptr = p) ||
(GShWbMsg.Cmd = DXFER &
GShWbMsg.Cluster = p))
Cluster p
44FMCAD 2006
Refinement Example
Invld
Excl Invld
WB
L2:
(Invld, *) (Excl, data2)
Cluster p p.WbMsg.Cmd = WB
=>
IsExcl(p)
(Invld & IsExcl(p), *)
(Excl, data2)
45FMCAD 2006
Infer exclusive From Inside
M1 M2
46FMCAD 2006
Definition of IE
IE(p):
exists i: L1_caches
(p.L1(i).state = Excl or
p.SnoopMsg(i).Cmd = (Put or PutX) or
p.UniMsg(i).Cmd = PutX) or
p.WbMsg.Cmd = WB or
p.ShWbMsg.Cmd = ShWb or
p.ShWbMsg.Cmd = FAck
47FMCAD 2006
Refinement
Invld
Excl Invld
WB
L2:
(Invld, *) (Excl, data2)
Cluster p Procs[p].WbMsg.Cmd = WB &
Procs[p].L2.Stae = Invld
=>
IE(p)
(Invld & IE(p), *)
(Excl, data2)
48FMCAD 2006
Soundness
Still holds by adding the extra bits “IE”
49FMCAD 2006
Experiment Results
17 iterations of refinements
Size of each error trace is < 8
Protocol Number of states
M > 1,521,900,000
M1 234,478,105
M2 283,124,383
50FMCAD 2006
Outline
Two hierarchical protocols Inclusive
Non-inclusive
A compositional approach Abstraction
Counter-example guided refinement
Soundness
51FMCAD 2006
Conclusion
Developed 2-level hierarchical protocols
Proposed a compositional approach Abstraction
Bug fixing
Refinement
Proved the soundness
52FMCAD 2006
Related Work
FMCAD’04 Chou et. al., A simple method for
parameterized verification of cache coherence
protocols
CHARME’99 McMillan, Verification of infinite state systems
by compositional model checking
53FMCAD 2006
For Details
http://www.cs.utah.edu/formal_verification/
54FMCAD 2006
A Multicore Coherence Protocol
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
Global Dir
MainMemory
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
55FMCAD 2006
About the Bug
IACK
56FMCAD 2006
Another Decomposing Approach
Split protocols hierarchically Intra-cluster protocol
Inter-cluster protocol
57FMCAD 2006
Intra-cluster Protocol
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
Cluster
Environment
58FMCAD 2006
Inter-cluster Protocol
RAC
L2 Cache+Local Dir’
Global Dir
MainMemory
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir’
RAC
L2 Cache+Local Dir’
59FMCAD 2006
Verification Difficulty
Environment
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
Global Dir
MainMemory
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
RAC
L2 Cache+Local Dir
L1 Cach
e
L1 Cach
e
60FMCAD 2006
An Example Scenario
Excl: 1
Home Cluster
Remote Cluster 1
Remote Cluster 2
Excl
Excl
Invld
Invld
1
23
6
4
5
7 NACK
1 Req_E 2 Req_E 3 Fwd_ReqE
4 Swap 5 Req_E 6 Fwd_ReqE
7