Verification of Hierarchical Cache Coherence Protocols for Future Processors
Student: Xiaofang Chen
Advisor: Ganesh Gopalakrishnan
2
Outline
Background Proposed solutions
– High level hierarchical coherence protocol verification
– Refinement check: specifications vs. RTL implementations
Conclusion
3
Hierarchical Cache Coherence Protocols
Chip-level protocols
Inter-cluster protocols
Intra-cluster protocols
dirmem dirmem
…
4
Modeling and Verification of Coherence Protocols
High-level modeling approaches– Model checking
Low-level modeling: RTL or VHDL– Simulation
5
Problems with Hierarchical Coherence Protocols
For high level modeling– Handle the complexity of hierarchical protocols
For RTL implementations– Verify a RTL correctly implements the specification
6
Example: Verification Complexity (I)
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
7
Example: Verification Complexity (II)
Tool: Murphi Verification
– IA-64 machine
– 18GB memory
– 40-bit hash compaction
– Non-conclusive after >30 hours of state enumeration
8
Differences in Modeling: Specs vs. Impls
1 1.1 1.
2
1.3
home clientbuf
local
cache
One step in high-level
Multiple steps in low-level
1.4
1.5
9
Differences in Execution: Specs vs. Impls
1
1.1 1.2
1.3
2 3
2.1 2.2 3.1
3.2
3.3
Interleaving in HL
Concurrency in LL
10
Proposed Mechanisms
For high level modeling, develop– A few M-CMP coherence protocols
– A compositional approach
For specifications vs. implementations, develop– A formal theory
– A compositional approach
– A practical tool
11
2005
Abstraction + assume guarantee Inclusive M-CMP protocols Chen et al. FMCAD 2006
Transaction based refinement check
Hierarchical protocols verification
2006 2007 2008
Transaction based refinement check Complete case study for a benchmark Chen et al. TECHCON 2007 Best session paper in verification
Extensions: refinement check
Present Predicate abstraction for Murphi Bounded Transaction based testing Chen et al. UUCS-06-002, UUCS-06-003
Starting practices
Hierarchical protocols verification
Refinement theory Modular refinement check Chen et al. FMCAD 2007
Improved approach: one level a time Automated abstraction Non-inclusive M-CMP protocols Chen et al. HLDVT 2007
Make muv a practical tool
Thesis Timeline
12
Outline
Background Proposed solutions
– High level hierarchical coherence protocol verification
– Refinement check: specifications vs. RTL implementations
Conclusion
13
An M-CMP Benchmark Protocol
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
Inter-cluster
Intra-cluster
14
Protocol Features
Both levels use MESI protocols– Intra-cluster: FLASH
– Inter-cluster: DASH
Silent drop on non-Modified cache lines Network channels are non-FIFO Inclusive caches
15
Another Benchmark: Non-inclusive Caches
RAC
L2 Cache+Local Dir
L1 Cache
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
L1 Cache
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
16
Our Compositional Approach
Original protocol
17
Our Compositional Approach
18
One Way to Decompose Protocols
Create three abstract protocols Each with 1 detailed cluster + 2 abstracted clusters
19
Abstract Protocol #1
RAC
L2 Cache+Local Dir’
Main Mem
Home Cluster
Remote Cluster 1
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
Remote Cluster 2
20
Abstract Protocol #2
RAC
L2 Cache+Local Dir’
Main Mem
Home Cluster
Remote Cluster 1
Global Dir
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
Remote Cluster 2
21
Problems with This Approach Every abstract protocol contains 2 protocols Duplicated behaviors in abstract protocols State space still large
1818 636,613,051M2
1812 284,088,425M1
Mem (GB)Time (hour)# of states
22
Second Way to Decompose Protocols
RAC
L2 Cache+Local Dir’
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir’
Global Dir
RAC
L2 Cache+Local Dir’
Home Cluster Remote Cluster 1
ABS #1 ABS #2
ABS #3
L2 Cache+Local Dir
L1 Cache
L1 Cache
L2 Cache+Local Dir
L1 Cache
L1 Cache
23
Model Checking Results
Model checkpassed
Use mem(GB)
18
18
18
1.8
1.8
1.8
Model checktime (sec)
> 125,410
44,978
66,249
270
50
21
# of states
> 438,120,000
284,088,425
636,613,051
1,500,621
574,198
198,162
Full model
Abs. model 1
Abs. model 2
Abs. model 1
Abs. model 2
Abs. model 3
Classicalapproach
Firstapproach
Secondapproach
Nonconclusive
Yes
Yes
Yes
Yes
Yes
24
Details of Our Approach
Abstraction– States
– Transitions, properties
Constraining– Assume guarantee reasoning
25
Abstraction on States
Intra-cluster
Inter-cluster
26
State Representation
L2 Cache+Local Dir
L1 Cache
L1 Cache
L2 Cache+Local Dir’
L1s Network L2Local Dir
Original cluster
RAC
RAC
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
L1s Network L2Local Dir
L2Local Dir’ RAC
Abstract clusters
27
Rule: guard action guard
– Become more permissive
action– Allow more behaviors
Abstracting Transitions and Properties
28
An Example of Abstraction
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
WBClusters[c].WbMsg.Cmd = WB
Clusters[c].L2.Data := Clusters[c].WbMsg.Data;
Clusters[c].L2.HeadPtr := L2; …
True
Clusters[c].L2.Data := nondet; …
Abstract inter-cluster protocol
Abstract intra-cluster protocol
29
Abstraction, Now Constraining
30
An Example of Constraining
RAC
L2 Cache+Local Dir
L1 Cache
L1 Cache
RAC
L2 Cache+Local Dir’
WBClusters[c].WbMsg.Cmd = WB
Clusters[c].L2.State = Excl
True &
Clusters[c].L2.State = Excl
Clusters[c].L2.Data := nondet; …
31
Non-inclusive Protocols: History Variables
RAC
L2 Cache+Local Dir’
Main Mem
Home ClusterRemote Cluster 1
Remote Cluster 2
RAC
L2 Cache+Local Dir’
Global Dir
RAC
L2 Cache+Local Dir’
Home Cluster Remote Cluster 1
L2 Cache+Local Dir
L1 Cache
L1 Cache
L2 Cache+Local Dir
L1 Cache
L1 Cache
32
Experimental Results
Model checkpassed
Use mem(GB)
18
1.8
1.8
1.8
Model checktime (sec)
> 161,398
770
250
248
# of states
> 473,260,000
4,070,484
2,424,719
2,424,719
Full model
Abs. model 1
Abs. model 2
Abs. model 3
Classicalapproach
Secondapproach
Nonconclusive
Yes
Yes
Yes
33
Outline
BackgroundProposed solutions
High level hierarchical coherence protocol verification
– Refinement check: specifications vs. RTL implementations
Conclusion
34
Our Approach
Use a hardware language– Hardware Murphi
Develop a formal theory of refinement check Develop a compositional approach
– Abstraction
– Assume guarantee
Develop a practical tool
35
Hardware Murphi
Murphi extension by S. German and G. Janssen A concurrent shared variable language
– On each cycle• Multiple transitions execute concurrently• Exclusive write to a variable• Shared reads to variables• Write immediately visible within the same transition• Write visible to other transitions on the next cycle
Support transactions, signals, etc
36
Transaction
Group multiple steps in impl
Transaction Rule-1 …. … Rule-6 … End;
12
3
456
37
Workflow of Our Refinement Check
Hardware MurphiImpl model
Product model inHardware Murphi
Product model in VHDL
MurphiSpec model
Property check
Muv
Check low-level correctly implements high-level
38
Full List of Assertions for Refinement Check
1. Serializability for specifications
2. No write-write conflicts
3. Initial states containment
4. Write set variables containment
5. Enableness for specifications
6. Joint variables match at the end of transactions
39
An Example
Transaction
Rule-1
guard1 action1;
Rule-2
guard2 action2;
Rule-3
guard3 action3;
End;
Rule
spec_guard spec_action;
Impl transaction
Spec rule
40
An Example (Cont’d)
Transaction
Rule-1 guard1 action1; assert spec_guard; spec_action; Rule-2
guard2 action2;
Rule-3 guard3 action3;
End;
assert impl_var1 = spec_var1;assert impl_var2 = spec_var2; …
41
Driving Benchmark
Buf
Buf
Buf Remote
Dir Cache Mem
Router
Buf
Buf
Buf
LocalHome
Remote
Dir Cache Mem
S. German and G. Janssen, IBM Research Tech Report 2006
LocalHome
42
Bugs Found with Refinement Check
Benchmark satisfies cache coherence already Bugs still found
– Bug 1: router unit loses messages
– Bug 2: home unit replies twice for one request
– Bug 3: cache unit gets updated twice from one reply
Refinement check is an automatic way of constructing checks
43
Model Checking Approaches
Monolithic– Straightforward property check
Compositional– Divide and conquer
Product model in VHDL
Monolithic
Compositional
44
Compositional Refinement Check
Reduce the verification complexity Basic Techniques
– Abstraction • Removing details to make verification easier
– Assume guarantee• A simple form of induction which introduces assumptions and
justifies them
45
In More Detail
Abstraction– Change variables to free input variables
– E.g. change a latch to free input signal
Assume guarantee
(spec.Var = impl.Var) holds
Assume for reads of a transaction
46
Experimental Results
Verification Time
1-bit 10-bit
1-day
Datapath
Configurations– 2 nodes, 2 addresses, SixthSense
30 min
Monolithic approachCompositional approach
47
Outline
BackgroundProposed solutions
High level hierarchical coherence protocol verificationRefinement check: specifications vs. RTL implementations
Conclusion
48
2005
Abstraction + assume guarantee Inclusive M-CMP protocols Chen et al. FMCAD 2006
Transaction based refinement check
Hierarchical protocols verification
2006 2007 2008
Transaction based refinement check Complete case study for a benchmark Chen et al. TECHCON 2007 Best session paper in verification
Extensions: refinement check
Present Predicate abstraction for Murphi Bounded Transaction based testing Chen et al. UUCS-06-002, UUCS-06-003
Starting practices
Hierarchical protocols verification
Refinement theory Modular refinement check Chen et al. FMCAD 2007
Improved approach: one level a time Automated abstraction Non-inclusive M-CMP protocols Chen et al. HLDVT 2007
Make muv a practical tool
Thesis Timeline
49
Thank you.
50
Related Work
Parameterized verification– Chou et al.
Bluespec– Arvind et al.
Aggregation of distributed actions – Park and Dill
Compositional verification– Many previous works including McMillan, Jones, etc.