page 1 on formal equivalence verification of hardware zurab khasidashvili formal technology and...
TRANSCRIPT
Page 1
On Formal Equivalence Verification of Hardware
Zurab Khasidashvili
Formal Technology and Logic GroupIntel Corporation, Haifa
Page 2
Overview
How we represent hardwareWhy hardware equivalence verification is neededHow we define hardware equivalenceHow we perform hardware equivalence verification – Combinational verification– Sequential verification– Compositional methods
Seqver tool and underlying enginesExperimental results
Page 3
• Microprocessor logic complexity and Microprocessor logic complexity and performance doubles every 18 monthsperformance doubles every 18 months
4004
4004
8008
8008
8080
8080 80
8680
8680
8880
88 286
286 i386
i386
TM
TM p
roce
ssor
pro
cess
or
i486
i486
TM
TM p
roce
ssor
pro
cess
or
Pen
tium
® p
roce
ssor
Pen
tium
® p
roce
ssor
Pen
tium
® P
ro p
roce
ssor
Pen
tium
® P
ro p
roce
ssor
Pen
tium
® II
pro
cess
orP
entiu
m®
II p
roce
ssor
Inte
l® C
eler
on®
pro
cess
orIn
tel®
Cel
eron
® p
roce
ssor
Pen
tium
® II
I pro
cess
orP
entiu
m®
III p
roce
ssor
Pen
tium
® 4
pro
cess
orP
entiu
m®
4 p
roce
ssor
Inte
l Ita
niu
m®
pro
cess
or
Inte
l Ita
niu
m®
pro
cess
or
Inte
l® X
eon
Inte
l® X
eon
TM
TM p
roce
sso
r p
roce
sso
r Inte
l Ita
nium
® 2
pro
cess
orIn
tel I
tani
um®
2 p
roce
ssor
Inte
l Dua
l Cor
e Ita
nium
® p
roce
ssor
Inte
l Dua
l Cor
e Ita
nium
® p
roce
ssor
19701970 19751975 19801980 19851985 19901990 19951995 20002000 20052005 20102010
1000010000100000100000
10000001000000
1000000010000000
100000000100000000
10000000001000000000
1000000000010000000000
Years of IntroductionYears of Introduction
TransistorsTransistorsMoore’s Law - 1965
Page 4
Challenges of Hardware Verification
The more complex a hardware design is, the more complex its verification becomesA chip design project needs both designers and ValidatorsValidation headcount is increasing from project to project– Both in absolute terms and as a ratio of Validators to Designers– At least 2 Validators per designer
Validation effort and risk considerations are causing us to:– Limit the introduction of new design features– Slow our response to market needs
Validation is on the critical path to tapeout– Completeness of validation determines the readiness of hardware
design to tapeout– Bugs in the design are found both before and after the first silicon
production (before mass production)– Bugs after mass production can have disastrous consequences, both
for users and for the production company
Page 5
Classification of Pre-Silicon Bugs
Goof
Complexity
Logic/Microcode changeMicroarchitecture
Corner cases
Documentation Design mistake
Incorrect assertion
Random initialization
Late definition
Miscommunication
Power related
Implementatio
n Bugs
Incorrect assertionsIncorrect assertionsBlunderBlunder
Late definitionLate definition
Design mistakeDesign mistake
ComplexityComplexity
Logic/MicrocodeLogic/Microcodechangechange
Power relatedPower related
MiscommunicationMiscommunication
Random Random
initializationinitialization
DocumentationDocumentation
Corner casesCorner cases
Micro architectureMicro architecture
Impl
emen
tatio
n bu
gs
Impl
emen
tatio
n bu
gs
Page 6
The Purpose of Equivalence Verification
The same functional behavior (input-output behavior) can be implemented in many different waysWe need to optimize the implementation through:– Timing – to achieve better performance of computer
chips– Power – to reduce power consumption and insure
longer battery life– Area – to produce smaller computer chips
We need to prove that hardware design optimization does not change the functional behavior
Page 7
RTL Model
Register-Transfer Level description written in a hardware description language (Verilog, System Verilog, etc.) looks like:
always_latch begin for(int portnum = 0; portnum <= (WR_PORTS-1); portnum++) if(!ckwrcbout[portnum]) for(int i = WR_LATENCY-1; i > 0; i = i-2) LAT_Wr[portnum][i] <= LAT_Wr[portnum][i-1]; end
always_latch begin for(int portnum = 0; portnum <= (WR_PORTS-1); portnum++) if(!ckwrcbout[portnum]) for(int i = WR_LATENCY-1; i > 0; i = i-2) LAT_Wr[portnum][i] <= LAT_Wr[portnum][i-1]; end
Page 8
Schematic model
AA OUTOUT
BB
A B ( (A&B)0 0 10 1 11 0 11 1 0
Page 9
Basic FEV flow
CompileCompile
Modify the Schematic model
Verification
passed
RTLRTL
ExtractExtract
FEVFEV
DebugDebug
Verification failed
SchematicsSchematics
FSM1FSM1 FSM2FSM2
Modify the RTL model
Page 10
FEV - Formal Equivalence Verification
|a-b|
+
|a-b|
+
|a-b|
+
|a-b|
+00
01
1-
'0
resetvalid
RTL
|a-b|
|a-b|
|a-b|
|a-b|
+
'0
CSA
CSA
CSA
00
01
1-
CSA
'0
resetvalid
00
01
1-
SCH
Page 11
SCH:
|a-b|
+
|a-b|
+
|a-b|
+
|a-b|
+
00
01
1-'0
reset
valid
a0
b0
a1
b1
a2
b2
a3
b3
RTL:
|a-b|
|a-b|
|a-b|
|a-b|
+
'0
CSA
CSA
CSA
00
01
1-CSA '0
reset
valid
00
01
1-
a0
b0
a1
b1
a2
b2
a3
b3
Page 12
SCH:
|a-b|
+
|a-b|
+
|a-b|
+
|a-b|
+
00
01
1-'0
reset
valid
a0
b0
a1
b1
a2
b2
a3
b3
Spec:
|a-b|
|a-b|
|a-b|
|a-b|
+
'0
CSA
CSA
CSA
00
01
1-CSA '0
reset
valid
00
01
1-
a0
b0
a1
b1
a2
b2
a3
b3
Large and complex data
paths
Page 13
SCH:
|a-b|
+
|a-b|
+
|a-b|
+
|a-b|
+
00
01
1-'0
reset
valid
a0
b0
a1
b1
a2
b2
a3
b3
RTL:
|a-b|
|a-b|
|a-b|
|a-b|
+
'0
CSA
CSA
CSA
00
01
1-CSA '0
reset
valid
00
01
1-
a0
b0
a1
b1
a2
b2
a3
b3
Different encoding
Page 14
SCH:
|a-b|
+
|a-b|
+
|a-b|
+
|a-b|
+
00
01
1-'0
reset
valid
a0
b0
a1
b1
a2
b2
a3
b3
RTL:
|a-b|
|a-b|
|a-b|
|a-b|
+
'0
CSA
CSA
CSA
00
01
1-CSA '0
reset
valid
00
01
1-
a0
b0
a1
b1
a2
b2
a3
b3
Extensive undocumented
re-timing
Page 15
SCH:
|a-b|
+
|a-b|
+
|a-b|
+
|a-b|
+
00
01
1-'0
reset
valid
a0
b0
a1
b1
a2
b2
a3
b3
RTL:
|a-b|
|a-b|
|a-b|
|a-b|
+
'0
CSA
CSA
CSA
00
01
1-CSA '0
reset
valid
00
01
1-
a0
b0
a1
b1
a2
b2
a3
b3
Tricky clocking scheme
Page 16
A circuit and its FSM
A Finite State Machine (FSM) is a tuple M = (S,I,O,,), where S is a finite set of state variables, 2S is the set of states, I is a finite set of input variables, 2I is the set of inputs, O is a finite set of output variables, 2O is the set of outputs, : 2S x 2I 2S is a transition function and 2S 2O is an output function.
o1
11
0
00
Page 17
Next state computation
• A state is an assignment to state elements (flip-flops)A state is an assignment to state elements (flip-flops)
• FF1=1,FF2=1,FF3=0 is a state, the output o=FF1=1,FF2=1,FF3=0 is a state, the output o=FF3=1FF3=1
• Suppose i1=1,i2=0Suppose i1=1,i2=0• In the next state: values at FF inputs propagate In the next state: values at FF inputs propagate
to the outputsto the outputs• The values of logic gates and outputs are adjustedThe values of logic gates and outputs are adjusted
i1
i2
FF1
FF2
FF3 o\/\//\/\
Page 18
Next state computation
11
0
1
1
1
0 100
00
11
outputoutput
statestate1 0 11 0 1
00
1 11 1
inputinput
1 1 01 1 0
11
0 10 1
inputinput
0 1 10 1 1
00
0 10 1
inputinput
……
11
11 11 00
00
00
11
1111 111111
11
11
11
0000
00
0000
• In all the states reachable from state 101, property In all the states reachable from state 101, property l1 l1 l2l2 is valid is validThis means that linear temporal property This means that linear temporal property always (l1 always (l1 l2 )l2 ) is valid is valid
Page 19
State Equivalence
M2 (SCH)
M1 (RTL)
i1
i2
FF’ FF3 o
i1
i2
FF1
FF2
FF3 o
Page 20
State Equivalence
1 11 1
00
1 11 11 01 0
11
1 01 0 1 11 1
00
0 10 1
101101
00
1 11 1010010
11
1 01 0 001001
00
0 10 1
1 01 0
11
1 11 10 00 0
11
1 01 0 1 11 1
00
0 10 1
equivalent statesequivalent states
Not equivalentNot equivalent
M2 (SCH)
M1 (RTL)
i1
i2
FF’ FF3 o
i1
i2
FF1
FF2
FF3 o
Page 21
State Equivalence
States s1 and s2 in hardware models M1 and M2 are equivalent states (s1 ≃ s2) iff for any input sequence the corresponding outputs of M1 and M2 in states t1 and t2 obtained from s1 and s2 by applying are equal.
tt11
tt22
ss11 ss22
Out(tOut(t11) = Out(t) = Out(t22))
Page 22
Hardware Equivalence
When comparing two hardware models M1 and M2, we assume that they are compatible, that is, there is a one-to-one correspondence between their inputs and outputsWhen the initial states s1 and s2 are given for M1 and M2, then M1 and M2 are called equivalent iff s1 and s2 are equivalentFor hardware, the power-up state cannot be determined uniquely. Thus, the initial state (or a set of initial states) is computed by applying a reboot sequence to the design; the reboot sequence must bring the design into a set of equivalent states
Page 23
Weak synchronization
An input sequence is a weakly synchronizing sequence for M if it brings M from any state to a subset of equivalent states {s1,…,sm}, which are called weak synchronization states of M.
tt11
tt22
ss55
ss22
tt11 ≃ t t2 2 ≃ t t33ss33ss11
ss44 tt33
Page 24
A circuit without a weak synchronizing sequence
• This circuit does not have a weakly synchronizing sequenceThis circuit does not have a weakly synchronizing sequence• The two states are not equivalent (their outputs are different)The two states are not equivalent (their outputs are different)• No input sequence can bring them into a set of equivalent statesNo input sequence can bring them into a set of equivalent states• The input-output behavior of this circuit is not deterministic –The input-output behavior of this circuit is not deterministic –
it depends on the power-up stateit depends on the power-up state• Therefore, we only consider weakly synchronizable circuitsTherefore, we only consider weakly synchronizable circuits
o1
11
0
00
Page 25
Hardware Machines
In practice, the operation states of a circuit make a proper subset of WS states. Thus, choosing any weakly synchronizing sequence as the reboot sequence is not enough -- there are “good” and “bad” reboot sequences.Thus, by a hardware machine we mean a pair (M,), in which M is a hardware model and is a weakly synchronizing input sequence for M. The pair (M, ) defines a set of operation states of (M,): OP(M, OP(M, )) = = {s|{s|t. t. : t: t**o & oo & o**s}. s}.
We are interested in the behavior of We are interested in the behavior of (M, ) in its operation in its operation states.states.
Page 26
Hardware equivalence concepts
Many concepts of equivalence were studied in the literatureMost basic equivalence concepts do not require a reboot sequence:– Combinational equivalence– Replaceability
Other equivalence concepts require some form of reboot sequence before the comparison of the output behavior of RTL and SCH models starts– Delayed safe replaceability – Sequential hardware equivalence or alignability– Exact 3-valued equivalence – 3-valued safe replaceability – Steady state equivalence
We will focus on alignability equivalence concept as we believe it fits well hardware verification
Page 27
Alignability Equivalence Pixley 1989
An input sequence is an aligning sequence for states s1,s2 in FSMs M1and M2 if it brings M1and M2 from states s1 and s2 into equivalent states.
FSMs M1 and M2 are alignable (M1≃alnM) iff every state pair of M1and M2 has an aligning sequence
Equivalently, M1≃alnM2 iff a universal aligning sequence aligns every state pair of M1and M2.
tt11
tt22
ss11 ss22
t1 t1 ≃ t2 t2
Page 28
Alignability Theorem
Theorem: FSMs M1 and M2 are alignable iff both of them are weakly synchronizable and have an equivalent state pair.
“Big” questions:– How can we prove existence of equivalent states in M1
and M2?– Given a reboot sequence for M1 (or M2), how can we
prove that it is weakly synchronizing for M1 (or M2)?– Besides, if we prove that M1 and M2 are alignable, can
we be sure that all temporal properties valid on M1 will be valid on M2 as well?
Page 29
State-matching design
Contemporary Intel circuits (computer chips) have millions of logic gates and hundreds of thousands of state elements (flip-flops).So how equivalence of RTL and SCH models M1 and M2 used to be proved?In the past, equivalence could only be proved for state-matching circuits M1 and M2
Page 30
RTL Schematic
Combinational vs sequential verification
RTL Schematic
Combinational equivalence verificationCombinational equivalence verification
Sequential equivalence verificationSequential equivalence verification
Page 31
Scalable solution: compositionality of alignability
Theorem [KSKH06]: Let M1 and M2 be compatible FSMs and suppose they are decomposed into smaller FSMs M1
i and
M2i, respectively. Further, Let both M1 and
M2 be weakly synchronizable, And for each i, M1
i and M2i are alignable.
Then M1 and M2 are alignable.
Page 32
Usage of properties
i1
i2
FF1
FF2
FF3 = o
M1
component A1 component B1
i1
i2
FF1
FF2
FF3 = o
M2
component A2 component B2
• It is safe to use l1=It is safe to use l1=l2 since it is valid in all operation statesl2 since it is valid in all operation states
Page 33
Combinational FEV is impossible
M2 (SCH)
M1 (RTL)
i1
i2
FF’ FF3 o
i1
i2
FF1
FF2
FF3 o
Page 34
How to build equivalent states
We build weak synchronization states for slices in M1 and M2 using a model checker (a counterexample to a suitable property serves as a weak synchronization sequence)We use the boundary properties as constraints in the process of building weakly synchronizing sequences It is guaranteed that if for every slice pair M1
i and M2
i we build equivalent states s1i and s2
i, then the induced states on M1 and M2 are equivalent
Page 35
Weak synchronization is not compositional
One cannot build a weakly synchronizing sequence for an FSM from weakly synchronizing sequences of its constituent sub-FSMsThat is why, in the compositionality theorem, we require both FSMs to be weakly initializable
0 1
1 0
0 0
1 1 FF1
FF2 o
Cut points
Page 36
s1 s2 s3
s4 s5 s6
0/0
1/1
0/0
1/0
-/1
0/0
1/1
0/01/0
-/1
00 11 11 11
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
The stronger the sequence, the smaller The stronger the sequence, the smaller
the operation states: 0=01<011<0111=0111…the operation states: 0=01<011<0111=0111…
Weak-synchronization statesWeak-synchronization states
Partial order on ws-sequencesPartial order on ws-sequences
Page 37
s1 s2 s3
s4 s5 s6
0/0
1/1
0/0
1/0
-/1
0/0
1/1
0/0
1/0-/1
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
eq
uiv
ale
nt
Weak-synchronization statesWeak-synchronization states
00 11 11 11
Partial order on ws-sequencesPartial order on ws-sequences
Page 38
Alignability does not preserve the validity of temporal properties
The two FSM are alignableLet P be true in {s4, s5, s6}Let P be true in {s4, s5, s6}Let 0 be the reboot sequence used for both FSMsLet 0 be the reboot sequence used for both FSMsThen P is valid in the operation states of FSM 2Then P is valid in the operation states of FSM 2But P is not valid in some operation states of FSM But P is not valid in some operation states of FSM
s1 s2 s3
s4 s5 s6
0/0
1/1
0/0
1/0
-/1
0/0
1/1
0/01/0
-/1
s1 s2 s3
s4 s5 s6
0/0
1/1
0/0
1/0
-/1
0/0
1/1
0/01/0
-/1
• Thus, alignability equivalence does not preserve Thus, alignability equivalence does not preserve the validity of temporal propertiesthe validity of temporal properties
• That is, if RTL model is designed correctly, its That is, if RTL model is designed correctly, its `’equivalent’’ Schematic model may not behave `’equivalent’’ Schematic model may not behave correctly!!correctly!!
Page 39
How to ensure preservation of temporal properties?
Suppose that M1 and M2 have observable logic gates or state elements (flip-flops) such that there is a one-to-one correspondence between the observables of M1 and M2Let us call states s1 and s2 of M1 and M2 observably equivalent if they are equivalent when all observable variables are considered as outputsAnd let us call an input sequence observably synchronizing for M1 (or M2) if it is weakly synchronizing sequence when the observables are considered as outputs
Page 40
Preservation of the validity of temporal properties
Theorem: Let (M1,) and (M2,2) be compatible hardware machines with a
set of observables O1 and O2 , respectively, such that there is a one-to-one correspondence between observable variables in O1 and O2. Further,– Let decompositions of M1 and M2 be given such that the inputs and
outputs of the sub-FSMs are observable variables– Assume that the corresponding sub-FSMs in M1 and M2 have states
that are equivalent under the corresponding input constraints– Finally, let be observably synchronizing for M1 and 2 be
observably synchronizing for M2
Then,– M1 and M2 have an observably equivalent state pair (which is also an
equivalent state pair of M1 and M2)– An observable temporal property in M1 is valid in the operation
states of (M1, ) iff it is valid in the operation states of (M2, 2)
Page 41
Experimental data on proving the state equivalence of sub-
FSMs
Circuit Logic gates Flip-flops Inputs / Outputs Runtime (sec)
C1 362604 12693 397/389 4700
C2 423070 23913 302/349 5722
C3 157110 22080 578/572 2478
Circuit Sub-FSMs Aver. inputs Average gates Average Flip-flops
C1 524 650 11171 804
C2 402 102 824 33
C3 134 233 2874 187
Page 42
Experimental data on latch mapping /design abstraction
Circuit Outputs Lathes Mapped % Latches Mapped % Abstraction CPU
C1 29 125 59 47% 195 59 30% 64% 85
C2 58 635 634 100% 1212 642 53% 52% 62
C3 65 271 129 48% 438 133 30% 62% 642
C4 84 2635 2635 100% 4185 3507 84% 63% 817
C5 114 2902 2787 96% 5454 2813 52% 53% 1273
C6 151 232 232 100% 323 240 74% 72% 21
C7 196 235 235 100% 464 244 53% 51% 274
C8 205 1129 1105 98% 1354 1338 99% 83% 35
C9 208 820 698 85% 1020 429 42% 80% 739
C10 221 32 0 0% 64 0 0% 50% 31
C11 232 441 95 22% 554 95 17% 80% 1418
C12 259 675 669 99% 1167 689 59% 58% 263
C13 399 67 57 85% 126 57 45% 53% 54
C14 848 1370 1085 79% 1808 1085 60% 76% 4471
C15 1040 1627 964 59% 2055 963 47% 79% 600
RTL Schematic
Page 43
Underlying engines
Propositional validity checkers– BDDs (Binary Decision Diagrams) and SAT
(satisfiability) checkers– Many customized strategies for propositional problems
Model checking engines– BDD-based fixpoint reachability analysis– SAT-based BMC (Bounded Model Checking) algorithm,
induction algorithm, and many customized strategies for very complex model-checking problems
Symbolic simulation engine (not discussed in this talk)
44 Page 44
Impact of sequential verification on chip design
Sequential verification Enables a higher design abstraction level and compact RTL design– with no “ASCII schematics” in RTL and– better readability, stability and verifiability of design
Accelerates design implementation and reduces “over design”– much of the design cycle deals with “tweaking” the
circuit to meet timing, area, and other constraints– most of these changes should not alter the visible
behavior of a module
Page 45
Seqver Status
Seqver is a signoff tool used at Intel for CPU design
Seqver is extensively used by hundreds of design engineers developing new generation processors
Seqver is the first comprehensive sequential verification solution in the industry
Page 46
Summary
We have shown how to prove existence of equivalent states compositionallyWe have shown how to ensure preservation of the validity of temporal properties between equivalent modelsWe could not discuss compositional methods for proving that an input sequence is a valid reboot sequenceAnd we did not discuss methods for automatic synchronization of sub-FSMs in decomposition
Page 47
References
Kohavi, Switching and Finite Automata Theory, McGraw-Hill, 1978.Pixley, A theory and implementation of sequentialhardware equivalence, IEEE transactions on CAD, 1992.Pomerance, Reddy, On removing redundancies fromsynchronous sequential circuits with synchronizing sequences,IEEE Trans. Comput., 1996.Khasidashvili, Skaba, Kaiss and Hanna. Theoretical Framework for Compositional Sequential Hardware Equivalence Verification in Presence of Design Constraints, ICCAD 2004.Khasidashvili, Skaba, Kaiss, Hanna. Post-reboot equivalence and compositional verification of hardware, FMCAD 2006.Kaiss, Goldenberg, Hanna, Khasidashvili. Seqver: A Sequential Equivalence Verifier for Hardware Designs, ICCD 2006.Khasidashvili, Bustan, Kaiss, Local and Global Semantics of Assertions for Post-reboot Equivalence Verification of Hardware, submitted.
Page 48
THANK YOU!