abc: a system for sequential synthesis and verification berkeley logic synthesis and verification...
TRANSCRIPT
ABC: A System for ABC: A System for Sequential Synthesis and Sequential Synthesis and
VerificationVerification
Berkeley Berkeley Logic Synthesis and Verification Logic Synthesis and Verification
GroupGroup
Robert Brayton Robert Brayton Alan MishchenkoAlan Mishchenko
OverviewOverview• IntroductionIntroduction
– What and why ABC?What and why ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
A Plethora of ABCsA Plethora of ABCshttphttp://en.wikipedia.org/wiki/Abc://en.wikipedia.org/wiki/Abc• ABC (American Broadcasting Company)ABC (American Broadcasting Company)
– A television network…A television network…• ABC (Active Body Control)ABC (Active Body Control)
– ABC is designed to minimize body roll in ABC is designed to minimize body roll in corner, accelerating, and braking. The system corner, accelerating, and braking. The system uses 13 sensors which monitor body uses 13 sensors which monitor body movement to supply the computer with movement to supply the computer with information every 10 ms…information every 10 ms…
• ABC (ABC (Abstract Abstract BBase ase CClasslass) ) – In C++, these are generic classes at the base In C++, these are generic classes at the base
of the inheritance tree; objects of such abstract of the inheritance tree; objects of such abstract classes cannot be created… classes cannot be created…
• ABC (supposed to mean “as simple as ABC”)ABC (supposed to mean “as simple as ABC”)– A system for sequential synthesis and A system for sequential synthesis and
verification at Berkeleyverification at Berkeley
Why We Decided to Build ABCWhy We Decided to Build ABC• SISSIS
– Outdated, but many research papers on how a new algorithm beats SIS Outdated, but many research papers on how a new algorithm beats SIS resultsresults
– Not supportedNot supported• MVSISMVSIS
– Gave us a reason to work on logic synthesisGave us a reason to work on logic synthesis– Learned a lot about new methods and better data structuresLearned a lot about new methods and better data structures– Could see how specializing to binary could provide substantial improvements.Could see how specializing to binary could provide substantial improvements.
• ABCABC– Initial intention was to re-implement all algorithms using new data structures Initial intention was to re-implement all algorithms using new data structures
(daunting task)(daunting task)– Discovered rewriting AIGs Discovered rewriting AIGs
• P. Bjesse and A. Boralv, "DAG-aware circuit compression for formal verification", P. Bjesse and A. Boralv, "DAG-aware circuit compression for formal verification", Proc. ICCAD ’04, pp. 42-49.Proc. ICCAD ’04, pp. 42-49.
– Decided to try to keep all transformations Decided to try to keep all transformations fastfast and and scalablescalable• No BDDsNo BDDs• No SOPsNo SOPs• No EspressoNo Espresso
BDDBDD
What Is Berkeley ABC?What Is Berkeley ABC?
• A system for logic A system for logic synthesissynthesis and and verificationverification– FastFast– ScalableScalable– High quality results (industrial strength)High quality results (industrial strength)– Exploits Exploits synergysynergy between synthesis and verification between synthesis and verification
• A programming environmentA programming environment– Open-sourceOpen-source– Evolving and improving over timeEvolving and improving over time
Design FlowDesign Flow
System Specification
RTLRTL
Logic synthesisLogic synthesis
Technology mappingTechnology mapping
Physical synthesisPhysical synthesis
Manufacturing
ABC Verification
Verification
ScreenshotScreenshot
Areas Addressed by ABCAreas Addressed by ABC
• Combinational synthesisCombinational synthesis– AIG rewritingAIG rewriting– technology mappingtechnology mapping– resynthesis after mappingresynthesis after mapping
• Sequential synthesisSequential synthesis– retimingretiming– structural register sweepstructural register sweep– merging seq. equiv. nodesmerging seq. equiv. nodes
• Formal verificationFormal verification– combinational equivalence checkingcombinational equivalence checking– bounded sequential verificationbounded sequential verification– unbounded sequential verificationunbounded sequential verification– equivalence checking using synthesis historyequivalence checking using synthesis history
Combinational SynthesisCombinational Synthesis
a b a c
Subgraph 1
b c
a
Subgraph 2
• Pre-computing AIG subgraphsPre-computing AIG subgraphs– Consider function f = abcConsider function f = abc
a c
b
Subgraph 3
Rewriting AIG subgraphsRewriting AIG subgraphsRewriting node A
Rewriting node B
a b a c
a b a c
A
Subgraph 1
b c
a
A
Subgraph 2
b c
a
B
Subgraph 2
a b a c
B
Subgraph 1
In both cases 1 node is savedIn both cases 1 node is saved
• AIG rewritingAIG rewriting minimizes the number of AIG nodes without minimizes the number of AIG nodes without increasing the number of AIG levelsincreasing the number of AIG levels
Technology MappingTechnology MappingInput: A Boolean network (And-Inverter Graph)
Output: A netlist of K-LUTs implementing AIG and optimizing some cost function
The subject graph The mapped netlist
TechnologyMapping
a b c d
f
e a b c d e
f
Sequential SynthesisSequential Synthesis
• Structural register sweep (Structural register sweep (scleanupscleanup))– Merge registers with identical driversMerge registers with identical drivers– Replace stuck-at registers by constantsReplace stuck-at registers by constants
• Retiming (Retiming (dretimedretime))– Minimize the number of registers under delay Minimize the number of registers under delay
constraintsconstraints– Preserves equivalent initial statePreserves equivalent initial state
• Sequential SAT sweeping (Sequential SAT sweeping (scorrscorr))– Detecting and merging sequencially equivalent nodesDetecting and merging sequencially equivalent nodes
Formal VerificationFormal Verification• Equivalence checkingEquivalence checking
– Takes two designs and makes Takes two designs and makes a miter (AIG)a miter (AIG)
• Model checking Model checking safetysafety propertiesproperties– Takes design and property and Takes design and property and
makes a miter (AIG)makes a miter (AIG)
The goals are the same: to The goals are the same: to transform AIG until the transform AIG until the output is proved constant 0output is proved constant 0
Breaking News:Breaking News: ABC won a ABC won a model checking competition model checking competition at CAV in August 2008at CAV in August 2008
D2D2D1D1
Equivalence checkingEquivalence checking
0
D1D1
Property checkingProperty checking
0
pp
Model Checking CompetitionModel Checking Competition
5. ABC 238
Time(sec)
# problems solved
ABCABC
Command “dprove” in ABCCommand “dprove” in ABC• transforming initial state (“undc”, “zero”)transforming initial state (“undc”, “zero”)• converting into an AIG (“strash”)converting into an AIG (“strash”)• creating sequential miter (“miter -c”)creating sequential miter (“miter -c”)• combinational equivalence checking (“iprove”)combinational equivalence checking (“iprove”)• bounded model checking (“bmc”)bounded model checking (“bmc”)• sequential sweep (“scl”)sequential sweep (“scl”)• phase-abstraction (“phase”)phase-abstraction (“phase”)• most forward retiming (“dret -f”) most forward retiming (“dret -f”) • partitioned register correspondence (“lcorr”)partitioned register correspondence (“lcorr”)• min-register retiming (“dretime”)min-register retiming (“dretime”)• combinational SAT sweeping (“fraig”)combinational SAT sweeping (“fraig”)• for ( K = 1; K for ( K = 1; K 16; K = K * 2 ) 16; K = K * 2 )
– signal correspondence (“scorr”)signal correspondence (“scorr”)– stronger AIG rewriting (“dc2”)stronger AIG rewriting (“dc2”)– min-register retiming (“dretime”)min-register retiming (“dretime”)– sequential AIG simulationsequential AIG simulation
• interpolation (“int”)interpolation (“int”)• BDD-based reachability (“reach”)BDD-based reachability (“reach”)• saving reduced hard miter (“write_aiger”)saving reduced hard miter (“write_aiger”)
Preprocessors
Combinational solver
Fast engines
Medium engines
Slower
Main induction loop
Last-gasp engines
ABC vs. Other ToolsABC vs. Other Tools
Industrial Industrial + well documented, fewer bugs+ well documented, fewer bugs- black-box, push-button, no source code, often expensive- black-box, push-button, no source code, often expensive
SIS SIS + traditionally very popular+ traditionally very popular- data structures / algorithms outdated, weak sequential synthesis- data structures / algorithms outdated, weak sequential synthesis
VIS VIS + very good implementation of BDD-based verification algorithms+ very good implementation of BDD-based verification algorithms- not meant for logic synthesis, does not feature the latest SAT-based - not meant for logic synthesis, does not feature the latest SAT-based
implementationsimplementations MVSIS MVSIS
+ allows for multi-valued and finite-automata manipulation+ allows for multi-valued and finite-automata manipulation- not meant for binary synthesis, lacking recent implementations- not meant for binary synthesis, lacking recent implementations
How Is ABC Different From SIS?How Is ABC Different From SIS?
Equivalent AIG in ABCEquivalent AIG in ABC
aa bb cc dd
ff
ee
xxyy
zz
Boolean network in SISBoolean network in SIS
aa bb cc dd
ee
xx yy
ff
zz
ze
xd yd xy
ab cd cd
AIG is a Boolean network of 2-input AND nodes and invertors (dotted lines)
One AIG Node – Many CutsOne AIG Node – Many Cuts
Combinational AIGCombinational AIG
aa bb cc dd
ff
ee
• Manipulating AIGs in ABCManipulating AIGs in ABC– Each node in an AIG has many cutsEach node in an AIG has many cuts– Each cut is a Each cut is a differentdifferent SIS node SIS node– No a priori fixed boundariesNo a priori fixed boundaries
• Implies that AIG manipulation with Implies that AIG manipulation with cuts is equivalent to working on cuts is equivalent to working on manymany Boolean networks at the Boolean networks at the same timesame time
Different cuts for the same nodeDifferent cuts for the same node
Comparison of Two SynthesesComparison of Two Syntheses
“ “Classical” synthesisClassical” synthesis
• Boolean networkBoolean network• Network manipulation Network manipulation
(algebraic)(algebraic)– EliminationElimination– Factoring/DecompositionFactoring/Decomposition– Speedup Speedup
• Node minimizationNode minimization– EspressoEspresso– Don’t cares computed using Don’t cares computed using
BDDsBDDs– Resubstitution Resubstitution
• Technology mappingTechnology mapping– Tree basedTree based
ABCABC “contemporary” synthesis“contemporary” synthesis
• AIG networkAIG network• DAG-aware AIG rewriting (Boolean)DAG-aware AIG rewriting (Boolean)
– Several related algorithmsSeveral related algorithms• RewritingRewriting• RefactoringRefactoring• BalancingBalancing• Speedup Speedup
• Node minimizationNode minimization– Boolean decompositionBoolean decomposition– Don’t cares computed using simulation Don’t cares computed using simulation
and SATand SAT– Resubstitution with don’t caresResubstitution with don’t cares
• Technology mappingTechnology mapping– Cut based with choice nodesCut based with choice nodes
Existing Capabilities (2005-2008)Existing Capabilities (2005-2008)
ABC
Combinational logic synthesisFast, scalable, good quality
Technology mapping with structural choicesCut-based, heuristic, good Cut-based, heuristic, good area/delay, flexiblearea/delay, flexible
Sequential synthesisInnovative, scalable, Innovative, scalable, verifiableverifiable
Sequential verificationIntegrated, interacts with synthesis
OverviewOverview• IntroductionIntroduction
– What is ABC?What is ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
• SummarySummary
Command “speedup”Command “speedup”
Timing CriticalityTiming Criticality• Critical nodesCritical nodes
– Used by many traditional Used by many traditional algorithmsalgorithms
• Critical edgesCritical edges– Used by our algorithmUsed by our algorithm
• We pre-compute critical edges We pre-compute critical edges of critical nodesof critical nodes– Reduces computationReduces computation
• An edge between critical An edge between critical nodes may not be criticalnodes may not be critical– See illustration: edge 1See illustration: edge 133
1
2
3
44
3
2
1
Primary inputs
Primary outputs
DDelay-elay-OOriented riented RRestructuring estructuring
F00 F01 F10 F11
x y
x
y
F F
• Using traditional MUX-restructuring Using traditional MUX-restructuring – AKA AKA generalized select transformgeneralized select transform
x and y are the critical edge inputs
Overall AlgorithmOverall Algorithmmapped netlist mapped netlist performSpeedupperformSpeedup ( ( subject graph S, // S is an And-Inverter Graph subject graph S, // S is an And-Inverter Graph mapped netlist M, // M was previously derived by tech-mapping of S mapped netlist M, // M was previously derived by tech-mapping of S timing window w, // w is used to detect the critical pathstiming window w, // w is used to detect the critical paths logic depth l, // l is used to detect a logic cone rooted at a nodelogic depth l, // l is used to detect a logic cone rooted at a node edge count p ) // p limits the number critical edges of the cone edge count p ) // p limits the number critical edges of the cone {{ perform perform timing analysistiming analysis of M with unit-delay or LUT-library model; of M with unit-delay or LUT-library model; pre-compute pre-compute critical sectioncritical section of M as nodes n such that 0 of M as nodes n such that 0 slack(n) slack(n) w; w; pre-compute pre-compute timing-critical edgestiming-critical edges connecting these nodes; connecting these nodes; for each timing critical node nfor each timing critical node n { { find find cone Ccone C of M that extends of M that extends ll levels down from levels down from nn; ; pick the set of pick the set of timing-critical edges Vtiming-critical edges V feeding into C; feeding into C; if the number of edges in V exceeds p, continue; if the number of edges in V exceeds p, continue; find find logic cone C’logic cone C’ in S corresponding to C in M; in S corresponding to C in M; find find variables V’variables V’ in S corresponding to V in M; in S corresponding to V in M; derive derive cofactors cofactors of the function of C’ w.r.t. variables in V’;of the function of C’ w.r.t. variables in V’; build build multiplexer treemultiplexer tree C’’ of the cofactors using variables in V’; C’’ of the cofactors using variables in V’; add add structural choicestructural choice C’= C’’ to the subject graph S; C’= C’’ to the subject graph S; }} returnreturn mapped netlistmapped netlist M’ derived by mapping subject graph S with added choices; M’ derived by mapping subject graph S with added choices;}}
Done only once
Experimental Results for Experimental Results for ““speedupspeedup””Design Profile Baseline Speedup
PI PO Reg LUT Lev Delay Total LUT Lev Delay Time1, s Time2, s
11 2,061 1,897 13,950 16,531 7 3.15 77.70 16,652 7 2.95 9.33 87.95
12 50 68 1,358 3,284 19 8.40 23.88 3,371 16 7.00 3.46 28.68
13 1,044 1,098 2,074 7,147 23 9.35 74.39 7,789 16 6.65 7.37 86.71
14 391 129 1,049 7,526 14 6.05 251.11 7,573 14 6.05 27.29 280.41
15 749 777 7,348 16,086 10 4.35 169.25 16,097 9 4.00 18.48 188.00
16 1,041 736 1,063 3,611 11 4.70 19.63 3,621 11 4.65 2.77 22.71
17 3,512 2,992 3,425 12,533 20 8.45 178.58 12,830 17 7.40 13.19 199.36
18 11,456 10,791 10,114 27,622 15 6.25 160.22 28,857 10 4.35 22.29 184.63
19 11,292 11,454 20,184 49,871 12 5.00 317.79 50,283 9 3.75 37.83 355.19
20 131 129 26258 13,811 8 3.65 72.17 14,186 5 2.45 8.23 81.60
Geomean 10,804 11.49 4.99 72.13 11,023 9.80 4.29 8.77 82.29
Ratio 1 1 1 1 1.020 0.854 0.860
Ratio 2 0.107 1
LUT – number of LUTsLev – number of LUT levelsDelay – delay using LUT libraryTotal – total runtime of Baseline
Time1 – the runtime of AIG restructuring onlyTime2 – the total runtime of SpeedupGeomean – geometric averages of columnsRatios – ratios of geometric averages
OverviewOverview• IntroductionIntroduction
– What is ABC?What is ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
• SummarySummary
Basic Inner Core Algorithm (DSD)Basic Inner Core Algorithm (DSD)
We use a fast We use a fast disjoint support decompositiondisjoint support decomposition (DSD) algorithm as our (DSD) algorithm as our underlyingunderlying subroutine subroutine – follows Bertacco and Damiani, "The disjunctive follows Bertacco and Damiani, "The disjunctive
decomposition of logic functions“,decomposition of logic functions“, ICCAD '97 ICCAD '97– butbut
• uses heuristics to speed it upuses heuristics to speed it up• no BDDsno BDDs• uses truth tablesuses truth tables
– limit inputs to up to 16limit inputs to up to 16
BDD
Disjoint Support Decomposition (DSD) Disjoint Support Decomposition (DSD) (Simple Disjunctive Decomposition)(Simple Disjunctive Decomposition)
Theorem 1 [Ashenhurst 1959]. For a completely For a completely specified Boolean function, there is a specified Boolean function, there is a uniqueunique maximalmaximal DSD ( DSD (up to the complementation of inputs up to the complementation of inputs
and outputs and factoring of ANDs/ORs and XORsand outputs and factoring of ANDs/ORs and XORs). ).
( , ) ( ( ), )F a c H D a c E
C
D
A
B
G
x1
x2
x3
x4
x5
HF
Da c
a
c
1
Non-Disjoint DecompositionNon-Disjoint Decomposition
Definition: A function F has an ( ) -decomposition if it can be written as
where ( ) is a partition of the variables x and D is a single output function.
( ) ( ( , ), , )F x H D a b b c
H
D
a
c
b
The variables in the set bb are called the shared variables.
The variables a are called the
bound set and c the free set.
1
,a b
, ,a b c
Non-Disjoint DecompositionNon-Disjoint DecompositionTheorem 2: A function A function has an has an
- decomposition - decomposition if and only if each of the of the cofactors of cofactors of FF with respect to with respect to has a DSD has a DSD structure in which the variables structure in which the variables are in a are in a separate sub-tree.separate sub-tree.
E
C
D
A
B
G
x4
x5
x1
x2
x3
X
Z
W
Y
x4
x5
x1x2
4 5{ , }a x x
3{ }a x
cofactorb cofactorb
( , , )F a b c
( , )a b
b
a
Application of FactoringApplication of Factoring(uses Theorem 2)(uses Theorem 2)
Rewriting a Rewriting a kk-LUT mapped circuit.-LUT mapped circuit.• For For eacheach LUT, and LUT, and eacheach cut of no more than 16 cut of no more than 16
inputs, inputs, – express the output of the LUT as truth table in terms express the output of the LUT as truth table in terms
of the cut variables – of the cut variables – FF((xx))– Find variables Find variables bb such that its cofactors are such that its cofactors are support support
reducingreducing • we exhaustively look for up to two variables in the we exhaustively look for up to two variables in the bb set set
– Take the best (Take the best (a,b) a,b) set and decompose set and decompose FF==HH((DD((a,ba,b),),b,cb,c))
– Recursively decompose Recursively decompose HH and and D D if they do not fit into if they do not fit into a a kk-LUT.-LUT.
– If improvement, replace LUTs in cut with its new If improvement, replace LUTs in cut with its new decomposition.decomposition. Experimental results laterExperimental results later
OverviewOverview• IntroductionIntroduction
– What is ABC?What is ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
• SummarySummary
Windowing a Node in the NetworkWindowing a Node in the Networkfor Don’t-Care Computationfor Don’t-Care Computation
• DefinitionDefinition– A A windowwindow for a node in the for a node in the
network is the context in which network is the context in which the don’t-cares are computedthe don’t-cares are computed
• A window includes A window includes – nn levels of the TFI levels of the TFI – mm levels of the TFO levels of the TFO– all re-convergent paths all re-convergent paths
captured in this scopecaptured in this scope• Window with its PIs and POs can Window with its PIs and POs can
be considered as a separate be considered as a separate networknetwork
Window POs
Window PIs
n = 3
m = 3
Boolean network (k-LUT mapped circuit)
Care Set RepresentationCare Set Representation
““Miter” constructed for the window POsMiter” constructed for the window POs
WindowWindow
Same window Same window with inverterwith inverter
ss
…
xx
ff
xx
ff
Window Window
If output is 1 then we careIf output is 1 then we care
ResubstitutionResubstitution
Resubstitution considers a Resubstitution considers a nodenode in a in a BooleanBoolean network network and expresses it using a different set of faninsand expresses it using a different set of fanins
X X
Computation can be enhanced by use of don’t caresComputation can be enhanced by use of don’t cares
Resubstitution with Don’t-CaresResubstitution with Don’t-Cares
Consider all or some nodes in Boolean network.Consider all or some nodes in Boolean network.For each nodeFor each node• Create window Create window • Select possible fanin nodes (divisors)Select possible fanin nodes (divisors)• For each candidate For each candidate subsetsubset of divisors of divisors
– Rule out some subsets using Rule out some subsets using simulationsimulation– Check resubstitution feasibility using Check resubstitution feasibility using SATSAT– Compute resubstitution function using Compute resubstitution function using interpolationinterpolation
• A low-cost by-product of completed SAT proofsA low-cost by-product of completed SAT proofs
• Update the network if there is an improvementUpdate the network if there is an improvement
Resubstitution with Don’t CaresResubstitution with Don’t Cares• Given: Given:
– node function node function FF((xx)) to be replaced to be replaced– care set care set CC((xx)) for the node for the node
– candidate set of divisors candidate set of divisors {{ggii((xx)})} for for re-expressing re-expressing FF((xx))
• Find:Find:– A resubstitution function A resubstitution function hh((yy)) such such
that that FF((xx) = ) = hh((gg((xx)))) on the care set on the care set
• SPFD TheoremSPFD Theorem: : Function Function hh exists exists if and only if every pair of if and only if every pair of carecare minterms, minterms, xx11 and and xx22, distinguished , distinguished by by FF((xx),), is also distinguished by is also distinguished by ggii((xx)) for some for some ii
C(x) F(x) g1 g2 g3
C(x) F(x)
g1 g2 g3
h(g)
F’(x)
Checking Resubstitution using SATChecking Resubstitution using SAT
1.1. Note use of care set, Note use of care set, CC..
2.2. Resubstitution function exists if and only if SAT problem is unsatisfiable. Resubstitution function exists if and only if SAT problem is unsatisfiable.
3.3. An An hh((gg)) is obtained by is obtained by interpolationinterpolation
x1
f g1 g2 g3
1 1
0
0 1
f g3 g2 g1 C
x2
B A
C F F
Miter for resubstitution checkMiter for resubstitution check
hh((gg))SPFD SPFD theorem theorem in in practicepractice
Experimental ResultsExperimental ResultsBaseline Choices Imfs Imfs + Lutpack
Designs PI PO Reg LUT Level LUT Level LUT Level LUT Level
alu4 14 8 0 821 6 785 5 558 5 453 5
apex2 39 3 0 992 6 866 6 806 6 787 6
apex4 9 19 0 838 5 853 5 800 5 732 5
bigkey 263 197 224 575 3 575 3 575 3 575 3
clma 383 82 33 3323 10 2715 9 1277 8 1222 8
des 256 245 0 794 5 512 5 483 4 480 4
diffeq 64 39 377 659 7 632 7 636 7 634 7
dsip 229 197 224 687 3 685 2 685 2 685 2
ex1010 10 10 0 2847 6 2967 6 1282 5 1059 5
ex5p 8 63 0 599 5 669 4 118 3 108 3
elliptic 131 114 1122 1773 10 1824 9 1820 9 1819 9
frisc 20 116 886 1748 13 1671 12 1692 12 1683 12
i10 257 224 0 589 9 560 8 548 7 547 7
pdc 16 40 0 2327 7 2500 6 194 5 171 5
misex3 14 14 0 785 5 664 5 517 5 446 5
s38417 28 106 1636 2684 6 2674 6 2621 6 2592 6
s38584 12 278 1452 2697 7 2647 6 2620 6 2601 6
seq 41 35 0 931 5 756 5 682 5 645 5
spla 16 46 0 1913 6 1828 6 289 4 263 4
tseng 52 122 385 647 7 649 6 645 6 645 6
geomean 1168 6.16 1103 5.66 716 5.24 677 5.24
Ratio 1.000 1.000 0.945 0.919 0.613 0.852 0.580 0.852
Ratio 1.000 1.000 0.946 1.000
OverviewOverview• IntroductionIntroduction
– What is ABC?What is ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
• SummarySummary
The Main IdeaThe Main Idea
• Consider Consider registers and nodesregisters and nodes of a design of a design– DetectDetect candidate equivalences in this set using candidate equivalences in this set using
random/guided simulationrandom/guided simulation– ProveProve candidates by K-step induction candidates by K-step induction– MergeMerge the resulting equivalences the resulting equivalences
• This is a subset of sequential synthesis withThis is a subset of sequential synthesis with– Practical advantages (does not move registers, etc)Practical advantages (does not move registers, etc)– Scales to large designsScales to large designs– Offers substantial improvementsOffers substantial improvements– Comes with a verification guaranteeComes with a verification guarantee
Base Case Inductive CaseBase Case Inductive Case
Proving internal equivalences in a topological order in frame K
A
B
SAT-1D C
SAT-2
A
B
D C
A
B
D C
Assuming internal equivalences to in uninitialized frames 0 through K-1
0
0
0
0
?
?
Symbolic state
PI0
PI1
PIk
A
B
SAT-3D C
SAT-4
A
B
SAT-1D C
SAT-2
?
?
?
?
PI0
PI1
Initial state
Candidate equivalences: {A,B}, {C,D}
Proving internal equivalences in initialized frames 0 through K-1
Dynamic PartitioningDynamic Partitioning (register correspondence) (register correspondence)
BA = DC =
B’A’ =?
BA = DC =
D’C’ =?
Illustration for two candidate equiv. classes: {A,B}, {C,D} Partition 1Partition 1
Partition 2Partition 2BA DC
B’A’ D’C’
One time-frame of the designOne time-frame of the design
Academic BenchmarksAcademic BenchmarksRegisters / Area / Delay
Baseline Reg Corr Ratio Sig Corr Ratio
Registers 809.9 610.9 0.75 544.3 0.67
6-LUTs 2141 1725 0.80 1405 0.65
Delay 6.8 6.33 0.93 5.83 0.86
Runtime
Reg Corr Sig Corr SEC Synt & Map Total
Geomean 7.186 29.846 81.583 16.760 135.376
Percentage 0.05 0.22 0.60 0.12 1.00
Columns “Baseline”, “Reg Corr” and “Sig Corr” show geometric means.
Industrial BenchmarksIndustrial BenchmarksRegisters / Area / Delay
Baseline St Seq Sw Ratio Reg Corr Ratio Sig Corr Ratio
Registers 5500 5248 0.954 4826 0.877 4788 0.871
6-LUTs 11497 11100 0.965 10421 0.906 9989 0.869
Depth 7.47 7.39 0.989 0.999 0.999 7.35 0.985
Runtime
St Seq Sw Reg Corr Sig Corr SEC Synt & Map
Geomean 0.84 11.81 143.51 223.10 62.72
Ratio 0.01 0.19 2.29 3.58 1.00
In case of multiple clock domains, optimization was applied only to the domain with the largest number of registers.
Reasons for Large ImprovementsReasons for Large Improvements
• Redundancy introduced by HDL compilersRedundancy introduced by HDL compilers• Early logic duplication by the designerEarly logic duplication by the designer• Accidental sequential redundanciesAccidental sequential redundancies• Sequential redundancies present due to reuse of Sequential redundancies present due to reuse of
design components that had more functionality design components that had more functionality than neededthan needed
OverviewOverview• IntroductionIntroduction
– What is ABC?What is ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
• SummarySummary
MotivationMotivation• Fewer pin-to-pin connections should make the Fewer pin-to-pin connections should make the
design easier to place and routedesign easier to place and route
• Newer FPGAs allow two outputs per LUTNewer FPGAs allow two outputs per LUT– Thus fewer pin-to-pin connections should produce a Thus fewer pin-to-pin connections should produce a
mapping that “packs” better into dual-output LUTsmapping that “packs” better into dual-output LUTs
Area Recovery OverviewArea Recovery Overview
1.1. Perform delay-optimal mappingPerform delay-optimal mapping2.2. Recover area off critical pathsRecover area off critical paths
– Area-flowArea-flow (global view)(global view)• Chooses cuts with better logic sharingChooses cuts with better logic sharing
– Exact local areaExact local area (local view)(local view)
3.3. New idea: Cut-based area recovery New idea: Cut-based area recovery algorithms can be extended to minimize algorithms can be extended to minimize edges (pin-to-pin connections)edges (pin-to-pin connections)
Both are
important
Both are
important
WireMap AlgorithmWireMap Algorithm
1.1. Perform delay-optimal mappingPerform delay-optimal mapping
2.2. Recover area off critical pathsRecover area off critical paths– Area-flowArea-flow ( (global viewglobal view))
• Break ties with minimum Break ties with minimum edge flowedge flow
– Exact local areaExact local area ( (local viewlocal view))• Break ties with exactBreak ties with exact local edge countlocal edge count
Experimental SetupExperimental Setup• WireMap implemented in ABCWireMap implemented in ABC• Compared WireMap against two algorithms in ABCCompared WireMap against two algorithms in ABC
– BaselineBaseline – basic mapping with area recovery – basic mapping with area recovery– Mapping with Structural Choices Mapping with Structural Choices – mapping with area – mapping with area
recovery for several netlists produced by synthesisrecovery for several netlists produced by synthesis
• WireMap was implemented on top of mapping with WireMap was implemented on top of mapping with choiceschoices
• Used VPR to place/route design for wirelength and Used VPR to place/route design for wirelength and critical path delayscritical path delays
• Used Used maximum cardinality matchingmaximum cardinality matching to pack single- to pack single-output LUTs into dual-output LUTs usingoutput LUTs into dual-output LUTs using
Results SummaryResults Summary
• Comparing WireMap against the best Comparing WireMap against the best mapping with structural choices in ABCmapping with structural choices in ABC
• WireMap results:WireMap results:– Reduction in edges by Reduction in edges by 9.3%9.3% – Reduction in dual-output LUT count by Reduction in dual-output LUT count by
9.4%,9.4%, compared to mapping with choices compared to mapping with choices• Single-output LUT count only reduced by 1.3%Single-output LUT count only reduced by 1.3%
– Reduction in wire length by Reduction in wire length by 8.5%8.5%– Reduction in power by Reduction in power by 20%20%
OverviewOverview• IntroductionIntroduction
– What is ABC?What is ABC?• ABC fundamentalsABC fundamentals
– Areas addressed by ABCAreas addressed by ABC• SynthesisSynthesis• Technology mappingTechnology mapping• VerificationVerification
– Contrast with classical methodsContrast with classical methods• How is ABC different from SIS?How is ABC different from SIS?
• Recent workRecent work– SpeedupSpeedup– Factoring Factoring – Don’t-care based optimizationDon’t-care based optimization– Scalable sequential synthesisScalable sequential synthesis– WireMapWireMap– White boxesWhite boxes
• SummarySummary
b
c
o3
o4
n2
Seq box
n1n2
n4
n3
n1
n7
n6
n8a
b
c
o1
o2
Seq box
FF1FF3
FF4
FF5
FF
FF
FF
FF
FF FF
Comb box
FF
FF
FF
FF
Comb and Seq BoxesComb and Seq Boxes
b
c
o3
o4
n2
Seq box
n1n2
n4
n3
n1
n7
n6
n8a
b
c
o1
o2
Seq box
FF1FF3
FF4
FF5
FF
FF
FF
FF
FF FF
Comb box
FF
FF
FF
FF
Treating Boxes as BlackTreating Boxes as Black
For simplicity, boxes can be treated as “black”. Thus box For simplicity, boxes can be treated as “black”. Thus box outputs become inputs to the rest of the logic and box inputs outputs become inputs to the rest of the logic and box inputs become outputs. Delay and logic information is lost.become outputs. Delay and logic information is lost.
b
c
o3
o4
n2
Seq box
n1n2
n4
n3
n1
n7
n6
n8a
b
c
o1
o2
Seq box
FF1FF3
FF4
FF5
FF
FF
FF
FF
FF FF
Comb box
FF
FF
FF
FF
Treating Boxes as WhiteTreating Boxes as White
Example: Nodes o1 and o3 may be equivalent in the design, but this equivalence cannot be detected if the boxes are treated as black.Solution: Consider logic inside white boxes for synthesis, but keep it unchanged during synthesis and mapping.
Future WorkFuture Work
Integrating synthesis/ mapping/retiming
ABCCo-developing synthesis and verification
Integrating synthesis with place and route
Improving AIG-based synthesis and mapping
Creating special configurable design flows
Supporting emerging technologies
To Learn MoreTo Learn More
• Visit ABC webpage Visit ABC webpage http://www.eecs.berkeley.edu/~alanmi/abchttp://www.eecs.berkeley.edu/~alanmi/abc
• Read recent papers Read recent papers http://www.eecs.berkeley.edu/~alanmi/http://www.eecs.berkeley.edu/~alanmi/publicationspublications
• Send email Send email – [email protected]@eecs.berkeley.edu– [email protected]@eecs.berkeley.edu