discovering invariants in the analysis and veriflcation of...
TRANSCRIPT
Discovering Invariants in the Analysis and Verificationof Finite State Transition Systems
by
Jie-Hong Roland Jiang
B.S. (National Chiao Tung University, Taiwan) 1996M.S. (National Chiao Tung University, Taiwan) 1998
A dissertation submitted in partial satisfactionof the requirements for the degree of
Doctor of Philosophy
in
Engineering - Electrical Engineering and Computer Sciences
in the
GRADUATE DIVISION
of the
UNIVERSITY OF CALIFORNIA, BERKELEY
Committee in charge:
Professor Robert K. Brayton, ChairProfessor Alberto Sangiovanni-Vincentelli
Professor Kam-Biu Luk
Fall 2004
The dissertation of Jie-Hong Roland Jiang is approved.
Chair Date
Date
Date
University of California, Berkeley
Fall 2004
Discovering Invariants in the Analysis and Verification
of Finite State Transition Systems
Copyright c© 2004
by
Jie-Hong Roland Jiang
Abstract
Discovering Invariants in the Analysis and Verification
of Finite State Transition Systems
by
Jie-Hong Roland Jiang
Doctor of Philosophy in Engineering - Electrical Engineering and Computer Sciences
University of California, Berkeley
Professor Robert K. Brayton, Chair
Hardware and software systems are evolving at a fascinating speed, thanks to the refine-
ments of semiconductor technologies. However, verifying their correctness becomes a daunt-
ing task because of the state explosion problem. Because simulation can only validate a
modern design for a fairly small portion of functional coverage, formal methods become
indispensable tools in certifying design correctness. Although significant progresses have
been achieved in this area, the state of the art is still far behind what is required and there
is plenty room for improvement. This thesis addresses some issues in formal analysis and
verification of finite state transition systems.
Through identifying some invariants, we study four subjects in the analysis and verifica-
tion of finite state transition systems. First, we establish the most general definition of com-
binationality in designs with cyclic definitions, which occur naturally in systems specified
in high-level description languages due to resource sharing, module composition, etc. This
is further extended to determine the sequential determinism of systems with state holding
1
elements. Second, we study the transformation invariants under retiming and resynthe-
sis operations, which are the most practical techniques in the optimization of synchronous
hardware systems. We characterize the optimization power of these operations and demon-
strate the verification complexity of checking retiming and resynthesis equivalence. We give
the rectification of initialization sequences invalidated due to these transformations. Third,
we revisit equivalence checking of two finite state transition systems, which is one of the
most important problems in design verification. Demonstrated is how the verification task
can be fulfilled with symbolic computations in the disjoint union state space, rather than in
the traditional product state space, of the two systems. Finally, because abstraction is one
of the most promising techniques to leverage the state explosion problem, we investigate a
reachability-preserving abstraction technique based on functional dependency. By extend-
ing combinational to sequential dependency, the detection of functional dependency can be
isolated from reachability analysis. Also, our computation can be integrated into reachabil-
ity analysis as an on-the-fly reduction.
Professor Robert K. BraytonDissertation Committee Chair
2
To My Parents
i
Contents
Contents ii
List of Figures vi
List of Tables viii
Acknowledgements ix
1 Introduction 1
1.1 Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Main Results and Connections to Invariants . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Preliminaries 6
2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Binary Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Finite State Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 Combinationality and Sequential Determinism 9
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Combinationality at the Functional Level . . . . . . . . . . . . . . . . . . . 17
ii
3.3.1 Formulation of Combinationality . . . . . . . . . . . . . . . . . . . . 17
3.3.2 Computation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.3 Generality Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.3.4 Conditions of Legitimacy . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.5 Stable Cyclic Dependencies . . . . . . . . . . . . . . . . . . . . . . . 23
3.3.6 Input-Output Determinism of State Transition Systems . . . . . . . 25
3.4 Combinationality at the Circuit Level . . . . . . . . . . . . . . . . . . . . . 27
3.4.1 Synthesis of Cyclic Circuits . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5.1 SEG vs. GMW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5.2 Combinationalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5.3 Sequential Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Retiming and Resynthesis 33
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Optimization Capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3.1 Optimization Power of Retiming . . . . . . . . . . . . . . . . . . . . 40
4.3.2 Optimization Power of Retiming and Resynthesis . . . . . . . . . . . 42
4.3.3 Retiming-Resynthesis Equivalence and Canonical Representation . . 45
4.4 Verification Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.1 Verification with Unknown Transformation History . . . . . . . . . . 48
4.4.2 Verification with Known Transformation History . . . . . . . . . . . 50
4.5 Initialization Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.1 Initialization Affected by Retiming . . . . . . . . . . . . . . . . . . . 52
4.5.2 Initialization Affected by Retiming and Resynthesis . . . . . . . . . 53
4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
iii
5 Equivalence Verification 59
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2 Definitions, Notation and Preliminaries . . . . . . . . . . . . . . . . . . . . 62
5.2.1 Equivalence Relations and Partitions . . . . . . . . . . . . . . . . . . 62
5.2.2 Functional Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3 Identification of State Equivalence . . . . . . . . . . . . . . . . . . . . . . . 65
5.3.1 State Equivalence vs. Functional Decomposition . . . . . . . . . . . 66
5.3.2 Algorithm for Equivalent State Identification . . . . . . . . . . . . . 67
5.3.3 Robust Equivalent State Identification . . . . . . . . . . . . . . . . . 70
5.4 Verification of Sequential Equivalence . . . . . . . . . . . . . . . . . . . . . 74
5.4.1 Multiplexed Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.2 Algorithm for Sequential Equivalence Checking . . . . . . . . . . . . 76
5.4.3 Robust Sequential Equivalence Checking . . . . . . . . . . . . . . . . 78
5.4.4 Error Tracing and Shortest Distinguishing Sequence . . . . . . . . . 81
5.4.5 State-Space Partitioning on Separate Machines . . . . . . . . . . . . 81
5.4.6 State-Space Partitioning on Product Machine . . . . . . . . . . . . . 82
5.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.5.1 Implementation-Independent Aspects . . . . . . . . . . . . . . . . . 83
5.5.2 Implementation-Dependent Aspects . . . . . . . . . . . . . . . . . . 86
5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.7.1 Computation of State Equivalence . . . . . . . . . . . . . . . . . . . 92
5.7.2 Verification of FSM Equivalence . . . . . . . . . . . . . . . . . . . . 93
5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6 Verification Reduction 100
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2 Preliminaries and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3 Functional Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.3.1 Combinational Dependency . . . . . . . . . . . . . . . . . . . . . . . 103
iv
6.3.2 Sequential Dependency . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.4 Verification Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7 Conclusions and Future Work 124
Bibliography 130
v
List of Figures
3.1 (i) SEG for x = 0. (ii) SEG for x = 1. . . . . . . . . . . . . . . . . . . . . . 16
3.2 (i) The original circuit. (ii) The induced circuit under input assignment a = 0and b = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1 Algorithm: Construct quotient graph. . . . . . . . . . . . . . . . . . . . . . 46
4.2 Algorithm: Verify equivalence under retiming and resynthesis. . . . . . . . . 47
4.3 The STG in (i) is transformable to the STG in (ii) by a 2-way switch operationwhile the reverse direction is not transformable. Since the operation is notreversible, it falls beyond the transformation power of retiming and resynthesis. 56
5.1 Algorithm CompNewPartition: Compute New Partition. . . . . . . . . . . . 67
5.2 Algorithm IDES5.1: Identify Equivalent States, Equation (5.1). . . . . . . . 70
5.3 Algorithm IDES5.2: Identify Equivalent States, Equation (5.2). . . . . . . . 71
5.4 Algorithm IDES5.3: Identify Equivalent States, Equation (5.3). . . . . . . . 72
5.5 Multiplexed Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.6 Algorithm: Verify Sequential Equivalence. . . . . . . . . . . . . . . . . . . . 78
6.1 Algorithm: CombinationalDependency. . . . . . . . . . . . . . . . . . . . . . 106
6.2 The greatest fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transitionfunctions (and thus next-state variables) are partitioned into dependent andindependent parts by the computation of combinational dependency. Thederived dependency is used to reduce the state space in the subsequent iteration.109
6.3 Algorithm: SequentialDependencyGfp. . . . . . . . . . . . . . . . . . . . . . 110
vi
6.4 The least fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transitionfunctions (and thus next-state variables) are partitioned into dependent andindependent parts by the computation of combinational dependency. Thederived dependency is used to reduce the state space in the subsequent iteration.111
6.5 Algorithm: SequentialDependencyLfp. . . . . . . . . . . . . . . . . . . . . . 113
6.6 Algorithm: ComputeReachWithDependencyReduction. . . . . . . . . . . . . 115
vii
List of Tables
5.1 Profiles of Benchmark Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 Characteristics of Equivalence Classes of Benchmark Circuits . . . . . . . . 96
5.3 Sequential Equivalence Checking between Identical Circuits . . . . . . . . . 97
5.4 Sequential Equivalence Checking between Different Implementations of SameDesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.5 Overall Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1 Comparisons of Capabilities of Discovering Dependency . . . . . . . . . . . 117
6.2 Comparisons of Capabilities of Checking Equivalence . . . . . . . . . . . . . 119
6.3 Comparisons of Capabilities of Analyzing Reachability . . . . . . . . . . . . 120
viii
Acknowledgements
I am deeply grateful to my advisor Bob Brayton for his guidance and support throughout
my graduate study at Berkeley. His broad knowledge and enthusiastic interests in the field
of electronic design automation are always my aim. Many thanks to him for giving me the
flexibility on research and tolerating my diverse interests in irrelevant fields, which was of
great benefit to me in getting non-zero knowledge about other subjects in quantum physics,
theoretical computer science, mathematical logic, etc.
My thanks go to Alberto Sangiovanni-Vincentelli and Andreas Kuehlmann for their
valuable feedback to my thesis work, and for their respective courses on embedded system
design and logic synthesis. In summer trips to Dagstuhl in Germany in 2002, Lisbon
in Portugal in 2003, and Pelion in Greece in 2004, my interests in hardware verification
increased greatly due to Andreas for organizing these workshops on formal equivalence
verification and due to Bob for financial support. My understanding of verification in a
more general setting was due to the class by Tom Henzinger.
I am grateful to Kam-Biu Luk for teaching me quantum mechanics and serving on my
thesis committee. I hope to produce some scientific work on quantum computation in the
near future, in addition to other “classical” work.
I would like to express my thanks to Tiziano Villa and Nina Yevtushenko for sharing
their work on the language equation formulation of finite state machine synthesis. Because
of them, my interest in the theory of automata and language was developed. A meeting
with them in Rome in 2003 was a joyful trip full of research ideas as well as historical
sightseeing led by Tiziano.
Alan Mishchenko is the one who makes me realize how far I am away from being a good
programmer. His generosity in sharing his expertise and ideas is greatly appreciated. My
thesis work has benefited from the discussions with him.
ix
I cherish the interactions with my colleagues in our group, Yunjian Jiang, Philip Chong,
Subarna Sinha, Fan Mo, Minxi Gao, Yinghua Li, Satrajit Chatterjee, Donald Chai, and my
officemates: Farinaz Koushanfar, Guang Yang, Haibo Zeng. Also I thank Rupak Majumdar
for his patience in answering my many random questions.
My stay at Berkeley could not be more joyful because of my fellow Taiwanese students:
Dah-Wei Chiou, En-Yi Lin, Te-Sheng Hsiao, Yu-Chuan Tai, Stanley Bo-Ting Wang, Cheng-
En Wu, Mandy Yang, Cheng-Han Yu, and many others. Dah-Wei is the one who introduced
me to quantum physics and many other subjects.
My study at Berkeley was not possible without the help of Jin-Yang Jou, Yao-Wen
Chang, and late Wen-Zen Shen, who were my mentors during my early development at
National Chiao Tung University. I deeply appreciate Iris Hui-Ru Jiang for her help in my
graduate study at Hsinchu.
The love and encouragement of my parents and brother has been my strongest support.
To my parents, I dedicate my thesis.
x
xi
Chapter 1
Introduction
1
Give me a place to stand, and with a lever I will move the world.— Archimedes
In this thesis, the place to stand on is finite state transition systems; the lever to use is
invariants. However, we are not sure if the world can be moved at all.
1Engraving from Mechanics Magazine, London, 1824.http://www.mcs.drexel.edu/∼crorres/Archimedes/Lever/LeverIntro.html
1
CHAPTER 1. INTRODUCTION
1.1 Invariants
Invariants are levers in mathematics! They play a fundamental role in many branches of
mathematics. Below we list two types of invariants each with a well-known example among
many many others.
I. Invariants classify mathematical objects: The Euler number v−e+f , a topological
invariant, can be used as a demonstration of the topological non-equivalence of poly-
hedra, where v, e and f denote the numbers of vertices, edges and faces, respectively,
of a polyhedron.
II. Invariants simplify computation: At the age of seven, Gauss summed the integers
from 1 to 100 instantly by spotting that the sum was 50 pairs of numbers each pair
summing to an invariant of 101 [Mac].
The invariants of Type I are sometimes apparent from the context and often essential in the
studied subject-matter. In comparison, the invariants of Type II are somewhat opaque and
behave more like auxiliary catalysts. The invariants that we will encounter fall into both
categories. They form power tools in the analysis and verification of finite state transition
systems.
1.2 Main Results and Connections to Invariants
This thesis mainly covers four (somewhat independent) subjects regarding the analysis
and verification of finite state transition systems.
1. Combinationality and sequential determinism. We consider finite state transition
systems described by a set of definitions which specify the valuations of variables in
2
CHAPTER 1. INTRODUCTION
terms of functions of other variables. In the special case where a system has no
state-holding elements, definitions with an acyclic relation of information processing
induce a definite stateless, or combinational, behavior. However, the converse is not
necessarily true; definitions with a cyclic relation may induce a definite combinational
behavior. We show the most general condition under which a set of cyclic definitions
induces a combinational system at the functional level. Our analysis is legitimate
when the cyclic definitions are to be broken2 or the synthesis target is software. Our
condition admits a higher level combinationality analysis and yields more flexible
descriptions of combinational systems. Furthermore, we extend the results to finite
state transition systems to verify the determinism of their input-output behavior.
Our results are achieved through showing an invariant which exactly characterizes
combinationality.
2. Retiming and resynthesis. Transformations using retiming and resynthesis opera-
tions are among the most important and practical techniques in optimizing syn-
chronous hardware systems. We study their corresponding transformation power and
verification complexity by identifying some transformation invariants under retiming
and resynthesis. We present a constructive algorithm to determine if two given fi-
nite state machines are transformable to each other using retiming and resynthesis
operations. We show the above problem is PSPACE-complete. In addition, we study
the effect of retiming and resynthesis on initialization sequences of synchronous hard-
ware systems with implicit reset. It is known that the original initialization sequences
should be prefixed with an arbitrary input sequence of a certain length. An algorithm
is proposed to determine the length increase.
3. Equivalence checking. The above analysis was restricted to verifying the restricted
2We use the term “broken” in the sense that the definitions are rewritten such that there is no cycle ofdependency in the definitions, i.e. all cyclic dependencies are broken.
3
CHAPTER 1. INTRODUCTION
equivalence of FSMs transformed under retiming and resynthesis. Here we consider
the equivalence checking of FSMs under arbitrary transformations. This is one of the
most challenging problems in VLSI design verification. Prior symbolic approaches to
the problem are based on reachability analysis over a product machine of the two finite
state machines. Two finite state machines are equivalent if the output of the product
machine is an invariant (a constant) demonstrating that no observable differences are
produced throughout the reachability analysis.
We present another possibility of verifying sequential equivalence. Rather than veri-
fying in the product state space of two state machines, we verify equivalence in their
disjoint union state space. In particular, the partition of equivalence classes over the
state space is iteratively refined. The corresponding invariant to be certified is that
initial states of the two machines remain in the same equivalence class throughout
the refinement process. The proposed approach differs from prior work in that the
verification efficiency is governed by the encountered number of equivalence classes
rather than the number of state variables. It is often more robust than reachability
analysis on the product machine.
4. Verification reduction. Abstraction is an important technique to cope with the state
explosion problem in formal verification of finite state transition systems. We focus
on a reachability-preserving reduction through functional dependency for safety prop-
erty verification. Essentially, functional dependency is an invariant characterizing the
representation redundancy of a given state transition system. Extracting functional
dependency allows us to reexpress the transition system using more compact transition
functions.
Prior derivations of functional dependency relied on reachability analysis, and thus
were computationally expensive and not scalable to large transition systems. We
propose another construction by detecting functional dependency directly from the
4
CHAPTER 1. INTRODUCTION
set of transition functions. Thus, reachability analysis is not a necessity for exploiting
dependency. In addition, the detection of functional dependency can be integrated
into reachability analysis as an on-the-fly reduction.
The invariants of the first three subjects are of Type I; while that of the last subject is of
Type II.
1.3 Thesis Organization
The thesis is organized as follows. Common preliminaries and definitions are introduced
in Chapter 2. Chapter 3 studies the fundamental formulation of combinationality and
sequential determinism. The other three subjects to be discussed are ordered from specific
to general, with the most specific analysis of the transformation of retiming and resynthesis
in Chapter 4. The more general sequential equivalence checking is studied in Chapter
5. Verification reduction using functional dependency is discussed in Chapter 6. Finally,
concluding remarks are given in Chapter 7.
5
Chapter 2
Preliminaries
2.1 Notation
We use |S| to denote the cardinality (or size) of a set S. Also, suppose V1 is a set of
variables. Notation [[V1]] represents the set of all possible valuations over V1. Let V2 ⊆ V1.
For x ∈ [[V1]], we use x[V2] ∈ [[V2]] to denote the valuation over variables V2 which agrees
with x on V2.
2.2 Graphs
A graph G consists of a vertex set V and an edge set E. Any edge e ∈ E connects
two vertices, say u, v ∈ V . For undirected e = u, v, there is no ordering on u and v.
For directed e = (u, v), the connection goes from u to v. In this case we say that u is
the predecessor of v (or v is the successor of u) with respect to e. A graph is said to be
undirected (directed) if all of its edges are undirected (directed). A vertex is of degree n
(a non-negative integer) if it is contained in n edges. For directed graphs, the degree of a
6
CHAPTER 2. PRELIMINARIES
vertex can be further distinguished: a vertex is of indegree j and outdegree k if it is the
successor vertex of j edges and the predecessor vertex of k edges, respectively.
In this thesis, the graphs we encounter are directed and contain no multi-edges between
any pair of vertices. We will be using graphs extensively to represent circuits, Boolean
functions, finite state machines, etc.
2.3 Binary Decision Diagrams
A binary decision diagram (BDD) is a directed graphical data structure similar to
a binary tree. A vertex is either a terminal node (or leaf) representing logical value true
or false, or a nonterminal node representing a decision point. Any nonterminal node
is associated with a binary decision variable and has two outgoing edges, the then- and
else-edge, representing the two possible branches of the valuation of the decision variable.
Consequently, a BDD is capable of representing any Boolean function. A special type of
BDD gains the most attention: the reduced ordered BDD (ROBDD). A BDD is ordered
if the visited variable sequence (without repetitions) along any path from the root to a
leaf obey some total order. An ordered BDD is reduced if no two BDD nodes represent the
same function. ROBDDs are a useful data structure, which supports efficient representation
and manipulation of Boolean functions. Two important properties of ROBDDs should be
mentioned. First, the efficiency of representing a Boolean function using an ROBDD is
strongly affected by the variable ordering. Second, given a function and a variable ordering,
the corresponding ROBDD is canonical. Due to its unique properties, the ROBDD has
pervasive applications in formal verification and logic synthesis. The reader is referred to
[Bry92] for a more detailed exposition. In the sequel, when a BDD is mentioned, we shall
mean an ROBDD.
7
CHAPTER 2. PRELIMINARIES
2.4 Finite State Machines
A finite state transition system can be modelled as a finite state machine. A finite
state machine1 (FSM) M is a tuple (Q, I,Σ, Ω, ~δ, ~ω), where Q is a finite set of states,
I ⊆ Q is the set of initial states, Σ and Ω are the sets of input and output alphabets,
respectively, and ~δ : Σ × Q → Q is the transition function. For a Moore machine, the
output function ~ω : Q → Ω depends on the current state; for a Mealy machine, the output
function ~ω : Σ × Q → Ω depends on both the input and current state. (In most of our
discussions, we simply consider the Mealy machine since the extension to the Moore machine
is straightforward.) Let VS , VI , and VO be the sets of variables that encode the states, input
alphabets, and output alphabets respectively. Then Q = [[VS ]], Σ = [[VI ]] and Ω = [[VO]].
Given an FSM, we can construct a state transition graph, where a vertex represents a
state and a labelled edge represents a possible transition with the corresponding predicate.
In addition, vertices (for a Moore machine) or edges (for a Mealy machine) are further
labelled with the observations induced by output functions of the FSM.
1In the thesis, we assume finite state machines are deterministic and completely specified.
8
Chapter 3
Combinationality and Sequential
Determinism
In the course of hardware system design or real-time process control, high-level spec-
ifications may contain simultaneous definitions of concurrent modules whose information
flow forms cyclic dependencies without the separation of state-holding elements. The tem-
poral behavior of these cyclic definitions may be meant to be combinational rather than
sequential. Most prior approaches to analyzing cyclic combinational circuits were built
upon the formulation of ternary-valued simulation at the circuit level. This chapter shows
the limitation of this formulation and investigates, at the functional level, the most general
condition where cyclic definitions are semantically combinational. It turns out that the
prior formulation is a special case of our treatment. Our result admits strictly more flexible
high-level specifications. Furthermore, it allows a higher-level analysis of combinationality,
and, thus, no costly synthesis of a high-level description into a circuit netlist before combi-
nationality analysis can be performed. With our formulation, when the target is software
implementations, combinational cycles need not be broken as long as the execution of the
9
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
underlying system obeys a sequencing execution rule. For hardware implementations, com-
binational cycles are broken and replaced with acyclic equivalents at the functional level to
avoid malfunctioning in the final physical realization.
3.1 Introduction
Cyclic definitions occur commonly in high-level system descriptions (e.g., due to resource
sharing, module composition, etc.) as was observed in [Sto92]. Checking if cyclic definitions
are semantically combinational is crucial in both hardware and software synthesis for two
reasons. First, the analysis certifies the legitimacy of cyclic definitions. Second, if the cyclic
definitions of a system are inappropriate but breakable, the analysis provides a means of
rewriting the system which breaks the cycles.
The analysis of cyclic combinational circuits was first formulated by Malik [Mal94],
based on ternary-valued simulation [Bry87, BS95], at the circuit (or gate) level. Subsequent
efforts [HM95, SBT96, Shi96, NK99] were built upon this formulation and bore much the
same foundation. However, the quest for solutions to analyzing combinationality remained
because the analysis at the functional level was left open.
Combinationality analysis for cyclic circuits is an essential step in the compilation of
synchronous languages [Hal93], such as Esterel [Ber00, Ber99], which allow simultaneous
cyclic definitions. Before applying Malik’s approach for static analysis1, a synchronous
program typically needs to be translated into a circuit netlist. Depending on how a program
is written, the same specification can be translated into different netlists. Because the
analysis based on ternary-valued simulation heavily depends on circuit structures (more
precisely, on the arrangement of delay elements over circuit netlists) as observed in [SBT96,
Shi96], a netlist may be declared not combinational even though there exists a functionally
1For dynamic or runtime analysis, translating a program to a netlist may be avoided, e.g., see [EL03].
10
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
equivalent one that behaves combinationally. This phenomenon corresponds to the so-called
schizophrenia problem in the compilation of Esterel programs [Ber99].
We propose a functional-level analysis that avoids the complication of translating pro-
grams into circuit netlists and eliminates the discrepancy problem of analyzing equivalent
different netlists. Essentially, our formulation of combinationality is extended to an extreme
at the functional level. That is, there exists a combinational implementation (where cyclic
definitions might need to be broken) if and only if the system in consideration passes our
combinationality test. As will be clear, ternary-valued simulation, when extended to the
functional level, is a limited special case of our formulation.
Although combinational circuits with feedback have their potential savings in area
[Kau70], they are hard to manipulate and analyze, e.g. timing analysis, logic minimiza-
tion, etc. Breaking cyclic dependencies is sometimes necessary to avoid later complications
since manipulating such circuits needs special care to prevent destroying well-behaved com-
binationality. Earlier efforts [HM95, Edw03] on breaking combinational cycles were done
for circuit netlists. In fact, there is no need to wait until circuit structures are derived. In
addition, analyzing combinationality at the functional level broadens the generality.
Our results are based upon the following principle. When cyclic definitions are to be
broken or the synthesis target is software, the combinationality analysis should be general-
ized to the extreme and performed at the highest level possible (i.e., the functional level).
On the other hand, when the target is hardware synthesis and combinational cycles are
allowed to exist, then the analysis should be conservative enough to tolerate undesirable
physical effects but general enough to abstract away unnecessary details at the appropriate
level (i.e., the circuit, or gate level). We emphasize this asymmetry in analyzing combina-
tionality, which was overlooked in prior work.
For combinationality at the circuit level, we comment on a recent development in the
11
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
synthesis of cyclic combinational circuits. Targeting area minimization, an attempt was
made in [RB03b, RB03a] to synthesize cyclic combinational circuits by extending the for-
mulation of ternary-valued simulation to the functional level. Unfortunately, it was ignored
that functional-level analysis is not sufficient in guaranteeing well-behaved circuitry. We
show the pitfall and suggest two cures. Essentially, additional conditions other than purely
functional ones need to be applied in order to guarantee well-behaved circuitry.
The chapter is organized as follows. After preliminaries and notations are given in
Section 3.2, our formulation of combinationality at the functional level is introduced in
Section 3.3. Section 3.4 discusses some issues about combinationality at the circuit level.
Section 3.5 compares our formalism with other work. Finally, Section 3.6 concludes this
chapter and lists some future research directions.
3.2 Preliminaries
Unless otherwise noted, this chapter assumes that the variables and functions under
consideration are of type Boolean, B. Moreover, we concentrate on output-deterministic
systems, whose output valuation is uniquely determined under any input assignment and
under any current state designated by the systems’ state-holding elements.
A functional-level description of a system M consists of a set DM of atomic defi-
nitions. Each atomic definition is of the form aj := φj , where aj and φj are a Boolean
variable and formula, respectively. In particular, for a circuit-level description of a system,
φj could be a formula of an identity function representing a wire (with delay), or a formula
of an elementary Boolean function representing a primitive logic gate (with delay) in the
gate library for technology mapping. We distinguish between functional and circuit level
descriptions. At first glance, this may seem obscure, and simply a matter of granularity.
However, we distinguish these two levels by saying that the valuations of atomic definitions
12
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
take no time at the functional level, but take time at the circuit level. (We assume that every
definition aj := φj in DM is deterministic, that is, variable aj valuates to a definite value
under any assignment on variables in formula φj . Thus, φj is a total Boolean function.)
Associated to DM is a definition graph GdM = (V d, Ed) characterizing the information
flow among atomic definitions. A vertex vj ∈ V d represents a left-hand variable aj of an
atomic definition aj := φj in DM. An directed edge (vj , vk) ∈ Ed indicates that variable aj
appears in the right-hand formula φk of atomic definition ak := φk. The set DM of atomic
definitions is of cyclic definition if GdM is a cyclic graph.
Given a system M (without state-holding elements, i.e., registers or latches), three
sets of variables are distinguished: the set VI of primary-input variables, VO of primary-
output variables, and VX of all the other (internal) variables. Notice that the primary-input
variables are the definition-free variables, and vice versa. A subset VC ⊆ VX ∪VO is selected
as a cutset such that the information flow among DM becomes acyclic if VC were exposed
as primary-input variables in addition to the original ones. That is, the corresponding
vertices of the cutset variables form a feedback vertex set in GdM. It turns out that any
such VC out of VX ∪VO provides enough information in analyzing the combinationality (its
precise definition will be given later) of M at the functional level. Selecting a minimal2
cutset helps simplify the analysis. (Previous studies, e.g. [ENSS98], on computing minimum
feedback vertex sets [Kar72] can be applied.) Furthermore, as will be proved, the analysis of
combinationality is independent of the choice of VC as long as VC is minimal. With a chosen
cutset VC , the behavior of M can be captured by two sets of definitions: the definitions of
cj ∈ VC , i.e., cj := ξj , and the definitions of ok ∈ VO, i.e., ok := ωk. Here, VI and VC are
the only variable occurrences in formulae ξj ’s and ωk’s. These formulae are obtained by a
sequence of recursive substitutions of the definitions in DM until the formulae for variables
in VC ∪ VO have VI ∪ VC as the only variable occurrences. Thus, the original intermediate
2A cutset VC is minimal if removing any element from VC makes the resultant information flow cyclic.
13
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
definitions of M are collapsed away. We call ξj the excitation function of cj ∈ VC , and
ωk the observation function of ok ∈ VO. (Cutset variables here are analogous to state
variables of a state transition system, while excitation functions are analogous to transition
functions.)
Example 1 Let DM = a := ¬xa ∨ c, b := ¬x(a ∨ ¬b) ∨ c, c := xb, y := ¬x(¬a ∨ ab) ∨x(a¬c ∨ ¬ac), VI = x, and VO = y. Suppose we choose VC to be a, b. Then,
rewriting DM with respect to VC , we have
a := ¬xa ∨ xb
b := ¬x(a ∨ ¬b) ∨ xb
y := ¬x(¬a ∨ ab) ∨ x(a¬b ∨ ¬ab)
The above right-hand formulae for a and b are the excitation functions; the formula for y
is the observation function.
Combinationality analysis for cyclic definitions of systems with state-holding elements
can be approximated as follows. Expose the outputs of state-holding elements as primary
inputs; expose the inputs of state-holding elements as primary outputs. If the unreach-
able state set of the system is available, it can be used as a don’t care condition in the
combinationality analysis. Also, if the state equivalence relation is known, it can be used
as a nondeterministic flexibility in the valuation of state-holding elements. Therefore, we
mainly focus on systems without state-holding elements. The exact analysis for systems
with state-holding elements is postponed to Section 3.3.6. Unless otherwise noted, we shall
assume the systems under consideration are without state-holding elements.
If a system consists of acyclic definitions, then it is combinational. However, the converse
is not true: A combinational system may have breakable cyclic definitions. Hence, only
systems with cyclic definitions are of our interest. Let M be such a system with cutset
14
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
VC . Given an input assignment for M, then the valuation of the cutset variables evolves
with time. The evolution can be captured by state evolution graphs (SEGs), analogous
to state transition graphs for state transition systems. However, unlike a state transition
graph, an SEG GeM,VC ,ı = (V e
ı , Eeı ) exists with respect to a particular fixed input assignment
ı ∈ [[VI ]]. An SEG is used to study how a system with cyclic definitions evolves when the
primary inputs are held constant. There is no concept of an initial state. A vertex vs ∈ V eı
corresponds to an intermediate valuation (or, a state) s ∈ [[VC ]] of the cutset variables (in the
sequel, we shall not distinguish between a vertex and the state it represents); each directed
edge (vs1 , vs2) ∈ Eeı corresponds to an evolution of intermediate valuations from s1 to s2.
That is, s2 = ~ξ(ı, s1), where ~ξ : [[VI ]] × [[VC ]] → [[VC ]] is the vector of excitation functions
of VC . Therefore, the evolution is deterministic at the functional level. (By contrast, the
evolution may be nondeterministic due to races, hazards, glitches, etc., at the circuit level.)
On the other hand, the vector ~ω : [[VI ]] × [[VC ]] → [[VO]] of observation functions imposes a
labelling over [[VC ]] with respect to some ı ∈ [[VI ]].
Below we define and explore some basics about SEGs.
Definition 1 A walk W of length k, denoted as len(W ) = k, on an SEG GeM,VC ,ı =
(V eı , Ee
ı ) is a sequence vs0 , vs1 , . . . , vskof vertices with (vsj−1 , vsj ) ∈ Ee
ı . A path is a walk
without repeated vertices. A loop of length k is a walk of length k without repeated edges
and with vs0 = vsk.
In this chapter, we use the term “loops” for SEGs and preserve “cycles” for definition
graphs.
Proposition 1 Any vertex of an SEG is in a loop and/or on a path leading to a loop.
Proof. Since every state of an SEG has at least one outgoing edge, any vertex is in a loop
and/or on a path leading to a loop.
15
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
00 01 10 11
(i)
00 01 10 11
(ii)
Figure 3.1. (i) SEG for x = 0. (ii) SEG for x = 1.
Proposition 2 Any two loops of an SEG with deterministic evolution are disjoint.
Proof. Since every vertex of an SEG with deterministic evolution has exactly one outgoing
edge, any two loops of the SEG must be disjoint.
In the sequel, we shall assume SEGs are deterministic unless otherwise stated.
Definition 2 A loop L is stable if len(L) = 1; L is unstable if len(L) > 1.
It will be clear later why a loop’s stability is determined by its length.
Definition 3 An equilibrium loop L of an SEG GeM,VC ,ı is a loop whose vertices all have
the same observation label, i.e., ∀vsj , vsk∈ L. ~ω(ı, sj) = ~ω(ı, sk).
Let S`M,VC ,ı denote the set s ∈ [[VC ]] | vs is a vertex in the loops of Ge
M,VC ,ı. For S ⊆ [[VC ]],
let Lo(S) denote the set ~ω(ı, s) ∈ [[VO]] | s ∈ S of observation labels.
Example 2 Continue the set DM of definitions and cutset VC of Example 1. Figure 3.1
shows the two state evolution graphs. Vertices are indexed with states, i.e., valuations of
(a, b). States are distinguished by solid and dotted circles to reflect different observation
16
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
labels induced by the observation function. The SEG of x = 0 has two loops, one stable and
the other unstable; the SEG of x = 1 has two stable loops. All of the loops are equilibrium
loops.
3.3 Combinationality at the Functional Level
Given the functional-level description of a system M with a cyclic definition, we study
the condition when M is combinational.
3.3.1 Formulation of Combinationality
In the functional-level formulation of combinationality, physical timing effects are ab-
stracted away by assuming that all valuations of functions are instantaneous. However, the
order of valuations matters.
A system M is said to be combinational at the functional level (or function-
ally combinational) if, under any input assignment, M would eventually (i.e., within a
bounded number of steps) evolve into a status in which the observation labelling settles to
a definite value3 independent of the initial internal state in [[VC ]]. Here, the dynamics of
M’s evolution is with respect to a cutset VC .
Theorem 3 A system M with cutset VC is combinational at the functional level if and
only if, for any input assignment ı ∈ [[VI ]], all states s ∈ S`M,VC ,ı have the same observation
label ~ω(ı, s), i.e., |Lo(S`M,VC ,ı)| = 1.
Proof. (−→) Suppose not. M may produce outputs depending on the initial state in [[VC ]].
In these cases, M is not combinational.3For simplicity, here we focus on the case where there is a unique output valuation in [[VO]] for any input
assignment. Our results can be straightforwardly generalized to a set of possible output valuations.
17
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
(←−) Since every vertex of GeM,VC ,ı = (V e
ı , Eeı ) is either in a loop or on a path leading
to a loop, any possible initial state sj evolves into a state sk ∈ S`M,VC ,ı after |V e
ı | − 1 steps
of evolution. Since all s ∈ S`M,VC ,ı have the same observation label, M eventually produces
a unique output under input ı. Because this is true for any input assignment, the proof
follows.
Example 3 Continue Example 2. The system described by DM is functionally combina-
tional because, for any of its two SEGs, states in loops have the same observation label.
Under input assignment x = 0, output y valuates to 1; under x = 1, y valuates to 0.
3.3.2 Computation Algorithms
Combinationality test.
From Theorem 3, we conduct a combinationality test on M using a symbolic computa-
tion (e.g. BDD-based computation) as follows. First, derive the set S`M,VC ,ı for all ı ∈ [[VI ]]
by a greatest fixed-point computation. Let Σ : [[VI ]]× [[VC ]] → B be the characteristic func-
tion of S`M,VC ,ı for all ı ∈ [[VI ]]. The fixed-point computation corresponds to: In the initial
step, let Σ(0) be the characteristic function of [[VC ]] for any ı ∈ [[VI ]]. In the iterative steps, to
derive Σ(k+1) from Σ(k), states without predecessors are successively removed from Σ(k) by
a forward image computation. That is, Σ(k+1) is computed by ∃c ∈ VC .Σ(k) ∧Ξ and a sub-
sequent replacement of variables Vc′ with their counterparts in Vc, where Ξ =∧
j(c′j ≡ ξj)
is the characteristic function of the evolution relation of M under cutset C with the newly
introduced “next-state” cutset variables Vc′ = c′j | cj ∈ VC. The process terminates
when Σ(m) equals Σ(m−1) for some m ≥ 1, i.e., no more states can be removed from Σ.
Upon termination, Σ is the characteristic function of S`M,VC ,ı. Notice that, with symbolic
computation, S`M,VC ,ı can be derived simultaneously for all ı ∈ [[VI ]] since variables in VI
are not quantified out in the fixed-point computation. Second, we derive the characteristic
18
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
function Λ of Lo(S`M,VC ,ı) by setting Λ = ∃c ∈ VC .Ω∧Σ, where Ω : [[VI ]]× [[VC ]]× [[VO]] → B
is the characteristic function Ω =∧
j(oj ≡ ωj) of the output relation of M under cutset C.
Again, this can be computed simultaneously for all ı ∈ [[VI ]] since primary-input variables
are not quantified out in the computation. Finally, we check if there exists an ı such that
|Lo(S`M,VC ,ı)| > 1. If the answer is positive, then M is not combinational. Otherwise, it
is. The computation can be done with a SAT-solving formulation, or with a BDD-based
formulation. For the latter, the computation can be performed effectively using the com-
patible projection operator [LN91b], cprojection. That is, M is combinational if and only
if Λ equals cprojection(Λ, o), where o is an arbitrary minterm in [[VO]].
The computational complexity of the combinationality test is the same as that of state
traversal on the space spanned by the cutset variables. That is, the complexity is PSPACE-
complete in the size of the selected cutset.
Theorem 4 The problem of analyzing combinationality at the functional level is in the
complexity class of PSPACE-complete with respect to the selected cutset size.
Proof. The problem of combinationality analysis can be done in nondeterministic PSPACE.
To determine if a state s is in a loop under some input assignment, one can record any
consecutive two states in the state evolution trace starting from s. As the “window” slides
along the trace, the recurrence of s can be checked in at most |[[VC ]]| steps. In addition, one
can test if different output observation labels ever appear in the sliding windows. Hence
the combinationality analysis can be achieved within space bounded by a polynomial in the
cutset size.
On the other hand, we need to reduce a PSPACE-complete problem to the problem of
combinationality analysis. The following problem can be used.
Given a total function f : 1, . . . , n → 1, . . . , n, is there a k such that fk(1) =n?
19
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
It was shown [Jon75] to be deterministic4 LOGSPACE-complete in n and, thus, PSPACE-
complete in log n [Pap94]. We establish that the answer to the PSPACE-complete problem is
positive if and only if the answer to the corresponding problem of combinationality analysis
(to be constructed) is negative. Since the complexity class of nondeterministic space is
closed under complementation [Imm88], the theorem follows.
To complete the proof, given f : 1, . . . , n → 1, . . . , n, an excitation function ξ :
1, . . . , n → 1, . . . , n and observation function ω : 1, . . . , n → B are constructed as
follows. Let ξ have the same mapping as f but with ξ(n) = 1. Also, let ω(j) = false
for 1 ≤ j ≤ n − 1, and ω(n) = true. With the above construction, n is reachable from
1 under f if and only if the system defined by ξ and ω is not combinational. (Note that,
since an n-valued variable can be encoded with O(log n) binary variables, multiple-valued
representations fit our framework.)
Cycle breaking.
Suppose M is combinational. From the above combinationality test, we can derive a
set of equivalent acyclic definitions for M. In fact, there are at least two ways of doing so:
One is to rewrite definitions of primary-output variables as functions (as determined from
the combinationality test) of primary-input variables. The other is to rewrite definitions of
cutset variables as functions of primary-input variables. An advantage of the latter would
be that the original definitions of M can be reused except for the definitions cj := φj ,
for cj ∈ VC , and the resultant unused ones. The derivations of the new definitions are as
follows.
For the rewriting of the primary-output variables, the new definitions can be inferred
from the input-output relation ∃c ∈ VC .Ω ∧ Σ, which has been computed in the combi-
4It is a well-known fact, proved by Savitch in [Sav70], that deterministic and nondeterministic spacecomplexities coincide.
20
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
nationality test. For the rewriting of the cutset variables, for every ı ∈ [[VI ]], some state
sı ∈ S`M,VC ,ı is selected as the representative for S`
M,VC ,ı. Then the new definitions for the
cutset variables can be inferred from the relation∧
ı(∧j(ij ≡ ı[j]))∧ (∧k(ck ≡ sı[k])), where
ij ∈ VI , ck ∈ VC , and ı[j] (resp. sı[k]), which denotes the jth (resp. kth) bit of ı (resp. sı),
is a Boolean constant of value either true or false.
3.3.3 Generality Analysis
Theorem 5 Let VC1 and VC2 be two choices of minimal cutsets for a system M with cyclic
definitions. Then, under any input assignment ı ∈ [[VI ]], there exists a bijection between the
loops of GeM,VC1 ,ı and those of Ge
M,VC2 ,ı.
Proof. First observe that, since both VC1 and VC2 are cutsets, under a specific input
assignment ı ∈ [[VI ]], the variables in VC1 can be expressed as functions of variables in VC2,
and vice versa. Thus, there exist a mapping f21 : [[VC1]] → [[VC2]] (resp. f12 : [[VC2]] → [[VC1]])
such that, for a valuation s of VC1 (resp. t of VC2), f21(s) (resp. f12(t)) is the corresponding
valuation of VC2 (resp. VC1) variables. In addition, since VC1 and VC2 are minimal, we have
f12(f21(s)) = ~ξ1(ı, s) and f21(f12(t)) = ~ξ2(ı, t), where ~ξ1 and ~ξ2 are the vectors of excitation
functions of M with cutsets VC1 and VC2, respectively.
To see the relation between the loops of GeM,VC1 ,ı and those of Ge
M,VC2 ,ı, consider a state
evolution sequence σ1 = s1, . . . , sj , . . . , sk of [[VC1]] such that sk is the first recurrent state in
σ1 with sk = sj . Clearly, σ2 = f21(s1), . . . , f21(sj), . . . , f21(sk) is a state evolution sequence
over [[VC2]] because f21(f12(f21(s))) = ~ξ2(ı, f21(s)). Now, we need to show that f21(sk) is
the only recurrent state in σ2 with f21(sk) = f21(sj). By contradiction, suppose there exists
another recurrent state in σ2 such that f21(sm) = f21(sl), l < m < k. However, this implies
sm+1 = sl+1 in σ1 because f12(f21(sm)) = f12(f21(sl)). It contradicts the assumption that
sk is the first recurrent state in σ1 unless m = k− 1 and l = j − 1. Similarly, one can show
21
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
that, given a state evolution sequence of [[VC2]] with a loop, there exists a corresponding
sequence of [[VC1]] with a loop. Also, by Propositions 1 and 2, there exists a bijection
between the loops of GeM,VC1 ,ı and those of Ge
M,VC2 ,ı.
Corollary 6 A system’s combinationality at the functional level is independent of the
choice of minimal cutset in the analysis.
Proof. Let VC1 and VC2 be two choices of minimal cutsets, and ~ω1 : [[VI ]] × [[VC1]] → [[VO]]
and ~ω2 : [[VI ]]× [[VC2]] → [[VO]] be the resultant vectors of observation functions. Let f21 and
f12 be the mappings as defined in the proof of Theorem 5. Then, ~ω1(ı, s1) = ~ω2(ı, f21(s1))
and, similarly, ~ω1(ı, f12(s2)) = ~ω2(ı, s2), for any ı ∈ [[VI ]], s1 ∈ [[VC1]], and s2 ∈ [[VC2]].
In addition to the result of Theorem 5, we need to show that all corresponding loops of
GeM,VC1 ,ı and Ge
M,VC2 ,ı must have the same output observation for all ı ∈ [[VI ]].
Suppose M is combinational under an analysis with cutset VC1. Then, all the states
in S`M,VC1 ,ı must have the same output observation label, say o1 ∈ [[VO]]. For the sake of
contradiction, assume there exists a state s2 ∈ S`M,VC2 ,ı with ~ω2(ı, s2) 6= o1. It implies that
~ω1(ı, f12(s2)) 6= o1. Since f12(s2) is in S`M,VC1 ,ı, it contradicts the assumption that all the
states in S`M,VC1 ,ı have observation label o1. Hence, all the states in S`
M,VC2 ,ı must have the
same observation label o1 as well. The corollary follows.
Notice that the result holds even when the cutset changes dynamically.
Assuming a systemM operates without a special pre-initialization, our combinationality
analysis at the functional level is the most general formulation that one can hope for in the
sense that
Theorem 7 There exists a feasible combinational implementation of M if and only if Msatisfies our combinationality test.
Proof. (−→) Suppose that M fails our combinationality test. It implies that there exists
22
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
some input assignment such that the corresponding output valuation cannot settle to a
unique value. This violates the definition of combinationality.
(←−) Trivial.
3.3.4 Conditions of Legitimacy
The legitimacy of our combinationality formulation is confirmed if a system’s cyclic
definitions are to be broken in the final realization. However, if some cyclic definitions are
to be maintained in the final realization, then the certification of combinationality at the
functional level of abstraction is not sufficient to guarantee correctness. Essentially, two
restrictions need to be imposed to ensure the correctness. First, all excitation functions
should be valuated synchronously such that state evolutions follow the combinationality
analysis. Second, the time interval between two consecutive input assignments should be
much larger than the time spent on internal valuations such that the state of the system has
enough time to evolve to an equilibrium loop. Certainly, the first restriction is inadequate for
hardware realization of cyclic definitions due to undesirable effects, such as races, hazards,
glitches, etc. In contrast, software realization is more adequate since the above effects
can be eliminated. A possible application domain could be software synthesis for reactive
systems, where the common assumption is that internal computations are much faster than
environmental responses. Hence, the second restriction is satisfied under this assumption.
3.3.5 Stable Cyclic Dependencies
Although maintaining cyclic definitions is legitimate for software synthesis, it may be
undesirable if SEGs contain unstable loops. Unstable loops result in persistent updates of
state information (even though observation functions have settled to definite values), and
23
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
hence consume power.5 To avoid the persistent power consumption, we require that all the
loops in SEGs must be stable. To make a system with unstable loops stable, the definitions
of the system M in consideration should be rewritten. Such rewrites can be done in various
ways. For instance, an unstable loop L is broken by redirecting the evolution of a state in
L to itself or to another state not leading to an unstable loop. Note that replacing cyclic
definitions with acyclic equivalents is just a special case of such rewrites.
On the other hand, a rewrite is not necessary if all loops are stable. One can devise an
algorithm to test if a system M with cutset VC is stably combinational at the functional
level. Essentially, M is stably combinational if and only if, for any input assignment, any
state s ∈ [[VC ]] can reach (i.e., evolve to) a self-looped state. Let Σ : [[VI ]] × [[VC ]] → B be
the characteristic function denoting the set of states which can reach self-looped states with
respect to some input assignment. Then the algorithm can be outlined as follows. First,
compute the set of self-looped states of GeM,VC ,ı for all ı ∈ [[VI ]] using the characteristic
function∧
j(cj ≡ ξj), where cj ∈ VC is a cutset variable and ξj : [[VI ]] × [[VC ]] → B is an
excitation function in ~ξ. Second, let Σ(0) =∧
j(cj ≡ ξj) initially. Perform the standard
backward reachability analysis (however, variables in VI are not quantified out). That is,
in iterative steps, we derive Σ(k+1) from Σ(k). Let Σ′(k) be Σ(k) with variables VC replaced
by their counterparts in VC′ , the “next-state” cutset variables. Then, Σ(k+1) = Σ(k) ∨∃c′ ∈VC′ .Σ′(k) ∧ Ξ, where Ξ =
∧j(c
′j ≡ ξj) is the characteristic function of the evolution relation
of M under cutset C. The iteration terminates when Σ(m) equals Σ(m−1) for some m ≥ 1,
i.e., no more states can be added to Σ. Using a symbolic approach, the computation is done
for all input assignments simultaneously since variables in VI are not quantified out during
the fixed-point computation. The system is stably combinational if and only if the final Σ
is a tautology.
5Although it might be possible that some way of detecting when the output has settled can be imple-mented to stop this evaluation, it would seem to be expensive.
24
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
In addition to the stability requirement, one may want to bound the maximum length of
evolution paths to equilibrium loops. The number of iterations spent in a combinationality
test corresponds to the length of the longest evolution path(s). If the length is greater than
the upper bound, say n, state evolutions need to be redirected to shorten long paths. One
approach would be to memorize the newly removed state sets for every n− 1 iterations in
the combinationality test. After the test, redirect the evolutions of the memorized states to
proper equilibrium loops.
3.3.6 Input-Output Determinism of State Transition Systems
We extend combinationality analysis on the set DM of cyclic definitions of a system
M with state-holding elements. Note that DM contains only simultaneous definitions and,
thus, excludes the delayed definitions of the state-holding elements. Let VI and VO be the
sets of primary input and primary output variables, respectively. Also, let S (resp. S′) be
the set of output (resp. input) variables of the state-holding elements, and VC be a cutset of
DM. We specify two types of states: External states [[S]] are those designated by the state-
holding elements; internal states [[VC ]] are those emerging from the cyclic definitions. Also,
terms “transition” and “evolution” are used to differentiate the dynamics among external
and internal states, respectively.
Our objective here is to analyze whether the cyclic definitions of M can be replaced
with acyclic ones such that the sequential behavior of M remains unchanged. Essentially,
such a substitution is possible if and only if M has deterministic input-output behavior6.
As mentioned in Section 3.2, the inputs and outputs of the state-holding elements can
be treated as primary outputs and primary inputs, respectively, of the set of the cyclic
6Nevertheless, state transitions may be nondeterministic due to the cyclic definitions. Since internalstates in loops of an SEG may have different observation labels induced by S′, these observation labelsconstitute the possible next external states. Hence, state transitions are nondeterministic in general.
25
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
definitions. However, a direct combinationality test on the cyclic definitions only yields an
approximative analysis because it requires the valuations of S′ to be deterministic.
To achieve an exact analysis, with the above input and output transformation, reacha-
bility analysis (for external states) and combinationality analysis (for internal states) should
be performed alternately. Two conditions need to be satisfied: First, under any input as-
signment ı ∈ [[VI ]] and any reachable state s ∈ [[S]], the set S`M,VC ,(ı,s) of all internal states in
loops of the corresponding SEG GeM,VC ,(ı,s) must have the same observation label induced
by VO, i.e., |Lo(S`M,VC ,(ı,s))| = 1. Second, under any input assignment ı and any reachable
state s, the corresponding next (external) states must be sequentially equivalent. This set
of next states is derived from the set of observation labels induced by S′ over S`M,VC ,(ı,s).
A detailed computation is outlined as follows. Let R(j) be the reached state set for the
state-holding elements at the jth iteration. Let R(0) ⊆ [[S]] be the initial state set. In the
jth iteration, we perform combinationality analysis detailed in Section 3.3.2 to certify that
DM is combinational with respect to VO for any ı ∈ [[VI ]] and s ∈ R(j). (If the certification
is not established, M is not deterministic in its input-output behavior and the procedure
aborts.) The combinationality analysis also gives us the set S`M,VC ,(ı,s) for all ı ∈ [[VI ]]
and s ∈ R(j). From it, we obtain the set of next states under ı and s by computing the
set of observation labels induced by S′ over S`M,VC ,(ı,s). Denote the set of next states as
N(j)ı,s . Then, R(j+1) = R(j) ∪ N (j)
ı,s | ı ∈ [[VI ]], s ∈ R(j). The iterations terminate when
R(k+1) = R(k) for some k. At this point, we need one more step to conclude whether DM can
be rewritten with acyclic definitions. The answer is affirmative if and only if |Lo(N (j)ı,s )| = 1,
for j = 0, . . . , k − 1, and for any ı ∈ [[VI ]], s ∈ R(j). The rewriting procedure is similar to
what was described in Section 3.3.2.
The corresponding computational complexity is PSPACE-complete in the number of
state-holding elements and the cutset size. The PSPACE-completeness is immediate from
26
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
the fact that the input-output determinism problem of state transition systems is in
PSPACE and is even harder than the PSPACE-complete problem of combinationality anal-
ysis shown in Theorem 4.
3.4 Combinationality at the Circuit Level
Combinationality analysis at the functional level abstracts away timing information.
Certainly, it does not guarantee the feasibility of maintaining cyclic definitions in final
circuit implementations. On the other hand, Malik’s formulation based on ternary-valued
simulation turns out to be the right formulation at the circuit level. In his formation,
effectively, all gates and wires are sources of uncertain delay7. Under the up-bounded inertial
delay model [BS95], ternary-valued simulation can be treated as an operational definition
of combinationality for cyclic circuits [SBT96, Shi96].
3.4.1 Synthesis of Cyclic Circuits
A recent attempt [RB03b, RB03a] of synthesizing cyclic circuits brings the formula-
tion of ternary-valued simulation up to the functional level. Combinationality analysis
was checked with recursive marginal operations [RB03a]. However, it was overlooked that
functional-level analysis itself is not sufficient to guarantee the correctness of the final cir-
cuit implementation. Consider the following cyclic definitions over primary-input variables
a and b:
f := ¬ah ∨ ¬b¬h
g := ¬a¬bf
h := ab ∨ ¬g
7This timing assumption is very conservative in the sense that no asynchronous circuits can ever existunder this assumption if initialization is not allowed.
27
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
( i ) ( ii )
f
gh
x
yf
a
b
a
b
a
b
gh
x
y
Figure 3.2. (i) The original circuit. (ii) The induced circuit under input assignment a = 0and b = 0.
The reader can verify that the above definitions are functionally combinational under any
input assignment. (Indeed, under the analysis with recursive marginal operations, the
cyclic definitions are combinational.) However, functional analysis does not guarantee a
well-behaved circuit implementation. Consider the circuit netlist in Figure 3.2 (i) as an
implementation of the above cyclic definitions. The circuit may not be well-behaved. To
see this, consider input assignment a = 0 and b = 0. Assume the resultant induced circuit
is abstracted to that in Figure 3.2 (ii), where all the gates have one-unit delay and all the
wires have zero delay. Now, suppose the previous input assignment is a = 1 and b = 1 before
assignment a = 0 and b = 0. That is, internal signals x and y in Figure 3.2 (ii) are of value
0 initially. An examination shows that the circuit oscillates despite of its combinationality
at the functional level. Essentially, the failure originates from the fact that some gates
and wires are not fully characterized in the analysis. Hence, functional-level analysis is not
sufficient to conclude the correctness of the gate-level implementation.
Two approaches can be applied to rectify the deficiency in the analysis proposed by
[RB03b, RB03a]. One is to remove axioms x ∨ ¬x = true and x ∧ ¬x = false from the
recursive marginal operations when x is not a primary-input variable. The other is to add
28
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
more terms to functional expressions such that, for every input assignment, cyclic definitions
are broken for some functions valuating to either true or false purely depending on the
input assignment. For instance, in the previous example, product term ¬a¬b needs to be
added to the definition of f , i.e, f := ¬ah∨¬b¬h∨¬a¬b. For the second rectification, one
should be careful in any subsequent circuit optimization; the added terms should not be
removed without special care. Note that the necessity of adding some rectification terms
may nullify area gains claimed in [RB03a] due to allowing cyclic combinational circuits.
3.5 Related Work
3.5.1 SEG vs. GMW
Our SEG formalism is closely related to the general multiple winner (GMW) analysis
[BS95], which is commonly used in the analysis of asynchronous circuits. (Under the up-
bounded inertial delay model, GMW analysis is equivalent to ternary-valued simulation.)
To reason about the behavior of asynchronous circuits under physical effects such as glitches,
hazards, races, etc., the GMW analysis builds graphs similar to SEGs with additional non-
deterministic evolutions. Depending on how the current state and next state are coded, an
evolution branches out into several nondeterministic ones. Also, unlike an SEG existing for
a fixed input assignment, the graph built by the GMW analysis is connected for different
input assignments. These additional evolutions make GMW analysis a complicated pro-
cedure. Even worse, the GMW analysis declares a state variable for every delay element
(possibly, a gate or wire). In comparison, only a minimal cutset needs to be chosen in
our combinationality analysis. Therefore, the state space is substantially reduced for SEG
analysis. Under the legitimacy conditions in Section 3.3.4, all of the above complications in
the GMW analysis can be avoided and simplified to our SEG formalism.
29
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
3.5.2 Combinationalities
At the functional level, we contrast our formulation of combinationality with prior work
based on ternary-valued simulation. In the case where all valuations in the set of cyclic
definitions must stabilize, the functional-level extension of Malik’s formulation can be sum-
marized as follows. For any input assignment ı ∈ [[VI ]], there exists a set of definitions
valuating to either true or false such that the cyclic definitions are broken. This require-
ment corresponds to that, for every input assignment, the corresponding SEG has a single
stable loop, and all states of the SEG evolve directly (in one step) to this loop. (An SEG
with multiple stable loops corresponds to what was considered as having nondeterministic
multiple solutions; an SEG with an unstable loop, i.e., a loop with length greater than one,
corresponds to what was considered as having no consistent solution.) In comparison, our
formulation is much more general because SEGs are allowed to have multiple loops, which
can be stable or unstable, and to have long evolution paths.
Now consider a more relaxed case where signals are allowed to oscillate for some in-
put assignment as long as all output valuations are uniquely determined under this input
assignment regardless of internal states. To see how Malik’s formulation corresponds, we
partition input assignments into two sets: one with outputs fully determined, and the other
partially determined. Under the former set of input assignments, no restrictions need to be
imposed on SEGs, just like in our formulation. Under the latter set of input assignments,
however, the restrictions discussed in the case where all valuations must stabilize need to
be imposed. Although the generality is enhanced in the relaxed case, the combinationality
based on ternary-valued simulation is still a limited formulation. In comparison, our for-
mulation is strictly more general. In fact, it is the most general formulation that one can
hope for as stated in Theorem 7.
Example 4 Continue Example 1. The observation function is only partially determined
30
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
under any input assignment. It is not hard to see that system M specified by DM is not
combinational under the functional-level extension of Malik’s combinationality formulation,
contrary to our combinationality analysis.
3.5.3 Sequential Extensions
In his thesis [Shi96], Shiple extended the analysis of combinational cycles for circuits
with state-holding elements. He defined sequential output-stability, which allows a circuit
to be initialized to some stable states and considers only initialized behavior. The GMW
analysis was adopted to replace ternary-valued simulation such that nondeterministic in-
ternal states were admitted to exist as long as the observable behavior is unaffected. An
equivalent acyclic circuit can be generated from the GMW analysis. Again, if the objective
is software synthesis or to break cyclic definitions, such sequential extension can be gen-
eralized substantially and simplified to our computation outlined in Section 3.3.6 without
resorting to the complicated GMW analysis.
3.6 Summary
Based on the observation that when cyclic definitions are to be broken in the final
realization, the formulation of combinationality can be much more general than previous
formulations. In addition, the analysis can be done at a higher abstraction level, i.e., the
functional level. The combinationality formulation is extended to an extreme — a system
is combinationally implementable if and only if it passes our combinationality test. When
cyclic definitions are to be maintained, we examine the legitimacy condition of our formula-
tion. It turns out that software synthesis of reactive systems may be an application domain,
where cyclic definitions can be maintained in the final realization. In addition, we show
that our analysis is independent of the choice of cutsets. Our results admit strictly more
31
CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM
flexible high-level specifications in hardware/software system design. For combinationality
at the circuit level, we comment on a pitfall in a recent attempt synthesizing cyclic circuits
for area minimization. Two approaches are given to rectify the deficiency.
32
Chapter 4
Retiming and Resynthesis
Transformations using retiming and resynthesis operations are the most important and
practical (if not the only) techniques used in optimizing synchronous hardware systems.
Although these transformations have been studied extensively for over a decade, questions
about their optimization capability and verification complexity are not answered fully. Re-
solving these questions may be crucial in developing more effective synthesis and verification
algorithms.
This chapter settles the above two open problems. The optimization potential is resolved
through a constructive algorithm which determines if two given finite state machines (FSMs)
are transformable to each other via retiming and resynthesis operations. Verifying the
equivalence of two FSMs under such transformations when the transformation history is lost
is proved to be PSPACE-complete and hence just as hard as general equivalence checking,
contrary to a common belief. As a result, we advocate a conservative design methodology
for the optimization of synchronous hardware systems to ameliorate verifiability.
Our analysis reveals some properties about initializing FSMs transformed under retim-
ing and resynthesis. On the positive side, a lag-independent bound on the length increase
33
CHAPTER 4. RETIMING AND RESYNTHESIS
of initialization sequences for FSMs under retiming is established. This allows a simpler
incremental construction of initialization sequences compared to prior approaches. On the
negative side, we show that there is no analogous transformation-independent bound when
resynthesis and retiming are iterated. Fortunately, an algorithm computing the exact length
increase is presented.
4.1 Introduction
Retiming [LS83, LS91] is an elementary yet effective technique in optimizing syn-
chronous hardware systems. By simply repositioning registers, it is capable of rescheduling
computation tasks in an optimal way subject to some design criteria. As both an ad-
vantage and a disadvantage, retiming preserves the circuit structure of the system under
consideration. It is an advantage in that it supports incremental engineering change with
good predictability, and a disadvantage in that the optimization capability is somewhat
limited. Therefore, resynthesis [Mal90, DM91, MSBSV91] was proposed to be combined
with retiming, allowing modification of circuit structures. This combination of retiming
and resynthesis certainly extends the optimization power of retiming, but to what extent
remains an open problem; even though some notable progress has been made since [Mal90],
e.g., [Ran97, RSSB98, ZSA98]. Fully resolving this problem is crucial in understanding the
complexity of verifying the equivalence of systems transformed by retiming and resynthesis
and in constructing correct initialization sequences. In fact, despite its effectiveness, retim-
ing and resynthesis is not widely used in hardware synthesis flows due to the verification
hindrance and the initialization problem. Progress in these areas could enhance the prac-
ticality and application of retiming and resynthesis, and advance the development of more
effective synthesis and verification algorithms.
This chapter tackles three main problems regarding retiming and resynthesis:
34
CHAPTER 4. RETIMING AND RESYNTHESIS
Optimization power: What is the transformation power of retiming and resynthesis?
How can we tell if two synchronous systems are transformable to each other with
retiming and resynthesis operations?
Verification complexity: What is the computational complexity of verifying if two syn-
chronous systems are equivalent under retiming and resynthesis?
Initialization: How does the transformation of retiming and resynthesis affect the initial-
ization of a synchronous system? How can we correct initialization sequences?
Our main results include
• (Section 4.3) Characterize constructively the transformation power of retiming and resyn-
thesis.
• (Section 4.4) Prove the PSPACE-completeness of verifying the equivalence of systems
transformed by retiming and resynthesis operations when the transformation history
is lost.
• (Section 4.5) Demonstrate the effects of retiming and resynthesis on the initialization
sequences of synchronous systems. Present an algorithm correcting initialization se-
quences.
The chapter is organized as follows. After Section 4.2 introduces some preliminaries
and notation, our main results are presented in Sections 4.3, 4.4, and 4.5. In Section 4.6,
a closer comparison with prior work is detailed. Section 4.7 concludes this chapter and
outlines some future research directions.
35
CHAPTER 4. RETIMING AND RESYNTHESIS
4.2 Preliminaries
In this chapter, to avoid later complication we shall not restrict ourselves to binary
variables and Boolean functions. Thus, we assume that variables can take values from
arbitrary finite domains, and similarly functions can have arbitrary finite domains and co-
domains. When (co)domains are immaterial in the discussion, we shall omit specifying
them.
Synchronous hardware systems.
Based on [LS83], a syntactical definition of synchronous hardware systems can be for-
mulated as follows. A hardware system is abstracted as a directed graph, called a com-
munication graph, G = (V,E) with typed vertices V and weighted edges E. Every vertex
v ∈ V represents either the environment or a functional element. The vertex representing
the environment is the host, which is of type undefined; a vertex is of type ~f if the functional
element it represents is of function ~f (which can be a multiple-output function consisting
of f1, f2, . . .). Every edge e〈w〉 = (u, v)〈w〉 ∈ E with a nonnegative integer-valued weight
w corresponds to the interconnection from vertex u to vertex v interleaved by w state-
holding elements (or registers). (From the viewpoint of hardware systems, any component
in a communication graph disconnected from the host is redundant. Hence, in the sequel,
we assume that a communication graph is a single connected component.) A hardware
system is synchronous if, in its corresponding communication graph, every cycle contains
at least one positive-weighted edge. This chapter is concerned with synchronous hardware
systems whose registers are all triggered by the same clock ticks. Moreover, according to
the initialization mechanism, a register can be reset either explicitly or implicitly. For reg-
isters with explicit reset, their initial values are determined by some reset circuitry when
the system is powered up. In contrast, for registers with implicit reset, their initial values
36
CHAPTER 4. RETIMING AND RESYNTHESIS
can be arbitrary, but can be brought to an identified set of states (i.e. the set of initial
states1) by applying some input sequences, the so-called initialization (or reset) sequences
[Pix92]. It turns out that explicit-reset registers can be replaced with implicit-reset ones
plus some reset circuitry [MSBSV91, SMB96]. (Doing so admits a more systematic treat-
ment of retiming synchronous hardware systems because retiming explicit-reset registers
needs special attention to maintain equivalent initial states.) Without loss of generality,
this chapter assumes that all registers have implicit reset. In addition, we are concerned
with initializable systems, that is, there exist input sequences which bring the systems from
any state to some set of designated initial states.
The semantical interpretation of synchronous hardware systems can be modelled as finite
state machines. To uniquely construct an FSM from a communication graph G = (V,E),
we divide each edge (u, v)〈w〉 ∈ E into w + 1 edges separated by w registers and connected
with the two end-vertices u and v. We then associate the outgoing (incoming) edges of
registers with current-state variables VS (next-state variables VS′); associate the outgoing
(incoming) edges of the host with variables VI (VO). All other edges are associated with
internal variables. The transition and output functions are obtained, starting from VS′ and
VO, respectively, by a sequence of recursive substitutions of variables with functions of their
input functional elements until the functions depend only on variables VI ∪ VS .
We define a strong form of state equivalence which will govern the study of the trans-
formation power of retiming.
Definition 4 Given an FSM M = (Q, I,Σ,Ω, ~δ, ~ω), two states q1, q2 ∈ Q are immedi-
ately equivalent if ~δ(σ, q1) ≡ ~δ(σ, q2) and ~ω(σ, q1) ≡ ~ω(σ, q2) for any σ ∈ Σ.
Also, we define dangling states inductively as follows.
1When referring to “initial states,” we shall mean the starting states of a system after initialization.
37
CHAPTER 4. RETIMING AND RESYNTHESIS
Definition 5 Given an FSM, a state is dangling if either it has no predecessor state or
all of its predecessor states are dangling. All other states are non-dangling.
Retiming.
A retiming operation over a synchronous hardware system consists of a series of atomic
moves of registers across functional elements in either a forward or backward direction.
(The relocation of registers is crucial in exploring optimal synchronous hardware systems
with respect to various design criteria, such as area, performance, power, etc. As not our
focus, the exposure of retiming in the optimization perspective is omitted in this chapter.
Interested readers are referred to [LS91].) Formally speaking, retiming can be described
with a retime function [LS83] over a communication graph as follows.
Definition 6 Given a communication graph G = (V,E), a retime function ρ : V → Z
maps each vertex to an integer, called the lag of the vertex, such that w + ρ(v)− ρ(u) ≥ 0
for any edge (u, v)〈w〉 ∈ E. If ρ(host) = 0, ρ is called normalized; otherwise, ρ is
unnormalized.
Given a communication graph G = (V,E), any retime function ρ over G uniquely determines
a “legally” retimed communication graph G† = (V,E†) in which (u, v)〈w〉 ∈ E if, and only
if, (u, v)〈w + ρ(v)− ρ(u)〉 ∈ E†. It is immediate that the retime function −ρ reverses the
retiming from G† to G.
Retime functions can be naturally classified by calibrating their equivalences as follows.
Definition 7 Given a communication graph G, two retime functions ρ1 and ρ2 are equiv-
alent if they result in the same retimed communication graph.
Proposition 8 Given a retime function ρ with respect to a communication graph, offsetting
ρ by an integer constant c results in an equivalent retime function.
38
CHAPTER 4. RETIMING AND RESYNTHESIS
Hence any retime function can be normalized. This equivalence relation, which will be useful
in the study of the increase of initialization sequences due to retiming, induces a partition
over retime functions. Equivalent retime functions (with respect to some communication
graph) form an equivalence class.
Proposition 9 Given a communication graph G, any equivalence class of retime functions
is of infinite size; any equivalence class of normalized retime functions is of size either one
or infinity (only when G contains components disconnected from the host). Furthermore,
any equivalence class of retime functions has a normalized member.
Resynthesis.
A resynthesis operation over a function f rewrites the syntactical formular structure of
f while maintaining its semantical functionality. Clearly, the set of all possible rewrites is
infinite (but countable, namely, with the same cardinality as the set N of natural numbers).
When a resynthesis operation is performed upon a synchronous hardware system, we shall
mean that the transition and output functions of the corresponding FSM are modified in
representations but preserved in functionalities. This modification in representations will
be reflected in the communication graph of the system. (Again, such rewrites are usually
subject to some optimization criteria. Since this is not our focus, the optimization aspects
of resynthesis operations are omitted. See, e.g., [DM91] for further treatment.)
4.3 Optimization Capability
The transformation power of retiming and resynthesis can be understood best with state
transition graphs (STGs) defined by FSMs. We investigate how retiming and resynthesis
operations can alter STGs.
39
CHAPTER 4. RETIMING AND RESYNTHESIS
4.3.1 Optimization Power of Retiming
Given a communication graph G = (V, E), we study how the atomic forward and
backward moves of retiming affect the corresponding FSM M = ([[VS ]], I, Σ,Ω, ~δ, ~ω).
To study the effect of an atomic backward move, consider a normalized retime function
ρ with ρ(v) = 1 for some vertex v ∈ V and ρ(u) = 0 for all u ∈ V \v. (Because a
retiming operation can be decomposed as a series of atomic moves, analyzing ρ defined
above suffices to demonstrate the effect.) Let VS = VS\ ∪ VS∗ be the state variables of
M, where VS\ = s1, . . . , si and VS∗ = si+1, . . . , sn are disjoint. Suppose v is of type
~f : [[t1, . . . , tj]] → [[s′1, . . . , s′i]], where the valuation of next-state variables s′k is defined by
fk(t1, . . . , tj) for k = 1, . . . , i. Let M† = ([[V†S ]], I†, Σ,Ω, ~δ†, ~ω†) be the FSM after retiming,
where state variables V†S = VT ∪VS∗ with VT = t1, . . . , tj. For any two states q†1, q†2 ∈ [[V†S ]],
if q†1[VS∗ ] ≡ q†2[VS∗ ] and ~f(q†1[VT ]) ≡ ~f(q†2[VT ]), then q†1 and q†2 are immediately equivalent.
This immediate equivalence results from the fact that the transition and output functions
of M† can be valuated after the valuation of ~f , which filters out the difference between
q†1 and q†2. Comparing state pairs between M and M†, we can always find a relation
R ⊆ [[VS ]]× [[V†S ]] such that
1. Pairs (q1, q†1) and (q1, q
†2) are both in R for the state q1 of M with q1[VS∗ ] ≡ q†1[VS∗ ]
and q1[VS\ ] ≡ ~f(q†1[VT ]).
2. It preserves the immediate equivalence, that is, (q, q†) ∈ R if, and only if, ~ω(σ, q) ≡~ω†(σ, q†) and (~δ(σ, q), ~δ†(σ, q†)) ∈ R for any σ ∈ Σ.
Since ~f is a total function, every state of M† has a corresponding state in M related by
R. (It corresponds to the fact that backward moves of retiming cannot increase the length
of initialization sequences, the subject to be discussed in Section 4.5.) On the other hand,
since ~f may not be a surjective (or an onto) mapping in general, there may be some state
40
CHAPTER 4. RETIMING AND RESYNTHESIS
q of M such that ∀x ∈ [[VT ]]. q[VS\ ] 6≡ ~f(x), that is, no states can transition to q. In this
case, q can be seen as being annihilated after retiming. To summarize,
Lemma 10 An atomic backward move of retiming can 1) split a state into multiple imme-
diately equivalent states and/or 2) annihilate states which have no predecessor states.
With a similar reasoning by reversing the roles of M and M†, one can show
Lemma 11 An atomic forward move of retiming can 1) merge multiple immediately equiv-
alent states into a single state and/or 2) create states which have no predecessor states.
(Similar results of Lemmas 10 and 11 appeared in [RSSB98], where the phenomena of state
creation and annihilation were omitted.)
Note that, in a single atomic forward move of retiming, transitions among the newly cre-
ated states are prohibited. In contrast, when a sequence of atomic forward moves m1, . . . ,mn
are performed, the newly created states at move mi can possibly have predecessor states
created in later moves mi+1, . . . , mn. Clearly all the newly created states not merged with
original existing states due to immediate equivalence are dangling. However, to be shown in
Section 4.5.1, the transition paths among these dangling states cannot be arbitrarily long.
Since a retiming operation consists of a series of atomic moves, Lemmas 10 and 11 set
the fundamental rules of all possible changes of STGs by retiming. Observe that a retiming
operation is always associated with some structure (i.e. a communication graph). For a
fixed structure, a retiming operation has limited optimization power, e.g., the converses of
Lemmas 10 and 11 are not true. That is, there may not exist atomic moves of retiming
(over a communication graph) which meet arbitrary targeting changes on an STG. Unlike a
retiming operation, a resynthesis operation provides the capability of modifying the vertices
and connections of a communication graph.
41
CHAPTER 4. RETIMING AND RESYNTHESIS
4.3.2 Optimization Power of Retiming and Resynthesis
A resynthesis operation itself cannot contribute any changes to the STG of an FSM.
However, when combined with retiming, it becomes a handy tool. In essence, the combi-
nation of retiming and resynthesis validates the converse of Lemmas 10 and 11 as will be
shown in Theorem 13. Moreover, it determines the transitions of newly created states due
to forward retiming moves, and thus has decisive effects on initialization sequences as will
be discussed in Section 4.5.2. On the other hand, we shall mention an important property
about retiming and resynthesis operations.
Lemma 12 Given an FSM, the newly created states (not merged with original existing
states due to immediate equivalence) due to atomic forward moves of retiming remain dan-
gling throughout iterative retiming and resynthesis operations.
Remark 1 As an orthogonal issue to our discussion on how retiming and resynthesis
can alter the STG of an FSM, the transformation of retiming and resynthesis was shown
[MSBSV91] to have the capability of exploiting various state encodings (or assignments) of
the FSM.
Notice that the induced state space of the dangling states originating from atomic
moves of retiming is immaterial in our study of the optimization capability of retiming and
resynthesis because an FSM after initialization never reaches such dangling states. An exact
characterization of the optimization power of retiming and resynthesis is given as follows.
Theorem 13 Ignoring the (unreachable) dangling states created due to retiming, two FSMs
are transformable to each other through retiming and resynthesis if, and only if, their state
transition graphs are transformable to each other by a sequence of splitting a state into
multiple immediately equivalent states and of merging multiple immediately equivalent states
into a single state.
42
CHAPTER 4. RETIMING AND RESYNTHESIS
Proof. (=⇒) Since resynthesis does not change the transition functions of an FSM, the
proof is immediate from Lemmas 10 and 11.
(⇐=) Given a target sequence of merging and splitting of immediately equivalent states,
it can be accomplished by a sequence of retiming and resynthesis. Essentially, each merging
(resp. splitting) of states can be achieved with a resynthesis operation followed by a forward
(resp. backward) retiming operation. To see why, let Σ and Q be the set of input alphabets
and states of M, respectively. Without loss of generality, assume that q1, q2 ∈ Q are
immediately equivalent states to be merged. A resynthesis operation can rewrite the original
transition functions ~δ : Σ × Q → Q as a composition of two parts, ~δ = ~∆2 ~∆1, where
~∆1 : Q → Q\q2, ~∆2 : Σ × Q\q2 → Q and ~∆1(q2) = q1. Retiming registers to the
positions in-between ~∆1 and ~∆2 merges q2 to q1. It is not hard to see that the retiming
operation is always possible. On the other hand, assume q ∈ Q is the state to be split
into multiple immediately equivalent states Q†. A resynthesis operation can again rewrite
the original transition functions ~δ as a composition of two parts, ~δ = ~∆4 ~∆3, where
~∆3 : Σ × Q → Q† ∪ Q\q, ~∆4 : Q† ∪ Q\q → Q, ~∆3(σ, qi) ∈ Q† for ~δ(σ, qi) = q, and
~∆4(q†) = q for q† ∈ Q†. Retiming registers to the positions in-between ~∆3 and ~∆4 splits q
to Q†. Notice that the retiming is always possible because the output functions, originally
depending on Q, can be rewritten (by resynthesis) as functions depending on Q† ∪Q\q.Consequently, any sequence of merging and splitting of immediately equivalent states is
achievable using retiming and resynthesis operations.
(A similar result of Theorem 13 appeared in [RSSB98], where however the optimization
power of retiming and resynthesis was over-stated as will be detailed in Section 4.6.) From
Theorem 13, one can relate two FSMs before and after the transformation of retiming and
resynthesis as follows.
Corollary 14 Given two FSMs M = (Q, I,Σ, Ω, ~δ, ~ω) and M† = (Q†, I†,Σ, Ω, ~δ†, ~ω†), M
43
CHAPTER 4. RETIMING AND RESYNTHESIS
and M† are transformable to each other through retiming and resynthesis operations if, and
only if, there exists a relation R ⊆ Q×Q† satisfying
1. Any non-dangling state q ∈ Q (resp. q† ∈ Q†) has at least one non-dangling state
q† ∈ Q† (resp. q ∈ Q) such that (q, q†) ∈ R.
2. State pair (q, q†) ∈ R if, and only if, ~ω(σ, q) ≡ ~ω†(σ, q†) and (~δ(σ, q), ~δ†(σ, q†)) ∈ R
for any σ ∈ Σ.
Notice that the statements of Theorem 13 and Corollary 14 are nonconstructive in the sense
that no procedure is given to determine if two FSMs are transformable to each other under
retiming and resynthesis. This weakness motivates us to study a constructive alternative.
Remark 2 One can show that peripheral retiming [MSBSV91] combined with resynthesis
does not increase the transformation power of normal retiming combined with resynthesis.
That is, the former can be achieved with the latter as we discuss below.
Peripheral retiming and resynthesis work as follows. A peripheral retiming operation is
performed on a communication graph G = (V, E) such that edges with negative weights are
allowed to exist temporarily. A resynthesis operation is then performed on G, yielding a
new communication graph G† = (V †, E†). Another retiming operation on G†, yielding G‡ =
(V †, E‡), must ensure that all edges E‡ are of non-negative weights. (If the last step fails, the
entire transformation is illegal. We are only concerned with legal transformations.) Observe
that the edges with non-zero weights in E† survive throughout the above operations. That is,
these edges also appear in E and E‡ ignoring the weight changes. With a similar reasoning
of Lemmas 10 and 11, the state spaces Q and Q‡ induced by G and G‡, respectively,
can be related with the valuations of these edges. This relation makes the transformation
achievable with a resynthesis operation followed by a normal retiming operation. Essentially,
the resynthesis operation rewrites the original transition functions ~δ : Σ×Q → Q induced
44
CHAPTER 4. RETIMING AND RESYNTHESIS
by G as a composition of ~∆2 ~∆1 according to the above relation, where ~∆1 : Σ×Q → Q‡
and ~∆2 : Q‡ → Q. The retiming operation, on the other hand, moves registers in-between
~∆1 and ~∆2.
It is worth mentioning that, although peripheral retiming in theory does not increase
the transformation power, it is useful in practice to find good rewrites.
4.3.3 Retiming-Resynthesis Equivalence and Canonical Representation
Given an FSM, the transformation of retiming and resynthesis operations can rewrite
it into a class of equivalent FSMs (constrained by Corollary 14). We ask if there exists a
computable canonical representative in each such class, and answer this question affirma-
tively by presenting a procedure constructing it. Rather than arguing directly over FSMs,
we simplify our exposition by arguing over STGs.
Because retiming and resynthesis operations are reversible, we know
Proposition 15 Given STGs G, G1, and G2. Suppose G1 and G2 are derivable from G
using retiming and resynthesis operations. Then G1 and G2 are transformable to each other
under retiming and resynthesis.
We say that two FSMs (STGs) are equivalent under retiming and resynthesis if they are
transformable to each other under retiming and resynthesis. Thus, any such equivalence
class is complete in the sense that any member in the class is transformable to any other
member. To derive a canonical representative of any equivalence class, consider the algo-
rithm outlined in Figure 4.1. Similar to the general state minimization algorithm [Koh78],
the idea is to seek a representative minimized with respect to the immediate equivalence
of states. However, unlike the least-fixed-point computation of the general state minimiza-
tion, the computation in Figure 4.1 looks for a greatest fixed point. Given an STG, the
45
CHAPTER 4. RETIMING AND RESYNTHESIS
ConstructQuotientGraphinput: a state transition graph Goutput: a state-minimized transition graph w.r.t. immediate equivalencebegin01 remove dangling states from G02 repeat03 compute and merge immediately equivalent states of G04 until no merging performed05 return the reduced graphend
Figure 4.1. Algorithm: Construct quotient graph.
algorithm first removes all the dangling states, and then iteratively merges immediately
equivalent states until no more states can be merged.
Theorem 16 Given an STG G, Algorithm ConstructQuotientGraph produces a canonical
state-minimized solution, which is equivalent to G under retiming and resynthesis.
Proof. It is clear that the algorithm always terminates for finite state transition graphs.
Recall our assumption that FSMs are of implicit reset. Since dangling states do not
affect the normal operation of an FSM (but affect its initialization), the algorithm can safely
remove the state space induced by the dangling states and consider only the remaining state
space. (See also Proposition 20.)
For the sake of contradiction, assume the algorithm produces two different (non-
isomorphic) quotient graphs G1/ and G2/ for two given STGs G1 and G2, respectively,
which are equivalent under retiming and resynthesis. Because the algorithm merges only
immediately equivalent states, G1/ and G2/ must also be equivalent under retiming and
resynthesis (but not isomorphic by assumption). Since G1/ and G2/ are not isomorphic,
there does not exist a bijection (a one-to-one and onto mapping) between states of G1/ and
states of G2/ such that the bijection preserves immediate equivalence. Two cases need to
be considered. First, there exists an onto but not one-to-one mapping from one graph to
46
CHAPTER 4. RETIMING AND RESYNTHESIS
VerifyEquivalenceUnderRetiming&Resynthesisinput: two state transition graphs G1 and G2
output: Yes, if G1 and G2 are equivalent under retiming and resynthesisNo, otherwise
begin01 G1/ := ConstructQuotientGraph(G1)02 G2/ := ConstructQuotientGraph(G2)03 if G1/ and G2/ are isomorphic04 then return Yes05 else return Noend
Figure 4.2. Algorithm: Verify equivalence under retiming and resynthesis.
the other which preserves immediate equivalence. In this case, not both G1/ and G2/ are
maximally reduced. It contradicts with the assumption that any two states in a quotient
graph cannot be immediately equivalent. Second, there exists no mapping preserving im-
mediate equivalence. However, from Proposition 15, we know that G1/ is transformable to
G1, then to G2, and finally to G2/. Hence a mapping that preserves immediate equivalence
must exist between G1/ and G2/. Again a conflict arises. The theorem follows.
For a naıve explicit enumerative implementation, Algorithm ConstructQuotientGraph is of
time complexity O(kn3), where k is the size of input alphabet and n is number of states.
A prudent refinement similar to the Paige-Tarjan algorithm [PT87] can further reduce the
complexity to O(kn log n). (Notice that the complexity is exponential when the input is
an FSM, instead of an STG representation.) For an implicit symbolic implementation, the
complexity depends heavily on the internal symbolic representations. If Step 3 in Figure 4.1
computes and merges all immediately equivalent states at once in a breadth-first-search
manner, then the algorithm converges in a minimum number of iterations.
From the proof of Theorem 16, an algorithm outlined in Figure 4.2 can check if two
STGs are transformable to each other under retiming and resynthesis.
47
CHAPTER 4. RETIMING AND RESYNTHESIS
Theorem 17 Given two state transition graphs, Algorithm VerifyEquivalenceUnderRetim-
ing&Resynthesis verifies if they are equivalent under retiming and resynthesis.
Proof. A direct consequence of Theorem 16.
The complexity of the algorithm in Figure 4.2 is the same as that in Figure 4.1 since the
graph isomorphism check for STGs is O(kn), which is not the dominating factor. With the
presented algorithm, checking the equivalence under retiming and resynthesis is not easier
than general equivalence checking. In the following section, we investigate its intrinsic
complexity.
4.4 Verification Complexity
We show some complexity results of verifying if two FSMs are equivalent under iterative
retiming and resynthesis.
4.4.1 Verification with Unknown Transformation History
We investigate the complexity of verifying the equivalence of two FSMs with unknown
history of retiming and resynthesis operations.
Theorem 18 Determining if two FSMs are equivalent under retiming and resynthesis with
unknown transformation history is PSPACE-complete.
Proof. Certainly Algorithm VerifyEquivalenceUnderRetiming&Resynthesis can be per-
formed in polynomial space (even with inputs in FSM representations).
On the other hand, we need to reduce a PSPACE-complete problem to our problem at
hand. The following problem is chosen.
48
CHAPTER 4. RETIMING AND RESYNTHESIS
Given a total function f : 1, . . . , n → 1, . . . , n, is there a composition of fsuch that, by composing f k times, fk(1) = n?
In other words, the problem asks if n is “reachable” from 1 through f . It was shown
[Jon75] to be deterministic2 LOGSPACE-complete in the unary representation and, thus,
PSPACE-complete in the binary representation [Pap94]. We show that the problem in
the unary (resp. binary) representation is log-space (resp. polynomial-time) reducible to
our problem with inputs in STG (resp. FSM) representations. We further establish that
the answer to the PSPACE-complete problem is positive if and only if the answer to the
corresponding equivalence verification problem (to be constructed) is negative. Since the
complexity class of nondeterministic space is closed under complementation [Imm88], the
theorem follows.
To complete the proof, we elaborate the reduction. Given a function f as stated earlier,
we construct two total functions f1, f2 : 0, 1, . . . , n → 0, 1, . . . , n as follows. Let f1 have
the same mapping as f over 1, . . . , n − 1 and have f1(0) = 1 and f1(n) = 1. Also let
f2 have the same mapping as f with f2(0) = 1 but f2(n) = 0. Clearly the constructions
of f1 and f2 can be done in log-space. Treating 0, 1, . . . , n as the state set, f1 and f2
specify the transitions of two STGs, say G1 and G2, (which have empty input and output
alphabets). Observe that any state of G1 (similarly G2) has exactly one next-state. Thus,
every state is either in a single cycle or on a single path leading to a cycle. Observe also that
two states of G1 (similarly G2) are immediately equivalent if and only if they have the same
next-state. An important consequence of these observations is that all states not in cycles
can be merged through iterative retiming and resynthesis due to immediate equivalence.
To see the relationship between reachability and equivalence under retiming and resyn-
thesis, consider the case where n is reachable from 1 through f . States 1 and n of G1 must
2It is a well-known result by Savitch [Sav70] that deterministic and nondeterministic space complexitiescoincide.
49
CHAPTER 4. RETIMING AND RESYNTHESIS
be in a cycle excluding state 0; states 1 and n of G2 must be in a cycle including state 0.
Hence the state-minimized (with respect to immediate equivalence) graphs of G1 and G2
are not isomorphic. That is, G1 and G2 are not equivalent under retiming and resynthesis.
On the other hand, consider the case where n is unreachable from 1 through f . Then state
n of G1 and state n of G2 are dangling. From the mentioned observations, merging dangling
states in G1 and G2 yields two isomorphic graphs. That is, G1 and G2 are equivalent under
retiming and resynthesis. Therefore, n is reachable from 1 through f if, and only if, G1 and
G2 are not equivalent under retiming and resynthesis. (Notice that, unlike the discussion
of optimization capability, here we should not ignore the effects of retiming and resynthesis
over the unreachable state space.)
4.4.2 Verification with Known Transformation History
By Theorem 18, verifying if two FSMs are equivalent under retiming and resynthesis
without knowing the transformation history is as hard as the general equivalence checking
problem. Thus, we advocate a conservative design methodology optimizing synchronous
hardware systems to ameliorate verifiability.
An easy approach to circumvent the PSPACE-completeness is to record the history
of retiming and resynthesis operations as verification checkpoints, or alternatively to per-
form equivalence checking after every retiming or resynthesis operation. The reduction in
complexity results from the following well-known facts.
Proposition 19 Given two synchronous hardware systems, verifying if they are trans-
formable to each other with retiming only is of the same complexity as checking graph
isomorphism; verifying if they are transformable to each other with resynthesis only is of
the same complexity as combinational equivalence checking, which is NP-complete.
50
CHAPTER 4. RETIMING AND RESYNTHESIS
Therefore, if transformation history is completely known, the verification complexity reduces
to NP-complete.
4.5 Initialization Sequences
To discuss initialization sequences, we rely on the following proposition of Pixley [Pix92].
Proposition 20 ([Pix92]) An FSM is initializable only if its initial states are non-
dangling. (In fact, any non-dangling state can be used as an initial state by suitably modi-
fying initialization sequences.)
By Lemma 12, Corollary 14 and Proposition 20, it is immediate that
Corollary 21 The initializability of an FSM is an invariant under retiming and resynthe-
sis.
Hence we shall assume that the given FSM M is initializable. Furthermore, we assume
that its initialization sequence is given as a black box. That is, we have no knowledge on
how M is initialized. Under these assumptions, we study how the initialization sequence
is affected when M is retimed (and resynthesized). As shown earlier, the creation and
annihilation of dangling states are immaterial to the optimization capability of retiming
and resynthesis. However, they play a decisive role in affecting initialization sequences.
In essence, the longest transition path among dangling states determines how long the
initialization sequences should be increased.
51
CHAPTER 4. RETIMING AND RESYNTHESIS
4.5.1 Initialization Affected by Retiming
Lag-dependent bounds.
Effects of retiming on initialization sequences were studied by Leiserson and Saxe in
[LS83], where their Retiming Lemma can be rephrased as follows.
Lemma 22 ([LS83]) Given a communication graph G = (V, E) and a normalized retime
function ρ, let ` = maxv∈V −ρ(v) and let G† be the corresponding retimed communication
graph of G. Suppose M and M† are the FSMs specified by G and G†, respectively. Then
after M† is initialized with an arbitrary input sequence of length `, any state of M† has an
equivalent3 state in M.
That is, ` (nonnegative for normalized ρ)4 gives an upper bound of the increase of initializa-
tion sequences under retiming. This bound was further tightened in [EMMRM97, SPRB95]
by letting ` be the maximum of −ρ(v) for all v of functional elements whose functions define
non-surjective mappings. Unfortunately, this strengthening still does not produce an exact
bound. Moreover, by Proposition 8, a normalized retime function among its equivalent
retime functions may not be the one that gives the tightest bound. A derivation of exact
bounds will be discussed in Section 4.5.2.
Lag-independent bounds.
Given a synchronous hardware system, a natural question is if there exists some bound
which is universally true for all possible retiming operations. Even though the bound may
be looser than lag-dependent bounds, it discharges the construction of new initialization
3A state q of FSM M is equivalent to a state q† of FSM M† if M starting from q, and M† starting fromq† have the same input-output behavior.
4Recall that normalized ρ is when ρ(host) = 0.
52
CHAPTER 4. RETIMING AND RESYNTHESIS
sequences from knowing what retime functions have been applied. Indeed, such a bound
does exist as exemplified below.
Proposition 23 Given a communication graph G = (V, E) and a normalized retime func-
tion ρ, let r(v) denote the minimum number of registers along any path from the host to
vertex v. Then r(v) sets an upper bound of the number of registers that can be moved for-
ward across v, i.e., −r(v) ≤ ρ(v). (Similarly, r(v) on G with reversed edges sets an upper
bound of ρ(v).)
Thus, maxv r(v), which is intrinsic to a communication graph and is independent of retiming
operations, yields a lag-independent bound.
When initialization delay is not a concern for a synchronous system, one can even relax
the above lag-independent bound by saying that the total number of registers of the system is
another lag-independent bound. As an example, suppose a system has one million registers
and its retimed version runs at one gigahertz clock frequency. Then the initialization delay
increased due to retiming is less than a thousandth of a second.
4.5.2 Initialization Affected by Retiming and Resynthesis
So far we have focused on initialization issues arising when a system is retimed only. Here
we extend our study to issues arising when a system is iteratively retimed and resynthesized.
A difficulty emerges from directly applying Lemma 22 to bound the increase of ini-
tialization sequences under iterative retiming and resynthesis. Interleaving retiming with
resynthesis makes the union bound∑
i ui the only available bound from Lemma 22, where
ui denotes the lag-dependent bound for the ith retiming operation. Essentially, inaccuracies
accumulate along with the summation of the union bound. Thus, the bound derived this
way can be far beyond what is necessary. In the light of lag-independent bounds discussed
53
CHAPTER 4. RETIMING AND RESYNTHESIS
earlier, one might hope that there may exist some constant which upper bounds the increase
of initialization sequences due to any iterative retiming and resynthesis operations. (Notice
that, when no resynthesis operation is performed, the transformation of a series of retiming
operations can be achieved by a single retiming operation. Thus a lag-independent bound
exists for iterative retiming operations.) Unfortunately, such a transformation-independent
bound does not exist as shown in Theorem 25.
Lemma 24 Any dangling state of an FSM (with implicit reset) is removable through iter-
ative retiming and resynthesis operations.
Proof. By Proposition 20, the initial states of an FSM M with implicit reset must be
non-dangling. Removing dangling states cannot affect the behavior of M. Essentially,
states without predecessor states can be eliminated with a resynthesis operation followed
by a retiming operation. To see why this is the case, let Σ be the set of input alphabets,
Q be the set of states of M, and Q† ⊆ Q be the subset of states with predecessors. A
resynthesis operation can rewrite the original transition functions ~δ : Σ × Q → Q as a
composition of three parts ~δ = ~∆−1 ~∆ ~δ, where ~∆ : Q → Q†, ~∆−1 : Q† → Q, and
~∆−1 ~∆ is an identity mapping. (Notice that ~∆−1 exists because states Q\Q† have empty
pre-image.) Retiming registers to the positions in-between ~∆ and ~∆−1 eliminates states
with no predecessors. (The retiming operation is possible because the output functions of
M can take the intermediate valuation after ~δ and before the identity mapping ~∆−1 ~∆
as its state input.) Therefore, with iterative retiming and resynthesis, dangling states are
removable.
Theorem 25 Given a synchronous hardware system and an arbitrary constant c, there
always exist retiming and resynthesis operations on the system such that the length increase
of the initialization sequence exceeds c.
54
CHAPTER 4. RETIMING AND RESYNTHESIS
Proof. The theorem follows from Lemma 24 and the reversibility of the transformation of
retiming and resynthesis.
Since the mentioned union bound is inaccurate and requires knowing the applied re-
time functions, it motivates us to investigate the computation of exact5 length increase
of initialization sequences without knowing the history of retiming and resynthesis opera-
tions. The length increase can be derived by computing the length, say n, of the longest
transition paths among the dangling states because applying an arbitrary6 input sequence
of length greater than n drives the system to a non-dangling state. The length n can be
obtained using a symbolic computation. By breadth-first search, one can iteratively remove
states without predecessor states until a greatest fixed point is reached. The number of the
performed iterations is exactly n.
4.6 Related Work
Optimization capability.
The closest to our work on the optimization power of retiming and resynthesis is
[RSSB98], where the optimization power was unfortunately over-stated contrary to the
claimed exactness. The mistake resulted from the claim that any 2-way switch operation
is achievable using 2-way merge and 2-way split operations (see [RSSB98] for their defi-
nitions). Figure 4.3 shows a counterexample illustrating a 2-way switch operation that is
not achievable with 2-way merge and split operations. (Essentially, a restriction needs to
be imposed — under any input assignment, the next state of a current state to be split
should be unique.) In fact, only 2-way merge and split operations are essential. Aside from
5The exactness is true under the assumption that the initialization sequence of the original FSM is givenas a block box. If the initialization mechanism is explored, more accurate analysis may be achieved.
6Although exploiting some particular input sequence may shorten the length increase, it complicates thecomputation.
55
CHAPTER 4. RETIMING AND RESYNTHESIS
( ii )
01s
s 2
0
0
0, 11
1
s 01s
s 2
0, 11
10 0
( i )
s
Figure 4.3. The STG in (i) is transformable to the STG in (ii) by a 2-way switch operationwhile the reverse direction is not transformable. Since the operation is not reversible, itfalls beyond the transformation power of retiming and resynthesis.
this minor error, no constructive algorithm was known to determine if two given FSMs are
equivalent under iterative retiming and resynthesis. In addition, not discussed were the
creation and annihilation of dangling states, which we show to be crucial in initializing
synchronous hardware systems.
Verification complexity.
Ranjan in [Ran97] examined a few verification complexities for cases under one retiming
operation and up to two resynthesis operations with unknown transformation history. The
complexity for the case under an arbitrary number of iterative retiming and resynthesis
operations was left open, and was conjectured in [ZSA98] to be easier than the general
equivalence checking problem. We disprove the conjecture.
Initialization sequences.
For systems with explicit reset, the effect of retiming on initial states was studied in
[TB93, ESS96, SMB96]. In the explicit reset case, incorporating resynthesis with retiming
does not contribute additional difficulty. Note that, for systems with explicit-reset registers,
forward moves of retiming are preferable to backward moves in maintaining equivalent initial
56
CHAPTER 4. RETIMING AND RESYNTHESIS
states, contrary to the case for systems with implicit-reset registers. To prevent backward
moves, Even et al. in [ESS96] proposed an algorithm to find a retime function such that the
maximum lag among all vertices is minimized. Interestingly enough, their algorithm can be
easily modified to obtain minimum lag-dependent bounds on the increase of initialization
sequences. As mentioned earlier, explicit reset can be seen as a special case of implicit reset
when reset circuitry is explicitly represented in the communication graph. Hence, the study
of the implicit reset case is more general, and is subtler when considering resynthesis in
addition to retiming.
Pixley in [Pix92] studied the initialization of synchronous hardware systems with im-
plicit reset in a general context. Leiserson and Saxe studied the effect of retiming on
initialization sequences in [LS83], where a lag-dependent bound was obtained and was later
improved by [EMMRM97, SPRB95]. We show a lag-independent bound instead. In recent
work [MSM04], a different approach was taken to tackle the initialization issue raised by
retiming. Rather than increasing initialization sequence lengths, a retimed system was fur-
ther modified to preserve its original initialization sequence. This modification might need
to pay area/performance penalties and could nullify the gains of retiming operations. In ad-
dition, the modification requires expensive computation involving existential quantification,
which limits the scalability of the approach to large systems. In comparison, prefixing an
arbitrary input sequence of a certain length to the original initialization sequence provides
a much simpler solution (without modifying the system) to the initialization problem.
On the other hand, we extend our study to the unexplored case of iterative retiming
and resynthesis, and show the unboundability of the increase of initialization sequences.
Finally, our exact analysis on the increase of initialization sequences is applicable to the
case of iterative retiming and resynthesis and improves the bound of [EMMRM97, SPRB95].
57
CHAPTER 4. RETIMING AND RESYNTHESIS
4.7 Summary
This chapter demonstrated some transformation invariants under retiming and resyn-
thesis. Three main results about retiming and resynthesis were established. First, an algo-
rithm was presented to construct a canonical representative of an equivalence class of FSMs
transformed under retiming and resynthesis. It was extended to determine if two FSMs are
transformable to each other under retiming and resynthesis. Second, a PSPACE-complete
complexity was proved for the above problem when the transformation history of retim-
ing and resynthesis is unknown. Hence to reduce complexity (from PSPACE-complete to
NP-complete) it is indispensable to maintain transformation history or to check equivalence
after every retiming or resynthesis operation. Third, the effects of retiming and resynthesis
on initialization sequences were studied. A lag-independent bound was shown on the length
increase of initialization sequences of FSMs under retiming; in contrast, unboundability
was shown on the case under retiming and resynthesis. In addition, an exact analysis on
the length increase was presented. We believe our results will enhance the practicality of
retiming and resynthesis for the optimization of synchronous hardware systems.
58
Chapter 5
Equivalence Verification
The state-explosion problem limits formal verification to small- or medium-sized sequen-
tial circuits partly because the sizes of binary decision diagrams (BDDs) heavily depend on
the number of variables dealt with. In the worst case, a BDD size grows exponentially with
the number of variables. Thus, reducing this number can possibly increase the verification
capacity. In particular, this chapter shows how sequential equivalence checking can be done
in the sum state space.
Given two finite state machinesM1 andM2 with numbers of state variables m1 and m2,
respectively, conventional formal methods verify equivalence by traversing the state space
of the product machine, with m1 + m2 registers. In contrast, this chapter introduces a
different possibility, based on partitioning the state space defined by a multiplexed machine,
which can have merely maxm1, m2 + 1 registers. This substantial reduction in state
variables potentially enables the verification of larger instances. Experimental results show
the approach can verify benchmarks with up to 312 registers, including all of the control
outputs of microprocessor 8085.
59
CHAPTER 5. EQUIVALENCE VERIFICATION
5.1 Introduction
Sequential equivalence checking plays a crucial role in very large scale integration de-
sign to ensure functional correctness. It has been greatly advanced since symbolic tech-
niques [CBM89] were used in formal methods based on state-space traversal. However,
these formal methods cannot be scaled as easily with the increasing complexity of system
designs due to the state explosion problem, which says that the state space grows expo-
nentially in the number of state variables. Therefore, recent research [CQS00, KB01] has
focused on reducing the number of state variables by retiming [LS83], with the hope that
verification can be conducted on the reduced circuits. Unlike these circuit-based transfor-
mations, this chapter reduces the register count in the verification construction. Moreover,
the verification itself is structure-independent, that is, neither circuit similarities nor register
correspondences [vE00] are assumed.
In this chapter, we reason about sequential equivalence based on the fact that two FSMs
are equivalent if, and only if, their initial states are equivalent. To identify equivalent states
of an FSM, BDDs [Bry86] were used in [HJJ+96, LTN90] and [Pix92] for symbolic exe-
cution. The fixed-point computation in [LTN90] and [Pix92] is carried out on a product
machine constructed over two identical copies of the FSM. As shown in [Fil91], when the
product machine is constructed over two FSMs under comparison, the same computation
can be used for sequential equivalence checking. In contrast to the approach of [LTN90]
and [Pix92], the computation in [HJJ+96] for equivalent state identification is done on
the original FSM without constructing a product machine. However, an n-state FSM in
[HJJ+96] is represented by n shared n-terminal BDDs. This representation may be expen-
sive in practice. In contrast, we identify equivalent states by applying BDD-based functional
decomposition [LPV93] to keep the computation in the original FSM without any special
representation. Since the computation is in a single FSM, we introduce the multiplexed ma-
60
CHAPTER 5. EQUIVALENCE VERIFICATION
chine to combine two FSMs into one. Thereby we can transform the sequential equivalence
checking problem to the state equivalence problem of a multiplexed machine.
Our equivalence checking technique avoids state traversal, by partitioning the state
space based on equivalence relations among states [Koh78]. Rather than reason about
the sequential equivalence in the product state space of two sequential machines under
comparison, we achieve this attempt in the sum state space. Compared to product-machine-
based verification, the proposed approach almost halves the number of state variables. More
precisely, checking the equivalence of two n-input FSMs M1 and M2 with m1 and m2 state
variables respectively, our method can keep the total number of variables to be at most
n + maxm1,m2+ 1 + dlog2(minm1,m2+ 1)e. Hence, the BDD sizes in our verification
technique could be much smaller than those in product-machine-based techniques.
Unlike previous verification techniques of [CBM89] and [Fil91], the efficiency of our
approach depends heavily on the encountered number of equivalence classes of states. Since
each equivalence class is represented by a BDD node, our approach is limited to instances
with less than ∼106 equivalence classes per output. Fortunately, it is applicable in most
practical applications. On the other hand, because the number of equivalence classes in
the reachable state subspace is an invariant, our technique tends to be more robust than
previous approaches in verifying different implementations of a design. For high-speed
designs, registers are mostly added to reduce cycle time not to increase the number of
equivalence classes. (For example, backward retiming cannot increase equivalence classes.)
In such designs, our proposed technique should be preferable to those of [CBM89] and
[Fil91].
The contributions of this chapter are as follows. We apply BDD-based functional de-
composition to the identification of equivalent states. Two important consequences are the
elimination of universal and existential quantifications, and the possible simplification with
61
CHAPTER 5. EQUIVALENCE VERIFICATION
respect to the reachable state subspace. To extend the above computation for sequential
equivalence checking, we introduce the multiplexed machine such that the verification can
be done in the sum state space. In addition, several techniques are proposed to enhance the
computational robustness; several properties are analyzed to contrast different verification
techniques.
The remainder of this chapter is organized as follows. Preliminaries and definitions are
given in Section 5.2. After introducing the technique for equivalent state identification in
Section 5.3, we present our equivalence checking algorithm in Section 5.4 and analyze its
properties in Section 5.5. Experimental results are then given in Section 5.6, and conclusions
in Section 5.8.
5.2 Definitions, Notation and Preliminaries
5.2.1 Equivalence Relations and Partitions
An equivalence relation is a binary relation on a set, S, satisfying reflexive, symmetric
and transitive laws and induces a unique partition π on S. The partition is a set π =
E1, E2, . . . of subsets of S such that
• Ei 6= ∅ for all i;
• Ei ∩ Ej = ∅ for all i 6= j;
• E1 ∪ E2 ∪ · · · = S.
Each Ei forms an equivalence class. Two elements in the same class satisfy the equivalence
relation, but elements in different classes do not. For two equivalence relations R1 and R2
with partitions π1 and π2, respectively, R1 ⊆ R2 if, and only if, π1 is a refinement of π2,
62
CHAPTER 5. EQUIVALENCE VERIFICATION
denoted as π1 ¹ π2, i.e., each equivalence class of π1 is contained in some equivalence class
of π2. On the other hand, the product of two arbitrary partitions, π1 and π2, denoted
π1 · π2, is the partition corresponding to the relation R1 ∩ R2, i.e. two elements are in the
same equivalence class of π1 · π2 if, and only if, they are both in one equivalence class of π1
as well as in one of π2.
Given an FSM M, its output and transition functions define an equivalence relation,
denoted ≡M, and, thus, induce a partition, denoted πM, over the state space of M =
(Q, I,Σ, Ω, ~δ, ~ω). In this chapter, we concentrate on equivalence relations on a set of states.
Two states q1, q2 ∈ Q are equivalent, satisfying q1 ≡M q2, if, and only if, by using them as
initial states, no input sequence can result in different output sequences. To approximate
state equivalence, we define a k-equivalence relation, denoted ∼=kM, and say two states q1
and q2 are k-equivalent, satisfying q1∼=kM q2, if, and only if, they are indistinguishable
under all input sequences with length up to k. Also, say two states (or FSMs) are k-
distinguishable if k is the shortest length of the input sequences that differentiate them. We
denote the partition associated with ∼=kM as πk
M. To derive πM from the approximation, we
have πM = πkM if πk
M = πk−1M for large enough k, that is, a fixed point has been reached.
Similarly, we define a 〈k〉-equivalence relation, denoted as ∼=〈k〉M . Two states q1 and q2 satisfy
q1∼=〈k〉M q2 if, and only if, by using them as initial states, the outputs at the kth step are
equal for any length-k input sequence. The corresponding partition of ∼=〈k〉M is denoted π
〈k〉M .
By definition, we can derive the following lemma.
Lemma 26 For a Moore machine and k ≥ 1,
π〈k〉M = Ei | q1, q2 ∈ Ei iff ~δ(σ, q1), ~δ(σ, q2) ∈ Ej ∈ π
〈k−1〉M , for any σ ∈ Σ, for some j
and
π〈0〉M = Ei | q1, q2 ∈ Ei iff ~ω(q1) = ~ω(q2).
For a Mealy machine and k ≥ 2,
π〈k〉M = Ei | q1, q2 ∈ Ei iff ~δ(σ, q1), ~δ(σ, q2) ∈ Ej ∈ π
〈k−1〉M , for any σ ∈ Σ, for some j,
63
CHAPTER 5. EQUIVALENCE VERIFICATION
π〈0〉M = Q
and
π〈1〉M = Ei | q1, q2 ∈ Ei iff ~ω(σ, q1) = ~ω(σ, q2), ∀σ ∈ Σ.
Proof. The base cases are direct results of the definition. Now we show the connection
between π〈k〉M and π
〈k−1〉M . There exists a length-(k − 1) input sequence to distinguish two
states q1 and q2 at the output of the (k− 1)st step if, and only if, q1 and q2 are in different
equivalence classes of π〈k−1〉M . Therefore, two states, say q3 and q4, cannot be distinguished
at the output of the kth step if, and only if, their successor states, i.e., ~δ(σ, q3) and ~δ(σ, q4),
are in the same equivalence class of π〈k−1〉M for any σ ∈ Σ.
The connection between π·M and π〈·〉M is indicated in Proposition 27.
Proposition 27 For an FSM M, two states are in the same equivalence class defined by
πkM if, and only if, they are in the same equivalence class of π
〈0〉M , of π
〈1〉M , . . . , and of π
〈k〉M .
5.2.2 Functional Decomposition
In this chapter, we adopt functional decomposition [RK62] for partitioning the state
space to identify equivalent states and to verify sequential equivalence. In functional de-
composition, variables of a Boolean function are divided into two disjoint subsets, the
bound set and the free set. In BDD-based functional decomposition [LPV93], bound set
variables are ordered above free set ones. A cutset of a BDD is the set of (downward) edges
which cross the boundary defined by the bound set and free set variables. A BDD node is
called an equivalence node if there exists an edge in the cutset directing to it.
For a Boolean function f(~λ, ~µ), we can interpret the specification of the bound set
variables ~λ and free set variables ~µ as a partition over the space spanned by ~λ, denoted
Λ. That is, the set of all paths from the root of a BDD to an equivalence node forms an
64
CHAPTER 5. EQUIVALENCE VERIFICATION
equivalence class. Each such set represents a subspace of Λ. Two minterms λ1 and λ2 in
Λ are equivalent under arbitrary assignments of the free set variables, i.e., ∀~µ (f(λ1, ~µ) =
f(λ2, ~µ)), if, and only if, their corresponding paths in the BDD lead to the same equivalence
node.
Given a set of Boolean functions f1, . . . , fk, which do not necessarily have common
supports, we can always expand these to the same Boolean space spanned by the union of
the input variables of all functions. Let the bound set variables be ~λ. Then, the free set
variables ~µ are all variables excluding those in ~λ. Suppose we want to find the equivalence
classes of the minterms in Λ, such that two minterms λ1 and λ2 are equivalent under
arbitrary assignments of all other variables, i.e., ∀~µ,∀i (fi(λ1, ~µ) = fi(λ2, ~µ)), if, and only
if, these two minterms are in the same equivalence class. To represent equivalence classes
by a BDD as in the single function case, we can construct a hyperfunction F [JJH01] of
f1, . . . , fk by adding dlog2 ke, new free set binary variables, ~η, to encode these functions.
Assume the overall free set variables become ~µ′. Thus, two minterms λ1 and λ2 in Λ have
∀~µ′(F(λ1, ~µ′) = F(λ2, ~µ
′)) if, and only if, their corresponding paths in the BDD of F lead
to the same equivalence node.
5.3 Identification of State Equivalence
To find a minimum state FSM, equivalent to a given one, equivalent states are identified.
Since each state in an equivalence class (of reachable states) can represent the entire class,
the number of states of the minimum state FSM equals the number of the equivalence
classes of the original FSM. This section proposes a more direct approach than those of
[LTN90] and [Pix92] to locate equivalent states, in the sense that we deal with equivalence
classes instead of equivalence relations. Given an FSM, we show that BDD-based functional
decomposition can be exploited to extract equivalence classes of states.
65
CHAPTER 5. EQUIVALENCE VERIFICATION
Our approach seems conceptually similar to that in [HJJ+96], where an FSM with n
states is represented by n shared n-terminal BDDs. However, functional decomposition
does not apply in this representation. As a result, the basic operations are of fundamental
difference. Moreover, since our computation operates directly on the output and transition
functions, it is representatively more efficient than the previous work.
5.3.1 State Equivalence vs. Functional Decomposition
In the base cases, π0M = π
〈0〉M for a Moore machine M and π1
M = π〈1〉M for a Mealy
machine, output function ~ω plays the central role, as indicated in Lemma 26. Examining
the case for a Moore machine M, we can see that ~ω serves directly as the characteristic
function for π0M. On the other hand, the characteristic function of π1
M of a Mealy machine
M is not clearly indicated by ~ω. We relate BDD-based functional decomposition to the
computation of this characteristic function. In general ~ω is composed of a set of binary
functions ω1, ω2, . . . , ωk. According to Section 5.2.2, we have to construct the hyper
function F of ω1, ω2, . . . , ωk. The supports of F consist of three parts: state variables
~s, primary inputs ~r, and new added variables ~η for encoding the output functions. Let
~s be the bound set variables and the rest be the free set. Accordingly, the equivalence
nodes of the BDD of F represent the equivalence classes of π1M. Paths from the root to
an equivalence node are states in a corresponding equivalence class. At this point, we
can ignore the functions represented by these equivalence nodes. That is, we can get rid
of the BDD structures below these nodes. By re-encoding these nodes using alphabet
Ψ, (introducing dlog2 log2 |N |e binary variables suffices to re-express N equivalence nodes
because dlog2 log2 |N |e variables can generate at least N different functions, i.e., N nodes in
a BDD), we can have a characteristic function ψ for π1M of a Mealy machine M, ψ : Q → Ψ.
Playing a similar trick, we show how to compute the characteristic function of π〈k〉M , k = 1
66
CHAPTER 5. EQUIVALENCE VERIFICATION
input: ψ, the characteristic function of π〈k−1〉; τ , the function to be composedoutput: characteristic function of π〈k〉
begin01 form hyperfunction F of ψ τ02 build BDD of F with state variables above others03 re-encode equivalence nodes and simplify BDD04 return new characteristic function
end
Figure 5.1. Algorithm CompNewPartition: Compute New Partition.
or 2 for a Moore or Mealy machine respectively. Assume ψ : Q → Ψ is the characteristic
function derived from the last iteration (for both types of machines). Then the composition
function ψ ~δ, i.e., ψ(~δ(σ, q)), plays exactly the same role as ~ω in a Mealy machine, from
which we have shown how to derive a characteristic function of π〈1〉M . Consequently, by
functional decomposition of the hyperfunction of ψ ~δ, we have a characteristic function of
π〈1〉M for Moore and π
〈2〉M for Mealy machine M. The algorithm is summarized in Figure 5.1.
The function call is denoted as CompNewPartition. By Lemma 26, we can derive the
following theorem.
Theorem 28 Given the characteristic function of π〈k−1〉M and ~δ : Σ×Q → Q as the function
to be composed, CompNewPartition generates the characteristic function of π〈k〉M , where
k ≥ 1 (≥ 2) for Moore (Mealy) machine M.
5.3.2 Algorithm for Equivalent State Identification
To identify equivalent states, we have to compute πkM until it equals πk−1
M ; then πM =
πkM. Theorem 31 provides three alternatives to derive πM. Its proof is supported by
Lemma 29, which is restated as Lemma 30.
Lemma 29 Consider an FSM with transition function ~δ : Σ × Q → Q. Let π1 and π2 be
two arbitrary partitions on Q. For q1, q2 ∈ Q,
67
CHAPTER 5. EQUIVALENCE VERIFICATION
~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of π1 and of π2 for any σ ∈ Σ
if, and only if,
~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of π1 · π2 for any σ ∈ Σ.
Proof. Let R1 and R2 be corresponding equivalence relations of π1 and π2, respectively.
(=⇒) The condition we have implies (~δ(σ, q1), ~δ(σ, q2)) ⊆ Ri, i = 1, 2, ∀σ ∈ Σ. Thus,
(~δ(σ, q1), ~δ(σ, q2)) ⊆ R1 ∩R2, ∀σ ∈ Σ. Since R1 ∩R2 is the equivalence relation of π1 ·π2,
the proof follows.
(⇐=) From (~δ(σ, q1), ~δ(σ, q2)) ⊆ R1 ∩R2, ∀σ ∈ Σ, we obtain (~δ(σ, q1), ~δ(σ, q2)) ⊆Ri, i = 1, 2, ∀σ ∈ Σ. That is, ~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of π1
and of π2 for any σ ∈ Σ.
Lemma 30 For an FSM with transition function ~δ, assume π1 and π2 are two partitions
over the state space. Let ψ1, ψ2 and ψ1·2 be the characteristic functions of π1, π2, and
π1 · π2, respectively. For characteristic functions ψ′1 = CompNewPartition(ψ1, ~δ), ψ′2 =
CompNewPartition(ψ2, ~δ), and ψ′1·2 = CompNewPartition(ψ1·2, ~δ), their corresponding
partitions satisfy π′1 · π′2 = π′1·2.
Theorem 31 Given an FSM M, for a positive integer k
πkM = πk−
M · πk−1M (5.1)
=
πk−M · π0
M, if M is a Moore machine
πk−M · π1
M, if M is a Mealy machine(5.2)
= π〈0〉M · π〈1〉M · · ·π〈k〉M (5.3)
where πk−M = Ei | q1, q2 ∈ Ei iff ~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of
πk−1M for any σ ∈ Σ.
Proof. We prove these equations by the order (5.3), (5.2), (5.1).
68
CHAPTER 5. EQUIVALENCE VERIFICATION
Equation (5.3): By the definition of πkM, states in an equivalence class are indistinguish-
able under length-k input sequences. According to Proposition 27, no outputs at steps from
0 to k can distinguish two states if, and only if, the states lie in the same equivalence class
of π〈0〉M , of π
〈1〉M , . . . , and of π
〈k〉M . Thus, by Lemma 29, the states stay in the same equivalence
class of π〈0〉M · π〈1〉M · · ·π〈k〉M .
Equation (5.2): Following the result of (5.3), we get πk−1M = π
〈0〉M · π〈1〉M · · ·π〈k−1〉
M . Sup-
pose we use the characteristic function of πk−1M and transition function ~δ as the inputs to
CompNewPartition. By Theorem 28 and Lemma 30, the output of the algorithm is πk−M , that
is, the characteristic function of π〈1〉M ·π〈2〉M · · ·π〈k〉M for a Moore machine or of π
〈2〉M ·π〈3〉M · · ·π〈k〉M
for a Mealy machine. Make a product partition with the initial partition induced by the
outputs. We derive πkM. (Note that π0
M is redundant for a Mealy machine.)
Equation (5.1): By expressing πk−M and πk−1
M in the product forms of π〈·〉M’s as in the
proof of (5.2), the equation follows.
Based on Equations (5.1)–(5.3) to derive πM, Figures 5.2–5.4 sketch three algorithms,
denoted as IDES5.1, IDES5.2 and IDES5.3, respectively. In these pseudocodes, “combine”
a set of characteristic functions means using the procedure in Figure 5.1 except F is the
hyperfunction of the set of characteristic functions.
These algorithms terminate in a finite number of iterations. IDES5.1 and IDES5.2
converge because the partitions over finite states are refined continuously and the number
of equivalence classes grows monotonically. On the other hand, because π〈k〉M in general is
not a refinement of π〈k−1〉M , IDES5.3 cannot simply determine the fixed point by comparing
the numbers of equivalence nodes in ψ〈k−1〉M and ψ
〈k〉M . Therefore, it is more expensive to do
fixed-point analysis. In general, one should check whether or not new equivalence classes
are created over previous partitions.
69
CHAPTER 5. EQUIVALENCE VERIFICATION
input: an FSM M = (Q, I,Σ, Ω, ~δ, ~ω)output: characteristic function of πMbegin
01 if M is a Moore machine then ψ− := ~ω02 else ψ− := CompNewPartition(identity fn, ~ω)03 ψ+ := CompNewPartition(ψ−, ~δ)04 while num. equiv. nodes of ψ+ 6= that of ψ− do05 ψ− := ψ+
06 ψ+ := CompNewPartition(ψ+, ~δ)07 ψ+ := combine ψ+ and ψ−08 return ψ+
end
Figure 5.2. Algorithm IDES5.1: Identify Equivalent States, Equation (5.1).
Although Figures 5.2 and 5.3 look quite similar, the major difference is in combining
two characteristic functions in Line 7 and Line 8, respectively. Despite keeping one more
characteristic function, IDES5.2 could require less memory than IDES5.1 because ψi has a
simpler BDD representation than ψ−. On the other hand, although IDES5.3 keeps all the
characteristic functions along iterations, it has maximal flexibility to arrange the combina-
tion of them to reduce peak memory consumption.
5.3.3 Robust Equivalent State Identification
The limitations of equivalent state identification using BDD-based functional decompo-
sition result from the explicit representation of equivalence classes and the restricted BDD
variable ordering. In this section we propose some possible techniques to reduce BDD sizes.
Using any underestimated unreachable states as the don’t care set, we can assign each
such unreachable state to any equivalence class of reachable states. This flexibility enables
the simplification of characteristic functions. However, because these algorithms use the
number of equivalence classes to decide fixed points, the number of equivalence classes with
solely unreachable states should be kept as a constant during the iterations. (Note that if
70
CHAPTER 5. EQUIVALENCE VERIFICATION
input: an FSM M = (Q, I,Σ, Ω, ~δ, ~ω)output: characteristic function of πMbegin
01 if M is a Moore machine then ψi := ~ω02 else ψi := CompNewPartition(identity fn, ~ω)03 ψ− := ψi
04 ψ+ := CompNewPartition(ψ−, ~δ)05 while num. equiv. nodes of ψ+ 6= that of ψ− do06 ψ− := ψ+
07 ψ+ := CompNewPartition(ψ+, ~δ)08 ψ+ := combine ψ+ and ψi
09 return ψ+
end
Figure 5.3. Algorithm IDES5.2: Identify Equivalent States, Equation (5.2).
unreachable states are not used as don’t cares, there is no such restriction.) Otherwise, we
have to complicate the fixed-point condition by testing if an equivalence class is contained in
the don’t care set. Claim 32 shows BDD constrain [CM90] is a good simplification operator
satisfying this requirement. On the contrary, BDD restrict [CM90] violates it. However, a
BDD restrict followed by a constrain is a good operation.
Claim 32 Given a Boolean function f(~λ, ~µ) with bound set and free set variables ~λ and ~µ,
respectively, assume Λ is the space spanned by ~λ. Let c(~λ) be the characteristic function of
the care set of Λ. Then, constrain(f, c) eliminates all equivalence nodes whose corresponding
equivalence classes are contained in the don’t care set, and preserves all other equivalence
nodes.
Proof. Since BDD structures below equivalence nodes are irrelevant, we can think of f to
be another function g : Λ → N , where N is the set of equivalence nodes. As constrain(g, c)
has its range equal to the image g(λ) | c(λ) = true, ∀λ ∈ Λ, equivalence nodes not in
this image disappear from the range and those in this image remain in the range. (On the
71
CHAPTER 5. EQUIVALENCE VERIFICATION
input: an FSM M = (Q, I,Σ, Ω, ~δ, ~ω)output: characteristic function of πMbegin
01 ψ〈0〉 := identity function02 if M is a Moore machine03 then ψ〈0〉 := ~ω
04 ψ〈1〉 := CompNewPartition(ψ〈0〉, ~δ)05 else ψ〈1〉 := CompNewPartition(ψ〈0〉, ~ω)06 k := 107 while fixed point not reached do08 k := k + 109 ψ〈k〉 := CompNewPartition(ψ〈k−1〉, ~δ)10 return combine ψ〈i〉, i = 0, 1, . . . , k − 1
end
Figure 5.4. Algorithm IDES5.3: Identify Equivalent States, Equation (5.3).
other hand, the restrict operator could increase c to c′, c ⊆ c′. Although equivalence nodes
in the original image are kept, some with solely unreachable states might exist.)
To reduce the impact of the restricted BDD variable ordering, we can use the following
strategy. Within the allowed threshold of a BDD size, find the variable ordering such that
the lowest state variable is as high as possible. Treat this variable and those above it as
bound set variables; all others are the free set ones. Then, compact the BDD such that
every node under the cutset is an equivalence node. Work on the new smaller BDD, and
apply variable reordering to it based on the same strategy, incrementally throwing away
unnecessary variables. On the other hand, since this ordering restriction emerges only from
functional decomposition, arbitrary ordering can be used in other BDD manipulations.
This restricted ordering is needed only when counting the number of equivalence classes
and constraining BDDs with respect to the reachable state subspace.
Directly building a single hyperfunction of a set of (binary) functions f1, . . . , fk may
be impractical. Fortunately, this can be avoided by computing equivalence classes incre-
mentally. For instance, first perform functional decomposition on f1. For each resultant
72
CHAPTER 5. EQUIVALENCE VERIFICATION
equivalence class, use it as the care set and others as the don’t care set. Hence, there is
a greater chance to build a hyperfunction for the simplified functions of f2, . . . , fk. (If it
fails, we can deepen the recursion level to extract more don’t cares.) Conducting functional
decomposition on it, the equivalence classes in the care set are encoded using new binary
functions. In this way, BDD sizes are kept small. This approach trades time for memory.
We can also explore flexibility to reduce a partition before using it to compute a new
partition. Given two partitions π1 and π2, we say any π†1 (6= π1) satisfying π†1 ·π2 = π1 ·π2 is
a reduced partition of π1 with respect to π2. In particular, a simpler reduced partition,
whose characteristic function has a smaller BDD size, is of interest. Theorem 35 states the
validity of this flexibility.
Proposition 33 If πd ¹ πc holds for two partitions πc and πd, there exists a partition πx
such that πc · πx = πd.
Lemma 34 Assume partitions π, π′ and πc satisfy π · πc = π′ · πc. If any πd satisfies
πd ¹ πc, then π · πd = π′ · πd.
Proof. By πd ¹ πc and Proposition 33, assume there exists a partition πx such that πc ·πx =
πd. For π · πc = π′ · πc, we derive π · πc · πx = π′ · πc · πx, i.e., π · πd = π′ · πd.
Assume, after certain iterations of refinement, the overall (product) partition is πo. Let
πy be a new (not overall) partition after one more iteration, and let π†y be a reduced partition
of πy with respect to any πx, such that πo ¹ πx. (Let ψ¦ denote the characteristic function
of π¦ for any subscript ¦.) We have
Theorem 35 For ψz = CompNewPartition(ψy, ~δ) and ψ′z = CompNewPartition(ψ†y, ~δ),
equality πo · πy · πz = πo · πy · π′z holds.
Proof. Let ψ/·. denote the characteristic function of π/ · π., for any subscripts /
73
CHAPTER 5. EQUIVALENCE VERIFICATION
and .. In addition, (π)∗ is used to denote the partition with characteristic function
CompNewPartition(ψ, ~δ), for any partition π with characteristic function ψ.
By the definition of a reduced partition, π†y · πx = πy · πx. Since πo ¹ πx, equation
π†y · πo = πy · πo holds according to Lemma 34. So, (π†y · πo)∗ = (πy · πo)∗. From Lemma 30,
we get (π†y)∗·(πo)∗ = (πy)∗·(πo)∗. Since π′z = (π†y)∗ and πz = (πy)∗, then π′z ·(πo)∗ = πz ·(πo)∗.
Also, from Theorem 31, πo · πy ¹ (πo)∗. Hence, by Lemma 34, πo · πy · πz = πo · πy · π′z.
In the light of Theorem 35, an algorithm can be implemented by modifying IDES2 and
IDES3 as follows. Keep a set of characteristic functions to represent the overall partition.
Compute new partitions based only on an essential partition, which consists of equivalence
classes that refine the previous overall partition. In this manner, the BDD size is kept small
and the iterative computation is sped up.
5.4 Verification of Sequential Equivalence
The proposed technique can be applied for sequential verification. The following two
propositions form the basis of our equivalence checking. The first states a property that
two equivalent FSMs must have.
Proposition 36 Given two equivalent FSMs M1 and M2 with sets of equivalence classes
πM1 and πM2, respectively, assume expunging unreachable states from πM1 and πM2 results
in π[M1
and π[M2
, respectively. Then, there exists a bijection f : π[M1
→ π[M2
, where f
reflects the state isomorphism of M1 with M2.
On the other hand, to show the equivalence between two FSMs, Proposition 37 gives
necessary and sufficient conditions.
Proposition 37 M1 and M2, with initial states I1 and I2, respectively, are equivalent if,
74
CHAPTER 5. EQUIVALENCE VERIFICATION
and only if, there exists a bijection f : π[M1
→ π[M2
(f reflects the state isomorphism of M1
with M2), and E2 = f(E1) with I1 ∈ E1 ∈ π[M1
and I2 ∈ E2 ∈ π[M2
.
Based on Proposition 37, we can extend the identification of state equivalence to sequen-
tial equivalence checking. In order to pose the problem of verification as the identification
of state equivalence, the multiplexed machine is introduced.
5.4.1 Multiplexed Machine
To check equivalence between two FSMs M1 and M2 with m1 and m2 registers, respec-
tively, assume without loss of generality m2 ≥ m1. Their multiplexed machine, denoted
M1on2, is depicted in Figure 5.5. The two FSMs share the same primary inputs. Their
corresponding outputs are multiplexed as a set of global primary outputs. To minimize the
state variables of M1on2, for every next state variable of M1, we pair it arbitrarily with
one of M2. This pair is then multiplexed before being fed to a register, whose output is
then demultiplexed to recover the current state variables for M1 and M2. In addition,
one self-looped auxiliary state variable (aux ) is added which controls all multiplexers and
demultiplexers as indicated by the dotted lines in Figure 5.5. The value of aux remains
the same as its initial value. Let M1on2 select M1 and M2 when aux has values 0 and 1,
respectively. No matter what the initial value of aux is, the multiplexed machine functions
the same as M1 and M2, if they are equivalent. In the verification, we can imagine that
aux is in a superposition status, possessing values 0 and 1 simultaneously. (Note that,
without changing its functionality, the multiplexed machine can be simplified by omitting
the demultiplexers. That is, replacing each demultiplexer, we directly connect its input to
outputs. Also it is worth mentioning that choosing any subset of the next state variables of
M1 to be paired is valid. Suppose, in the extreme case, we choose an empty subset. Then,
aux and the multiplexers for outputs are unnecessary. The multiplexed machine, there-
75
CHAPTER 5. EQUIVALENCE VERIFICATION
M1
M2
PrimaryInputs
PrimaryOutputs
m1 bits(m2 - m1) bits
aux
0
1
0
1
0
1
Figure 5.5. Multiplexed Machine.
fore, degenerates into two separate machines. The corresponding verification is discussed
in Section 5.4.5.)
5.4.2 Algorithm for Sequential Equivalence Checking
Given two FSMs M1 and M2 with initial states1 I1 and I2, respectively, without loss
of generality assume their multiplexed machine M1on2 selects M1 (M2) while aux equals
zero (one). Their equivalence can be verified based on Lemma 38, a consequence of Propo-
sition 37.
Lemma 38 M1 and M2 are equivalent if, and only if, πM1on2 has at least one reachable
state with aux bit 0 and at least one with 1 in every equivalence class containing any reachable
state, and has initial states (I1 with aux 0 and I2 with aux 1) in the same equivalence class.
1To simplify the discussion, we assume each FSM has a single initial state. This can be straightforwardlygeneralized to a set of initial states.
76
CHAPTER 5. EQUIVALENCE VERIFICATION
Proof. Assume f : π[M1
→ π[M2
reflects the state isomorphism between M1 and M2. Let
E2 = f(E1) for E1 ∈ π[M1
and E2 ∈ π[M2
. Then, after adding the aux bit, original reachable
states (including initial states) q1 ∈ E1 and q2 ∈ E2 must be within the same equivalence
class of πM1on2 . Thus, every equivalence class of πM1on2 containing any reachable state must
have at least one (with aux 0) contributed from M1 and one (with aux 1) from M2.
By iterative refinement of the state space as in the identification of state equivalence,
equivalence classes of states for M1on2 can be derived whenever the fixed-point has been
reached. According to Lemma 38, both conditions are checked. However, the first condition
implies that we need to know reachable states of both M1 and M2. Fortunately, the first
condition is redundant, i.e., as long as the second condition is satisfied, so is the first. This
property is stated in Theorem 39. As a result, reachability analysis can be completely
eliminated.
Theorem 39 M1 and M2 are equivalent if, and only if, πM1on2 has initial states, namely
I1 with aux 0 and I2 with aux 1, within the same equivalence class.
Proof. By contradiction, we show that the first condition in Lemma 38 is redundant. Assume
πM1on2 has initial states in the same equivalence class Ei, and there exists an equivalence
class E containing reachable states all with aux bits 0 (or 1, it does not matter). Therefore,
E 6= Ei. For any reachable state of E, there must be a reachable state, say q, (with aux bit
0) that transitions to it. This transition makes q have no equivalent reachable states from
M2. Therefore, the equivalence class containing q has all reachable states with aux bits
0. Continuing this argument, we conclude that Ei must exclude the state, I2 with aux 1.
Hence, a contradiction arises.
Further, rather than checking that the condition of Theorem 39 is satisfied in the overall
partition of the state space, validity can be verified on the new partition at each iteration.
The correctness of this variant is based on Proposition 27. As the BDD representation
77
CHAPTER 5. EQUIVALENCE VERIFICATION
input: two FSMs under equivalence checkingoutput: yes if equivalent; no otherwisebegin
01 build the multiplexed machine M02 compute the init. partition πi
M03 if init. states not in an equiv class of πi
M04 then return no05 while fixed point not reached do06 compute πnew
M07 refine the overall partition and simplify πnew
M08 if init. states not in an equiv class of πnew
M09 then return no10 return yes
end
Figure 5.6. Algorithm: Verify Sequential Equivalence.
of the current partition is obtained, it is of linear time complexity in the number of state
variables to test if two initial states are within the same equivalence class. Consequently, this
checking can be done efficiently in each iteration. Figure 5.6 outlines the overall procedure
for sequential equivalence checking.
Remark: In theory, k FSMs can be verified simultaneously by introducing dlog2 keauxiliary state variables to control the k-to-1 multiplexers of their corresponding multiplexed
machine.
5.4.3 Robust Sequential Equivalence Checking
To make the verification procedure more robust, the techniques and restrictions listed in
Section 5.3.3 are also applicable here. Instead of repeating them, this section is concerned
with those that are particular to verification.
Verifying each primary output and/or characteristic function separately could substan-
tially reduce the number of encountered equivalence classes. The numbers of equivalence
78
CHAPTER 5. EQUIVALENCE VERIFICATION
classes induced by individual primary outputs may be exponentially smaller than those
induced by all of the primary outputs. The correctness of this separation is inferred from
Lemma 29. It is interesting to notice that the cone of influence reduction has been auto-
matically taken care of due to this separation, i.e., irrelevant state variables with respect to
the considered primary output disappear.
Although reachability analysis is unnecessary, any under-estimation of unreachable
states of M1 and/or M2 can be used as a don’t care set to simplify BDD expressions
and to reduce unnecessary state refinements. Theorem 40 shows the correctness of such
simplification and the maximal don’t care set for the multiplexed machine. However, as
mentioned in Section 5.3.3, the fixed-point condition should be preserved to ensure the
algorithm terminates.
Theorem 40 The equivalence condition of M1 and M2 is invariant under don’t care sim-
plification by unreachable states of M1on2, that is, unreachable states of M1 with aux 0
together with those of M2 with aux 1.
Proof. Because state transition is irrelevant to the simplification of characteristic functions
of partitions, the proof of Theorem 39 still holds.
Assume the sets of reachable (unreachable) states of M1 and M2 are R1 (U1) and R2
(U2), respectively. Let α be the auxiliary state variable. Since the state space of M1on2 is
the direct sum of M1 and M2 distinguished with the auxiliary state variable, it consists of
four disjoint subsets ¬α ∧ R1, ¬α ∧ U1, α ∧ R2, and α ∧ U2. The reachable set of states of
M1on2 is (¬α ∧R1) ∪ (α ∧R2); the unreachable set is (¬α ∧ U1) ∪ (α ∧ U2).
Besides don’t care simplification, the partitioned state space can be reduced further
according to the following theorem.
Theorem 41 Let πkM1on2
be the partition associated with the k-equivalence relation of
79
CHAPTER 5. EQUIVALENCE VERIFICATION
M1on2. Then, equivalence checking is invariant under the reduction of πkM1on2
by collapsing
the set E ∈ πkM1on2
| aux(q) = 0, ∀q ∈ E of equivalence classes into one equivalence class
and collapsing E ∈ πkM1on2
| aux(q) = 1, ∀q ∈ E into another, where aux(q) denotes the
valuation of the aux bit of q.
Proof. It is clear that M1 and M2 are equivalent only if the collapsed equivalence classes
are unreachable from the initial states.
Since we collapse the equivalence classes of M1 and of M2 separately, states from one
machine which have transitions to these equivalence classes do not have corresponding equiv-
alent states from the other machine. Besides, as state transition relations are not affected
by the collapsing, the equivalence relation among other states, which cannot transition to
these equivalence classes, remains intact. Since the condition holds for all k ≥ 0, M1 and
M2 must be equivalent. Hence, the verification is invariant under this reduction.
Corollary 42 For two FSMs M1 and M2 with n1 and n2 equivalence classes, respec-
tively, the number of equivalence classes can be kept at most minn1, n2 + 1 in our
sequential equivalence checking with the use of the collapsing process in Theorem 41.
That is, the number of variables introduced to generate equivalence nodes is at most
dlog2 log2(minn1, n2 + 1)e. Assume the n-input FSMs M1 and M2 have m1 and m2
state variables, respectively. Then, by verifying each output separately, the total number of
variables in our verification is at most (n+maxm1,m2+1+dlog2 log2(minn1, n2+1)e)≤ (n + maxm1,m2+ 1 + dlog2(minm1,m2+ 1)e).
In the construction of the multiplexed machine, a multiplexer, selecting state variables,
pairs a state variable from M1 with any unpaired one from M2. Since this pairing is ar-
bitrary (and, thus, can be adaptively changed on-the-fly), an optimization problem is to
maximize the BDD sharing between M1 and M2, and to simplify the consequent BDD
80
CHAPTER 5. EQUIVALENCE VERIFICATION
manipulations. Heuristics can be derived based on the cone of influence reduction and
functional similarity. The former pairs two state variables which are supports of two similar
sets of primary outputs; the latter pairs two state variables with similar transition func-
tionalities. In the extreme case, when comparing two identical copies of an FSM, we can
possibly reduce the BDD such that it is as if there is only one machine.
5.4.4 Error Tracing and Shortest Distinguishing Sequence
Given two states q1 and q2 which are k-distinguishable at an output of an FSM
M = (Q, I,Σ,Ω, ~δ, ~ω), this section illustrates how to derive a length-k input sequence
differentiating them.
Since q1 and q2 are k-distinguishable, their corresponding BDD paths lead to different
equivalence nodes in some characteristic function at the kth refinement. Let the functions
represented by these two BDD nodes be f1 and f2. (Notice that f1 and f2 should be the
functions before re-encoding and simplification mentioned in Section 5.3.1.) Then, any
solution, say σ∗, to (f1 xor f2) provides the kth distinguishing input vector. On the
other hand, two states q′1 = ~δ(σ∗, q1) and q′2 = ~δ(σ∗, q2) are (k − 1)-distinguishable. They
result in the distinguishability of q1 and q2 at the kth refinement. Similarly, the (k − 1)st
distinguishing input vector can be obtained. Repeating this process backward, one can
derive a shortest distinguishing sequence to trace an error.
5.4.5 State-Space Partitioning on Separate Machines
The multiplexed machine is not the only construction that extends state equivalence to
machine equivalence. To prove the equivalence of M1 and M2, the state variables can be
kept disjointed while the inputs are shared. Therefore, their state spaces are partitioned
separately, but simultaneously, by maintaining two sets of shared BDDs during functional
81
CHAPTER 5. EQUIVALENCE VERIFICATION
decomposition. Again, they are equivalent if, and only if, their initial states lead to the
same equivalence node when the fixed point is reached.
In the case of the multiplexed machine, state variables of M1 and M2 are merged by
multiplexers. As mentioned in Section 5.4.3, the register pairing affects the cone of influence
and BDD manipulations. By state-space partitioning on separate machines, the interference
among state variables is removed. However the major drawback is that there is no BDD
sharing between M1 and M2 above the cutset. Notice that, although the number of state
variables in this case is the same as for the product machine, the verification is still in the
sum state space.
5.4.6 State-Space Partitioning on Product Machine
Verification by state-space partitioning works for the product machine as well. It can
be done by slight modifications of [LTN90] and [Pix92], previously known as the backward
state traversal [Fil91]. We refer to it as state-space partitioning on the product machine.
When compared to state-space partitioning on the multiplexed machine, this approach
has more flexibility in BDD variable ordering. However, this flexibility prevents simplifica-
tion by the restrict or constrain operator with respect to the reachable states because this
might corrupt the represented equivalence relation.
5.5 Analysis
This section consists of two parts. First, some verification properties, independent of
the implementation of a design, are analyzed. Second, we discuss circuit implementation
related effects on the sequential equivalence checking problem.
82
CHAPTER 5. EQUIVALENCE VERIFICATION
5.5.1 Implementation-Independent Aspects
Given an FSM taking a total of n iterations in state-space partitioning, its partition
structure is defined as an ordered sequence p = (p1, p2, . . . , pn), where pi denotes the
accumulated number of equivalence classes at the ith iteration. Thus, pi < pi+1, for i =
1, . . . , n− 1, and pn = pn+1.
Theorem 43 Any two equivalent FSMs must have the same partition structure in their
reachable state subspace.
Proof. Assume two equivalent FSMs M and M′ have sets of equivalence classes π and
π′, respectively, in their reachable state subspace. Therefore, according to Proposition 36,
there exists a bijection f : π → π′.
Suppose M and M′ have different partition structures. Since the state space is contin-
uously refined in fixed-point computation, there must exist l- and k-distinguishable state
pairs (q1, q2) and (q′1, q′2), respectively, such that l > k, q1 ∈ E1 ∈ π, q′1 ∈ f(E1) ∈ π′,
q2 ∈ E2 ∈ π, and q′2 ∈ f(E2) ∈ π′. Let ~δ and ~δ′ be the transition functions of M and
M′, respectively. Then, pairs (qi, qj) | ∃σ(~δ−1(σ, qi), ~δ−1(σ, qj)) = (q1, q2) must be at
least (l − 1)-distinguishable and at least one of them is (l − 1)-distinguishable. Similarly,
(q′i, q′j) | ∃σ.(~δ′−1(σ, q′i), ~δ′−1(σ, q′j)) = (q′1, q
′2) are at least (k − 1)-distinguishable and
at least one of them, say (q′i∗ , q′j∗), is (k − 1)-distinguishable. Let σ∗ be the input such
that (~δ′(σ∗, q′1), ~δ′(σ∗, q′2)) = (q′i∗ , q
′j∗). Also, let (qi∗ , qj∗) = (~δ(σ∗, q1), ~δ(σ∗, q2)). Suppose
qi∗ ∈ Ei∗ ∈ π and qj∗ ∈ Ej∗ ∈ π. Then, q′i∗ ∈ f(Ei∗) ∈ π′ and q′j∗ ∈ f(Ej∗) ∈ π′. Now,
since (qi∗ , qj∗) is at least (l− 1)-distinguishable and (l− 1) > (k − 1), we are ready to have
recursive reasoning for (qi∗ , qj∗) and (q′i∗ , q′j∗). At some point of the recursion, we will reach
the situation that (q′i∗ , q′j∗) can be differentiated by some output while (qi∗ , qj∗) can not.
This violates the base cases of Lemma 26. Hence, M and M′ must have the same partition
structure.
83
CHAPTER 5. EQUIVALENCE VERIFICATION
Therefore, partition structures in reachable state subspace form a signature for equiv-
alent FSMs. This may not be true for the entire state space. However, even without the
knowledge of state reachability, the following holds.
Theorem 44 Given two FSMs M1 and M2 converging in m and n steps, respectively, in
state-space partitioning, their product machine converges in no more than minm,n steps
in state-space partitioning.
Proof. In state-space partitioning, the product machine has state “equivalence relation” ≡P
over (ordered) pairs of states, (q1, q2) with q1 ∈ Q1 and q2 ∈ Q2, where Q1 and Q2 are the
sets of states of M1 and M2, respectively. Notice that ≡P may not satisfy reflexive and
symmetric laws. Nevertheless, the transitive law holds for the ordered pairs of states. Since
the transitive law is maintained during the fixed-point computation, it is clear that once one
machine converges, so does the product machine. On the other hand, this state-partitioning
procedure does not refine the state subspace q ∈ Q1 | (q, q2) 6∈ ≡P , ∀q2 ∈ Q2 ×Q2 ∪Q1 × q ∈ Q2 | (q1, q) 6∈ ≡P ,∀q1 ∈ Q1. Hence, it could converge in less than minm,nsteps.
Theorem 45 Given two FSMs M1 and M2 converging in m and n steps, respectively,
in state-space partitioning, then their multiplexed machine converges in exactly maxm,nsteps in state-space partitioning. With the state space reduced by Theorem 41 in each iter-
ation, the computation converges in the same step as the state-space partitioning on their
product machine.
Proof. The construction of the multiplexed machine is designed to match corresponding
equivalence classes betweenM1 andM2. State-space partitioning on the combined machine
has no effect on the partition of the state subspace spanned by any individual FSM. Once
each subspace of M1 and M2 has reached a fixed point in state partitioning, so has the
84
CHAPTER 5. EQUIVALENCE VERIFICATION
space of their combined machine. Therefore, the combined machine converges in exactly
maxm,n steps.
When the state space is reduced by Theorem 41 in each iteration, the fixed-point com-
putation does not refine the state subspace spanned by the collapsed equivalence classes.
The state space is partitioned in the same way as that of the product machine. Hence, the
multiplexed and product machines converge in the same step in state-space partitioning.
In contrast, for state traversal of an FSM, although we can similarly define a traversal
structure to be the sequence of numbers of reached states, we can not use it as a signature.
Moreover, even if the traversal depths for two FSMs are known, they merely provide a lower
bound on the depth of the product machine. No strong argument like Theorems 44 and 45
is possible.
The following theorem shows the connection between the number of refinements in state
partitioning and the depth of state traversal.
Theorem 46 Given two k-distinguishable FSMs M1 and M2, both state-traversal- and
state-partition-based approaches differentiate them at the kth step.
Proof. Since state traversal on the product machine of M1 and M2 implicitly enumerates
all possible transitions, clearly any discrepancy can be observed in the shortest steps.
On the other hand, for state partitioning, since the initial states from M1 and M2 must
be k-distinguishable in the combined machine of M1 and M2. The theorem follows.
As a consequence, Corollary 47 follows.
Corollary 47 Given two FSMs M1 and M2, let M1×2 be their product machine. Assume
np is the number of refinements in state partitioning on M1×2, and nt is the depth of
state traversal on M1×2. Then, minnp, nt is an upper bound on the number of iterations
required for equivalence checking.
85
CHAPTER 5. EQUIVALENCE VERIFICATION
In other words, following Corollary 47, if np > nt, we can conclude the equivalence of
M1 and M2 in nt refinements of state partitioning on M1×2. Similarly, if np < nt, their
equivalence can be confirmed in np steps of state traversal on M1×2. Also, Corollary 48
follows immediately from Theorem 44.
Corollary 48 Given two FSMs, M1 and M2, converging in m and n steps, respectively,
in state-space partitioning, their equivalence can be concluded in no more than minm,nsteps in state partitioning on their multiplexed machine.
5.5.2 Implementation-Dependent Aspects
Retiming [LS83] is an important technique in sequential circuit optimization. There
are two types of atomic moves in retiming, namely forward (from inputs to outputs) moves
and backward (from outputs to inputs) moves across functional blocks. Here we investigate
their effects on the number of equivalence classes in the state space. Suppose an FSM Mb is
retimed from another FSM Mf using only backward moves across a functional block with
function f : Qb → Qf , where Qb and Qf are the state spaces of Mb and Mf , respectively.
(Equivalently, Mf is retimed from Mb using forward moves across the functional block
with function f .)
Proposition 49 Two states qb and q′b of Mb are equivalent, i.e., qb ≡Mbq′b, if and only if
their corresponding states f(qb) and f(q′b) of Mf are equivalent, i.e., f(qb) ≡Mff(q′b).
Proposition 50 If qb ≡Mbq′b, then the corresponding states of qb, q′b, f(qb), and f(q′b) in
the multiplexed machine Mbonf of Mb and Mf are in the same equivalence class of Mbonf .
Theorem 51 The number of equivalence classes of Mb is not greater than that of Mf .
Proof. Since f is a total function, i.e., f is well defined for all states of Mb, the theorem
follows from Proposition 49.
86
CHAPTER 5. EQUIVALENCE VERIFICATION
Theorem 52 The number of equivalence classes of Mf is greater than that of Mb if, and
only if, there exists a state q of Mf such that f−1(q) = ∅ and q 6≡Mff(qb), ∀qb ∈ Qb.
Proof. The theorem follows from Proposition 49.
Similar arguments of Theorems 51 and 52 were used in [SPRB95] for the discussion of
the validity of retiming.
5.6 Experimental Results
Using the VIS [BHSV+96] environment, we compared three equivalence checking tech-
niques, namely,
STPM – state traversal on the product machine,
SPPM – state partitioning on the product machine, and
SPMM – state partitioning on the multiplexed machine.
The experiments were conducted on a Linux machine with a Pentium III XEON 700-MHz
CPU and 2-Gb of RAM.
For STPM and SPPM, the VIS sequential verification command is used. Dynamic
variable reordering is turned on and the hybrid method [MKRS00], considered the state-of-
the-art technique for image computation, is used. For SPMM, variable reordering is enabled
when appropriate.
To demonstrate the relative power of the three techniques, we first compare a set of
benchmark circuits against themselves. (Although combinational checking suffices in this
circumstance, we are only interested in sequential methods.) In general, combinational
equivalence checking should be tried in situations where there is structural similarity. The
87
CHAPTER 5. EQUIVALENCE VERIFICATION
techniques of this chapter aim at situations where there is no such similarity. The self-
comparison benchmarks are used to compare the methods on a large set of examples. Care is
taken not to exploit similarity by using a method for pairing state variables which considers
only the cones of influence of the primary outputs. To further emphasize that no similarity
is being exploited, a second set of experiments is done comparing circuits against their
retimed versions.
An argument detailing why self-comparison is sufficient for the experiments is Propo-
sition 36, which states that two different implementations M1 and M2 must have cor-
responding equivalence classes in the reachable set of states. Thus, the reachable state
spaces of M1on2, M1on1 and M2on2 all have the same number of equivalence classes. Also,
even if M1 and M2 have incomparable numbers of equivalence classes in the whole state
spaces, by Corollary 42, the number of equivalence classes encountered by SPMM is at most
minn1, n2+1, where ni is the number of equivalence classes of Mi, i = 1, 2. Thus, conclu-
sions drawn from self-comparison experiments should remain valid for general comparisons.
In Tables 5.1 and 5.2, we provide the characteristics of the benchmark circuits, then the
empirical results in Tables 5.3 and 5.4. Table 5.1 gives the profiles of the selected bench-
marks from iscas’89, lgsynth’91, texas’97, VIS and texas. Columns 2–4 indicate the
number of inputs, outputs, and registers respectively. In addition, the number of reachable
states and the corresponding traversal depth are provided in Column 5. (Here, we reset
uninitialized state variables to zero.)
Also, the information of equivalence classes is included in Table 5.2. As mentioned
in Section 5.4.3, we can verify sequential equivalence by examining each primary output
separately instead of treating them as a whole. The advantage is that we can reduce the
peak memory requirements recording encountered equivalence classes. To provide strong
evidence, Table 5.2 contains two parts of data. The first part, “Overall Partition,” in
88
CHAPTER 5. EQUIVALENCE VERIFICATION
Columns 2 and 3 shows the number of equivalence classes induced by all primary outputs.
The number in the following parentheses indicates the depth of refinement in the corre-
sponding fixed-point computation. In contrast, the second part, “Worst Partial Partition,”
in Columns 4 and 5 lists the largest number of equivalence classes induced by some pri-
mary output. The number in the following parentheses indicates the maximum depth of
refinement among all outputs. Circuit s991 is an example of where separating verification
tasks for each output makes a substantial reduction in the number of encountered equiva-
lence classes. In the extreme case, the number of equivalence classes induced by all outputs
can be exponentially (in the number of outputs) larger than those induced by individual
outputs. Usually, the separation of verification tasks lengthens the required refinement.
However, as BDD manipulations could be simplified substantially, the runtime can still be
reduced in most cases. Further, within each part, we compare the number (in the column
marked “whole”) of equivalence classes in the whole state space to the number (in the col-
umn marked “reach”) of equivalence classes in the reachable subspace. As can be seen, in
most instances, this subset is fairly small when compared to the entire space. Since SPMM
directly benefits from these reductions, it can easily verify some large instances which are
unverifiable for STPM and SPPM as indicated in Tables 5.3 and 5.4, where the results for
SPPM and STPM report the best of verifying combined outputs and verifying each out-
put separately. From experience, SPPM has better results in verifying combined outputs
for most circuits while SPMM has the opposite results. This might be explained by the
fact that the performance of SPPM is not directly related to the encountered number of
equivalence classes, while that of SPMM is.
From the experiment in Table 5.3, we observe that, for SPMM, using a monolithic BDD
as a characteristic function suffices for all verifiable benchmarks. The only exception is sbc,
where an array of characteristic functions needs to be maintained. Because using multiple
characteristic functions usually complicates the fixed-point computation, it is in general
89
CHAPTER 5. EQUIVALENCE VERIFICATION
more time consuming. Also, we find that SPMM takes more time than STPM and SPPM
for circuits, such as s382, s420.1, etc., with numerous equivalence classes and deep refining
processes. It is understandable because SPMM enumerates each equivalence class in every
refining process.
For circuits like s420.1, where the depths of traversal and refinement are both ex-
ponential in the size of inputs, none of the three techniques is competent. However, for
s420.1, since the depth of refinement is half of that of traversal, SPPM is about twice as
fast as STPM. Notice that, as analyzed in Section 5.5.1, although the product machine has
a traversal depth of 65535 (due to self-comparison), we can conclude the equivalence by
traversing states at the 32768th step even before the fixed point is reached.
For cbp and minmax series of circuits, where depths are shallow, STPM and SPPM
perform much better than SPMM, which needs to take care of numerous equivalence classes
as listed in Table 5.2. On the other hand, for minmax circuits, as discussed in [Fil91],
SPPM has a polynomial complexity in input sizes while STPM has an exponential one. In
comparison, SPPM is the best choice for these cases.
Circuits key and bigkey are another extreme, which have a few equivalence classes.
SPMM verifies them quite easily while both STPM and SPPM fail. In general, for control
logic, SPMM performs much better than the other two. Microprocessor 8085 is an example,
where SPMM verifies all the outputs except for the 16 for the address bus. (The results of
8085 in Tables 5.2 and 5.3 exclude these unverifiable outputs.) Other examples are control,
IFetchControl2, and IFetchControl3. On the other hand, due to the large number of
outputs in IFetchControl2, IFetchControl3, clma, sbc, etc., SPMM takes a long time to
verify them because it processes each output once at a time. Fortunately, these tasks can
be parallelly verified to minimize the total completion time.
90
CHAPTER 5. EQUIVALENCE VERIFICATION
In Table 5.4, the equivalence between a circuit and its retimed implementation is
checked. Retimed circuits were obtained by using sis [SSL+92], except for texas bench-
marks, s641-retime, and tbk-retime. Other circuits, which are included in Table 5.3
but absent from Table 5.4, either take too long for sis to retime, or have incompatible
initial states created by the retiming. Table 5.4 suggests that SPMM does not benefit
particularly when self-comparison is done. (This is due to the fact that state variables
are paired only by cone of influence of outputs. Otherwise, corresponding state variables
are avoided to be paired together. Doing so destroys BDD sharing in the experiments of
self-comparison.) This supports that the results of Table 5.3 are relevant for comparing the
three methods. Also, observe from Table 5.4 that SPMM is relatively stable when moving
from self-comparison to comparing against retimed versions. For example, for s526 and
s526n, the results in Tables 5.3 and 5.4 are similar for SPMM, but STPM and SPPM yield
substantial variances. The stability of SPMM derives from the fact that it depends mainly
on the maximum number of registers in the two designs plus the number of equivalence
classes encountered.
Another view of Tables 5.3 and 5.4 is shown in Table 5.5, where the second and third
columns denote the numbers of wins in terms of smaller memory and time usage, respec-
tively, and the last gives the number of examples on which the method failed. This analysis
indicates that SPMM is, on average, more efficient and more rugged than the other two
methods.
We did not experiment with the equivalence checking between inequivalent circuits.
However the expectation is that, according to Theorem 46, all of the three verification
techniques can report the nonequivalence in the same iteration, say in the nth iteration. To
generate a counterexample, on the other hand, both STPM and SPPM have time complexity
O(n) while SPMM has O(n2). This difference results from the fact that, in SPMM, the
91
CHAPTER 5. EQUIVALENCE VERIFICATION
input information of the previous iterations is thrown away when equivalence nodes are
re-expressed using newly introduced variables.
To summarize the results, the major limitation of SPMM is the encountered number
of equivalence classes during verification. In contrast, STPM and SPPM do not suffer the
same limitation because equivalence classes are not explicitly represented in the BDDs. For
a circuit with a not-so-deep depth of refinement and a “reasonable” number (≤ ∼106) of
equivalence classes per output, SPMM has a great chance of verifying it. On the other hand,
due to the fact that the number of equivalence classes in the reachable state subspace is
invariant under different implementations, SPMM tends to be the most robust verification
technique.
5.7 Related Work
5.7.1 Computation of State Equivalence
Computing state equivalence is a key ingredient of FSM state minimization. Before
the implicit symbolic approach was proposed in [LTN90, Pix92], the explicit enumerative
approach [Koh78, HU79] had been the traditional way of doing it. The computation pro-
posed by Lin et al. [LTN90] builds a product machine of the considered FSM with itself
and reasons about the state equivalence using a relation over pairs of states. In contrast,
we demonstrated another symbolic computation which deals with equivalence classes rather
than equivalence relations. In essence, the prior approach represents the equivalence rela-
tions with BDD paths; our approach represents equivalence classes with BDD nodes. As its
strength, the prior approach imposes no particular limitation on the number of equivalence
classes to be handled. However, as its weakness, it is often unable to handle FSMs with
many state variables; the performance and capability of the approach are unpredictable es-
92
CHAPTER 5. EQUIVALENCE VERIFICATION
pecially for medium-sized FSMs. In comparison, our approach is more robust, but limited
to cases where the number of equivalence classes cannot exceed a few million.
5.7.2 Verification of FSM Equivalence
As mentioned earlier, our verification technique aimed for general sequential equivalence
checking. Structural similarities between two FSMs to be verified were not explored. The
forward [BCM90] and backward [Fil91] state traversals are the closest structure-independent
equivalence checking techniques to ours, especially the latter.
Also, there have been extensive studies on structure-dependent equivalence checking,
e.g., just to name a few [vE00, QCC+00, SWWK04]. In [vE00], signal correspondences
were identified and merged to simplify equivalence checking. In [QCC+00], two transition
systems under comparison need to be similar up to a one-to-one mapping between equivalent
states. Such a mapping is discovered by reachability analysis to converge their combina-
tional similarity. The structural traversal method in [SWWK04] is an over-approximative
reachability analysis based on circuit manipulations.
5.8 Summary
This chapter consists of two parts: the identification of equivalent states and the ver-
ification of sequential equivalence. We show that the former can be done efficiently by
BDD-based functional decomposition. By introducing the multiplexed machine, we can
verify sequential equivalence by means of state partitioning in the sum space, a new possi-
bility to do formal equivalence checking. In high-speed designs, a great portion of registers
are for timing speed up rather than increasing the number of equivalence classes of states.
In such cases, state-space partitioning would become preferable to state-space traversal.
93
CHAPTER 5. EQUIVALENCE VERIFICATION
A major advantage of the new verification technique is the substantial reduction in the
number of state variables. Compared to product-machine-based techniques, our approach
almost halves the number of state variables. Although there is an intrinsic restriction on
BDD variable ordering, to overcome it and minimize the BDD sizes, several techniques are
proposed. These make our algorithm even more promising.
94
CHAPTER 5. EQUIVALENCE VERIFICATION
Table 5.1. Profiles of Benchmark Circuits
Circuit In Out Reg Reach (Depth)s1196 14 14 18 2616 (2)s298 3 6 14 218 (18)349 9 11 15 2625 (6)
s400/s444 3 6 21 8865 (150)s420.1 18 1 16 65536 (65535)s499 1 22 22 22 (21)
s526/s526n 3 6 21 8868 (150)s641 35 24 19 1544 (6)s713 35 23 19 1544 (6)s953 16 23 29 504 (10)s967 16 23 29 549 (10)s991 65 17 19 524288 (3)
bigkey 262 197 224 1.17e+67 (2)clma 382 82 33 158908 (411)mm4a 7 4 12 832 (3)mm9a 12 9 27 2.25e+7 (3)mm9b 12 9 26 2.25e+7 (3)
mult16a 17 1 16 65535 (16)sbc 40 56 28 154593 (9)
control 33 21 35 119 (6)IFetchControl2 27 38 59 2.50e+8 (27)IFetchControl3 27 38 61 1.00e+9 (27)
parsepack 9 65 70 3.70e+19 (9)parsesys 9 65 312 2.21e+48 (103)8085 18 27 193 N/Abpb 9 4 36 6.87e+10 (32)
cbp 16 4 17 17 16 131072 (1)cbp 32 4 33 33 32 4.29e+9 (1)
key 258 193 228 N/Aminmax5 8 5 15 12032 (3)minmax10 13 10 30 1.79e+8 (3)
tbk-retime 6 3 49 2048 (3)
95
CHAPTER 5. EQUIVALENCE VERIFICATION
Table 5.2. Characteristics of Equivalence Classes of Benchmark Circuits
Overall Partition Worst Partial PartitionCircuit whole (rfn) reach (rfn) whole (rfn) reach (rfn)s1196 82944 (2) 1509 (2) 96 (3) 56 (3)s298 8061 (16) 135 (12) 249 (24) 118 (20)s349 18608 (5) 1801 (5) 248 (8) 35 (6)s400 608448 (93) 8865 (93) 17174 (183) 8597 (183)
s420.1 65536 (32768)s444 608448 (93) 8865 (93) 17174 (183) 8597 (183)s499 4.19e+6 (1) 22 (1) 24 (21) 22 (21)s526 1.43e+6 (119) 8868 (93) 43068 (199) 8597 (183)s526n 1.43e+6 (119) 8868 (93) 43068 (199) 8597 (183)s641 294912 (1) 1480 (1) 24750 (8) 1248 (8)s713 294912 (1) 1480 (1) 24750 (8) 1248 (8)s953 N/A 504 (2) 42 (10) 35 (10)s967 N/A 549 (2) 42 (10) 35 (10)s991 327680 (1) 10 (2)
bigkey N/A 4 (2)clma N/A N/A 5950 (178)mm4a 3616 (1) 712 (1) 452 (2) 217 (1)mm9a N/A 522244 (2) 260617 (1)mm9b N/A N/A 260617 (1)
mult16a 65536 (16) 65535 (16) 65536 (16) 65535 (16)sbc N/A N/A 23048 (10)
control N/A 43 (2) 14 (6) 8 (5)IF’hC’l2 N/A N/A 9434 (37)IF’hC’l3 N/A N/A 8442 (39)parsepack N/A 18 (9) 10 (9)parsesys N/A 164 (21) N/A8085∗ N/A 309619 (28) N/Abpb N/A 512 (3)
cbp 16 4 65536 (1)cbp 32 4 4.29e+9 (1)
key N/A 64 (7) N/Aminmax5 30784 (1) 5520 (1) 1924 (2) 965 (2)minmax10 1.07e+9 (1) N/A 2.09e+6 (2) 1.04e+6 (1)
tbk-retime 16 (1) 16 (3)
96
CHAPTER 5. EQUIVALENCE VERIFICATION
Table 5.3. Sequential Equivalence Checking between Identical Circuits
STPM SPPM SPMMmem time mem time mem time
Circuit (Mb) (sec) (Mb) (sec) (Mb) (sec)s1196 28.3 2.3 25.1 1.5 12.4 2.1s298 7.8 0.2 16.4 1.0 8.7 0.9s349 12.7 1.5 25.4 1.3 10.8 1.9s400 12.8 4.9 43.1 4.8 56.6 448.8s420.1 45.1 669.2 37.9 290.9 62.0 2.98e+5s444 12.7 4.8 42.2 4.5 55.8 438.9s499 299 157.1 16.5 1.0 8.6 0.2s526 22.5 7.1 117.0 293.8 50.4 358.2s526n 16.6 4.4 82.7 150.9 50.4 357.8s641 11.9 0.7 27.4 0.6 39.5 3.3s713 11.8 0.7 27.6 0.6 39.2 6.4s953 11.3 0.1 27.9 0.8 11.9 1.1s967 11.4 0.9 27.5 0.8 10.3 0.5s991 35.4 26.4 64.9 11.6 10.7 0.3bigkey >2G N/A >2G N/A 21.4 1.3clma 142 134.6 >2G N/A 117 4.30e+4mm4a 8.6 0.3 7.7 0.1 15.3 0.9mm9a 82.1 1.24e+5 58.9 16.6 244 4673.7mm9b >2G N/A >2G N/A 693 3.12e+4
mult16a 8.5 0.2 8.4 0.1 87.8 126.1sbc >2G N/A >2G N/A 537 1.29e+5
control 191 79.4 46.1 7.9 23.3 1.1IF’hC’l2 >2G N/A N/A >1.0e+6 258 1.37e+4IF’hC’l3 >2G N/A N/A >1.0e+6 259 1.38e+4parsepack >2G N/A 64.9 110.9 19.0 1.2parsesys >2G N/A 458 2.91e+4 102 45.98085∗ >2G N/A >2G N/A 793 3.06e+5bpb >2G N/A 51.7 62.9 46.1 17.2
cbp 16 4 18.0 0.3 18.0 0.3 75.2 70.2cbp 32 4 25.0 0.8 24.7 0.7 >2G N/A
key >2G N/A >2G N/A 68.5 15.4minmax5 27.3 0.8 28.1 0.6 26.0 12.2minmax10 151 1694.9 47.2 2.3 733 8.75e+4tbk-retime >2G N/A >2G N/A 84.2 112.3
97
CHAPTER 5. EQUIVALENCE VERIFICATION
Tab
le5.
4.Se
quen
tial
Equ
ival
ence
Che
ckin
gbe
twee
nD
iffer
ent
Impl
emen
tati
ons
ofSa
me
Des
ign
STP
MSP
PM
SPM
Mm
emti
me
mem
tim
em
emti
me
Cir
cuit
Reg
(Mb)
(sec
)(M
b)(s
ec)
(Mb)
(sec
)s208.1
/s208.1-retime
8/
1612
.40.
311
.80.
58.
92.
3s298
/s298-retime
14/
3412
.70.
321
.81.
79.
60.
7s386
/s386-retime
6/
1512
.60.
213
.00.
37.
30.
1s499
/s499-retime
22/
4143
719
6.9
690
401.
310
.71.
8s510
/s510-retime
6/
3413
.60.
419
.51.
812
.30.
4s526
/s526-retime
21/
5848
.324
.323
720
12.2
55.4
552.
5s526n
/s526n-retime
21/
6448
.441
.520
452
38.7
53.2
325.
9s526-retime
/s526n-retime
58/
64>
2GN
/A98
21.
26e+
554
.546
9.3
s641
/s641-retime
19/
1837
.51.
941
.11.
929
.39.
7s991
/s991-retime
19/
4234
524
31.9
139
760.
874
.313
4.6
mult16a
/mult16a-retime
16/
106
>2G
N/A
>2G
N/A
N/A
>1.
0e+
6tbk
/tbk-retime
5/
4956
.110
.370
.179
.246
.26.
6
98
CHAPTER 5. EQUIVALENCE VERIFICATION
Table 5.5. Overall Statistics
Method Wins in Memory Wins in Time FailedSTPM 11 12 13SPPM 7 15 10SPMM 28 21 2
99
Chapter 6
Verification Reduction
The existence of functional dependency among the state variables of a state transition
system has been identified as a common cause of inefficient BDD representation in formal
verification. Eliminating such dependency from the system compacts the state space and
may significantly reduce the verification cost. Despite this importance, how to detect func-
tional dependency without or before knowing the reachable state set remains a challenge.
This chapter tackles this problem by unifying two closely related but scattered studies —
detecting signal correspondence and exploiting functional dependency. Prior work on ei-
ther subject is a special case of our formulation. Unlike previous approaches, we detect
dependency directly from transition functions rather than from reached state sets. Thus,
reachability analysis is not a necessity for exploiting dependency. In addition, our proce-
dure can be integrated into reachability analysis as an on-the-fly reduction. Preliminary
experiments demonstrate promising results of extracting functional dependency without
reachability analysis; dependencies that were underivable before, due to the limitation of
reachability analysis on large transition systems, can now be computed efficiently; for ap-
100
CHAPTER 6. VERIFICATION REDUCTION
plication to verification, reachability analysis is shown to have substantial reduction in both
memory and run-time consumption.
6.1 Introduction
Reduction [Kur94] is an important technique in extending the capacity of formal veri-
fication. This chapter is concerned with property-preserving reduction [CGP99], where the
reduced model satisfies a property if and only if the original model does. In particular, we
focus on reachability-preserving reduction for safety property verification using functional
dependency.
The existence of dependency among state variables frequently occurs in state tran-
sition systems in both high-level specifications and gate-level implementations [STB96].
Such dependency may cause inefficient BDD [Bry86] representation in formal verification
[HD93] and can be used also in logic minimization [LN91a, STB96]. Thus, its detection
has attracted extensive research in both domains (e.g., see [BCM90, LN91a, HD93, vEJ96,
STB96, YSBO99]). The essence of all prior efforts [BCM90, LN91a, vEJ96, STB96] can
be traced back to functional deduction [Bro03], where variable dependency was derived
from the characteristic function of a reached state set. However, state transition systems of
practical applications are often too complex to compute their reachable states, even though
these systems might be substantially reduced after variable dependency is known. An im-
provement was proposed in [vEJ96] to exploit the dependency from the currently reached
state set during every iteration of a reachability analysis. However, the computation may
still be too expensive and may simplify subsequent iterations very little.
To avoid such difficulty, we take a different path to exploit dependency. The observa-
tion is that dependency among state variables may originate from the dependency among
101
CHAPTER 6. VERIFICATION REDUCTION
transition functions1. Some variable dependency can be concluded more efficiently using
transition functions rather than the characteristic function of a reached state set. Therefore,
the computation requires only local image computation. Because the derived dependency is
an invariant, it can be used by any BDD- or SAT-based model checking procedure to reduce
verification complexity. Since not all dependency can be discovered this way due to the im-
perfect information about state reachability, this method is an approximative approach. To
complete the approximative computation, our procedure can be embedded into reachability
analysis as an on-the-fly detection. Reachability analysis is thus conducted on a reduced
model in each iteration. Our formulation leads to a unification of two closely related, but
scattered, studies on detecting signal correspondence [Fil92, vE00] and exploiting functional
dependency [HD93, vEJ96].
The chapter is organized as follows. After preliminaries and notation are given in
Section 6.2, our formulation of functional dependency and the corresponding calculations
are introduced in Section 6.3. Section 6.4 applies the developed algorithms to reachability
analysis as an on-the-fly reduction. Experimental results are provided in Section 6.5 to
demonstrate practical advantages. In Section 6.6, a closer comparison with prior work is
detailed. Section 6.7 concludes and outlines some future research directions.
6.2 Preliminaries and Notation
As a notational convention, the unordered version of a vector (an ordered set) ~v =
〈v1, . . . , vn〉 is written as ~v = v1, . . . , vn. In this case, n is the cardinality (size) of both
~v and ~v, i.e., |~v| = |~v| = n. Also, when a vector ~v is partitioned into k sub-vectors1From experience, it is commonly recognized that, to represent state transition systems, transition func-
tions are mostly preferable to transition relations. Complex transition systems are often compactly rep-resentable in transition functions but not in transition relations. This chapter assumes that transitionfunctions are the underlying representation of state transition systems. Consequently, our formulation isnot directly applicable to nondeterministic transition systems. The corresponding extension can apply theMOCB technique proposed in [HD93].
102
CHAPTER 6. VERIFICATION REDUCTION
~v1, . . . , ~vk, the convention 〈~v1; . . . ;~vk〉 denotes that ~v1, . . . , ~vk are combined into one vector
with a proper reordering of elements to recover the ordering of ~v.
This chapter assumes, without loss of generality, that multi-valued functions are replaced
with vectors of Boolean functions. The image of a Boolean functional vector ~ψ over a subset
C of its domain is denoted as Image(~ψ,C); the range of ~ψ is denoted as Range(~ψ). Let
ψ : Bn → B be a Boolean function over variables x1, . . . , xn. The support set of ψ is
Supp(ψ) = xi | (ψ|xi=0 xor ψ|xi=1) 6= false. For a characteristic function F (~x) over
the set ~x of Boolean variables, its projection on ~y ⊆ ~x is defined as F [~y/~x] =
∃xi ∈ ~x\~y.F (~x). Also, we denote the identity function and its complement as = and
=†, respectively.
A state transition system is modelled an FSMM = (Q, I,Σ, Ω, ~δ, ~ω). Since symbols and
functions are in binary representations in this chapter, M will be specified, instead, with
a five-tuple (I, ~r, ~s, ~δ, ~ω), where ~r (resp. ~s) is the vector of Boolean variables that encodes
the input alphabets (resp. states).
6.3 Functional Dependency
Dependency for a state transition system can be formulated in two steps. We first
define combinational dependency among a collection of functions. The formulation is
then extended to sequential dependency for a state transition system.
6.3.1 Combinational Dependency
Given two Boolean functional vectors ~φ : Bl → Bm and ~ϕ : Bl → Bn over the same
domain, we are interested in rewriting ~φ in terms of a function of ~ϕ. The condition when such
103
CHAPTER 6. VERIFICATION REDUCTION
a rewrite is feasible can be captured by a refinement relation, v ⊆ (Bl → Bm)× (Bl → Bn),
defined as follows.
Definition 8 Given two Boolean functional vectors ~φ : Bl → Bm and ~ϕ : Bl → Bn, ~ϕ
refines ~φ in C ⊆ Bl, denoted as ~φ vC ~ϕ, if ~φ(a) 6= ~φ(b) implies ~ϕ(a) 6= ~ϕ(b) for all
a, b ∈ C.
In other words, ~ϕ refines ~φ in C if and only if ~ϕ is more distinguishing than ~φ in C. (As
the orderings within ~φ and ~ϕ are not a prerequisite, our definition of refinement relation
applies to two unordered sets of functions as well.) In the sequel, the subscription C will be
omitted from the refinement relation v when C is the universe of the domain. Based on the
above definition, the following proposition forms the foundation of our later development.
Proposition 53 Given ~φ : Bl → Bm and ~ϕ : Bl → Bn, there exists a functional vector
~θ : Bn → Bm such that ~φ = ~θ ~ϕ = ~θ(~ϕ(·)) over C ⊆ Bl if and only if ~φ vC ~ϕ. Moreover, ~θ
is unique when restricting its domain to the range of ~ϕ.
For ~φ = ~θ ~ϕ, we call φ1, . . . , φm ∈ ~φ the functional dependents (or, briefly, dependents),
ϕ1, . . . , ϕn ∈ ~ϕ the functional independents (or, briefly, independents), and θ1, . . . , θn ∈~θ the dependency functions.
Problem formulation.
The problem of detecting (combinational) functional dependency can be formulated as
follows. Given a collection of Boolean functions ~ψ, we are asked to partition ~ψ into two
parts ~φ and ~ϕ such that ~φ = ~θ(~ϕ). Hence, the triple (~φ, ~ϕ, ~θ) characterizes the functional
dependency of ~ψ. We call such a triple a dependency triplet. Suppose ~ϕ cannot be
further reduced in (~φ, ~ϕ, ~θ) by recognizing more functional dependents from ~ϕ with all
possible modifications of ~θ. That is, |~ϕ| is minimized; equivalently, |~φ| is maximized. Then
104
CHAPTER 6. VERIFICATION REDUCTION
the triplet maximally characterizes the functional dependency of ~ψ. In this chapter, we are
interested in computing maximal functional dependency. (Although finding a maximum
dependency might be helpful, it is computationally much harder than finding a maximal
one as it is the supremum over the set of maximal ones.)
The computation.
In the discussion below, when we mention Boolean functional vectors ~φ(~x) and ~ϕ(~x),
we shall assume that ~φ : Bl → Bm and ~ϕ : Bl → Bn with variable vector ~x : Bl. Notice that
Supp(~φ) and Supp(~ϕ) are subsets of ~x. The following properties are useful in computing
combinational dependency.
Theorem 54 Given functional vectors ~φ and ~ϕ, ~φ v ~ϕ only if Supp(~φ) ⊆ Supp(~ϕ).
Corollary 55 Given a collection of Boolean functions ψ1(~x), . . . , ψk(~x), if, for any xi ∈~x, ψj is the only function such that xi ∈ Supp(ψj), then ψj is a functional independent.
With the support set information, Theorem 54 and Corollary 55 can be used as a fast
screening in finding combinational dependency.
Theorem 56 Given functional vectors ~φ and ~ϕ, ~φ v ~ϕ if and only if |Range(~ϕ)| =
|Range(〈~φ, ~ϕ〉)|.
Theorem 57 Let θi ∈ ~θ be the corresponding dependency function of a dependent φi ∈ ~φ.
Let Θ0i = ~ϕ(~x)|φi(~x) = 0 and Θ1
i = ~ϕ(~x)|φi(~x) = 1. Then φi v ~ϕ if and only if
Θ0i ∩ Θ1
i = ∅. Also, θi has Θ0i , Θ1
i , and Bn\Θ0i ∪ Θ1
i as its off-set, on-set, and don’t-care
set, respectively. That is, θi(~ϕ(~x)) = φi(~x) for all valuations of ~x.
From Theorem 56, we know that the set ~ϕ of functional independents is as distinguishing
as the entire set ~φ ∪ ~ϕ of functions. Theorem 57, on the other hand, shows a way of
computing dependency functions.
105
CHAPTER 6. VERIFICATION REDUCTION
CombinationalDependency
input: a collection ~ψ of Boolean functionsoutput: a dependency triplet (~φ, ~ϕ, ~θ)begin01 for each ψi ∈ ~ψ02 derive minimal refining sets Ei
1, . . . , Eik
03 select a minimal basis ~ϕ that refines all ψi ∈ ~ψ04 compute the dependency functions ~θ for ~φ = ~ψ\~ϕ05 return (~φ, ~ϕ, ~θ)end
Figure 6.1. Algorithm: CombinationalDependency.
Given a collection ~ψ of Boolean functions, its maximal dependency can be computed
with the procedure outlined in Figure 6.1. First, by Theorem 56, for each function ψi ∈ ~ψwe obtain the minimal subsets of ~ψ which refine ψi. Let the minimal refining subsets for
ψi be E1i , . . . , Ek
i . (Notice that k ≥ 1 since ψi refines itself and, thus, ψi is one of the
subsets.) The calculation can be done with local image computation because by Theorem 54
and Corollary 55 we only need to consider subsets of functions in ~ψ which overlap with
ψi in support sets. Second, we heuristically derive a minimal set of functional independents
that refines all the functions of ~ψ. Equivalently, for each ψi, some Ejii is selected such that
the cardinality of⋃|~ψ|
i=1 Ejii is minimized. This union set forms the basis of representing all
other functions. That is, functions in the union set are the functional independents; others
are the functional dependents. Finally, by Theorem 57, dependency functions are obtained
with respect to the selected basis.
A digression.
There are other variant definitions of dependency (see [Mar60] for more examples). The
functional dependency defined in [Bro03] (Section 6.9), which follows [Mar60], is too weak to
be applicable in our application. We, thus, resort to a stronger definition. As noted below,
106
CHAPTER 6. VERIFICATION REDUCTION
our definition turns out to be consistent with functional deduction (see [Bro03], Chapter 8),
which is concerned with the variable dependency in a single characteristic function.
We relate our formulation to functional deduction as follows. In functional deduction,
variable dependency is drawn from a single characteristic function. Thus, to exploit the
dependency among a collection of functions ~ψ(~x), a single relation Ψ(~x, ~y) =∧
i(yi ≡ ψi(~x))
should be built, where yi’s are newly introduced Boolean variables. In addition, to derive
dependency solely among ~y, input variables ~x should be enforced in the eliminable
subset [Bro03]. With the foregoing transformation, variable dependency in functional de-
duction coincides with our defined functional dependency. A similar result of Theorem 57
was known in the context of functional deduction. Compared to the relational-oriented func-
tional deduction, our formulation can be understood as more functional-oriented, which is
computationally more practical.
6.3.2 Sequential Dependency
Given a state transition system M = (I, ~r, ~s, ~δ, ~ω), we consider the detection of func-
tional dependency among the set ~δ of transition functions. More precisely, detecting the
sequential dependency of M is equivalent to finding ~θ such that ~δ is partitioned into two
vectors: the dependents ~δφ, and the independents ~δϕ. Let ~s = ~sφ ∪ ~sϕ be such that
the valuations of ~sφ and ~sϕ are updated by ~δφ and ~δϕ, respectively. Then ~θ specifies the de-
pendency of M by ~sφ = ~θ( ~sϕ) and ~δφ = ~θ( ~δϕ), i.e., ~δφ(~r, 〈~θ( ~sϕ); ~sϕ〉) = ~θ ~δϕ(~r, 〈~θ( ~sϕ); ~sϕ〉).
Sequential dependency is more relaxed than its combinational counterpart because of
the reachability nature of M. The derivation of ~θ shall involve a fixed-point computation,
and can be obtained in two different ways, the greatest fixed-point (gfp) and the least fixed-
point (lfp) approaches, with different optimality and complexity. Our discussions start
from the easier gfp computation, and continue with the more complicated lfp one. The
107
CHAPTER 6. VERIFICATION REDUCTION
optimality, on the other hand, is usually improved when changing from the gfp to the lfp
computation.
Remark 3 We mention a technicality regarding the set I of initial states. In general,
the combinational dependency among transition functions may not hold for the states in
I because I may contain predecessor-free states. (A state is called predecessor-free if
it has no predecessor states.) To overcome this difficulty, a new set I ′ of initial states is
defined. Let I ′ be the set of states which are one-step reachable from I. Now, since any
state in I ′ has at least one predecessor state, the calculated dependency holds for I ′. On
the other hand, the set of reachable states from I is identical to that from I ′ except for
some states in I. In the verification of safety properties, such a substitution is legitimate as
long as states in I satisfy the underlying property to be verified. In our discussion, unless
otherwise noted, we shall assume that the set of initial states consists of only states with
predecessors.
The greatest fixed-point calculation.
Figure 6.2 illustrates the gfp calculation of sequential dependency up to three iterations.
In the computation, state variables are treated functionally independent of each other ini-
tially. Their dependency is then discovered iteratively. Combinational dependency among
transition functions is computed in each iteration. The resultant dependency functions
are substituted backward in the subsequent iteration for the state variables of their corre-
sponding functional dependents. Thereby, the transition functions and previously derived
dependency functions are updated. More precisely, let ~θ(i) be the set of derived dependency
functions for ~δ(i) at the ith iteration. For j from i − 1 to 1, the set ~θ(j)(~s(i−1)~ϕ ) of depen-
dency functions is updated in order with ~θ(j)(~s(i)~ϕ ) = ~θ(j)(〈~θ(j+1)(~s(i)
~ϕ ); . . . ; ~θ(i)(~s(i)~ϕ );~s(i)
~ϕ 〉).After the updates of ~θ(j)’s, ~δ(i+1) is set to be ~δ
(i)~ϕ (~r, 〈~θ(1)(~s(i)
~ϕ ); . . . ; ~θ(i)(~s(i)~ϕ );~s(i)
~ϕ 〉), where
108
CHAPTER 6. VERIFICATION REDUCTION
r
(1)
ϕ’(1)s
s’ (1)φδ
s
( i )
r
θ(2)sϕ
(1)s
( ii )
sϕ’(2)
s’φ(2)δ(1)
θ(1)
ϕ
r
s
sϕ’
s’φδϕ
( iii )
(2)
(3)
(3)
θ(3)
θ(2)
sϕ(2)
θ
sϕ(1)
(1)
θ
Figure 6.2. The greatest fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transition functions (andthus next-state variables) are partitioned into dependent and independent parts by thecomputation of combinational dependency. The derived dependency is used to reduce thestate space in the subsequent iteration.
~δ(i)~ϕ ⊆ ~δ corresponds to the functional independents of ~δ(i). At the (i + 1)st iteration,
the combinational dependency among ~δ(i+1) is computed. The iteration terminates when
the size of the set of functional independents cannot be reduced further. The termination
is guaranteed since |~δ(i)| decreases monotonically. In the end of the computation, the fi-
nal ~θ is simply the collection of ~θ(i)’s, and the final set of functional independents is ~δ(k)~ϕ ,
where k is the last iteration. The computation is summarized in Figure 6.3, where the
procedure CombinationalDependencyRestore is similar to CombinationalDependency with
a slight difference. It computes the dependency among the set of functions given in the first
argument in the same way as CombinationalDependency. However, the returned functional
109
CHAPTER 6. VERIFICATION REDUCTION
SequentialDependencyGfp
input: a state transition system M = (I, ~r, ~s, ~δ, ~ω)output: a dependency triplet (~δφ, ~δϕ, ~θ) for ~δbegin01 i := 0; ~δ(1) := ~δ02 repeat03 if i ≥ 204 for j from i− 1 to 105 ~θ(j)(~s(i)
~ϕ ) := ~θ(j)(〈~θ(j+1)(~s(i)~ϕ ); . . . ; ~θ(i)(~s(i)
~ϕ );~s(i)~ϕ 〉)
06 if i ≥ 107 ~δ(i+1)(~r,~s(i)
~ϕ ) := ~δ(i)~ϕ (~r, 〈~θ(1)(~s(i)
~ϕ ); . . . ; ~θ(i)(~s(i)~ϕ );~s(i)
~ϕ 〉)08 i := i + 109 (~δ(i)
~φ, ~δ
(i)~ϕ , ~θ(i)) := CombinationalDependencyRestore(~δ(i), ~δ)
10 until |~δ(i)| = |~δ(i)~ϕ |
11 return (〈~δ(1)~φ
; . . . ;~δ(i−1)~φ
〉, ~δ(i−1)~ϕ , 〈~θ(1); . . . ; ~θ(i−1)〉)
end
Figure 6.3. Algorithm: SequentialDependencyGfp.
dependents and independents are the corresponding functions given in the second argument
instead of those in the first argument.
Notice that the final result of the gfp calculation may not be unique since, in each itera-
tion, there are several possible choices for a maximal functional dependency. As one choice
has been made, it fixes the dependency functions for state variables that are declared as
dependents. Thereafter, the dependency becomes an invariant throughout the computation
since the derivation is valid for the entire set of states with predecessors. For the same
reason, the gfp calculation may be too conservative. Moreover, the optimality of the gfp
calculation is limited because the state variables are initially treated functionally indepen-
dent of each other. This limitation becomes apparent especially when the dependency to
be discovered is between two state transition systems (e.g., in equivalence checking). To
discover more dependency, we need to adopt a least fixed-point strategy and refine the
dependency iteratively.
110
CHAPTER 6. VERIFICATION REDUCTION
r
(1)
ϕ’(1)s
s’ (1)φδ
s
( i )
r
θ
sϕ(0)
(0)
θ(2)sϕ
(1)
θ(1)
s
( ii )
sϕ’(2)
s’φ(2)δ
r
θ(2)
sϕ(2)
s
sϕ’
s’φδ
( iii )
(3)
(3)
θ(3)
θ
Figure 6.4. The least fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transition functions (andthus next-state variables) are partitioned into dependent and independent parts by thecomputation of combinational dependency. The derived dependency is used to reduce thestate space in the subsequent iteration.
The least fixed-point calculation.
Figure 6.4 illustrates the lfp calculation of sequential dependency up to three iterations.
In the computation, unlike the gfp one, the initial dependency among state variables is
exploited maximally based on the set of initial states. The dependency is then strengthened
iteratively until a fixed point has been reached. The set of functional independents tend to
increase during the iterations, in contrast to the decrease in the gfp calculation.
Consider the computation of initial dependency. For the simplest case, when |I| = 1,
any state variable sϕ can be selected as the basis. Any other variable is replaced with
111
CHAPTER 6. VERIFICATION REDUCTION
either =(sϕ) or =†(sϕ), depending on whether its initial value equals that of sϕ or not. For
arbitrary I, the initial variable dependency can be derived using functional deduction on
the characteristic function of I. (As noted in Remark 3, excluding predecessor-free states
from I reveals more dependency.)
For the iterative computation, transition functions are updated in every iteration by
eliminating dependent state variables with the latest dependency functions. Combinational
dependency is then obtained for the new set of transition functions. Unlike the gfp iter-
ations, the obtained functional dependency in the ith iteration may not be an invariant
for the following iterations because the derived dependency may be valid only in the state
subspace spanned by ~s(i−1)~ϕ . As the state subspace changes over the iterations due to
different selections of independent state variables, the dependency may need to be recti-
fied. Notice that the set of functional independents may not increase monotonically during
the iterations. This non-convergent phenomenon is due to the existence of the don’t-care
choices of ~θ(i) in addition to the imperfect information about the currently reachable state
set. Therefore, additional requirements need to be imposed to guarantee termination. Here
we require that, after a certain number of iterations, the set of independent state variables
increase monotonically until ~θ(i) can be reused in the next iteration, that is, the fixed point
is reached. The algorithm is outlined in Figure 6.5. To simplify the presentation, it contains
only the iterations where ~s(i)~ϕ increases monotonically. Procedure CombinationalDepen-
dencyReuse is the same as CombinationalDependency except that it tries to maximally
reuse the dependency functions provided in its second argument.
In theory, the optimality of the lfp calculation lies somewhere between that of the gfp
calculation and that of the most general computation with reachability analysis. (The opti-
mality of the lfp calculation subsumes that of the gfp counterpart because the dependency
discovered by the gfp calculation can always be an invariant during the lfp calculation.
However, in practice, the lfp calculation may not maintain such dependency throughout
112
CHAPTER 6. VERIFICATION REDUCTION
SequentialDependencyLfp
input: a state transition system M = (I, ~r, ~s, ~δ, ~ω)output: a dependency triplet (~δφ, ~δϕ, ~θ) for ~δbegin01 i := 0; (~s(0)
~φ, ~s
(0)~ϕ , ~θ(0)) := InitialDependency(I)
02 repeat03 i := i + 104 ~δ(i) := ~δ(~r, 〈~θ(i−1)(~s(i−1)
~ϕ );~s(i−1)~ϕ 〉)
05 (~δ(i)~φ
, ~δ(i)~ϕ , ~θ(i)) := CombinationalDependencyReuse(~δ(i), ~θ(i−1))
06 until ~θ(i) = ~θ(i−1)
07 return (~δ(i)~φ
, ~δ(i)~ϕ , ~θ(i))
end
Figure 6.5. Algorithm: SequentialDependencyLfp.
its iterative computations if the reachable state space is approximated in an inappropriate
path.) Since not all dependency in M can be detected by the lfp procedure due to the
imperfect information about the reachable states, the algorithm is incomplete in detect-
ing dependency. To make it complete, reachability analysis should be incorporated. We
postpone this integration to the next section and phrase it in the context of verification
reduction.
Remark 4 Notice that when ~θ(i)’s are restricted to consisting of only identity functions
and/or complementary identity ones, the refinement relation v among transition functions
reduces to an equivalence relation. In this case, the lfp calculation of sequential dependency
reduces to the detection of equivalent state variables. Hence, detecting signal correspon-
dence [vE00] is a special case of our formulation.
113
CHAPTER 6. VERIFICATION REDUCTION
6.4 Verification Reduction
Here we focus on using reduction for safety property verification, where reachability
analysis is the core computation. The verification problem asks if a state transition system
M = (I, ~r, ~s, ~δ, ~ω) satisfies a safety property P , denoted as M |= P , for all of its reachable
states.
Suppose that (~δφ, ~δϕ, ~θ) is a dependency triplet of ~δ; let ~sφ and ~sϕ be the corresponding
state variables of ~δφ and ~δϕ, respectively. To represent the reachable state set, either
~s or ~sϕ can be selected as the basis. Essentially, R(~s) = Expand(R⊥( ~sϕ), ( ~sφ, ~sϕ, ~θ)) =
R⊥( ~sϕ) ∧ ∧i(sφi ≡ θi( ~sϕ)), where R and R⊥ are the characteristic functions representing
the reachable state sets in the total space and, respectively, in the reduced space spanned
by ~sϕ. Let P (~s) denote the states that satisfy P . Checking whether R(~s) ⇒ P (~s) is
equivalent to checking whether R⊥( ~sϕ) ⇒ P⊥( ~sϕ), where P⊥( ~sϕ) = P (〈~θ( ~sϕ); ~sϕ〉). Hence,
the verification problem can be carried out solely over the reduced space. As noted in
Remark 3, the set I of initial states might require special handling.
For a given dependency, reachability analysis can be carried out solely upon the reduced
basis. The validity of the given dependency can be tested in every iteration of the reacha-
bility analysis as was done in [HD93]. Below we concentrate on the cases where dependency
is not given. We show how the detection of functional dependency can be embedded into
and simplify the reachability analysis.
To analyze the reachability of a transition system with unknown dependency, two ap-
proaches can be taken. One is to find the sequential dependency with the forementioned gfp
and/or lfp calculation, and then perform reachability analysis on the reduced state space
based on the obtained dependency. The other is to embed the dependency detection into
the reachability analysis as an on-the-fly reduction. Since the former is straightforward, we
only detail the latter. Figure 6.6 sketches the algorithm. Procedure CombinationalDepen-
114
CHAPTER 6. VERIFICATION REDUCTION
ComputeReachWithDependencyReduction
input: a state transition system M = (I, ~r, ~s, ~δ, ~ω)output: the set R of reachable states of Mbegin01 i := 0; (~s(0)
~φ, ~s
(0)~ϕ , ~θ(0)) := InitialDependency(I)
02 I⊥0 := I[~s(0)~ϕ /~s]
03 R⊥0 := I⊥0 ; F⊥0 := I⊥0
04 repeat05 i := i + 106 ~δ(i) := ~δ(~r, 〈~θ(i−1)(~s(i−1)
~ϕ );~s(i−1)~ϕ 〉)
07 (~δ(i)~φ
, ~δ(i)~ϕ , ~θ(i)) := CombinationalDependencyReach(~δ(i), ~θ(i−1), R⊥i−1)
08 T⊥i := Image(~δ(i)~ϕ , F⊥i−1)
09 ~sν := ~s(i)~ϕ \~s
(i−1)~ϕ ; ~θν := ~sν ’s corresponding functions in ~θ(i−1)
10 R⊥i−1 := Expand(R⊥i−1 , (~sν , ~s(i−1)~ϕ , ~θν))
11 R⊥i−1 := R⊥i−1 [~s(i)~ϕ /~s
(i)~ϕ ∪ ~s
(i−1)~ϕ ]
12 F⊥i := simplify T⊥i with R⊥i−1 as don’t care13 R⊥i := R⊥i−1 ∪ T⊥i
14 until R⊥i = R⊥i−1
15 return Expand(R⊥i , (~s(i)~φ
, ~s(i)~ϕ , ~θ(i)))
end
Figure 6.6. Algorithm: ComputeReachWithDependencyReduction.
dencyReach is similar to CombinationalDependencyReuse with two exceptions: First, the
derived dependency is with respect to the reached state set provided in the third argument.
Second, the set of independent state variables needs not increase monotonically since the
termination condition has been taken care of by the reached state sets. In each iteration
of the state traversal, the previously reached state set R is adjusted (by the expansion and
projection operations) to a new basis according to the derived dependency triplet.
115
CHAPTER 6. VERIFICATION REDUCTION
6.5 Experimental Results
The forementioned algorithms have been implemented in the VIS [BHSV+96] environ-
ment. Experiments were conducted on a Sun machine with a 900-MHz CPU and 2-Gb
memory. Three sets of experiments have results shown in Tables 6.1, 6.2, and 6.3, respec-
tively. Table 6.1 demonstrates the relative power of exploiting dependency by the detection
of signal correspondence, the gfp, and lfp calculations of sequential dependency. Table 6.2
compares their applicabilities in the equivalence checking problem. Finally, Table 6.3 shows
how reachability analysis can benefit from our computation of functional dependency. In
the experiments, all the approaches under comparison use the same BDD ordering (not
optimized). In addition, no reordering is invoked.
Compared in Table 6.1 are three approaches: the computation of signal correspondence
[vE00], the gfp, and lfp calculations of sequential dependency. The first two columns list
the benchmark circuits and their sizes in state variables. The original sizes of retimed
circuits (for timing optimization) are listed in the following parentheses. For each compared
approach, four columns in order list the sizes of the computed independent state variables,
the required numbers of iterations, memory usage, and CPU time. Among these three
approaches, the minimum sizes of independent variables are highlighted in bold. It is
evident from Table 6.1 that the lfp calculation of sequential dependency subsumes the
detection of signal correspondence in both generality and optimality. On the other hand,
the powers of the lfp and gfp calculations are incomparable in practice. They have different
directions of approximating reachable state sets. For the gfp calculation, the unreachable
state set is gradually pruned each time dependency functions are substituted backward.
For the lfp one, the reachable state set grows with the iterative computation. It turns out
that the gfp computation is very effective in exploiting dependency for retimed circuits. For
instance, in circuit tbk-rt, 13 variables are identified as independents by the gfp calculation,
116
CHAPTER 6. VERIFICATION REDUCTION
Tab
le6.
1.C
ompa
riso
nsof
Cap
abili
ties
ofD
isco
veri
ngD
epen
denc
y
Sign
alC
orr.
[vE
00]
Seq.
Dep
.G
fpSe
q.D
ep.
Lfp
mem
.ti
me
mem
.ti
me
mem
.ti
me
Cir
cuit
Stat
eV
ar.
indp
.it
er.
(Mb)
(sec
)in
dp.
iter
.(M
b)(s
ec)
indp
.it
er.
(Mb)
(sec
)s208.1-rt
16(8
)13
410
0.1
91
100.
19
1010
0.2
s298-rt
34(1
4)31
510
0.3
232
231.
624
1041
6.2
s386-rt
15(6
)7
210
0.1
71
100.
16
210
0.1
s499-rt
41(2
2)41
2113
1.6
291
2311
.629
2223
8.2
s510-rt
34(6
)32
413
0.4
212
5117
.523
658
81.1
s526-rt
58(2
1)50
613
1.0
382
5943
.839
1456
24.9
s526n-rt
64(2
1)55
413
1.0
372
6010
4.2
4014
5826
.8s635-rt
51(3
2)50
1613
0.6
342
132.
834
3321
7.4
s838.1-rt
73(3
2)48
2013
1.5
331
223.
733
4621
18.3
s991-rt
42(1
9)24
213
0.5
212
211.
420
221
1.4
mult16a-rt
106
(16)
666
130.
975
213
1.0
618
134.
6tbk-rt
49(5
)49
249
6.8
134
6226
4.1
213
5948
.4s1269
3737
221
0.6
351
211.
135
621
2.4
s1423
7473
639
2.3
740
394.
373
939
12.0
s3271
116
114
629
2.1
116
029
3.0
114
645
12.6
s4863
104
813
474.
781
169
178.
775
347
14.5
s5378
179
163
1237
6.5
155
251
15.9
154
1451
43.1
s6669
239
231
564
9.3
231
161
53.8
231
564
97.5
s9234.1
211
188
1899
79.5
189
297
250.
218
438
9996
7.6
s13207
669
303
1613
895
.646
05
111
384.
626
337
100
836.
0s15850
597
431
2414
222
1.7
569
313
414
87.1
315
3214
214
41.0
s35932
1728
1472
3128
159
9.8
1728
014
634
091.
5–
––
>10
5
s38584
1452
869
1730
352
5.5
1440
115
541
03.3
849
2530
322
001.
18085
193
9115
6528
.919
30
7042
.479
1763
64.3
117
CHAPTER 6. VERIFICATION REDUCTION
compared to 24 by the lfp one. In general, the gfp computation uses much fewer iterations
than the other two approaches. In contrast, the lfp calculation outperforms the other two
approaches in circuits not retimed. The table also reveals that all the approaches do not
suffer from memory explosion. Rather, the time consumption may be a concern in the gfp
and lfp calculations of sequential dependency. This is understandable because testing the
refinement relation is more general and complicated than testing the equivalence relation
used in the detection of signal correspondence. Fortunately, the tradeoff between quality
and time can be easily controlled, for example, by imposing k-substitutability, which uses up
to k functions to substitute a dependent function. With our formulation, dependencies that
were underivable before, due to the limitation of reachability analysis on large transition
systems, can now be computed efficiently.
With similar layout to Table 6.1, Table 6.2 compares the applicabilities of these three
approaches to the equivalence checking problem. Here a product machine is built using
a circuit and its retimed version. As noted earlier, the gfp calculation itself cannot prove
the equivalence between two systems. It, essentially, computes the dependency inside each
individual system, but not the interdependency between them. On the other hand, the
detection of signal correspondence can rarely prove equivalence unless the two systems under
comparison are almost functionally identical. In contrast, the lfp calculation of sequential
dependency can easily prove the equivalence between two systems where one is forwardly
retimed from the other, and vice versa. Arbitrary retiming, however, may cause a failure,
although in principle there always exists a lfp calculation that can conclude the equivalence.
In Table 6.2, since the retiming operations on the retimed circuits involve both forward and
backward moves, none of the approaches can directly conclude the equivalences. However,
as can be seen, the lfp calculation can compactly condense the product machines.
Although detecting dependency can reduce state space, it is not clear if the BDD sizes for
the dependency functions and the rewritten transition functions are small enough to benefit
118
CHAPTER 6. VERIFICATION REDUCTION
Tab
le6.
2.C
ompa
riso
nsof
Cap
abili
ties
ofC
heck
ing
Equ
ival
ence
Sign
alC
orr.
[vE
00]
Seq.
Dep
.G
fpSe
q.D
ep.
Lfp
mem
.ti
me
mem
.ti
me
mem
.ti
me
Cir
cuit
Stat
eV
ar.
indp
.it
er.
(Mb)
(sec
)in
dp.
iter
.(M
b)(s
ec)
indp
.it
er.
(Mb)
(sec
)s208.1
8+16
167
100.
217
110
0.1
1210
100.
4s298
14+
3439
510
0.5
372
211.
530
1331
4.4
s386
6+15
133
100.
213
212
0.3
123
100.
2s499
22+
4163
2114
3.1
432
387.
342
2245
23.6
s510
6+34
384
130.
627
250
25.9
295
3639
.8s526
21+
5864
813
2.2
592
6041
.650
1453
27.2
s526n
21+
6469
813
2.4
582
5912
1.9
5012
5831
.8s635
32+
5166
3113
7.8
661
211.
451
3325
9.1
s838.1
32+
7378
3125
16.8
652
484.
259
4737
22.5
s991
19+
4242
222
1.5
402
382.
539
341
5.4
mult16a
16+
106
826
144.
691
214
1.7
778
265.
1tbk
5+49
542
445.
517
461
175.
625
359
86.4
119
CHAPTER 6. VERIFICATION REDUCTION
Tab
le6.
3.C
ompa
riso
nsof
Cap
abili
ties
ofA
naly
zing
Rea
chab
ility
R.A
.w
/oD
ep.
Red
ucti
onR
.A.w
Dep
.R
educ
tion
peak
reac
hed
mem
.ti
me
peak
reac
hed
mem
.ti
me
Cir
cuit
Iter
.(b
ddno
des)
(bdd
node
s)(M
b)(s
ec)
(bdd
node
s)(b
ddno
des)
(Mb)
(sec
)s3271
428
8193
0116
1582
4262
027
84.1
1884
3837
1074
6053
415
1082
.6s4863
218
5277
8124
8885
365
404.
854
9006
8772
6713
.1s5378
2–
–>
2000
–11
5143
911
3522
7021
.5s15850
1529
8428
8999
6194
565
321
337.
417
6670
7663
5671
446
381
75.0
8085
5016
6637
4917
0160
439
024
280.
278
3060
213
3832
221
246
40.1
120
CHAPTER 6. VERIFICATION REDUCTION
reachability analysis. In Table 6.3, we justify that it indeed can improve the analysis.
Some hard instances for state traversal are studied. We compare reachability analyses
without and with on-the-fly reduction using functional dependency. In the comparison,
both analyses have the same implementation except switching off and on the reduction
option. The second column of Table 6.3 shows the steps for (partial) state traversal. For
each reachability analysis, four columns in order shows the peak number of live BDD nodes,
the size of the BDD representing the final reached state set, memory usage, and CPU
time. It is apparent that, with the help of functional dependency, the reachability analysis
yields substantial savings in both memory and time consumptions, compared to the analysis
without reduction.
6.6 Related Work
Among previous studies [HD93, vEJ96] on exploiting functional dependency, the one
closest to ours is [vEJ96] while functional dependency in [HD93] is assumed to be given.
In [vEJ96], their method is similar to our reachability analysis with on-the-fly dependency
detection. However, several differences need to be addressed. First, their dependency is
drawn entirely from the currently reached state set (using functional deduction) rather than
from the transition functions. Thus, in each iteration of their reachability analysis, image
computation need to be done before the detection of new functional dependency. The image
computation rarely benefits from functional dependency. In contrast, our approach is more
effective because the dependency is discovered before the image computation. The image
computation is performed on the reduced basis. Second, as their dependency is obtained
from the currently reached state set, not from transition functions, it is not as robust as
ours to remain valid through the following iterations. Third, their approach cannot be
used to detect functional dependency without reachability analysis while our formulation
121
CHAPTER 6. VERIFICATION REDUCTION
can be used as a stand-alone technique. Also, we identify a new initial set of states with
predecessors. It uncovers more dependency to be exploited.
For related work [QCC+00, AGM96, SWWK04, vE00] specific to sequential equivalence
checking, the work of [vE00] is the most relevant to ours. As noted in Remark 4, finding
signal correspondence [vE00] is a special case of our lfp calculation. For other related work,
in [QCC+00], two transition systems under comparison need to be similar up to a one-to-one
mapping between equivalent states. Such a mapping is discovered by reachability analysis
to converge their combinational similarity. In comparison, the C-1-D approach of [AGM96]
can handle one-to-many mappings by imposing the additional 1-distinguishablity constraint.
Compared to [QCC+00] and [AGM96], our formulation works for arbitrary mappings. We
note that, in [AGM96], nondeterministic transition systems can be naturally handled. By
contrast, a complication of incorporating the MOCB technique [HD93] is necessary for us
to manage nondeterminism. The structural traversal method in [SWWK04] is an over-
approximative reachability analysis based on circuit manipulations; our computation, on
the other hand, is at the functional level. While these prior efforts focus on equivalence
checking, ours is more general for safety property checking.
6.7 Summary
We formulate the dependency among a collection of functions based on a refinement
relation. When applied to a state transition system, it allows the detection of functional
dependency without knowing reached state sets. With an integration into a reachability
analysis, it can be used as a complete verification procedure with the power of on-the-fly
reduction. Our formulation unifies signal correspondence [vE00] and functional dependency
[vEJ96] in the verification framework. In application to the equivalence checking problem,
our method bridges the complexity gap between combinational and sequential equivalence
122
CHAPTER 6. VERIFICATION REDUCTION
checking. Preliminary experiments show promising results in detecting dependency and
verification reduction.
123
Chapter 7
Conclusions and Future Work
With the help of invariants, we studied four topics in the analysis and verification
of finite state transition systems, namely, combinationality and sequential determinism,
retiming and resynthesis, sequential equivalence checking, and verification reduction.
Combinationality and sequential determinism. Cyclic definitions occur naturally in
high-level system specifications due to resource sharing, module composition, etc.
Without the time separation provided by state-holding elements, instantaneous val-
uations of cyclic definitions result in a causality problem. However, not all cyclic
definitions are hazardous. Prior work on differentiating good from bad cyclic defini-
tions was based on ternary-valued simulation at the circuit level with the up-bounded
inertial delay model. Different equivalent netlists may result in different conclusions
about combinationality. We argued that the previous differentiation (combination-
ality formulation) is too conservative because the timing model rules out legitimate
instances when cyclic definitions are to be broken by rewriting or when the synthesis
target is software. We investigated, at the functional level, the most general con-
dition where cyclic definitions are semantically combinational. Essentially, a set of
124
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
cyclic definitions is combinational at the functional level if and only if every state
evolution graph induced by an input assignment has all states in loops with a unique
output observation. The above invariant characterizes the combinationality at the
functional level. Our result admits strictly more flexible high-level specifications and
avoids inconsistent analysis for different equivalent netlists. Furthermore, it allows a
higher-level analysis of combinationality, and, thus, no costly synthesis of a high-level
description into a circuit netlist before combinationality analysis can be performed.
With our formulation, when the target is software implementations, combinational
cycles need not be broken as long as the execution of the underlying system obeys
a sequencing execution rule. For hardware implementations, combinational cycles
should be broken and replaced with acyclic equivalents at the functional level to avoid
malfunctioning in the final physical realization.
Moreover, we extended our combinationality formulation to systems with state-
holding elements. We showed the exact condition when a system with cyclic definitions
is deterministic in its input-output behavior. Although the analysis of combination-
ality and input-output determinism is of complexity PSPACE-complete. It may still
be practical as long as the cutset size is small.
As for future work, although the choice of cutset does not affect the analysis of com-
binationality, it does influence the resultant system rewritten with acyclic definitions.
It might be useful to decide a good cutset with respect to various optimization ob-
jectives. Also, as shown in Sections 3.3.2, 3.3.5 and 3.3.6, there are many ways to
rewrite cyclic definitions with acyclic equivalents. It would be interesting to explore
such flexibilities for further optimization.
Retiming and resynthesis. Transformations using retiming and resynthesis are consid-
ered as the most practical and important techniques in optimizing synchronous hard-
ware systems. Since the transformation modifies circuit structures directly without
125
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
resorting to state space traversal, the computation is inexpensive and the improvement
is transparent and predictable. Regardless of these advantages, these transformations
are not widely adopted in current synthesis flow of synchronous hardware systems.
The reason can be attributed to three unsolved problems, optimization capability, ver-
ification complexity, and the rectification of initialization sequences. Resolving these
questions is crucial in developing effective synthesis and verification algorithms.
These problems were resolved in the thesis through identifying some transformation
invariants under retiming and resynthesis. The first problem was resolved through
a constructive algorithm which determines if two given FSMs are transformable to
each other via retiming and resynthesis operations. The second problem, verifying
the equivalence of two FSMs under such transformation, was proved, contrary to a
common belief, to be as hard as the PSPACE-complete problem of general equivalence
checking if the transformation history is lost. As a result, we advocated a conserva-
tive design methodology for the optimization of synchronous hardware systems to
ameliorate verifiability. For instance, transformation history should be recorded, or
every retiming (resynthesis) operation should be followed by an equivalence checking.
For the third problem, initializing FSMs transformed under iterative retiming and
resynthesis, we showed that there is no general transformation-independent bound
limiting the growth of initialization sequences, unlike the case when only retiming is
performed. An algorithm computing the length increase of initialization sequences
was presented. Essentially, an initialization sequence should be rectified by prefixing
it with an arbitrary input sequence of length greater than the computed length.
For future work, it is important to investigate more efficient computation, with reason-
able accuracy, of the length increase of initialization sequences for FSMs transformed
under iterative retiming and resynthesis. On the other hand, it may seem that our
lag-independent bound can be used to improve retiming algorithms by pruning out
126
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
spurious linear constraints, similar to [MS98]. Moreover, since the result of [ESS96]
can be modified to obtain a retime function targeting area optimization with mini-
mum increase of initialization sequences, it would be useful to study retiming under
other objectives while avoiding increasing initialization sequences.
Sequential equivalence checking. We extended our studies to general sequential equiv-
alence checking beyond that of checking retiming and resynthesis equivalence.
Checking the equivalence of two sequential systems is one of the most challenging
problems and obstacle in designing correct hardware systems. The state-explosion
problem limits formal verification to small- or medium-sized sequential circuits partly
because BDD sizes depend heavily on the number of variables dealt with. In the worst
case, a BDD size grows exponentially with the number of variables. Thus, reducing
this number can possibly increase the verification capacity.
Given two FSMs M1 and M2 with numbers of state variables m1 and m2, respec-
tively, conventional formal methods verify equivalence by traversing the state space of
the product machine, with m1 + m2 registers. In contrast, we showed that the state
equivalence of an FSM can be computed without building a product machine. Apply-
ing the result to equivalence checking, we were able to introduce a different possibility,
based on partitioning the state space defined by a multiplexed machine, which has
merely maxm1,m2 + 1 registers. Essentially, sequential equivalence checking was
done in the disjoint union state space. For the invariants to be asserted, the product-
machine based verification checks if the outputs of the two FSMs under comparison
are identical throughout reachability analysis; the multiplexed-machine based verifi-
cation checks if the initial state pair of the two FSMs remains in the same equivalence
class throughout state space partitioning.
Empirical results demonstrated that the proposed approach is more robust than pre-
vious ones. The robustness can be attributed to three factors: First, the encountered
127
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
state variables is almost half of those in the product-machine based verification. Sec-
ond, the cone of influence reduction is automatically taken care of when every output
function is separately verified. Third, the number of equivalence classes in the reach-
able state space is an invariant under any valid transformation.
On the other hand, since an equivalence class is represented with a BDD node, the
verification capacity of the proposed approach is primarily limited by the encoun-
tered number of equivalence classes. The approach is feasible when the number of
encountered equivalence classes is no more than a few million. Another weakness is
the variable ordering restriction on BDDs. Fortunately, since the ordering restriction
only needs to be maintained in counting the number of equivalence classes, it does
not result in a notable hinderance.
A future research direction would be to develop specialized BDD operations to improve
our computations.
Verification reduction. We extended our studies further to general safety property check-
ing beyond equivalence checking, and proposed a reachability-preserving reduction
technique based on functional dependency. In essence, functional dependency is an
invariant, which acts as a catalyst simplifying verification tasks.
The existence of functional dependency among the state variables of a state transition
system can cause needless inefficiency in BDD representations for formal verification.
Eliminating such dependency from the system compacts the state space and can sig-
nificantly reduce the verification cost. Prior approaches to the detection of functional
dependency relied on reachability analysis, which makes the computation not scalable
to large systems. Instead, we investigated how functional dependency can be derived
without or before knowing the reachable state set. Two previous studies on detect-
ing signal correspondence and exploiting functional dependency were unified in our
approach. We presented a direct derivation of dependency from transition functions
128
CHAPTER 7. CONCLUSIONS AND FUTURE WORK
rather than from reached state sets. As a consequence, reachability analysis is not
a necessity for exploiting dependency. Dependencies that were underivable before,
due to the limitation of reachability analysis on large transition systems, can now be
computed efficiently. In addition, our derivation of functional dependency was inte-
grated into reachability analysis as an on-the-fly reduction. Using this, reachability
analysis was shown to have a substantial reduction in both memory and run-time
consumption.
As a future research direction, our results can be reformulated in a SAT-solving frame-
work. An approach similar to that of [BC00], where van Eijk’s approach [vE00] was
adjusted, could be taken to prove safety properties with strengthened induction. We
believe that SAT-based verification can benefit from our results, because our approach
can impose more invariants than just signal correspondence; hence the searching proce-
dure of SAT solvers can be made more efficient. Furthermore, our current formulation
does not handle transition relations directly. We would like to know what would be
the appropriate formulation for transition relations rather than translating them into
sets of functional vectors.
129
Bibliography
[AGM96] P. Ashar, A. Gupta, and S. Malik. Using complete-1-distinguishability for
FSM equivalence checking. In Proc. Int’l Conf. Computer-Aided Design,
pages 346–353, 1996.
[BC00] P. Bjesse and K. Claessen. SAT-based verification without state space traver-
sal. In Proc. Formal Methods in Computer-Aided Design, pages 372–389,
2000.
[BCM90] C. Berthet, O. Coudert, and J.-C. Madre. New ideas on symbolic manipula-
tions of finite state machines. In Proc. Int’l Conf. Computer Design, pages
224–227, 1990.
[Ber99] G. Berry. The Constructive Semantics of Pure Esterel. Draft book, 1999.
[Ber00] G. Berry. The foundations of Esterel. In Proof, Language, and Interaction:
Essays in Honour of Robin Milner. MIT Press, 2000.
[BHSV+96] R. K. Brayton, G. D. Hachtel, A. Sangiovanni-Vincentelli, F. Somenzi,
A. Aziz, S.-T. Cheng, S. Edwards, S. Khatri, Y. Kukimoto, A. Pardo,
S. Qadeer, R. K. Ranjan, S. Sarwary, T. R. Shiple, G. Swamy, and T. Villa.
VIS: a system for verification and synthesis. In Proc. Int’l Conf. Computer
Aided Verification, pages 428–432, 1996.
130
BIBLIOGRAPHY
[Bro03] F. M. Brown. Boolean Reasoning: The Logic of Boolean Equations. Dover
Publications, 2003.
[Bry86] R. E. Bryant. Graph-based algorithms for Boolean function manipulation.
IEEE Trans. on Computers, C-35:677–691, August 1986.
[Bry87] R. E. Bryant. Boolean analysis of MOS circuits. IEEE Trans. on Computer-
Aided Design of Integrated Circuits and Systems, pages 634–649, 1987.
[Bry92] R. E. Bryant. Symbolic Boolean manipulation with ordered binary decision
diagrams. ACM Computing Surveys, 24(3):293–318, September 1992.
[BS95] J. Brzozowski and C.-J. Seger. Asynchronous Circuits. Springer-Verlag, 1995.
[CBM89] O. Coudert, C. Berthet, and J.-C. Madre. Verification of synchronous se-
quential machines based on symbolic execution. In Proc. Int’l Workshop
Automatic Verification Methods Finite State Syst., pages 365–373, 1989.
[CGP99] E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. MIT Press,
1999.
[CM90] O. Coudert and J.-C. Madre. A unified framework for the formal verification
of sequential circuits. In Proc. Int’l Conf. Computer-Aided Design, pages
126–129, 1990.
[CQS00] G. Cabodi, S. Quer, and F. Somenzi. Optimizing sequential verification by
retiming transformations. In Proc. Design Automation Conf., pages 601–606,
2000.
[DM91] G. De Micheli. Synchronous logic synthesis: algorithms for cycle-time mini-
mization. IEEE Trans. on Computer-Aided Design of Integrated Circuits and
Systems, 10:63–73, January 1991.
131
BIBLIOGRAPHY
[Edw03] S. Edwards. Making cyclic circuits acyclic. In Proc. Design Automation
Conference, pages 159–162, 2003.
[EL03] S. Edwards and E. Lee. The semantics and execution of a synchronous block-
diagram language. Science of Computer Programming, 48:21–42, 2003.
[EMMRM97] A. El-Maleh, T. E. Marchok, J. Rajski, and W. Maly. Behavior and testability
preservation under the retiming transformation. IEEE Trans. on Computer-
Aided Design of Integrated Circuits and Systems, 16:528–543, May 1997.
[ENSS98] G. Even, J. Naor, B. Schieber, and M. Sudan. Approximating minimum
feedback sets and multi-cuts in directed graphs. Algorithmica, 20:151–174,
1998.
[ESS96] G. Even, I. Y. Spillinger, and L. Stok. Retiming revisited and reversed.
IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems,
15:348–357, March 1996.
[Fil91] T. Filkorn. A method for symbolic verification of synchronous circuits. In
Proc. Int’l Symp. Comput. Hardware Description Lang. Applicat., pages 249–
259, 1991.
[Fil92] T. Filkorn. Symbolische Methoden fur die Verifikation Endlicher Zustandssys-
teme. Ph.D. Thesis, Institut fur Informatik der Technischen Universitat
Munchen, 1992.
[Hal93] N. Halbwachs. Synchronous Programming of Reactive Systems. Kluwer Aca-
demic Publishers, 1993.
[HD93] A. J. Hu and D. L. Dill. Reducing BDD size by exploiting functional depen-
dencies. In Proc. Design Automation Conference, pages 266–271, 1993.
132
BIBLIOGRAPHY
[HJJ+96] J. G. Henriksen, J. Jensen, M. Jorgensen, N. Klarlund, B. Paige, T. Rauhe,
and A. Sandholm. Mona: monadic second-order logic in practice. In Proc.
Int’l Conf. on Tools and Algorithms for the Construction and Analysis of
Systems, pages 89–110, 1996.
[HM95] N. Halbwachs and F. Maraninchi. On the symbolic analysis of combinational
loops in circuits and synchronous programs. In Proc. Euromicro, 1995.
[HU79] J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Lan-
guages, and Computation. Addison-Wesley, 1979.
[Imm88] N. Immerman. Nondeterministic space is closed under complementation.
SIAM Journal on Computing, 17:935–938, 1988.
[JJH01] J.-H. R. Jiang, J.-Y. Jou, and J.-D. Huang. Unified functional decomposition
via encoding for FPGA technology mapping. IEEE Trans. on Very Large
Scale Integration Systems, 9:251–260, April 2001.
[Jon75] N. Jones. Space-bounded reducibility among combinatorial problems. Journal
of Computer and System Sciences, 11:68–85, 1975.
[Kar72] R. Karp. Reducibility among combinatorial problems. In Complexity of
Computer Computations, pages 85–104. Plenum Press, 1972.
[Kau70] W. Kautz. The necessity of closed circuit loops in minimal combinational
circuits. IEEE Trans. on Computers, pages 162–164, 1970.
[KB01] A. Kuehlmann and J. Baumgartner. Transformation-based verification using
generalized retiming. In Proc. Int’l Conf. Computer Aided Verification, pages
104–117, 2001.
133
BIBLIOGRAPHY
[Koh78] Z. Kohavi. Switching and Finite Automata Theory. McGraw-Hill, New York,
1978.
[Kur94] R. P. Kurshan. Computer-Aided Verification of Coordinating Processes.
Princeton University Press, 1994.
[LN91a] B. Lin and A. R. Newton. Exact redundant state registers removal based on
binary decision diagrams. In Proc. Int’l Conf. Very Large Scale Integration,
pages 277–286, 1991.
[LN91b] B. Lin and A. R. Newton. Implicit manipulation of equivalence classes using
binary decision diagrams. In Proc. Int’l Conf. Computer Design, pages 81–85,
1991.
[LPV93] Y.-T. Lai, M. Pedram, and S. B. K. Vrudhula. BDD based decomposition
of logic functions with application to FPGA synthesis. In Proc. Design Au-
tomation Conf., pages 642–647, 1993.
[LS83] C. E. Leiserson and J. B. Saxe. Optimizing synchronous systems. Journal of
VLSI and Computer Systems, 1(1):41–67, Spring 1983.
[LS91] C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica,
6:5–35, 1991.
[LTN90] B. Lin, H. J. Touati, and A. R. Newton. Don’t care minimization of multi-
level sequential logic networks. In Proc. Int’l Conf. Computer-Aided Design,
pages 414–417, 1990.
[Mac] The MacTutor History of Mathematics. Online archive,
http://www-gap.dcs.st-and.ac.uk/∼history/.
134
BIBLIOGRAPHY
[Mal90] S. Malik. Combinational Logic Optimization Techniques in Sequential Logic
Synthesis. Ph.D. Thesis, University of California, Berkeley, 1990.
[Mal94] S. Malik. Analysis of cyclic combinational circuits. IEEE Trans. on
Computer-Aided Design of Integrated Circuits and Systems, 13(7):950–956,
July 1994.
[Mar60] E. Marczewski. Independence in algebras of sets and Boolean algebra. Fun-
damenta Mathematicae, 48:135–145, 1960.
[MKRS00] I.-H. Moon, J. H. Kukula, K. Ravi, and F. Somenzi. To split or to conjoin:
the question in image computation. In Proc. Design Automation Conf., pages
23–28, 2000.
[MS98] N. Maheshwari and S. Sapatnekar. Efficient retiming of large circuits. IEEE
Trans. on Very Large Scale Integration Systems, 6:74–83, March 1998.
[MSBSV91] S. Malik, E. M. Sentovich, R. K. Brayton, and A. Sangiovanni-Vincentelli.
Retiming and resynthesis: optimization of sequential networks with combi-
national techniques. IEEE Trans. on Computer-Aided Design of Integrated
Circuits and Systems, 10:74–84, January 1991.
[MSM04] M. N. Mneimneh, K. A. Sakallah, and J. Moondanos. Preserving synchroniz-
ing sequences of sequential circuits after retiming. In Proc. Asia and South
Pacific Design Automation Conference, January 2004.
[NK99] K. Namjoshi and R. Kurshan. Efficient analysis of cyclic definitions. In Proc.
Computer Aided Verification, pages 394–405, 1999.
[Pap94] C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.
135
BIBLIOGRAPHY
[Pix92] C. Pixley. A theory and implementation of sequential hardware equivalence.
IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems,
11:1469–1478, December 1992.
[PT87] R. Paige and R. E. Tarjan. Three partition refinement algorithms. SIAM
Journal on Computing, 16:973–989, 1987.
[QCC+00] S. Quer, G. Cabodi, P. Camurati, L. Lavagno, and R. K. Brayton. Verification
of similar FSMs by mixing incremental re-encoding, reachability analysis, and
combinational check. Formal Methods in System Design, 17:107–134, 2000.
[Ran97] R. K. Ranjan. Design and Implementation Verification of Finite State Sys-
tems. Ph.D. Thesis, University of California, Berkeley, 1997.
[RB03a] M. Riedel and J. Bruck. Cyclic combinational circuits: analysis for synthesis.
In Proc. Int’l Workshop on Logic and Synthesis, pages 105–112, 2003.
[RB03b] M. Riedel and J. Bruck. The synthesis of cyclic combinational circuits. In
Proc. Design Automation Conference, pages 163–168, 2003.
[RK62] J. P. Roth and R. M. Karp. Minimization over Boolean graphs. IBM Journal
of Research and Development, pages 227–238, December 1962.
[RSSB98] R. K. Ranjan, V. Singhal, F. Somenzi, and R. K. Brayton. On the optimiza-
tion power of retiming and resynthesis transformations. In Proc. Int’l Conf.
on Computer-Aided Design, pages 402–407, 1998.
[Sav70] W. Savitch. Relationships between nondeterministic and deterministic tape
complexities. Journal of Computer and System Sciences, 4:177–192, 1970.
[SBT96] T. Shiple, G. Berry, and H. Touati. Constructive analysis of cyclic circuits.
In Proc. European Design and Test Conf., pages 328–333, 1996.
136
BIBLIOGRAPHY
[Shi96] T. Shiple. Formal Analysis of Cyclic Circuits. Ph.D. Thesis, University of
California, Berkeley, 1996.
[SMB96] V. Singhal, S. Malik, and R. K. Brayton. The case for retiming with explicit
reset circuitry. In Proc. Int’l Conf. on Computer-Aided Design, pages 618–
625, 1996.
[SPRB95] V. Singhal, C. Pixley, R. L. Rudell, and R. K. Brayton. The validity of
retiming sequential circuits. In Proc. Design Automation Conference, pages
316–321, 1995.
[SSL+92] E. Sentovich, K. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha,
H. Savoj, P. Stephen, R. K. Brayton, and A. Sangiovanni-Vincentelli. SIS:
a system for sequential circuit synthesis. Tech. Report, UCB/ERL M92/41,
University of California, Berkeley, 1992.
[STB96] E. Sentovich, H. Toma, and G. Berry. Latch optimization in circuits generated
from high-level descriptions. In Proc. Int’l Conf. on Computer-Aided Design,
pages 428–435, 1996.
[Sto92] L. Stok. False loops through resource sharing. In Proc. Int’l Conf. on
Computer-Aided Design, pages 345–348, 1992.
[SWWK04] D. Stoffel, M. Wedler, P. Warkentin, and W. Kunz. Structural FSM traversal.
IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems,
23(5):598–619, May 2004.
[TB93] H. J. Touati and R. K. Brayton. Computing the initial states of retimed
circuits. IEEE Trans. on Computer-Aided Design of Integrated Circuits and
Systems, 12:157–162, January 1993.
137
BIBLIOGRAPHY
[vE00] C. A. J. van Eijk. Sequential equivalence checking based on structural simi-
larities. IEEE Trans. on Computer-Aided Design of Integrated Circuits and
Systems, 19:814–819, July 2000.
[vEJ96] C. A. J. van Eijk and J. A. G. Jess. Exploiting functional dependencies in
finite state machine verification. In Proc. European Design & Test Conf.,
pages 9–14, 1996.
[YSBO99] B. Yang, R. Simmons, R. E. Bryant, and D. O’Hallaron. Optimizing symbolic
model checking for constraint-rich models. In Proc. Int’l Conf. Computer
Aided Verification, pages 328–340, 1999.
[ZSA98] H. Zhou, V. Singhal, and A. Aziz. How powerful is retiming? In Proc. Int’l
Workshop on Logic Synthesis, 1998.
138