discovering invariants in the analysis and veriﬂcation of...

Discovering Invariants in the Analysis and Verificationof Finite State Transition Systems

by

Jie-Hong Roland Jiang

B.S. (National Chiao Tung University, Taiwan) 1996M.S. (National Chiao Tung University, Taiwan) 1998

A dissertation submitted in partial satisfactionof the requirements for the degree of

Doctor of Philosophy

in

Engineering - Electrical Engineering and Computer Sciences

in the

GRADUATE DIVISION

of the

UNIVERSITY OF CALIFORNIA, BERKELEY

Committee in charge:

Professor Robert K. Brayton, ChairProfessor Alberto Sangiovanni-Vincentelli

Professor Kam-Biu Luk

Fall 2004

The dissertation of Jie-Hong Roland Jiang is approved.

Chair Date

Date

Date

University of California, Berkeley

Fall 2004

Discovering Invariants in the Analysis and Verification

of Finite State Transition Systems

Copyright c© 2004

by


Abstract

Discovering Invariants in the Analysis and Verification

of Finite State Transition Systems

by


Doctor of Philosophy in Engineering - Electrical Engineering and Computer Sciences

University of California, Berkeley

Professor Robert K. Brayton, Chair

Hardware and software systems are evolving at a fascinating speed, thanks to the refine-

ments of semiconductor technologies. However, verifying their correctness becomes a daunt-

ing task because of the state explosion problem. Because simulation can only validate a

modern design for a fairly small portion of functional coverage, formal methods become

indispensable tools in certifying design correctness. Although significant progresses have

been achieved in this area, the state of the art is still far behind what is required and there

is plenty room for improvement. This thesis addresses some issues in formal analysis and

verification of finite state transition systems.

Through identifying some invariants, we study four subjects in the analysis and verifica-

tion of finite state transition systems. First, we establish the most general definition of com-

binationality in designs with cyclic definitions, which occur naturally in systems specified

in high-level description languages due to resource sharing, module composition, etc. This

is further extended to determine the sequential determinism of systems with state holding

1

elements. Second, we study the transformation invariants under retiming and resynthe-

sis operations, which are the most practical techniques in the optimization of synchronous

hardware systems. We characterize the optimization power of these operations and demon-

strate the verification complexity of checking retiming and resynthesis equivalence. We give

the rectification of initialization sequences invalidated due to these transformations. Third,

we revisit equivalence checking of two finite state transition systems, which is one of the

most important problems in design verification. Demonstrated is how the verification task

can be fulfilled with symbolic computations in the disjoint union state space, rather than in

the traditional product state space, of the two systems. Finally, because abstraction is one

of the most promising techniques to leverage the state explosion problem, we investigate a

reachability-preserving abstraction technique based on functional dependency. By extend-

ing combinational to sequential dependency, the detection of functional dependency can be

isolated from reachability analysis. Also, our computation can be integrated into reachabil-

ity analysis as an on-the-fly reduction.

Professor Robert K. BraytonDissertation Committee Chair

2

To My Parents

i

Contents

Contents ii

List of Figures vi

List of Tables viii

Acknowledgements ix

1 Introduction 1

1.1 Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Main Results and Connections to Invariants . . . . . . . . . . . . . . . . . . 2

1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Preliminaries 6

2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.3 Binary Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.4 Finite State Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Combinationality and Sequential Determinism 9

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 Combinationality at the Functional Level . . . . . . . . . . . . . . . . . . . 17

ii

3.3.1 Formulation of Combinationality . . . . . . . . . . . . . . . . . . . . 17

3.3.2 Computation Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 18

3.3.3 Generality Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.3.4 Conditions of Legitimacy . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.5 Stable Cyclic Dependencies . . . . . . . . . . . . . . . . . . . . . . . 23

3.3.6 Input-Output Determinism of State Transition Systems . . . . . . . 25

3.4 Combinationality at the Circuit Level . . . . . . . . . . . . . . . . . . . . . 27

3.4.1 Synthesis of Cyclic Circuits . . . . . . . . . . . . . . . . . . . . . . . 27

3.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5.1 SEG vs. GMW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5.2 Combinationalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.5.3 Sequential Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Retiming and Resynthesis 33

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4.3 Optimization Capability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.3.1 Optimization Power of Retiming . . . . . . . . . . . . . . . . . . . . 40

4.3.2 Optimization Power of Retiming and Resynthesis . . . . . . . . . . . 42

4.3.3 Retiming-Resynthesis Equivalence and Canonical Representation . . 45

4.4 Verification Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.4.1 Verification with Unknown Transformation History . . . . . . . . . . 48

4.4.2 Verification with Known Transformation History . . . . . . . . . . . 50

4.5 Initialization Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.5.1 Initialization Affected by Retiming . . . . . . . . . . . . . . . . . . . 52

4.5.2 Initialization Affected by Retiming and Resynthesis . . . . . . . . . 53

4.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

iii

5 Equivalence Verification 59

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2 Definitions, Notation and Preliminaries . . . . . . . . . . . . . . . . . . . . 62

5.2.1 Equivalence Relations and Partitions . . . . . . . . . . . . . . . . . . 62

5.2.2 Functional Decomposition . . . . . . . . . . . . . . . . . . . . . . . . 64

5.3 Identification of State Equivalence . . . . . . . . . . . . . . . . . . . . . . . 65

5.3.1 State Equivalence vs. Functional Decomposition . . . . . . . . . . . 66

5.3.2 Algorithm for Equivalent State Identification . . . . . . . . . . . . . 67

5.3.3 Robust Equivalent State Identification . . . . . . . . . . . . . . . . . 70

5.4 Verification of Sequential Equivalence . . . . . . . . . . . . . . . . . . . . . 74

5.4.1 Multiplexed Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.4.2 Algorithm for Sequential Equivalence Checking . . . . . . . . . . . . 76

5.4.3 Robust Sequential Equivalence Checking . . . . . . . . . . . . . . . . 78

5.4.4 Error Tracing and Shortest Distinguishing Sequence . . . . . . . . . 81

5.4.5 State-Space Partitioning on Separate Machines . . . . . . . . . . . . 81

5.4.6 State-Space Partitioning on Product Machine . . . . . . . . . . . . . 82

5.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5.5.1 Implementation-Independent Aspects . . . . . . . . . . . . . . . . . 83

5.5.2 Implementation-Dependent Aspects . . . . . . . . . . . . . . . . . . 86

5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.7.1 Computation of State Equivalence . . . . . . . . . . . . . . . . . . . 92

5.7.2 Verification of FSM Equivalence . . . . . . . . . . . . . . . . . . . . 93

5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6 Verification Reduction 100

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

6.2 Preliminaries and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.3 Functional Dependency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.3.1 Combinational Dependency . . . . . . . . . . . . . . . . . . . . . . . 103

iv

6.3.2 Sequential Dependency . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.4 Verification Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

6.6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7 Conclusions and Future Work 124

Bibliography 130

v

List of Figures

3.1 (i) SEG for x = 0. (ii) SEG for x = 1. . . . . . . . . . . . . . . . . . . . . . 16

3.2 (i) The original circuit. (ii) The induced circuit under input assignment a = 0and b = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1 Algorithm: Construct quotient graph. . . . . . . . . . . . . . . . . . . . . . 46

4.2 Algorithm: Verify equivalence under retiming and resynthesis. . . . . . . . . 47

4.3 The STG in (i) is transformable to the STG in (ii) by a 2-way switch operationwhile the reverse direction is not transformable. Since the operation is notreversible, it falls beyond the transformation power of retiming and resynthesis. 56

5.1 Algorithm CompNewPartition: Compute New Partition. . . . . . . . . . . . 67

5.2 Algorithm IDES5.1: Identify Equivalent States, Equation (5.1). . . . . . . . 70



5.5 Multiplexed Machine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.6 Algorithm: Verify Sequential Equivalence. . . . . . . . . . . . . . . . . . . . 78

6.1 Algorithm: CombinationalDependency. . . . . . . . . . . . . . . . . . . . . . 106

6.2 The greatest fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transitionfunctions (and thus next-state variables) are partitioned into dependent andindependent parts by the computation of combinational dependency. Thederived dependency is used to reduce the state space in the subsequent iteration.109

6.3 Algorithm: SequentialDependencyGfp. . . . . . . . . . . . . . . . . . . . . . 110

vi

6.4 The least fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transitionfunctions (and thus next-state variables) are partitioned into dependent andindependent parts by the computation of combinational dependency. Thederived dependency is used to reduce the state space in the subsequent iteration.111

6.5 Algorithm: SequentialDependencyLfp. . . . . . . . . . . . . . . . . . . . . . 113

6.6 Algorithm: ComputeReachWithDependencyReduction. . . . . . . . . . . . . 115

vii

List of Tables

5.1 Profiles of Benchmark Circuits . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.2 Characteristics of Equivalence Classes of Benchmark Circuits . . . . . . . . 96

5.3 Sequential Equivalence Checking between Identical Circuits . . . . . . . . . 97

5.4 Sequential Equivalence Checking between Different Implementations of SameDesign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.5 Overall Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

6.1 Comparisons of Capabilities of Discovering Dependency . . . . . . . . . . . 117

6.2 Comparisons of Capabilities of Checking Equivalence . . . . . . . . . . . . . 119

6.3 Comparisons of Capabilities of Analyzing Reachability . . . . . . . . . . . . 120

viii

Acknowledgements

I am deeply grateful to my advisor Bob Brayton for his guidance and support throughout

my graduate study at Berkeley. His broad knowledge and enthusiastic interests in the field

of electronic design automation are always my aim. Many thanks to him for giving me the

flexibility on research and tolerating my diverse interests in irrelevant fields, which was of

great benefit to me in getting non-zero knowledge about other subjects in quantum physics,

theoretical computer science, mathematical logic, etc.

My thanks go to Alberto Sangiovanni-Vincentelli and Andreas Kuehlmann for their

valuable feedback to my thesis work, and for their respective courses on embedded system

design and logic synthesis. In summer trips to Dagstuhl in Germany in 2002, Lisbon

in Portugal in 2003, and Pelion in Greece in 2004, my interests in hardware verification

increased greatly due to Andreas for organizing these workshops on formal equivalence

verification and due to Bob for financial support. My understanding of verification in a

more general setting was due to the class by Tom Henzinger.

I am grateful to Kam-Biu Luk for teaching me quantum mechanics and serving on my

thesis committee. I hope to produce some scientific work on quantum computation in the

near future, in addition to other “classical” work.

I would like to express my thanks to Tiziano Villa and Nina Yevtushenko for sharing

their work on the language equation formulation of finite state machine synthesis. Because

of them, my interest in the theory of automata and language was developed. A meeting

with them in Rome in 2003 was a joyful trip full of research ideas as well as historical

sightseeing led by Tiziano.

Alan Mishchenko is the one who makes me realize how far I am away from being a good

programmer. His generosity in sharing his expertise and ideas is greatly appreciated. My

thesis work has benefited from the discussions with him.

ix

I cherish the interactions with my colleagues in our group, Yunjian Jiang, Philip Chong,

Subarna Sinha, Fan Mo, Minxi Gao, Yinghua Li, Satrajit Chatterjee, Donald Chai, and my

officemates: Farinaz Koushanfar, Guang Yang, Haibo Zeng. Also I thank Rupak Majumdar

for his patience in answering my many random questions.

My stay at Berkeley could not be more joyful because of my fellow Taiwanese students:

Dah-Wei Chiou, En-Yi Lin, Te-Sheng Hsiao, Yu-Chuan Tai, Stanley Bo-Ting Wang, Cheng-

En Wu, Mandy Yang, Cheng-Han Yu, and many others. Dah-Wei is the one who introduced

me to quantum physics and many other subjects.

My study at Berkeley was not possible without the help of Jin-Yang Jou, Yao-Wen

Chang, and late Wen-Zen Shen, who were my mentors during my early development at

National Chiao Tung University. I deeply appreciate Iris Hui-Ru Jiang for her help in my

graduate study at Hsinchu.

The love and encouragement of my parents and brother has been my strongest support.

To my parents, I dedicate my thesis.

x

Chapter 1

Introduction

1

Give me a place to stand, and with a lever I will move the world.— Archimedes

In this thesis, the place to stand on is finite state transition systems; the lever to use is

invariants. However, we are not sure if the world can be moved at all.

1Engraving from Mechanics Magazine, London, 1824.http://www.mcs.drexel.edu/∼crorres/Archimedes/Lever/LeverIntro.html

1

CHAPTER 1. INTRODUCTION

1.1 Invariants

Invariants are levers in mathematics! They play a fundamental role in many branches of

mathematics. Below we list two types of invariants each with a well-known example among

many many others.

I. Invariants classify mathematical objects: The Euler number v−e+f , a topological

invariant, can be used as a demonstration of the topological non-equivalence of poly-

hedra, where v, e and f denote the numbers of vertices, edges and faces, respectively,

of a polyhedron.

II. Invariants simplify computation: At the age of seven, Gauss summed the integers

from 1 to 100 instantly by spotting that the sum was 50 pairs of numbers each pair

summing to an invariant of 101 [Mac].

The invariants of Type I are sometimes apparent from the context and often essential in the

studied subject-matter. In comparison, the invariants of Type II are somewhat opaque and

behave more like auxiliary catalysts. The invariants that we will encounter fall into both

categories. They form power tools in the analysis and verification of finite state transition

systems.

1.2 Main Results and Connections to Invariants

This thesis mainly covers four (somewhat independent) subjects regarding the analysis

and verification of finite state transition systems.

1. Combinationality and sequential determinism. We consider finite state transition

systems described by a set of definitions which specify the valuations of variables in

2


terms of functions of other variables. In the special case where a system has no

state-holding elements, definitions with an acyclic relation of information processing

induce a definite stateless, or combinational, behavior. However, the converse is not

necessarily true; definitions with a cyclic relation may induce a definite combinational

behavior. We show the most general condition under which a set of cyclic definitions

induces a combinational system at the functional level. Our analysis is legitimate

when the cyclic definitions are to be broken2 or the synthesis target is software. Our

condition admits a higher level combinationality analysis and yields more flexible

descriptions of combinational systems. Furthermore, we extend the results to finite

state transition systems to verify the determinism of their input-output behavior.

Our results are achieved through showing an invariant which exactly characterizes

combinationality.

2. Retiming and resynthesis. Transformations using retiming and resynthesis opera-

tions are among the most important and practical techniques in optimizing syn-

chronous hardware systems. We study their corresponding transformation power and

verification complexity by identifying some transformation invariants under retiming

and resynthesis. We present a constructive algorithm to determine if two given fi-

nite state machines are transformable to each other using retiming and resynthesis

operations. We show the above problem is PSPACE-complete. In addition, we study

the effect of retiming and resynthesis on initialization sequences of synchronous hard-

ware systems with implicit reset. It is known that the original initialization sequences

should be prefixed with an arbitrary input sequence of a certain length. An algorithm

is proposed to determine the length increase.

3. Equivalence checking. The above analysis was restricted to verifying the restricted

2We use the term “broken” in the sense that the definitions are rewritten such that there is no cycle ofdependency in the definitions, i.e. all cyclic dependencies are broken.

3


equivalence of FSMs transformed under retiming and resynthesis. Here we consider

the equivalence checking of FSMs under arbitrary transformations. This is one of the

most challenging problems in VLSI design verification. Prior symbolic approaches to

the problem are based on reachability analysis over a product machine of the two finite

state machines. Two finite state machines are equivalent if the output of the product

machine is an invariant (a constant) demonstrating that no observable differences are

produced throughout the reachability analysis.

We present another possibility of verifying sequential equivalence. Rather than veri-

fying in the product state space of two state machines, we verify equivalence in their

disjoint union state space. In particular, the partition of equivalence classes over the

state space is iteratively refined. The corresponding invariant to be certified is that

initial states of the two machines remain in the same equivalence class throughout

the refinement process. The proposed approach differs from prior work in that the

verification efficiency is governed by the encountered number of equivalence classes

rather than the number of state variables. It is often more robust than reachability

analysis on the product machine.

4. Verification reduction. Abstraction is an important technique to cope with the state

explosion problem in formal verification of finite state transition systems. We focus

on a reachability-preserving reduction through functional dependency for safety prop-

erty verification. Essentially, functional dependency is an invariant characterizing the

representation redundancy of a given state transition system. Extracting functional

dependency allows us to reexpress the transition system using more compact transition

functions.

Prior derivations of functional dependency relied on reachability analysis, and thus

were computationally expensive and not scalable to large transition systems. We

propose another construction by detecting functional dependency directly from the

4


set of transition functions. Thus, reachability analysis is not a necessity for exploiting

dependency. In addition, the detection of functional dependency can be integrated

into reachability analysis as an on-the-fly reduction.

The invariants of the first three subjects are of Type I; while that of the last subject is of

Type II.

1.3 Thesis Organization

The thesis is organized as follows. Common preliminaries and definitions are introduced

in Chapter 2. Chapter 3 studies the fundamental formulation of combinationality and

sequential determinism. The other three subjects to be discussed are ordered from specific

to general, with the most specific analysis of the transformation of retiming and resynthesis

in Chapter 4. The more general sequential equivalence checking is studied in Chapter

5. Verification reduction using functional dependency is discussed in Chapter 6. Finally,

concluding remarks are given in Chapter 7.

5

Chapter 2

Preliminaries

2.1 Notation

We use |S| to denote the cardinality (or size) of a set S. Also, suppose V1 is a set of

variables. Notation [[V1]] represents the set of all possible valuations over V1. Let V2 ⊆ V1.

For x ∈ [[V1]], we use x[V2] ∈ [[V2]] to denote the valuation over variables V2 which agrees

with x on V2.

2.2 Graphs

A graph G consists of a vertex set V and an edge set E. Any edge e ∈ E connects

two vertices, say u, v ∈ V . For undirected e = u, v, there is no ordering on u and v.

For directed e = (u, v), the connection goes from u to v. In this case we say that u is

the predecessor of v (or v is the successor of u) with respect to e. A graph is said to be

undirected (directed) if all of its edges are undirected (directed). A vertex is of degree n

(a non-negative integer) if it is contained in n edges. For directed graphs, the degree of a

6

CHAPTER 2. PRELIMINARIES

vertex can be further distinguished: a vertex is of indegree j and outdegree k if it is the

successor vertex of j edges and the predecessor vertex of k edges, respectively.

In this thesis, the graphs we encounter are directed and contain no multi-edges between

any pair of vertices. We will be using graphs extensively to represent circuits, Boolean

functions, finite state machines, etc.

2.3 Binary Decision Diagrams

A binary decision diagram (BDD) is a directed graphical data structure similar to

a binary tree. A vertex is either a terminal node (or leaf) representing logical value true

or false, or a nonterminal node representing a decision point. Any nonterminal node

is associated with a binary decision variable and has two outgoing edges, the then- and

else-edge, representing the two possible branches of the valuation of the decision variable.

Consequently, a BDD is capable of representing any Boolean function. A special type of

BDD gains the most attention: the reduced ordered BDD (ROBDD). A BDD is ordered

if the visited variable sequence (without repetitions) along any path from the root to a

leaf obey some total order. An ordered BDD is reduced if no two BDD nodes represent the

same function. ROBDDs are a useful data structure, which supports efficient representation

and manipulation of Boolean functions. Two important properties of ROBDDs should be

mentioned. First, the efficiency of representing a Boolean function using an ROBDD is

strongly affected by the variable ordering. Second, given a function and a variable ordering,

the corresponding ROBDD is canonical. Due to its unique properties, the ROBDD has

pervasive applications in formal verification and logic synthesis. The reader is referred to

[Bry92] for a more detailed exposition. In the sequel, when a BDD is mentioned, we shall

mean an ROBDD.

7

CHAPTER 2. PRELIMINARIES

2.4 Finite State Machines

A finite state transition system can be modelled as a finite state machine. A finite

state machine1 (FSM) M is a tuple (Q, I,Σ, Ω, ~δ, ~ω), where Q is a finite set of states,

I ⊆ Q is the set of initial states, Σ and Ω are the sets of input and output alphabets,

respectively, and ~δ : Σ × Q → Q is the transition function. For a Moore machine, the

output function ~ω : Q → Ω depends on the current state; for a Mealy machine, the output

function ~ω : Σ × Q → Ω depends on both the input and current state. (In most of our

discussions, we simply consider the Mealy machine since the extension to the Moore machine

is straightforward.) Let VS , VI , and VO be the sets of variables that encode the states, input

alphabets, and output alphabets respectively. Then Q = [[VS ]], Σ = [[VI ]] and Ω = [[VO]].

Given an FSM, we can construct a state transition graph, where a vertex represents a

state and a labelled edge represents a possible transition with the corresponding predicate.

In addition, vertices (for a Moore machine) or edges (for a Mealy machine) are further

labelled with the observations induced by output functions of the FSM.

1In the thesis, we assume finite state machines are deterministic and completely specified.

8

Chapter 3

Combinationality and Sequential

Determinism

In the course of hardware system design or real-time process control, high-level spec-

ifications may contain simultaneous definitions of concurrent modules whose information

flow forms cyclic dependencies without the separation of state-holding elements. The tem-

poral behavior of these cyclic definitions may be meant to be combinational rather than

sequential. Most prior approaches to analyzing cyclic combinational circuits were built

upon the formulation of ternary-valued simulation at the circuit level. This chapter shows

the limitation of this formulation and investigates, at the functional level, the most general

condition where cyclic definitions are semantically combinational. It turns out that the

prior formulation is a special case of our treatment. Our result admits strictly more flexible

high-level specifications. Furthermore, it allows a higher-level analysis of combinationality,

and, thus, no costly synthesis of a high-level description into a circuit netlist before combi-

nationality analysis can be performed. With our formulation, when the target is software

implementations, combinational cycles need not be broken as long as the execution of the

9

CHAPTER 3. COMBINATIONALITY AND SEQUENTIAL DETERMINISM

underlying system obeys a sequencing execution rule. For hardware implementations, com-

binational cycles are broken and replaced with acyclic equivalents at the functional level to

avoid malfunctioning in the final physical realization.

3.1 Introduction

Cyclic definitions occur commonly in high-level system descriptions (e.g., due to resource

sharing, module composition, etc.) as was observed in [Sto92]. Checking if cyclic definitions

are semantically combinational is crucial in both hardware and software synthesis for two

reasons. First, the analysis certifies the legitimacy of cyclic definitions. Second, if the cyclic

definitions of a system are inappropriate but breakable, the analysis provides a means of

rewriting the system which breaks the cycles.

The analysis of cyclic combinational circuits was first formulated by Malik [Mal94],

based on ternary-valued simulation [Bry87, BS95], at the circuit (or gate) level. Subsequent

efforts [HM95, SBT96, Shi96, NK99] were built upon this formulation and bore much the

same foundation. However, the quest for solutions to analyzing combinationality remained

because the analysis at the functional level was left open.

Combinationality analysis for cyclic circuits is an essential step in the compilation of

synchronous languages [Hal93], such as Esterel [Ber00, Ber99], which allow simultaneous

cyclic definitions. Before applying Malik’s approach for static analysis1, a synchronous

program typically needs to be translated into a circuit netlist. Depending on how a program

is written, the same specification can be translated into different netlists. Because the

analysis based on ternary-valued simulation heavily depends on circuit structures (more

precisely, on the arrangement of delay elements over circuit netlists) as observed in [SBT96,

Shi96], a netlist may be declared not combinational even though there exists a functionally

1For dynamic or runtime analysis, translating a program to a netlist may be avoided, e.g., see [EL03].

10


equivalent one that behaves combinationally. This phenomenon corresponds to the so-called

schizophrenia problem in the compilation of Esterel programs [Ber99].

We propose a functional-level analysis that avoids the complication of translating pro-

grams into circuit netlists and eliminates the discrepancy problem of analyzing equivalent

different netlists. Essentially, our formulation of combinationality is extended to an extreme

at the functional level. That is, there exists a combinational implementation (where cyclic

definitions might need to be broken) if and only if the system in consideration passes our

combinationality test. As will be clear, ternary-valued simulation, when extended to the

functional level, is a limited special case of our formulation.

Although combinational circuits with feedback have their potential savings in area

[Kau70], they are hard to manipulate and analyze, e.g. timing analysis, logic minimiza-

tion, etc. Breaking cyclic dependencies is sometimes necessary to avoid later complications

since manipulating such circuits needs special care to prevent destroying well-behaved com-

binationality. Earlier efforts [HM95, Edw03] on breaking combinational cycles were done

for circuit netlists. In fact, there is no need to wait until circuit structures are derived. In

addition, analyzing combinationality at the functional level broadens the generality.

Our results are based upon the following principle. When cyclic definitions are to be

broken or the synthesis target is software, the combinationality analysis should be general-

ized to the extreme and performed at the highest level possible (i.e., the functional level).

On the other hand, when the target is hardware synthesis and combinational cycles are

allowed to exist, then the analysis should be conservative enough to tolerate undesirable

physical effects but general enough to abstract away unnecessary details at the appropriate

level (i.e., the circuit, or gate level). We emphasize this asymmetry in analyzing combina-

tionality, which was overlooked in prior work.

For combinationality at the circuit level, we comment on a recent development in the

11


synthesis of cyclic combinational circuits. Targeting area minimization, an attempt was

made in [RB03b, RB03a] to synthesize cyclic combinational circuits by extending the for-

mulation of ternary-valued simulation to the functional level. Unfortunately, it was ignored

that functional-level analysis is not sufficient in guaranteeing well-behaved circuitry. We

show the pitfall and suggest two cures. Essentially, additional conditions other than purely

functional ones need to be applied in order to guarantee well-behaved circuitry.

The chapter is organized as follows. After preliminaries and notations are given in

Section 3.2, our formulation of combinationality at the functional level is introduced in

Section 3.3. Section 3.4 discusses some issues about combinationality at the circuit level.

Section 3.5 compares our formalism with other work. Finally, Section 3.6 concludes this

chapter and lists some future research directions.

3.2 Preliminaries

Unless otherwise noted, this chapter assumes that the variables and functions under

consideration are of type Boolean, B. Moreover, we concentrate on output-deterministic

systems, whose output valuation is uniquely determined under any input assignment and

under any current state designated by the systems’ state-holding elements.

A functional-level description of a system M consists of a set DM of atomic defi-

nitions. Each atomic definition is of the form aj := φj , where aj and φj are a Boolean

variable and formula, respectively. In particular, for a circuit-level description of a system,

φj could be a formula of an identity function representing a wire (with delay), or a formula

of an elementary Boolean function representing a primitive logic gate (with delay) in the

gate library for technology mapping. We distinguish between functional and circuit level

descriptions. At first glance, this may seem obscure, and simply a matter of granularity.

However, we distinguish these two levels by saying that the valuations of atomic definitions

12


take no time at the functional level, but take time at the circuit level. (We assume that every

definition aj := φj in DM is deterministic, that is, variable aj valuates to a definite value

under any assignment on variables in formula φj . Thus, φj is a total Boolean function.)

Associated to DM is a definition graph GdM = (V d, Ed) characterizing the information

flow among atomic definitions. A vertex vj ∈ V d represents a left-hand variable aj of an

atomic definition aj := φj in DM. An directed edge (vj , vk) ∈ Ed indicates that variable aj

appears in the right-hand formula φk of atomic definition ak := φk. The set DM of atomic

definitions is of cyclic definition if GdM is a cyclic graph.

Given a system M (without state-holding elements, i.e., registers or latches), three

sets of variables are distinguished: the set VI of primary-input variables, VO of primary-

output variables, and VX of all the other (internal) variables. Notice that the primary-input

variables are the definition-free variables, and vice versa. A subset VC ⊆ VX ∪VO is selected

as a cutset such that the information flow among DM becomes acyclic if VC were exposed

as primary-input variables in addition to the original ones. That is, the corresponding

vertices of the cutset variables form a feedback vertex set in GdM. It turns out that any

such VC out of VX ∪VO provides enough information in analyzing the combinationality (its

precise definition will be given later) of M at the functional level. Selecting a minimal2

cutset helps simplify the analysis. (Previous studies, e.g. [ENSS98], on computing minimum

feedback vertex sets [Kar72] can be applied.) Furthermore, as will be proved, the analysis of

combinationality is independent of the choice of VC as long as VC is minimal. With a chosen

cutset VC , the behavior of M can be captured by two sets of definitions: the definitions of

cj ∈ VC , i.e., cj := ξj , and the definitions of ok ∈ VO, i.e., ok := ωk. Here, VI and VC are

the only variable occurrences in formulae ξj ’s and ωk’s. These formulae are obtained by a

sequence of recursive substitutions of the definitions in DM until the formulae for variables

in VC ∪ VO have VI ∪ VC as the only variable occurrences. Thus, the original intermediate

2A cutset VC is minimal if removing any element from VC makes the resultant information flow cyclic.

13


definitions of M are collapsed away. We call ξj the excitation function of cj ∈ VC , and

ωk the observation function of ok ∈ VO. (Cutset variables here are analogous to state

variables of a state transition system, while excitation functions are analogous to transition

functions.)

Example 1 Let DM = a := ¬xa ∨ c, b := ¬x(a ∨ ¬b) ∨ c, c := xb, y := ¬x(¬a ∨ ab) ∨x(a¬c ∨ ¬ac), VI = x, and VO = y. Suppose we choose VC to be a, b. Then,

rewriting DM with respect to VC , we have

a := ¬xa ∨ xb

b := ¬x(a ∨ ¬b) ∨ xb

y := ¬x(¬a ∨ ab) ∨ x(a¬b ∨ ¬ab)

The above right-hand formulae for a and b are the excitation functions; the formula for y

is the observation function.

Combinationality analysis for cyclic definitions of systems with state-holding elements

can be approximated as follows. Expose the outputs of state-holding elements as primary

inputs; expose the inputs of state-holding elements as primary outputs. If the unreach-

able state set of the system is available, it can be used as a don’t care condition in the

combinationality analysis. Also, if the state equivalence relation is known, it can be used

as a nondeterministic flexibility in the valuation of state-holding elements. Therefore, we

mainly focus on systems without state-holding elements. The exact analysis for systems

with state-holding elements is postponed to Section 3.3.6. Unless otherwise noted, we shall

assume the systems under consideration are without state-holding elements.

If a system consists of acyclic definitions, then it is combinational. However, the converse

is not true: A combinational system may have breakable cyclic definitions. Hence, only

systems with cyclic definitions are of our interest. Let M be such a system with cutset

14


VC . Given an input assignment for M, then the valuation of the cutset variables evolves

with time. The evolution can be captured by state evolution graphs (SEGs), analogous

to state transition graphs for state transition systems. However, unlike a state transition

graph, an SEG GeM,VC ,ı = (V e

ı , Eeı ) exists with respect to a particular fixed input assignment

ı ∈ [[VI ]]. An SEG is used to study how a system with cyclic definitions evolves when the

primary inputs are held constant. There is no concept of an initial state. A vertex vs ∈ V eı

corresponds to an intermediate valuation (or, a state) s ∈ [[VC ]] of the cutset variables (in the

sequel, we shall not distinguish between a vertex and the state it represents); each directed

edge (vs1 , vs2) ∈ Eeı corresponds to an evolution of intermediate valuations from s1 to s2.

That is, s2 = ~ξ(ı, s1), where ~ξ : [[VI ]] × [[VC ]] → [[VC ]] is the vector of excitation functions

of VC . Therefore, the evolution is deterministic at the functional level. (By contrast, the

evolution may be nondeterministic due to races, hazards, glitches, etc., at the circuit level.)

On the other hand, the vector ~ω : [[VI ]] × [[VC ]] → [[VO]] of observation functions imposes a

labelling over [[VC ]] with respect to some ı ∈ [[VI ]].

Below we define and explore some basics about SEGs.

Definition 1 A walk W of length k, denoted as len(W ) = k, on an SEG GeM,VC ,ı =

(V eı , Ee

ı ) is a sequence vs0 , vs1 , . . . , vskof vertices with (vsj−1 , vsj ) ∈ Ee

ı . A path is a walk

without repeated vertices. A loop of length k is a walk of length k without repeated edges

and with vs0 = vsk.

In this chapter, we use the term “loops” for SEGs and preserve “cycles” for definition

graphs.

Proposition 1 Any vertex of an SEG is in a loop and/or on a path leading to a loop.

Proof. Since every state of an SEG has at least one outgoing edge, any vertex is in a loop

and/or on a path leading to a loop.

15


00 01 10 11

(i)

00 01 10 11

(ii)

Figure 3.1. (i) SEG for x = 0. (ii) SEG for x = 1.

Proposition 2 Any two loops of an SEG with deterministic evolution are disjoint.

Proof. Since every vertex of an SEG with deterministic evolution has exactly one outgoing

edge, any two loops of the SEG must be disjoint.

In the sequel, we shall assume SEGs are deterministic unless otherwise stated.

Definition 2 A loop L is stable if len(L) = 1; L is unstable if len(L) > 1.

It will be clear later why a loop’s stability is determined by its length.

Definition 3 An equilibrium loop L of an SEG GeM,VC ,ı is a loop whose vertices all have

the same observation label, i.e., ∀vsj , vsk∈ L. ~ω(ı, sj) = ~ω(ı, sk).

Let S`M,VC ,ı denote the set s ∈ [[VC ]] | vs is a vertex in the loops of Ge

M,VC ,ı. For S ⊆ [[VC ]],

let Lo(S) denote the set ~ω(ı, s) ∈ [[VO]] | s ∈ S of observation labels.

Example 2 Continue the set DM of definitions and cutset VC of Example 1. Figure 3.1

shows the two state evolution graphs. Vertices are indexed with states, i.e., valuations of

(a, b). States are distinguished by solid and dotted circles to reflect different observation

16


labels induced by the observation function. The SEG of x = 0 has two loops, one stable and

the other unstable; the SEG of x = 1 has two stable loops. All of the loops are equilibrium

loops.

3.3 Combinationality at the Functional Level

Given the functional-level description of a system M with a cyclic definition, we study

the condition when M is combinational.

3.3.1 Formulation of Combinationality

In the functional-level formulation of combinationality, physical timing effects are ab-

stracted away by assuming that all valuations of functions are instantaneous. However, the

order of valuations matters.

A system M is said to be combinational at the functional level (or function-

ally combinational) if, under any input assignment, M would eventually (i.e., within a

bounded number of steps) evolve into a status in which the observation labelling settles to

a definite value3 independent of the initial internal state in [[VC ]]. Here, the dynamics of

M’s evolution is with respect to a cutset VC .

Theorem 3 A system M with cutset VC is combinational at the functional level if and

only if, for any input assignment ı ∈ [[VI ]], all states s ∈ S`M,VC ,ı have the same observation

label ~ω(ı, s), i.e., |Lo(S`M,VC ,ı)| = 1.

Proof. (−→) Suppose not. M may produce outputs depending on the initial state in [[VC ]].

In these cases, M is not combinational.3For simplicity, here we focus on the case where there is a unique output valuation in [[VO]] for any input

assignment. Our results can be straightforwardly generalized to a set of possible output valuations.

17


(←−) Since every vertex of GeM,VC ,ı = (V e

ı , Eeı ) is either in a loop or on a path leading

to a loop, any possible initial state sj evolves into a state sk ∈ S`M,VC ,ı after |V e

ı | − 1 steps

of evolution. Since all s ∈ S`M,VC ,ı have the same observation label, M eventually produces

a unique output under input ı. Because this is true for any input assignment, the proof

follows.

Example 3 Continue Example 2. The system described by DM is functionally combina-

tional because, for any of its two SEGs, states in loops have the same observation label.

Under input assignment x = 0, output y valuates to 1; under x = 1, y valuates to 0.

3.3.2 Computation Algorithms

Combinationality test.

From Theorem 3, we conduct a combinationality test on M using a symbolic computa-

tion (e.g. BDD-based computation) as follows. First, derive the set S`M,VC ,ı for all ı ∈ [[VI ]]

by a greatest fixed-point computation. Let Σ : [[VI ]]× [[VC ]] → B be the characteristic func-

tion of S`M,VC ,ı for all ı ∈ [[VI ]]. The fixed-point computation corresponds to: In the initial

step, let Σ(0) be the characteristic function of [[VC ]] for any ı ∈ [[VI ]]. In the iterative steps, to

derive Σ(k+1) from Σ(k), states without predecessors are successively removed from Σ(k) by

a forward image computation. That is, Σ(k+1) is computed by ∃c ∈ VC .Σ(k) ∧Ξ and a sub-

sequent replacement of variables Vc′ with their counterparts in Vc, where Ξ =∧

j(c′j ≡ ξj)

is the characteristic function of the evolution relation of M under cutset C with the newly

introduced “next-state” cutset variables Vc′ = c′j | cj ∈ VC. The process terminates

when Σ(m) equals Σ(m−1) for some m ≥ 1, i.e., no more states can be removed from Σ.

Upon termination, Σ is the characteristic function of S`M,VC ,ı. Notice that, with symbolic

computation, S`M,VC ,ı can be derived simultaneously for all ı ∈ [[VI ]] since variables in VI

are not quantified out in the fixed-point computation. Second, we derive the characteristic

18


function Λ of Lo(S`M,VC ,ı) by setting Λ = ∃c ∈ VC .Ω∧Σ, where Ω : [[VI ]]× [[VC ]]× [[VO]] → B

is the characteristic function Ω =∧

j(oj ≡ ωj) of the output relation of M under cutset C.

Again, this can be computed simultaneously for all ı ∈ [[VI ]] since primary-input variables

are not quantified out in the computation. Finally, we check if there exists an ı such that

|Lo(S`M,VC ,ı)| > 1. If the answer is positive, then M is not combinational. Otherwise, it

is. The computation can be done with a SAT-solving formulation, or with a BDD-based

formulation. For the latter, the computation can be performed effectively using the com-

patible projection operator [LN91b], cprojection. That is, M is combinational if and only

if Λ equals cprojection(Λ, o), where o is an arbitrary minterm in [[VO]].

The computational complexity of the combinationality test is the same as that of state

traversal on the space spanned by the cutset variables. That is, the complexity is PSPACE-

complete in the size of the selected cutset.

Theorem 4 The problem of analyzing combinationality at the functional level is in the

complexity class of PSPACE-complete with respect to the selected cutset size.

Proof. The problem of combinationality analysis can be done in nondeterministic PSPACE.

To determine if a state s is in a loop under some input assignment, one can record any

consecutive two states in the state evolution trace starting from s. As the “window” slides

along the trace, the recurrence of s can be checked in at most |[[VC ]]| steps. In addition, one

can test if different output observation labels ever appear in the sliding windows. Hence

the combinationality analysis can be achieved within space bounded by a polynomial in the

cutset size.

On the other hand, we need to reduce a PSPACE-complete problem to the problem of

combinationality analysis. The following problem can be used.

Given a total function f : 1, . . . , n → 1, . . . , n, is there a k such that fk(1) =n?

19


It was shown [Jon75] to be deterministic4 LOGSPACE-complete in n and, thus, PSPACE-

complete in log n [Pap94]. We establish that the answer to the PSPACE-complete problem is

positive if and only if the answer to the corresponding problem of combinationality analysis

(to be constructed) is negative. Since the complexity class of nondeterministic space is

closed under complementation [Imm88], the theorem follows.

To complete the proof, given f : 1, . . . , n → 1, . . . , n, an excitation function ξ :

1, . . . , n → 1, . . . , n and observation function ω : 1, . . . , n → B are constructed as

follows. Let ξ have the same mapping as f but with ξ(n) = 1. Also, let ω(j) = false

for 1 ≤ j ≤ n − 1, and ω(n) = true. With the above construction, n is reachable from

1 under f if and only if the system defined by ξ and ω is not combinational. (Note that,

since an n-valued variable can be encoded with O(log n) binary variables, multiple-valued

representations fit our framework.)

Cycle breaking.

Suppose M is combinational. From the above combinationality test, we can derive a

set of equivalent acyclic definitions for M. In fact, there are at least two ways of doing so:

One is to rewrite definitions of primary-output variables as functions (as determined from

the combinationality test) of primary-input variables. The other is to rewrite definitions of

cutset variables as functions of primary-input variables. An advantage of the latter would

be that the original definitions of M can be reused except for the definitions cj := φj ,

for cj ∈ VC , and the resultant unused ones. The derivations of the new definitions are as

follows.

For the rewriting of the primary-output variables, the new definitions can be inferred

from the input-output relation ∃c ∈ VC .Ω ∧ Σ, which has been computed in the combi-

4It is a well-known fact, proved by Savitch in [Sav70], that deterministic and nondeterministic spacecomplexities coincide.

20


nationality test. For the rewriting of the cutset variables, for every ı ∈ [[VI ]], some state

sı ∈ S`M,VC ,ı is selected as the representative for S`

M,VC ,ı. Then the new definitions for the

cutset variables can be inferred from the relation∧

ı(∧j(ij ≡ ı[j]))∧ (∧k(ck ≡ sı[k])), where

ij ∈ VI , ck ∈ VC , and ı[j] (resp. sı[k]), which denotes the jth (resp. kth) bit of ı (resp. sı),

is a Boolean constant of value either true or false.

3.3.3 Generality Analysis

Theorem 5 Let VC1 and VC2 be two choices of minimal cutsets for a system M with cyclic

definitions. Then, under any input assignment ı ∈ [[VI ]], there exists a bijection between the

loops of GeM,VC1 ,ı and those of Ge

M,VC2 ,ı.

Proof. First observe that, since both VC1 and VC2 are cutsets, under a specific input

assignment ı ∈ [[VI ]], the variables in VC1 can be expressed as functions of variables in VC2,

and vice versa. Thus, there exist a mapping f21 : [[VC1]] → [[VC2]] (resp. f12 : [[VC2]] → [[VC1]])

such that, for a valuation s of VC1 (resp. t of VC2), f21(s) (resp. f12(t)) is the corresponding

valuation of VC2 (resp. VC1) variables. In addition, since VC1 and VC2 are minimal, we have

f12(f21(s)) = ~ξ1(ı, s) and f21(f12(t)) = ~ξ2(ı, t), where ~ξ1 and ~ξ2 are the vectors of excitation

functions of M with cutsets VC1 and VC2, respectively.

To see the relation between the loops of GeM,VC1 ,ı and those of Ge

M,VC2 ,ı, consider a state

evolution sequence σ1 = s1, . . . , sj , . . . , sk of [[VC1]] such that sk is the first recurrent state in

σ1 with sk = sj . Clearly, σ2 = f21(s1), . . . , f21(sj), . . . , f21(sk) is a state evolution sequence

over [[VC2]] because f21(f12(f21(s))) = ~ξ2(ı, f21(s)). Now, we need to show that f21(sk) is

the only recurrent state in σ2 with f21(sk) = f21(sj). By contradiction, suppose there exists

another recurrent state in σ2 such that f21(sm) = f21(sl), l < m < k. However, this implies

sm+1 = sl+1 in σ1 because f12(f21(sm)) = f12(f21(sl)). It contradicts the assumption that

sk is the first recurrent state in σ1 unless m = k− 1 and l = j − 1. Similarly, one can show

21


that, given a state evolution sequence of [[VC2]] with a loop, there exists a corresponding

sequence of [[VC1]] with a loop. Also, by Propositions 1 and 2, there exists a bijection

between the loops of GeM,VC1 ,ı and those of Ge

M,VC2 ,ı.

Corollary 6 A system’s combinationality at the functional level is independent of the

choice of minimal cutset in the analysis.

Proof. Let VC1 and VC2 be two choices of minimal cutsets, and ~ω1 : [[VI ]] × [[VC1]] → [[VO]]

and ~ω2 : [[VI ]]× [[VC2]] → [[VO]] be the resultant vectors of observation functions. Let f21 and

f12 be the mappings as defined in the proof of Theorem 5. Then, ~ω1(ı, s1) = ~ω2(ı, f21(s1))

and, similarly, ~ω1(ı, f12(s2)) = ~ω2(ı, s2), for any ı ∈ [[VI ]], s1 ∈ [[VC1]], and s2 ∈ [[VC2]].

In addition to the result of Theorem 5, we need to show that all corresponding loops of

GeM,VC1 ,ı and Ge

M,VC2 ,ı must have the same output observation for all ı ∈ [[VI ]].

Suppose M is combinational under an analysis with cutset VC1. Then, all the states

in S`M,VC1 ,ı must have the same output observation label, say o1 ∈ [[VO]]. For the sake of

contradiction, assume there exists a state s2 ∈ S`M,VC2 ,ı with ~ω2(ı, s2) 6= o1. It implies that

~ω1(ı, f12(s2)) 6= o1. Since f12(s2) is in S`M,VC1 ,ı, it contradicts the assumption that all the

states in S`M,VC1 ,ı have observation label o1. Hence, all the states in S`

M,VC2 ,ı must have the

same observation label o1 as well. The corollary follows.

Notice that the result holds even when the cutset changes dynamically.

Assuming a systemM operates without a special pre-initialization, our combinationality

analysis at the functional level is the most general formulation that one can hope for in the

sense that

Theorem 7 There exists a feasible combinational implementation of M if and only if Msatisfies our combinationality test.

Proof. (−→) Suppose that M fails our combinationality test. It implies that there exists

22


some input assignment such that the corresponding output valuation cannot settle to a

unique value. This violates the definition of combinationality.

(←−) Trivial.

3.3.4 Conditions of Legitimacy

The legitimacy of our combinationality formulation is confirmed if a system’s cyclic

definitions are to be broken in the final realization. However, if some cyclic definitions are

to be maintained in the final realization, then the certification of combinationality at the

functional level of abstraction is not sufficient to guarantee correctness. Essentially, two

restrictions need to be imposed to ensure the correctness. First, all excitation functions

should be valuated synchronously such that state evolutions follow the combinationality

analysis. Second, the time interval between two consecutive input assignments should be

much larger than the time spent on internal valuations such that the state of the system has

enough time to evolve to an equilibrium loop. Certainly, the first restriction is inadequate for

hardware realization of cyclic definitions due to undesirable effects, such as races, hazards,

glitches, etc. In contrast, software realization is more adequate since the above effects

can be eliminated. A possible application domain could be software synthesis for reactive

systems, where the common assumption is that internal computations are much faster than

environmental responses. Hence, the second restriction is satisfied under this assumption.

3.3.5 Stable Cyclic Dependencies

Although maintaining cyclic definitions is legitimate for software synthesis, it may be

undesirable if SEGs contain unstable loops. Unstable loops result in persistent updates of

state information (even though observation functions have settled to definite values), and

23


hence consume power.5 To avoid the persistent power consumption, we require that all the

loops in SEGs must be stable. To make a system with unstable loops stable, the definitions

of the system M in consideration should be rewritten. Such rewrites can be done in various

ways. For instance, an unstable loop L is broken by redirecting the evolution of a state in

L to itself or to another state not leading to an unstable loop. Note that replacing cyclic

definitions with acyclic equivalents is just a special case of such rewrites.

On the other hand, a rewrite is not necessary if all loops are stable. One can devise an

algorithm to test if a system M with cutset VC is stably combinational at the functional

level. Essentially, M is stably combinational if and only if, for any input assignment, any

state s ∈ [[VC ]] can reach (i.e., evolve to) a self-looped state. Let Σ : [[VI ]] × [[VC ]] → B be

the characteristic function denoting the set of states which can reach self-looped states with

respect to some input assignment. Then the algorithm can be outlined as follows. First,

compute the set of self-looped states of GeM,VC ,ı for all ı ∈ [[VI ]] using the characteristic

function∧

j(cj ≡ ξj), where cj ∈ VC is a cutset variable and ξj : [[VI ]] × [[VC ]] → B is an

excitation function in ~ξ. Second, let Σ(0) =∧

j(cj ≡ ξj) initially. Perform the standard

backward reachability analysis (however, variables in VI are not quantified out). That is,

in iterative steps, we derive Σ(k+1) from Σ(k). Let Σ′(k) be Σ(k) with variables VC replaced

by their counterparts in VC′ , the “next-state” cutset variables. Then, Σ(k+1) = Σ(k) ∨∃c′ ∈VC′ .Σ′(k) ∧ Ξ, where Ξ =

∧j(c

′j ≡ ξj) is the characteristic function of the evolution relation

of M under cutset C. The iteration terminates when Σ(m) equals Σ(m−1) for some m ≥ 1,

i.e., no more states can be added to Σ. Using a symbolic approach, the computation is done

for all input assignments simultaneously since variables in VI are not quantified out during

the fixed-point computation. The system is stably combinational if and only if the final Σ

is a tautology.

5Although it might be possible that some way of detecting when the output has settled can be imple-mented to stop this evaluation, it would seem to be expensive.

24


In addition to the stability requirement, one may want to bound the maximum length of

evolution paths to equilibrium loops. The number of iterations spent in a combinationality

test corresponds to the length of the longest evolution path(s). If the length is greater than

the upper bound, say n, state evolutions need to be redirected to shorten long paths. One

approach would be to memorize the newly removed state sets for every n− 1 iterations in

the combinationality test. After the test, redirect the evolutions of the memorized states to

proper equilibrium loops.

3.3.6 Input-Output Determinism of State Transition Systems

We extend combinationality analysis on the set DM of cyclic definitions of a system

M with state-holding elements. Note that DM contains only simultaneous definitions and,

thus, excludes the delayed definitions of the state-holding elements. Let VI and VO be the

sets of primary input and primary output variables, respectively. Also, let S (resp. S′) be

the set of output (resp. input) variables of the state-holding elements, and VC be a cutset of

DM. We specify two types of states: External states [[S]] are those designated by the state-

holding elements; internal states [[VC ]] are those emerging from the cyclic definitions. Also,

terms “transition” and “evolution” are used to differentiate the dynamics among external

and internal states, respectively.

Our objective here is to analyze whether the cyclic definitions of M can be replaced

with acyclic ones such that the sequential behavior of M remains unchanged. Essentially,

such a substitution is possible if and only if M has deterministic input-output behavior6.

As mentioned in Section 3.2, the inputs and outputs of the state-holding elements can

be treated as primary outputs and primary inputs, respectively, of the set of the cyclic

6Nevertheless, state transitions may be nondeterministic due to the cyclic definitions. Since internalstates in loops of an SEG may have different observation labels induced by S′, these observation labelsconstitute the possible next external states. Hence, state transitions are nondeterministic in general.

25


definitions. However, a direct combinationality test on the cyclic definitions only yields an

approximative analysis because it requires the valuations of S′ to be deterministic.

To achieve an exact analysis, with the above input and output transformation, reacha-

bility analysis (for external states) and combinationality analysis (for internal states) should

be performed alternately. Two conditions need to be satisfied: First, under any input as-

signment ı ∈ [[VI ]] and any reachable state s ∈ [[S]], the set S`M,VC ,(ı,s) of all internal states in

loops of the corresponding SEG GeM,VC ,(ı,s) must have the same observation label induced

by VO, i.e., |Lo(S`M,VC ,(ı,s))| = 1. Second, under any input assignment ı and any reachable

state s, the corresponding next (external) states must be sequentially equivalent. This set

of next states is derived from the set of observation labels induced by S′ over S`M,VC ,(ı,s).

A detailed computation is outlined as follows. Let R(j) be the reached state set for the

state-holding elements at the jth iteration. Let R(0) ⊆ [[S]] be the initial state set. In the

jth iteration, we perform combinationality analysis detailed in Section 3.3.2 to certify that

DM is combinational with respect to VO for any ı ∈ [[VI ]] and s ∈ R(j). (If the certification

is not established, M is not deterministic in its input-output behavior and the procedure

aborts.) The combinationality analysis also gives us the set S`M,VC ,(ı,s) for all ı ∈ [[VI ]]

and s ∈ R(j). From it, we obtain the set of next states under ı and s by computing the

set of observation labels induced by S′ over S`M,VC ,(ı,s). Denote the set of next states as

N(j)ı,s . Then, R(j+1) = R(j) ∪ N (j)

ı,s | ı ∈ [[VI ]], s ∈ R(j). The iterations terminate when

R(k+1) = R(k) for some k. At this point, we need one more step to conclude whether DM can

be rewritten with acyclic definitions. The answer is affirmative if and only if |Lo(N (j)ı,s )| = 1,

for j = 0, . . . , k − 1, and for any ı ∈ [[VI ]], s ∈ R(j). The rewriting procedure is similar to

what was described in Section 3.3.2.

The corresponding computational complexity is PSPACE-complete in the number of

state-holding elements and the cutset size. The PSPACE-completeness is immediate from

26


the fact that the input-output determinism problem of state transition systems is in

PSPACE and is even harder than the PSPACE-complete problem of combinationality anal-

ysis shown in Theorem 4.

3.4 Combinationality at the Circuit Level

Combinationality analysis at the functional level abstracts away timing information.

Certainly, it does not guarantee the feasibility of maintaining cyclic definitions in final

circuit implementations. On the other hand, Malik’s formulation based on ternary-valued

simulation turns out to be the right formulation at the circuit level. In his formation,

effectively, all gates and wires are sources of uncertain delay7. Under the up-bounded inertial

delay model [BS95], ternary-valued simulation can be treated as an operational definition

of combinationality for cyclic circuits [SBT96, Shi96].

3.4.1 Synthesis of Cyclic Circuits

A recent attempt [RB03b, RB03a] of synthesizing cyclic circuits brings the formula-

tion of ternary-valued simulation up to the functional level. Combinationality analysis

was checked with recursive marginal operations [RB03a]. However, it was overlooked that

functional-level analysis itself is not sufficient to guarantee the correctness of the final cir-

cuit implementation. Consider the following cyclic definitions over primary-input variables

a and b:

f := ¬ah ∨ ¬b¬h

g := ¬a¬bf

h := ab ∨ ¬g

7This timing assumption is very conservative in the sense that no asynchronous circuits can ever existunder this assumption if initialization is not allowed.

27


( i ) ( ii )

f

gh

x

yf

a

b

a

b

a

b

gh

x

y

Figure 3.2. (i) The original circuit. (ii) The induced circuit under input assignment a = 0and b = 0.

The reader can verify that the above definitions are functionally combinational under any

input assignment. (Indeed, under the analysis with recursive marginal operations, the

cyclic definitions are combinational.) However, functional analysis does not guarantee a

well-behaved circuit implementation. Consider the circuit netlist in Figure 3.2 (i) as an

implementation of the above cyclic definitions. The circuit may not be well-behaved. To

see this, consider input assignment a = 0 and b = 0. Assume the resultant induced circuit

is abstracted to that in Figure 3.2 (ii), where all the gates have one-unit delay and all the

wires have zero delay. Now, suppose the previous input assignment is a = 1 and b = 1 before

assignment a = 0 and b = 0. That is, internal signals x and y in Figure 3.2 (ii) are of value

0 initially. An examination shows that the circuit oscillates despite of its combinationality

at the functional level. Essentially, the failure originates from the fact that some gates

and wires are not fully characterized in the analysis. Hence, functional-level analysis is not

sufficient to conclude the correctness of the gate-level implementation.

Two approaches can be applied to rectify the deficiency in the analysis proposed by

[RB03b, RB03a]. One is to remove axioms x ∨ ¬x = true and x ∧ ¬x = false from the

recursive marginal operations when x is not a primary-input variable. The other is to add

28


more terms to functional expressions such that, for every input assignment, cyclic definitions

are broken for some functions valuating to either true or false purely depending on the

input assignment. For instance, in the previous example, product term ¬a¬b needs to be

added to the definition of f , i.e, f := ¬ah∨¬b¬h∨¬a¬b. For the second rectification, one

should be careful in any subsequent circuit optimization; the added terms should not be

removed without special care. Note that the necessity of adding some rectification terms

may nullify area gains claimed in [RB03a] due to allowing cyclic combinational circuits.

3.5 Related Work

3.5.1 SEG vs. GMW

Our SEG formalism is closely related to the general multiple winner (GMW) analysis

[BS95], which is commonly used in the analysis of asynchronous circuits. (Under the up-

bounded inertial delay model, GMW analysis is equivalent to ternary-valued simulation.)

To reason about the behavior of asynchronous circuits under physical effects such as glitches,

hazards, races, etc., the GMW analysis builds graphs similar to SEGs with additional non-

deterministic evolutions. Depending on how the current state and next state are coded, an

evolution branches out into several nondeterministic ones. Also, unlike an SEG existing for

a fixed input assignment, the graph built by the GMW analysis is connected for different

input assignments. These additional evolutions make GMW analysis a complicated pro-

cedure. Even worse, the GMW analysis declares a state variable for every delay element

(possibly, a gate or wire). In comparison, only a minimal cutset needs to be chosen in

our combinationality analysis. Therefore, the state space is substantially reduced for SEG

analysis. Under the legitimacy conditions in Section 3.3.4, all of the above complications in

the GMW analysis can be avoided and simplified to our SEG formalism.

29


3.5.2 Combinationalities

At the functional level, we contrast our formulation of combinationality with prior work

based on ternary-valued simulation. In the case where all valuations in the set of cyclic

definitions must stabilize, the functional-level extension of Malik’s formulation can be sum-

marized as follows. For any input assignment ı ∈ [[VI ]], there exists a set of definitions

valuating to either true or false such that the cyclic definitions are broken. This require-

ment corresponds to that, for every input assignment, the corresponding SEG has a single

stable loop, and all states of the SEG evolve directly (in one step) to this loop. (An SEG

with multiple stable loops corresponds to what was considered as having nondeterministic

multiple solutions; an SEG with an unstable loop, i.e., a loop with length greater than one,

corresponds to what was considered as having no consistent solution.) In comparison, our

formulation is much more general because SEGs are allowed to have multiple loops, which

can be stable or unstable, and to have long evolution paths.

Now consider a more relaxed case where signals are allowed to oscillate for some in-

put assignment as long as all output valuations are uniquely determined under this input

assignment regardless of internal states. To see how Malik’s formulation corresponds, we

partition input assignments into two sets: one with outputs fully determined, and the other

partially determined. Under the former set of input assignments, no restrictions need to be

imposed on SEGs, just like in our formulation. Under the latter set of input assignments,

however, the restrictions discussed in the case where all valuations must stabilize need to

be imposed. Although the generality is enhanced in the relaxed case, the combinationality

based on ternary-valued simulation is still a limited formulation. In comparison, our for-

mulation is strictly more general. In fact, it is the most general formulation that one can

hope for as stated in Theorem 7.

Example 4 Continue Example 1. The observation function is only partially determined

30


under any input assignment. It is not hard to see that system M specified by DM is not

combinational under the functional-level extension of Malik’s combinationality formulation,

contrary to our combinationality analysis.

3.5.3 Sequential Extensions

In his thesis [Shi96], Shiple extended the analysis of combinational cycles for circuits

with state-holding elements. He defined sequential output-stability, which allows a circuit

to be initialized to some stable states and considers only initialized behavior. The GMW

analysis was adopted to replace ternary-valued simulation such that nondeterministic in-

ternal states were admitted to exist as long as the observable behavior is unaffected. An

equivalent acyclic circuit can be generated from the GMW analysis. Again, if the objective

is software synthesis or to break cyclic definitions, such sequential extension can be gen-

eralized substantially and simplified to our computation outlined in Section 3.3.6 without

resorting to the complicated GMW analysis.

3.6 Summary

Based on the observation that when cyclic definitions are to be broken in the final

realization, the formulation of combinationality can be much more general than previous

formulations. In addition, the analysis can be done at a higher abstraction level, i.e., the

functional level. The combinationality formulation is extended to an extreme — a system

is combinationally implementable if and only if it passes our combinationality test. When

cyclic definitions are to be maintained, we examine the legitimacy condition of our formula-

tion. It turns out that software synthesis of reactive systems may be an application domain,

where cyclic definitions can be maintained in the final realization. In addition, we show

that our analysis is independent of the choice of cutsets. Our results admit strictly more

31


flexible high-level specifications in hardware/software system design. For combinationality

at the circuit level, we comment on a pitfall in a recent attempt synthesizing cyclic circuits

for area minimization. Two approaches are given to rectify the deficiency.

32

Chapter 4

Retiming and Resynthesis

Transformations using retiming and resynthesis operations are the most important and

practical (if not the only) techniques used in optimizing synchronous hardware systems.

Although these transformations have been studied extensively for over a decade, questions

about their optimization capability and verification complexity are not answered fully. Re-

solving these questions may be crucial in developing more effective synthesis and verification

algorithms.

This chapter settles the above two open problems. The optimization potential is resolved

through a constructive algorithm which determines if two given finite state machines (FSMs)

are transformable to each other via retiming and resynthesis operations. Verifying the

equivalence of two FSMs under such transformations when the transformation history is lost

is proved to be PSPACE-complete and hence just as hard as general equivalence checking,

contrary to a common belief. As a result, we advocate a conservative design methodology

for the optimization of synchronous hardware systems to ameliorate verifiability.

Our analysis reveals some properties about initializing FSMs transformed under retim-

ing and resynthesis. On the positive side, a lag-independent bound on the length increase

33

CHAPTER 4. RETIMING AND RESYNTHESIS

of initialization sequences for FSMs under retiming is established. This allows a simpler

incremental construction of initialization sequences compared to prior approaches. On the

negative side, we show that there is no analogous transformation-independent bound when

resynthesis and retiming are iterated. Fortunately, an algorithm computing the exact length

increase is presented.

4.1 Introduction

Retiming [LS83, LS91] is an elementary yet effective technique in optimizing syn-

chronous hardware systems. By simply repositioning registers, it is capable of rescheduling

computation tasks in an optimal way subject to some design criteria. As both an ad-

vantage and a disadvantage, retiming preserves the circuit structure of the system under

consideration. It is an advantage in that it supports incremental engineering change with

good predictability, and a disadvantage in that the optimization capability is somewhat

limited. Therefore, resynthesis [Mal90, DM91, MSBSV91] was proposed to be combined

with retiming, allowing modification of circuit structures. This combination of retiming

and resynthesis certainly extends the optimization power of retiming, but to what extent

remains an open problem; even though some notable progress has been made since [Mal90],

e.g., [Ran97, RSSB98, ZSA98]. Fully resolving this problem is crucial in understanding the

complexity of verifying the equivalence of systems transformed by retiming and resynthesis

and in constructing correct initialization sequences. In fact, despite its effectiveness, retim-

ing and resynthesis is not widely used in hardware synthesis flows due to the verification

hindrance and the initialization problem. Progress in these areas could enhance the prac-

ticality and application of retiming and resynthesis, and advance the development of more

effective synthesis and verification algorithms.

This chapter tackles three main problems regarding retiming and resynthesis:

34


Optimization power: What is the transformation power of retiming and resynthesis?

How can we tell if two synchronous systems are transformable to each other with

retiming and resynthesis operations?

Verification complexity: What is the computational complexity of verifying if two syn-

chronous systems are equivalent under retiming and resynthesis?

Initialization: How does the transformation of retiming and resynthesis affect the initial-

ization of a synchronous system? How can we correct initialization sequences?

Our main results include

• (Section 4.3) Characterize constructively the transformation power of retiming and resyn-

thesis.

• (Section 4.4) Prove the PSPACE-completeness of verifying the equivalence of systems

transformed by retiming and resynthesis operations when the transformation history

is lost.

• (Section 4.5) Demonstrate the effects of retiming and resynthesis on the initialization

sequences of synchronous systems. Present an algorithm correcting initialization se-

quences.

The chapter is organized as follows. After Section 4.2 introduces some preliminaries

and notation, our main results are presented in Sections 4.3, 4.4, and 4.5. In Section 4.6,

a closer comparison with prior work is detailed. Section 4.7 concludes this chapter and

outlines some future research directions.

35


4.2 Preliminaries

In this chapter, to avoid later complication we shall not restrict ourselves to binary

variables and Boolean functions. Thus, we assume that variables can take values from

arbitrary finite domains, and similarly functions can have arbitrary finite domains and co-

domains. When (co)domains are immaterial in the discussion, we shall omit specifying

them.

Synchronous hardware systems.

Based on [LS83], a syntactical definition of synchronous hardware systems can be for-

mulated as follows. A hardware system is abstracted as a directed graph, called a com-

munication graph, G = (V,E) with typed vertices V and weighted edges E. Every vertex

v ∈ V represents either the environment or a functional element. The vertex representing

the environment is the host, which is of type undefined; a vertex is of type ~f if the functional

element it represents is of function ~f (which can be a multiple-output function consisting

of f1, f2, . . .). Every edge e〈w〉 = (u, v)〈w〉 ∈ E with a nonnegative integer-valued weight

w corresponds to the interconnection from vertex u to vertex v interleaved by w state-

holding elements (or registers). (From the viewpoint of hardware systems, any component

in a communication graph disconnected from the host is redundant. Hence, in the sequel,

we assume that a communication graph is a single connected component.) A hardware

system is synchronous if, in its corresponding communication graph, every cycle contains

at least one positive-weighted edge. This chapter is concerned with synchronous hardware

systems whose registers are all triggered by the same clock ticks. Moreover, according to

the initialization mechanism, a register can be reset either explicitly or implicitly. For reg-

isters with explicit reset, their initial values are determined by some reset circuitry when

the system is powered up. In contrast, for registers with implicit reset, their initial values

36


can be arbitrary, but can be brought to an identified set of states (i.e. the set of initial

states1) by applying some input sequences, the so-called initialization (or reset) sequences

[Pix92]. It turns out that explicit-reset registers can be replaced with implicit-reset ones

plus some reset circuitry [MSBSV91, SMB96]. (Doing so admits a more systematic treat-

ment of retiming synchronous hardware systems because retiming explicit-reset registers

needs special attention to maintain equivalent initial states.) Without loss of generality,

this chapter assumes that all registers have implicit reset. In addition, we are concerned

with initializable systems, that is, there exist input sequences which bring the systems from

any state to some set of designated initial states.

The semantical interpretation of synchronous hardware systems can be modelled as finite

state machines. To uniquely construct an FSM from a communication graph G = (V,E),

we divide each edge (u, v)〈w〉 ∈ E into w + 1 edges separated by w registers and connected

with the two end-vertices u and v. We then associate the outgoing (incoming) edges of

registers with current-state variables VS (next-state variables VS′); associate the outgoing

(incoming) edges of the host with variables VI (VO). All other edges are associated with

internal variables. The transition and output functions are obtained, starting from VS′ and

VO, respectively, by a sequence of recursive substitutions of variables with functions of their

input functional elements until the functions depend only on variables VI ∪ VS .

We define a strong form of state equivalence which will govern the study of the trans-

formation power of retiming.

Definition 4 Given an FSM M = (Q, I,Σ,Ω, ~δ, ~ω), two states q1, q2 ∈ Q are immedi-

ately equivalent if ~δ(σ, q1) ≡ ~δ(σ, q2) and ~ω(σ, q1) ≡ ~ω(σ, q2) for any σ ∈ Σ.

Also, we define dangling states inductively as follows.

1When referring to “initial states,” we shall mean the starting states of a system after initialization.

37


Definition 5 Given an FSM, a state is dangling if either it has no predecessor state or

all of its predecessor states are dangling. All other states are non-dangling.

Retiming.

A retiming operation over a synchronous hardware system consists of a series of atomic

moves of registers across functional elements in either a forward or backward direction.

(The relocation of registers is crucial in exploring optimal synchronous hardware systems

with respect to various design criteria, such as area, performance, power, etc. As not our

focus, the exposure of retiming in the optimization perspective is omitted in this chapter.

Interested readers are referred to [LS91].) Formally speaking, retiming can be described

with a retime function [LS83] over a communication graph as follows.

Definition 6 Given a communication graph G = (V,E), a retime function ρ : V → Z

maps each vertex to an integer, called the lag of the vertex, such that w + ρ(v)− ρ(u) ≥ 0

for any edge (u, v)〈w〉 ∈ E. If ρ(host) = 0, ρ is called normalized; otherwise, ρ is

unnormalized.

Given a communication graph G = (V,E), any retime function ρ over G uniquely determines

a “legally” retimed communication graph G† = (V,E†) in which (u, v)〈w〉 ∈ E if, and only

if, (u, v)〈w + ρ(v)− ρ(u)〉 ∈ E†. It is immediate that the retime function −ρ reverses the

retiming from G† to G.

Retime functions can be naturally classified by calibrating their equivalences as follows.

Definition 7 Given a communication graph G, two retime functions ρ1 and ρ2 are equiv-

alent if they result in the same retimed communication graph.

Proposition 8 Given a retime function ρ with respect to a communication graph, offsetting

ρ by an integer constant c results in an equivalent retime function.

38


Hence any retime function can be normalized. This equivalence relation, which will be useful

in the study of the increase of initialization sequences due to retiming, induces a partition

over retime functions. Equivalent retime functions (with respect to some communication

graph) form an equivalence class.

Proposition 9 Given a communication graph G, any equivalence class of retime functions

is of infinite size; any equivalence class of normalized retime functions is of size either one

or infinity (only when G contains components disconnected from the host). Furthermore,

any equivalence class of retime functions has a normalized member.

Resynthesis.

A resynthesis operation over a function f rewrites the syntactical formular structure of

f while maintaining its semantical functionality. Clearly, the set of all possible rewrites is

infinite (but countable, namely, with the same cardinality as the set N of natural numbers).

When a resynthesis operation is performed upon a synchronous hardware system, we shall

mean that the transition and output functions of the corresponding FSM are modified in

representations but preserved in functionalities. This modification in representations will

be reflected in the communication graph of the system. (Again, such rewrites are usually

subject to some optimization criteria. Since this is not our focus, the optimization aspects

of resynthesis operations are omitted. See, e.g., [DM91] for further treatment.)

4.3 Optimization Capability

The transformation power of retiming and resynthesis can be understood best with state

transition graphs (STGs) defined by FSMs. We investigate how retiming and resynthesis

operations can alter STGs.

39


4.3.1 Optimization Power of Retiming

Given a communication graph G = (V, E), we study how the atomic forward and

backward moves of retiming affect the corresponding FSM M = ([[VS ]], I, Σ,Ω, ~δ, ~ω).

To study the effect of an atomic backward move, consider a normalized retime function

ρ with ρ(v) = 1 for some vertex v ∈ V and ρ(u) = 0 for all u ∈ V \v. (Because a

retiming operation can be decomposed as a series of atomic moves, analyzing ρ defined

above suffices to demonstrate the effect.) Let VS = VS\ ∪ VS∗ be the state variables of

M, where VS\ = s1, . . . , si and VS∗ = si+1, . . . , sn are disjoint. Suppose v is of type

~f : [[t1, . . . , tj]] → [[s′1, . . . , s′i]], where the valuation of next-state variables s′k is defined by

fk(t1, . . . , tj) for k = 1, . . . , i. Let M† = ([[V†S ]], I†, Σ,Ω, ~δ†, ~ω†) be the FSM after retiming,

where state variables V†S = VT ∪VS∗ with VT = t1, . . . , tj. For any two states q†1, q†2 ∈ [[V†S ]],

if q†1[VS∗ ] ≡ q†2[VS∗ ] and ~f(q†1[VT ]) ≡ ~f(q†2[VT ]), then q†1 and q†2 are immediately equivalent.

This immediate equivalence results from the fact that the transition and output functions

of M† can be valuated after the valuation of ~f , which filters out the difference between

q†1 and q†2. Comparing state pairs between M and M†, we can always find a relation

R ⊆ [[VS ]]× [[V†S ]] such that

1. Pairs (q1, q†1) and (q1, q

†2) are both in R for the state q1 of M with q1[VS∗ ] ≡ q†1[VS∗ ]

and q1[VS\ ] ≡ ~f(q†1[VT ]).

2. It preserves the immediate equivalence, that is, (q, q†) ∈ R if, and only if, ~ω(σ, q) ≡~ω†(σ, q†) and (~δ(σ, q), ~δ†(σ, q†)) ∈ R for any σ ∈ Σ.

Since ~f is a total function, every state of M† has a corresponding state in M related by

R. (It corresponds to the fact that backward moves of retiming cannot increase the length

of initialization sequences, the subject to be discussed in Section 4.5.) On the other hand,

since ~f may not be a surjective (or an onto) mapping in general, there may be some state

40


q of M such that ∀x ∈ [[VT ]]. q[VS\ ] 6≡ ~f(x), that is, no states can transition to q. In this

case, q can be seen as being annihilated after retiming. To summarize,

Lemma 10 An atomic backward move of retiming can 1) split a state into multiple imme-

diately equivalent states and/or 2) annihilate states which have no predecessor states.

With a similar reasoning by reversing the roles of M and M†, one can show

Lemma 11 An atomic forward move of retiming can 1) merge multiple immediately equiv-

alent states into a single state and/or 2) create states which have no predecessor states.

(Similar results of Lemmas 10 and 11 appeared in [RSSB98], where the phenomena of state

creation and annihilation were omitted.)

Note that, in a single atomic forward move of retiming, transitions among the newly cre-

ated states are prohibited. In contrast, when a sequence of atomic forward moves m1, . . . ,mn

are performed, the newly created states at move mi can possibly have predecessor states

created in later moves mi+1, . . . , mn. Clearly all the newly created states not merged with

original existing states due to immediate equivalence are dangling. However, to be shown in

Section 4.5.1, the transition paths among these dangling states cannot be arbitrarily long.

Since a retiming operation consists of a series of atomic moves, Lemmas 10 and 11 set

the fundamental rules of all possible changes of STGs by retiming. Observe that a retiming

operation is always associated with some structure (i.e. a communication graph). For a

fixed structure, a retiming operation has limited optimization power, e.g., the converses of

Lemmas 10 and 11 are not true. That is, there may not exist atomic moves of retiming

(over a communication graph) which meet arbitrary targeting changes on an STG. Unlike a

retiming operation, a resynthesis operation provides the capability of modifying the vertices

and connections of a communication graph.

41


4.3.2 Optimization Power of Retiming and Resynthesis

A resynthesis operation itself cannot contribute any changes to the STG of an FSM.

However, when combined with retiming, it becomes a handy tool. In essence, the combi-

nation of retiming and resynthesis validates the converse of Lemmas 10 and 11 as will be

shown in Theorem 13. Moreover, it determines the transitions of newly created states due

to forward retiming moves, and thus has decisive effects on initialization sequences as will

be discussed in Section 4.5.2. On the other hand, we shall mention an important property

about retiming and resynthesis operations.

Lemma 12 Given an FSM, the newly created states (not merged with original existing

states due to immediate equivalence) due to atomic forward moves of retiming remain dan-

gling throughout iterative retiming and resynthesis operations.

Remark 1 As an orthogonal issue to our discussion on how retiming and resynthesis

can alter the STG of an FSM, the transformation of retiming and resynthesis was shown

[MSBSV91] to have the capability of exploiting various state encodings (or assignments) of

the FSM.

Notice that the induced state space of the dangling states originating from atomic

moves of retiming is immaterial in our study of the optimization capability of retiming and

resynthesis because an FSM after initialization never reaches such dangling states. An exact

characterization of the optimization power of retiming and resynthesis is given as follows.

Theorem 13 Ignoring the (unreachable) dangling states created due to retiming, two FSMs

are transformable to each other through retiming and resynthesis if, and only if, their state

transition graphs are transformable to each other by a sequence of splitting a state into

multiple immediately equivalent states and of merging multiple immediately equivalent states

into a single state.

42


Proof. (=⇒) Since resynthesis does not change the transition functions of an FSM, the

proof is immediate from Lemmas 10 and 11.

(⇐=) Given a target sequence of merging and splitting of immediately equivalent states,

it can be accomplished by a sequence of retiming and resynthesis. Essentially, each merging

(resp. splitting) of states can be achieved with a resynthesis operation followed by a forward

(resp. backward) retiming operation. To see why, let Σ and Q be the set of input alphabets

and states of M, respectively. Without loss of generality, assume that q1, q2 ∈ Q are

immediately equivalent states to be merged. A resynthesis operation can rewrite the original

transition functions ~δ : Σ × Q → Q as a composition of two parts, ~δ = ~∆2 ~∆1, where

~∆1 : Q → Q\q2, ~∆2 : Σ × Q\q2 → Q and ~∆1(q2) = q1. Retiming registers to the

positions in-between ~∆1 and ~∆2 merges q2 to q1. It is not hard to see that the retiming

operation is always possible. On the other hand, assume q ∈ Q is the state to be split

into multiple immediately equivalent states Q†. A resynthesis operation can again rewrite

the original transition functions ~δ as a composition of two parts, ~δ = ~∆4 ~∆3, where

~∆3 : Σ × Q → Q† ∪ Q\q, ~∆4 : Q† ∪ Q\q → Q, ~∆3(σ, qi) ∈ Q† for ~δ(σ, qi) = q, and

~∆4(q†) = q for q† ∈ Q†. Retiming registers to the positions in-between ~∆3 and ~∆4 splits q

to Q†. Notice that the retiming is always possible because the output functions, originally

depending on Q, can be rewritten (by resynthesis) as functions depending on Q† ∪Q\q.Consequently, any sequence of merging and splitting of immediately equivalent states is

achievable using retiming and resynthesis operations.

(A similar result of Theorem 13 appeared in [RSSB98], where however the optimization

power of retiming and resynthesis was over-stated as will be detailed in Section 4.6.) From

Theorem 13, one can relate two FSMs before and after the transformation of retiming and

resynthesis as follows.

Corollary 14 Given two FSMs M = (Q, I,Σ, Ω, ~δ, ~ω) and M† = (Q†, I†,Σ, Ω, ~δ†, ~ω†), M

43


and M† are transformable to each other through retiming and resynthesis operations if, and

only if, there exists a relation R ⊆ Q×Q† satisfying

1. Any non-dangling state q ∈ Q (resp. q† ∈ Q†) has at least one non-dangling state

q† ∈ Q† (resp. q ∈ Q) such that (q, q†) ∈ R.

2. State pair (q, q†) ∈ R if, and only if, ~ω(σ, q) ≡ ~ω†(σ, q†) and (~δ(σ, q), ~δ†(σ, q†)) ∈ R

for any σ ∈ Σ.

Notice that the statements of Theorem 13 and Corollary 14 are nonconstructive in the sense

that no procedure is given to determine if two FSMs are transformable to each other under

retiming and resynthesis. This weakness motivates us to study a constructive alternative.

Remark 2 One can show that peripheral retiming [MSBSV91] combined with resynthesis

does not increase the transformation power of normal retiming combined with resynthesis.

That is, the former can be achieved with the latter as we discuss below.

Peripheral retiming and resynthesis work as follows. A peripheral retiming operation is

performed on a communication graph G = (V, E) such that edges with negative weights are

allowed to exist temporarily. A resynthesis operation is then performed on G, yielding a

new communication graph G† = (V †, E†). Another retiming operation on G†, yielding G‡ =

(V †, E‡), must ensure that all edges E‡ are of non-negative weights. (If the last step fails, the

entire transformation is illegal. We are only concerned with legal transformations.) Observe

that the edges with non-zero weights in E† survive throughout the above operations. That is,

these edges also appear in E and E‡ ignoring the weight changes. With a similar reasoning

of Lemmas 10 and 11, the state spaces Q and Q‡ induced by G and G‡, respectively,

can be related with the valuations of these edges. This relation makes the transformation

achievable with a resynthesis operation followed by a normal retiming operation. Essentially,

the resynthesis operation rewrites the original transition functions ~δ : Σ×Q → Q induced

44


by G as a composition of ~∆2 ~∆1 according to the above relation, where ~∆1 : Σ×Q → Q‡

and ~∆2 : Q‡ → Q. The retiming operation, on the other hand, moves registers in-between

~∆1 and ~∆2.

It is worth mentioning that, although peripheral retiming in theory does not increase

the transformation power, it is useful in practice to find good rewrites.

4.3.3 Retiming-Resynthesis Equivalence and Canonical Representation

Given an FSM, the transformation of retiming and resynthesis operations can rewrite

it into a class of equivalent FSMs (constrained by Corollary 14). We ask if there exists a

computable canonical representative in each such class, and answer this question affirma-

tively by presenting a procedure constructing it. Rather than arguing directly over FSMs,

we simplify our exposition by arguing over STGs.

Because retiming and resynthesis operations are reversible, we know

Proposition 15 Given STGs G, G1, and G2. Suppose G1 and G2 are derivable from G

using retiming and resynthesis operations. Then G1 and G2 are transformable to each other

under retiming and resynthesis.

We say that two FSMs (STGs) are equivalent under retiming and resynthesis if they are

transformable to each other under retiming and resynthesis. Thus, any such equivalence

class is complete in the sense that any member in the class is transformable to any other

member. To derive a canonical representative of any equivalence class, consider the algo-

rithm outlined in Figure 4.1. Similar to the general state minimization algorithm [Koh78],

the idea is to seek a representative minimized with respect to the immediate equivalence

of states. However, unlike the least-fixed-point computation of the general state minimiza-

tion, the computation in Figure 4.1 looks for a greatest fixed point. Given an STG, the

45


ConstructQuotientGraphinput: a state transition graph Goutput: a state-minimized transition graph w.r.t. immediate equivalencebegin01 remove dangling states from G02 repeat03 compute and merge immediately equivalent states of G04 until no merging performed05 return the reduced graphend

Figure 4.1. Algorithm: Construct quotient graph.

algorithm first removes all the dangling states, and then iteratively merges immediately

equivalent states until no more states can be merged.

Theorem 16 Given an STG G, Algorithm ConstructQuotientGraph produces a canonical

state-minimized solution, which is equivalent to G under retiming and resynthesis.

Proof. It is clear that the algorithm always terminates for finite state transition graphs.

Recall our assumption that FSMs are of implicit reset. Since dangling states do not

affect the normal operation of an FSM (but affect its initialization), the algorithm can safely

remove the state space induced by the dangling states and consider only the remaining state

space. (See also Proposition 20.)

For the sake of contradiction, assume the algorithm produces two different (non-

isomorphic) quotient graphs G1/ and G2/ for two given STGs G1 and G2, respectively,

which are equivalent under retiming and resynthesis. Because the algorithm merges only

immediately equivalent states, G1/ and G2/ must also be equivalent under retiming and

resynthesis (but not isomorphic by assumption). Since G1/ and G2/ are not isomorphic,

there does not exist a bijection (a one-to-one and onto mapping) between states of G1/ and

states of G2/ such that the bijection preserves immediate equivalence. Two cases need to

be considered. First, there exists an onto but not one-to-one mapping from one graph to

46


VerifyEquivalenceUnderRetiming&Resynthesisinput: two state transition graphs G1 and G2

output: Yes, if G1 and G2 are equivalent under retiming and resynthesisNo, otherwise

begin01 G1/ := ConstructQuotientGraph(G1)02 G2/ := ConstructQuotientGraph(G2)03 if G1/ and G2/ are isomorphic04 then return Yes05 else return Noend

Figure 4.2. Algorithm: Verify equivalence under retiming and resynthesis.

the other which preserves immediate equivalence. In this case, not both G1/ and G2/ are

maximally reduced. It contradicts with the assumption that any two states in a quotient

graph cannot be immediately equivalent. Second, there exists no mapping preserving im-

mediate equivalence. However, from Proposition 15, we know that G1/ is transformable to

G1, then to G2, and finally to G2/. Hence a mapping that preserves immediate equivalence

must exist between G1/ and G2/. Again a conflict arises. The theorem follows.

For a naıve explicit enumerative implementation, Algorithm ConstructQuotientGraph is of

time complexity O(kn3), where k is the size of input alphabet and n is number of states.

A prudent refinement similar to the Paige-Tarjan algorithm [PT87] can further reduce the

complexity to O(kn log n). (Notice that the complexity is exponential when the input is

an FSM, instead of an STG representation.) For an implicit symbolic implementation, the

complexity depends heavily on the internal symbolic representations. If Step 3 in Figure 4.1

computes and merges all immediately equivalent states at once in a breadth-first-search

manner, then the algorithm converges in a minimum number of iterations.

From the proof of Theorem 16, an algorithm outlined in Figure 4.2 can check if two

STGs are transformable to each other under retiming and resynthesis.

47


Theorem 17 Given two state transition graphs, Algorithm VerifyEquivalenceUnderRetim-

ing&Resynthesis verifies if they are equivalent under retiming and resynthesis.

Proof. A direct consequence of Theorem 16.

The complexity of the algorithm in Figure 4.2 is the same as that in Figure 4.1 since the

graph isomorphism check for STGs is O(kn), which is not the dominating factor. With the

presented algorithm, checking the equivalence under retiming and resynthesis is not easier

than general equivalence checking. In the following section, we investigate its intrinsic

complexity.

4.4 Verification Complexity

We show some complexity results of verifying if two FSMs are equivalent under iterative

retiming and resynthesis.

4.4.1 Verification with Unknown Transformation History

We investigate the complexity of verifying the equivalence of two FSMs with unknown

history of retiming and resynthesis operations.

Theorem 18 Determining if two FSMs are equivalent under retiming and resynthesis with

unknown transformation history is PSPACE-complete.

Proof. Certainly Algorithm VerifyEquivalenceUnderRetiming&Resynthesis can be per-

formed in polynomial space (even with inputs in FSM representations).

On the other hand, we need to reduce a PSPACE-complete problem to our problem at

hand. The following problem is chosen.

48


Given a total function f : 1, . . . , n → 1, . . . , n, is there a composition of fsuch that, by composing f k times, fk(1) = n?

In other words, the problem asks if n is “reachable” from 1 through f . It was shown

[Jon75] to be deterministic2 LOGSPACE-complete in the unary representation and, thus,

PSPACE-complete in the binary representation [Pap94]. We show that the problem in

the unary (resp. binary) representation is log-space (resp. polynomial-time) reducible to

our problem with inputs in STG (resp. FSM) representations. We further establish that

the answer to the PSPACE-complete problem is positive if and only if the answer to the

corresponding equivalence verification problem (to be constructed) is negative. Since the

complexity class of nondeterministic space is closed under complementation [Imm88], the

theorem follows.

To complete the proof, we elaborate the reduction. Given a function f as stated earlier,

we construct two total functions f1, f2 : 0, 1, . . . , n → 0, 1, . . . , n as follows. Let f1 have

the same mapping as f over 1, . . . , n − 1 and have f1(0) = 1 and f1(n) = 1. Also let

f2 have the same mapping as f with f2(0) = 1 but f2(n) = 0. Clearly the constructions

of f1 and f2 can be done in log-space. Treating 0, 1, . . . , n as the state set, f1 and f2

specify the transitions of two STGs, say G1 and G2, (which have empty input and output

alphabets). Observe that any state of G1 (similarly G2) has exactly one next-state. Thus,

every state is either in a single cycle or on a single path leading to a cycle. Observe also that

two states of G1 (similarly G2) are immediately equivalent if and only if they have the same

next-state. An important consequence of these observations is that all states not in cycles

can be merged through iterative retiming and resynthesis due to immediate equivalence.

To see the relationship between reachability and equivalence under retiming and resyn-

thesis, consider the case where n is reachable from 1 through f . States 1 and n of G1 must

2It is a well-known result by Savitch [Sav70] that deterministic and nondeterministic space complexitiescoincide.

49


be in a cycle excluding state 0; states 1 and n of G2 must be in a cycle including state 0.

Hence the state-minimized (with respect to immediate equivalence) graphs of G1 and G2

are not isomorphic. That is, G1 and G2 are not equivalent under retiming and resynthesis.

On the other hand, consider the case where n is unreachable from 1 through f . Then state

n of G1 and state n of G2 are dangling. From the mentioned observations, merging dangling

states in G1 and G2 yields two isomorphic graphs. That is, G1 and G2 are equivalent under

retiming and resynthesis. Therefore, n is reachable from 1 through f if, and only if, G1 and

G2 are not equivalent under retiming and resynthesis. (Notice that, unlike the discussion

of optimization capability, here we should not ignore the effects of retiming and resynthesis

over the unreachable state space.)

4.4.2 Verification with Known Transformation History

By Theorem 18, verifying if two FSMs are equivalent under retiming and resynthesis

without knowing the transformation history is as hard as the general equivalence checking

problem. Thus, we advocate a conservative design methodology optimizing synchronous

hardware systems to ameliorate verifiability.

An easy approach to circumvent the PSPACE-completeness is to record the history

of retiming and resynthesis operations as verification checkpoints, or alternatively to per-

form equivalence checking after every retiming or resynthesis operation. The reduction in

complexity results from the following well-known facts.

Proposition 19 Given two synchronous hardware systems, verifying if they are trans-

formable to each other with retiming only is of the same complexity as checking graph

isomorphism; verifying if they are transformable to each other with resynthesis only is of

the same complexity as combinational equivalence checking, which is NP-complete.

50


Therefore, if transformation history is completely known, the verification complexity reduces

to NP-complete.

4.5 Initialization Sequences

To discuss initialization sequences, we rely on the following proposition of Pixley [Pix92].

Proposition 20 ([Pix92]) An FSM is initializable only if its initial states are non-

dangling. (In fact, any non-dangling state can be used as an initial state by suitably modi-

fying initialization sequences.)

By Lemma 12, Corollary 14 and Proposition 20, it is immediate that

Corollary 21 The initializability of an FSM is an invariant under retiming and resynthe-

sis.

Hence we shall assume that the given FSM M is initializable. Furthermore, we assume

that its initialization sequence is given as a black box. That is, we have no knowledge on

how M is initialized. Under these assumptions, we study how the initialization sequence

is affected when M is retimed (and resynthesized). As shown earlier, the creation and

annihilation of dangling states are immaterial to the optimization capability of retiming

and resynthesis. However, they play a decisive role in affecting initialization sequences.

In essence, the longest transition path among dangling states determines how long the

initialization sequences should be increased.

51


4.5.1 Initialization Affected by Retiming

Lag-dependent bounds.

Effects of retiming on initialization sequences were studied by Leiserson and Saxe in

[LS83], where their Retiming Lemma can be rephrased as follows.

Lemma 22 ([LS83]) Given a communication graph G = (V, E) and a normalized retime

function ρ, let ` = maxv∈V −ρ(v) and let G† be the corresponding retimed communication

graph of G. Suppose M and M† are the FSMs specified by G and G†, respectively. Then

after M† is initialized with an arbitrary input sequence of length `, any state of M† has an

equivalent3 state in M.

That is, ` (nonnegative for normalized ρ)4 gives an upper bound of the increase of initializa-

tion sequences under retiming. This bound was further tightened in [EMMRM97, SPRB95]

by letting ` be the maximum of −ρ(v) for all v of functional elements whose functions define

non-surjective mappings. Unfortunately, this strengthening still does not produce an exact

bound. Moreover, by Proposition 8, a normalized retime function among its equivalent

retime functions may not be the one that gives the tightest bound. A derivation of exact

bounds will be discussed in Section 4.5.2.

Lag-independent bounds.

Given a synchronous hardware system, a natural question is if there exists some bound

which is universally true for all possible retiming operations. Even though the bound may

be looser than lag-dependent bounds, it discharges the construction of new initialization

3A state q of FSM M is equivalent to a state q† of FSM M† if M starting from q, and M† starting fromq† have the same input-output behavior.

4Recall that normalized ρ is when ρ(host) = 0.

52


sequences from knowing what retime functions have been applied. Indeed, such a bound

does exist as exemplified below.

Proposition 23 Given a communication graph G = (V, E) and a normalized retime func-

tion ρ, let r(v) denote the minimum number of registers along any path from the host to

vertex v. Then r(v) sets an upper bound of the number of registers that can be moved for-

ward across v, i.e., −r(v) ≤ ρ(v). (Similarly, r(v) on G with reversed edges sets an upper

bound of ρ(v).)

Thus, maxv r(v), which is intrinsic to a communication graph and is independent of retiming

operations, yields a lag-independent bound.

When initialization delay is not a concern for a synchronous system, one can even relax

the above lag-independent bound by saying that the total number of registers of the system is

another lag-independent bound. As an example, suppose a system has one million registers

and its retimed version runs at one gigahertz clock frequency. Then the initialization delay

increased due to retiming is less than a thousandth of a second.

4.5.2 Initialization Affected by Retiming and Resynthesis

So far we have focused on initialization issues arising when a system is retimed only. Here

we extend our study to issues arising when a system is iteratively retimed and resynthesized.

A difficulty emerges from directly applying Lemma 22 to bound the increase of ini-

tialization sequences under iterative retiming and resynthesis. Interleaving retiming with

resynthesis makes the union bound∑

i ui the only available bound from Lemma 22, where

ui denotes the lag-dependent bound for the ith retiming operation. Essentially, inaccuracies

accumulate along with the summation of the union bound. Thus, the bound derived this

way can be far beyond what is necessary. In the light of lag-independent bounds discussed

53


earlier, one might hope that there may exist some constant which upper bounds the increase

of initialization sequences due to any iterative retiming and resynthesis operations. (Notice

that, when no resynthesis operation is performed, the transformation of a series of retiming

operations can be achieved by a single retiming operation. Thus a lag-independent bound

exists for iterative retiming operations.) Unfortunately, such a transformation-independent

bound does not exist as shown in Theorem 25.

Lemma 24 Any dangling state of an FSM (with implicit reset) is removable through iter-

ative retiming and resynthesis operations.

Proof. By Proposition 20, the initial states of an FSM M with implicit reset must be

non-dangling. Removing dangling states cannot affect the behavior of M. Essentially,

states without predecessor states can be eliminated with a resynthesis operation followed

by a retiming operation. To see why this is the case, let Σ be the set of input alphabets,

Q be the set of states of M, and Q† ⊆ Q be the subset of states with predecessors. A

resynthesis operation can rewrite the original transition functions ~δ : Σ × Q → Q as a

composition of three parts ~δ = ~∆−1 ~∆ ~δ, where ~∆ : Q → Q†, ~∆−1 : Q† → Q, and

~∆−1 ~∆ is an identity mapping. (Notice that ~∆−1 exists because states Q\Q† have empty

pre-image.) Retiming registers to the positions in-between ~∆ and ~∆−1 eliminates states

with no predecessors. (The retiming operation is possible because the output functions of

M can take the intermediate valuation after ~δ and before the identity mapping ~∆−1 ~∆

as its state input.) Therefore, with iterative retiming and resynthesis, dangling states are

removable.

Theorem 25 Given a synchronous hardware system and an arbitrary constant c, there

always exist retiming and resynthesis operations on the system such that the length increase

of the initialization sequence exceeds c.

54


Proof. The theorem follows from Lemma 24 and the reversibility of the transformation of

retiming and resynthesis.

Since the mentioned union bound is inaccurate and requires knowing the applied re-

time functions, it motivates us to investigate the computation of exact5 length increase

of initialization sequences without knowing the history of retiming and resynthesis opera-

tions. The length increase can be derived by computing the length, say n, of the longest

transition paths among the dangling states because applying an arbitrary6 input sequence

of length greater than n drives the system to a non-dangling state. The length n can be

obtained using a symbolic computation. By breadth-first search, one can iteratively remove

states without predecessor states until a greatest fixed point is reached. The number of the

performed iterations is exactly n.

4.6 Related Work

Optimization capability.

The closest to our work on the optimization power of retiming and resynthesis is

[RSSB98], where the optimization power was unfortunately over-stated contrary to the

claimed exactness. The mistake resulted from the claim that any 2-way switch operation

is achievable using 2-way merge and 2-way split operations (see [RSSB98] for their defi-

nitions). Figure 4.3 shows a counterexample illustrating a 2-way switch operation that is

not achievable with 2-way merge and split operations. (Essentially, a restriction needs to

be imposed — under any input assignment, the next state of a current state to be split

should be unique.) In fact, only 2-way merge and split operations are essential. Aside from

5The exactness is true under the assumption that the initialization sequence of the original FSM is givenas a block box. If the initialization mechanism is explored, more accurate analysis may be achieved.

6Although exploiting some particular input sequence may shorten the length increase, it complicates thecomputation.

55


( ii )

01s

s 2

0

0

0, 11

1

s 01s

s 2

0, 11

10 0

( i )

s

Figure 4.3. The STG in (i) is transformable to the STG in (ii) by a 2-way switch operationwhile the reverse direction is not transformable. Since the operation is not reversible, itfalls beyond the transformation power of retiming and resynthesis.

this minor error, no constructive algorithm was known to determine if two given FSMs are

equivalent under iterative retiming and resynthesis. In addition, not discussed were the

creation and annihilation of dangling states, which we show to be crucial in initializing

synchronous hardware systems.

Verification complexity.

Ranjan in [Ran97] examined a few verification complexities for cases under one retiming

operation and up to two resynthesis operations with unknown transformation history. The

complexity for the case under an arbitrary number of iterative retiming and resynthesis

operations was left open, and was conjectured in [ZSA98] to be easier than the general

equivalence checking problem. We disprove the conjecture.

Initialization sequences.

For systems with explicit reset, the effect of retiming on initial states was studied in

[TB93, ESS96, SMB96]. In the explicit reset case, incorporating resynthesis with retiming

does not contribute additional difficulty. Note that, for systems with explicit-reset registers,

forward moves of retiming are preferable to backward moves in maintaining equivalent initial

56


states, contrary to the case for systems with implicit-reset registers. To prevent backward

moves, Even et al. in [ESS96] proposed an algorithm to find a retime function such that the

maximum lag among all vertices is minimized. Interestingly enough, their algorithm can be

easily modified to obtain minimum lag-dependent bounds on the increase of initialization

sequences. As mentioned earlier, explicit reset can be seen as a special case of implicit reset

when reset circuitry is explicitly represented in the communication graph. Hence, the study

of the implicit reset case is more general, and is subtler when considering resynthesis in

addition to retiming.

Pixley in [Pix92] studied the initialization of synchronous hardware systems with im-

plicit reset in a general context. Leiserson and Saxe studied the effect of retiming on

initialization sequences in [LS83], where a lag-dependent bound was obtained and was later

improved by [EMMRM97, SPRB95]. We show a lag-independent bound instead. In recent

work [MSM04], a different approach was taken to tackle the initialization issue raised by

retiming. Rather than increasing initialization sequence lengths, a retimed system was fur-

ther modified to preserve its original initialization sequence. This modification might need

to pay area/performance penalties and could nullify the gains of retiming operations. In ad-

dition, the modification requires expensive computation involving existential quantification,

which limits the scalability of the approach to large systems. In comparison, prefixing an

arbitrary input sequence of a certain length to the original initialization sequence provides

a much simpler solution (without modifying the system) to the initialization problem.

On the other hand, we extend our study to the unexplored case of iterative retiming

and resynthesis, and show the unboundability of the increase of initialization sequences.

Finally, our exact analysis on the increase of initialization sequences is applicable to the

case of iterative retiming and resynthesis and improves the bound of [EMMRM97, SPRB95].

57


4.7 Summary

This chapter demonstrated some transformation invariants under retiming and resyn-

thesis. Three main results about retiming and resynthesis were established. First, an algo-

rithm was presented to construct a canonical representative of an equivalence class of FSMs

transformed under retiming and resynthesis. It was extended to determine if two FSMs are

transformable to each other under retiming and resynthesis. Second, a PSPACE-complete

complexity was proved for the above problem when the transformation history of retim-

ing and resynthesis is unknown. Hence to reduce complexity (from PSPACE-complete to

NP-complete) it is indispensable to maintain transformation history or to check equivalence

after every retiming or resynthesis operation. Third, the effects of retiming and resynthesis

on initialization sequences were studied. A lag-independent bound was shown on the length

increase of initialization sequences of FSMs under retiming; in contrast, unboundability

was shown on the case under retiming and resynthesis. In addition, an exact analysis on

the length increase was presented. We believe our results will enhance the practicality of

retiming and resynthesis for the optimization of synchronous hardware systems.

58

Chapter 5

Equivalence Verification

The state-explosion problem limits formal verification to small- or medium-sized sequen-

tial circuits partly because the sizes of binary decision diagrams (BDDs) heavily depend on

the number of variables dealt with. In the worst case, a BDD size grows exponentially with

the number of variables. Thus, reducing this number can possibly increase the verification

capacity. In particular, this chapter shows how sequential equivalence checking can be done

in the sum state space.

Given two finite state machinesM1 andM2 with numbers of state variables m1 and m2,

respectively, conventional formal methods verify equivalence by traversing the state space

of the product machine, with m1 + m2 registers. In contrast, this chapter introduces a

different possibility, based on partitioning the state space defined by a multiplexed machine,

which can have merely maxm1, m2 + 1 registers. This substantial reduction in state

variables potentially enables the verification of larger instances. Experimental results show

the approach can verify benchmarks with up to 312 registers, including all of the control

outputs of microprocessor 8085.

59

CHAPTER 5. EQUIVALENCE VERIFICATION

5.1 Introduction

Sequential equivalence checking plays a crucial role in very large scale integration de-

sign to ensure functional correctness. It has been greatly advanced since symbolic tech-

niques [CBM89] were used in formal methods based on state-space traversal. However,

these formal methods cannot be scaled as easily with the increasing complexity of system

designs due to the state explosion problem, which says that the state space grows expo-

nentially in the number of state variables. Therefore, recent research [CQS00, KB01] has

focused on reducing the number of state variables by retiming [LS83], with the hope that

verification can be conducted on the reduced circuits. Unlike these circuit-based transfor-

mations, this chapter reduces the register count in the verification construction. Moreover,

the verification itself is structure-independent, that is, neither circuit similarities nor register

correspondences [vE00] are assumed.

In this chapter, we reason about sequential equivalence based on the fact that two FSMs

are equivalent if, and only if, their initial states are equivalent. To identify equivalent states

of an FSM, BDDs [Bry86] were used in [HJJ+96, LTN90] and [Pix92] for symbolic exe-

cution. The fixed-point computation in [LTN90] and [Pix92] is carried out on a product

machine constructed over two identical copies of the FSM. As shown in [Fil91], when the

product machine is constructed over two FSMs under comparison, the same computation

can be used for sequential equivalence checking. In contrast to the approach of [LTN90]

and [Pix92], the computation in [HJJ+96] for equivalent state identification is done on

the original FSM without constructing a product machine. However, an n-state FSM in

[HJJ+96] is represented by n shared n-terminal BDDs. This representation may be expen-

sive in practice. In contrast, we identify equivalent states by applying BDD-based functional

decomposition [LPV93] to keep the computation in the original FSM without any special

representation. Since the computation is in a single FSM, we introduce the multiplexed ma-

60


chine to combine two FSMs into one. Thereby we can transform the sequential equivalence

checking problem to the state equivalence problem of a multiplexed machine.

Our equivalence checking technique avoids state traversal, by partitioning the state

space based on equivalence relations among states [Koh78]. Rather than reason about

the sequential equivalence in the product state space of two sequential machines under

comparison, we achieve this attempt in the sum state space. Compared to product-machine-

based verification, the proposed approach almost halves the number of state variables. More

precisely, checking the equivalence of two n-input FSMs M1 and M2 with m1 and m2 state

variables respectively, our method can keep the total number of variables to be at most

n + maxm1,m2+ 1 + dlog2(minm1,m2+ 1)e. Hence, the BDD sizes in our verification

technique could be much smaller than those in product-machine-based techniques.

Unlike previous verification techniques of [CBM89] and [Fil91], the efficiency of our

approach depends heavily on the encountered number of equivalence classes of states. Since

each equivalence class is represented by a BDD node, our approach is limited to instances

with less than ∼106 equivalence classes per output. Fortunately, it is applicable in most

practical applications. On the other hand, because the number of equivalence classes in

the reachable state subspace is an invariant, our technique tends to be more robust than

previous approaches in verifying different implementations of a design. For high-speed

designs, registers are mostly added to reduce cycle time not to increase the number of

equivalence classes. (For example, backward retiming cannot increase equivalence classes.)

In such designs, our proposed technique should be preferable to those of [CBM89] and

[Fil91].

The contributions of this chapter are as follows. We apply BDD-based functional de-

composition to the identification of equivalent states. Two important consequences are the

elimination of universal and existential quantifications, and the possible simplification with

61


respect to the reachable state subspace. To extend the above computation for sequential

equivalence checking, we introduce the multiplexed machine such that the verification can

be done in the sum state space. In addition, several techniques are proposed to enhance the

computational robustness; several properties are analyzed to contrast different verification

techniques.

The remainder of this chapter is organized as follows. Preliminaries and definitions are

given in Section 5.2. After introducing the technique for equivalent state identification in

Section 5.3, we present our equivalence checking algorithm in Section 5.4 and analyze its

properties in Section 5.5. Experimental results are then given in Section 5.6, and conclusions

in Section 5.8.

5.2 Definitions, Notation and Preliminaries

5.2.1 Equivalence Relations and Partitions

An equivalence relation is a binary relation on a set, S, satisfying reflexive, symmetric

and transitive laws and induces a unique partition π on S. The partition is a set π =

E1, E2, . . . of subsets of S such that

• Ei 6= ∅ for all i;

• Ei ∩ Ej = ∅ for all i 6= j;

• E1 ∪ E2 ∪ · · · = S.

Each Ei forms an equivalence class. Two elements in the same class satisfy the equivalence

relation, but elements in different classes do not. For two equivalence relations R1 and R2

with partitions π1 and π2, respectively, R1 ⊆ R2 if, and only if, π1 is a refinement of π2,

62


denoted as π1 ¹ π2, i.e., each equivalence class of π1 is contained in some equivalence class

of π2. On the other hand, the product of two arbitrary partitions, π1 and π2, denoted

π1 · π2, is the partition corresponding to the relation R1 ∩ R2, i.e. two elements are in the

same equivalence class of π1 · π2 if, and only if, they are both in one equivalence class of π1

as well as in one of π2.

Given an FSM M, its output and transition functions define an equivalence relation,

denoted ≡M, and, thus, induce a partition, denoted πM, over the state space of M =

(Q, I,Σ, Ω, ~δ, ~ω). In this chapter, we concentrate on equivalence relations on a set of states.

Two states q1, q2 ∈ Q are equivalent, satisfying q1 ≡M q2, if, and only if, by using them as

initial states, no input sequence can result in different output sequences. To approximate

state equivalence, we define a k-equivalence relation, denoted ∼=kM, and say two states q1

and q2 are k-equivalent, satisfying q1∼=kM q2, if, and only if, they are indistinguishable

under all input sequences with length up to k. Also, say two states (or FSMs) are k-

distinguishable if k is the shortest length of the input sequences that differentiate them. We

denote the partition associated with ∼=kM as πk

M. To derive πM from the approximation, we

have πM = πkM if πk

M = πk−1M for large enough k, that is, a fixed point has been reached.

Similarly, we define a 〈k〉-equivalence relation, denoted as ∼=〈k〉M . Two states q1 and q2 satisfy

q1∼=〈k〉M q2 if, and only if, by using them as initial states, the outputs at the kth step are

equal for any length-k input sequence. The corresponding partition of ∼=〈k〉M is denoted π

〈k〉M .

By definition, we can derive the following lemma.

Lemma 26 For a Moore machine and k ≥ 1,

π〈k〉M = Ei | q1, q2 ∈ Ei iff ~δ(σ, q1), ~δ(σ, q2) ∈ Ej ∈ π

〈k−1〉M , for any σ ∈ Σ, for some j

and

π〈0〉M = Ei | q1, q2 ∈ Ei iff ~ω(q1) = ~ω(q2).

For a Mealy machine and k ≥ 2,

π〈k〉M = Ei | q1, q2 ∈ Ei iff ~δ(σ, q1), ~δ(σ, q2) ∈ Ej ∈ π

〈k−1〉M , for any σ ∈ Σ, for some j,

63


π〈0〉M = Q

and

π〈1〉M = Ei | q1, q2 ∈ Ei iff ~ω(σ, q1) = ~ω(σ, q2), ∀σ ∈ Σ.

Proof. The base cases are direct results of the definition. Now we show the connection

between π〈k〉M and π

〈k−1〉M . There exists a length-(k − 1) input sequence to distinguish two

states q1 and q2 at the output of the (k− 1)st step if, and only if, q1 and q2 are in different

equivalence classes of π〈k−1〉M . Therefore, two states, say q3 and q4, cannot be distinguished

at the output of the kth step if, and only if, their successor states, i.e., ~δ(σ, q3) and ~δ(σ, q4),

are in the same equivalence class of π〈k−1〉M for any σ ∈ Σ.

The connection between π·M and π〈·〉M is indicated in Proposition 27.

Proposition 27 For an FSM M, two states are in the same equivalence class defined by

πkM if, and only if, they are in the same equivalence class of π

〈0〉M , of π

〈1〉M , . . . , and of π

〈k〉M .

5.2.2 Functional Decomposition

In this chapter, we adopt functional decomposition [RK62] for partitioning the state

space to identify equivalent states and to verify sequential equivalence. In functional de-

composition, variables of a Boolean function are divided into two disjoint subsets, the

bound set and the free set. In BDD-based functional decomposition [LPV93], bound set

variables are ordered above free set ones. A cutset of a BDD is the set of (downward) edges

which cross the boundary defined by the bound set and free set variables. A BDD node is

called an equivalence node if there exists an edge in the cutset directing to it.

For a Boolean function f(~λ, ~µ), we can interpret the specification of the bound set

variables ~λ and free set variables ~µ as a partition over the space spanned by ~λ, denoted

Λ. That is, the set of all paths from the root of a BDD to an equivalence node forms an

64


equivalence class. Each such set represents a subspace of Λ. Two minterms λ1 and λ2 in

Λ are equivalent under arbitrary assignments of the free set variables, i.e., ∀~µ (f(λ1, ~µ) =

f(λ2, ~µ)), if, and only if, their corresponding paths in the BDD lead to the same equivalence

node.

Given a set of Boolean functions f1, . . . , fk, which do not necessarily have common

supports, we can always expand these to the same Boolean space spanned by the union of

the input variables of all functions. Let the bound set variables be ~λ. Then, the free set

variables ~µ are all variables excluding those in ~λ. Suppose we want to find the equivalence

classes of the minterms in Λ, such that two minterms λ1 and λ2 are equivalent under

arbitrary assignments of all other variables, i.e., ∀~µ,∀i (fi(λ1, ~µ) = fi(λ2, ~µ)), if, and only

if, these two minterms are in the same equivalence class. To represent equivalence classes

by a BDD as in the single function case, we can construct a hyperfunction F [JJH01] of

f1, . . . , fk by adding dlog2 ke, new free set binary variables, ~η, to encode these functions.

Assume the overall free set variables become ~µ′. Thus, two minterms λ1 and λ2 in Λ have

∀~µ′(F(λ1, ~µ′) = F(λ2, ~µ

′)) if, and only if, their corresponding paths in the BDD of F lead

to the same equivalence node.

5.3 Identification of State Equivalence

To find a minimum state FSM, equivalent to a given one, equivalent states are identified.

Since each state in an equivalence class (of reachable states) can represent the entire class,

the number of states of the minimum state FSM equals the number of the equivalence

classes of the original FSM. This section proposes a more direct approach than those of

[LTN90] and [Pix92] to locate equivalent states, in the sense that we deal with equivalence

classes instead of equivalence relations. Given an FSM, we show that BDD-based functional

decomposition can be exploited to extract equivalence classes of states.

65


Our approach seems conceptually similar to that in [HJJ+96], where an FSM with n

states is represented by n shared n-terminal BDDs. However, functional decomposition

does not apply in this representation. As a result, the basic operations are of fundamental

difference. Moreover, since our computation operates directly on the output and transition

functions, it is representatively more efficient than the previous work.

5.3.1 State Equivalence vs. Functional Decomposition

In the base cases, π0M = π

〈0〉M for a Moore machine M and π1

M = π〈1〉M for a Mealy

machine, output function ~ω plays the central role, as indicated in Lemma 26. Examining

the case for a Moore machine M, we can see that ~ω serves directly as the characteristic

function for π0M. On the other hand, the characteristic function of π1

M of a Mealy machine

M is not clearly indicated by ~ω. We relate BDD-based functional decomposition to the

computation of this characteristic function. In general ~ω is composed of a set of binary

functions ω1, ω2, . . . , ωk. According to Section 5.2.2, we have to construct the hyper

function F of ω1, ω2, . . . , ωk. The supports of F consist of three parts: state variables

~s, primary inputs ~r, and new added variables ~η for encoding the output functions. Let

~s be the bound set variables and the rest be the free set. Accordingly, the equivalence

nodes of the BDD of F represent the equivalence classes of π1M. Paths from the root to

an equivalence node are states in a corresponding equivalence class. At this point, we

can ignore the functions represented by these equivalence nodes. That is, we can get rid

of the BDD structures below these nodes. By re-encoding these nodes using alphabet

Ψ, (introducing dlog2 log2 |N |e binary variables suffices to re-express N equivalence nodes

because dlog2 log2 |N |e variables can generate at least N different functions, i.e., N nodes in

a BDD), we can have a characteristic function ψ for π1M of a Mealy machine M, ψ : Q → Ψ.

Playing a similar trick, we show how to compute the characteristic function of π〈k〉M , k = 1

66


input: ψ, the characteristic function of π〈k−1〉; τ , the function to be composedoutput: characteristic function of π〈k〉

begin01 form hyperfunction F of ψ τ02 build BDD of F with state variables above others03 re-encode equivalence nodes and simplify BDD04 return new characteristic function

end

Figure 5.1. Algorithm CompNewPartition: Compute New Partition.

or 2 for a Moore or Mealy machine respectively. Assume ψ : Q → Ψ is the characteristic

function derived from the last iteration (for both types of machines). Then the composition

function ψ ~δ, i.e., ψ(~δ(σ, q)), plays exactly the same role as ~ω in a Mealy machine, from

which we have shown how to derive a characteristic function of π〈1〉M . Consequently, by

functional decomposition of the hyperfunction of ψ ~δ, we have a characteristic function of

π〈1〉M for Moore and π

〈2〉M for Mealy machine M. The algorithm is summarized in Figure 5.1.

The function call is denoted as CompNewPartition. By Lemma 26, we can derive the

following theorem.

Theorem 28 Given the characteristic function of π〈k−1〉M and ~δ : Σ×Q → Q as the function

to be composed, CompNewPartition generates the characteristic function of π〈k〉M , where

k ≥ 1 (≥ 2) for Moore (Mealy) machine M.

5.3.2 Algorithm for Equivalent State Identification

To identify equivalent states, we have to compute πkM until it equals πk−1

M ; then πM =

πkM. Theorem 31 provides three alternatives to derive πM. Its proof is supported by

Lemma 29, which is restated as Lemma 30.

Lemma 29 Consider an FSM with transition function ~δ : Σ × Q → Q. Let π1 and π2 be

two arbitrary partitions on Q. For q1, q2 ∈ Q,

67


~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of π1 and of π2 for any σ ∈ Σ

if, and only if,

~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of π1 · π2 for any σ ∈ Σ.

Proof. Let R1 and R2 be corresponding equivalence relations of π1 and π2, respectively.

(=⇒) The condition we have implies (~δ(σ, q1), ~δ(σ, q2)) ⊆ Ri, i = 1, 2, ∀σ ∈ Σ. Thus,

(~δ(σ, q1), ~δ(σ, q2)) ⊆ R1 ∩R2, ∀σ ∈ Σ. Since R1 ∩R2 is the equivalence relation of π1 ·π2,

the proof follows.

(⇐=) From (~δ(σ, q1), ~δ(σ, q2)) ⊆ R1 ∩R2, ∀σ ∈ Σ, we obtain (~δ(σ, q1), ~δ(σ, q2)) ⊆Ri, i = 1, 2, ∀σ ∈ Σ. That is, ~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of π1

and of π2 for any σ ∈ Σ.

Lemma 30 For an FSM with transition function ~δ, assume π1 and π2 are two partitions

over the state space. Let ψ1, ψ2 and ψ1·2 be the characteristic functions of π1, π2, and

π1 · π2, respectively. For characteristic functions ψ′1 = CompNewPartition(ψ1, ~δ), ψ′2 =

CompNewPartition(ψ2, ~δ), and ψ′1·2 = CompNewPartition(ψ1·2, ~δ), their corresponding

partitions satisfy π′1 · π′2 = π′1·2.

Theorem 31 Given an FSM M, for a positive integer k

πkM = πk−

M · πk−1M (5.1)

=

πk−M · π0

M, if M is a Moore machine

πk−M · π1

M, if M is a Mealy machine(5.2)

= π〈0〉M · π〈1〉M · · ·π〈k〉M (5.3)

where πk−M = Ei | q1, q2 ∈ Ei iff ~δ(σ, q1) and ~δ(σ, q2) are in the same equivalence class of

πk−1M for any σ ∈ Σ.

Proof. We prove these equations by the order (5.3), (5.2), (5.1).

68


Equation (5.3): By the definition of πkM, states in an equivalence class are indistinguish-

able under length-k input sequences. According to Proposition 27, no outputs at steps from

0 to k can distinguish two states if, and only if, the states lie in the same equivalence class

of π〈0〉M , of π

〈1〉M , . . . , and of π

〈k〉M . Thus, by Lemma 29, the states stay in the same equivalence

class of π〈0〉M · π〈1〉M · · ·π〈k〉M .

Equation (5.2): Following the result of (5.3), we get πk−1M = π

〈0〉M · π〈1〉M · · ·π〈k−1〉

M . Sup-

pose we use the characteristic function of πk−1M and transition function ~δ as the inputs to

CompNewPartition. By Theorem 28 and Lemma 30, the output of the algorithm is πk−M , that

is, the characteristic function of π〈1〉M ·π〈2〉M · · ·π〈k〉M for a Moore machine or of π

〈2〉M ·π〈3〉M · · ·π〈k〉M

for a Mealy machine. Make a product partition with the initial partition induced by the

outputs. We derive πkM. (Note that π0

M is redundant for a Mealy machine.)

Equation (5.1): By expressing πk−M and πk−1

M in the product forms of π〈·〉M’s as in the

proof of (5.2), the equation follows.

Based on Equations (5.1)–(5.3) to derive πM, Figures 5.2–5.4 sketch three algorithms,

denoted as IDES5.1, IDES5.2 and IDES5.3, respectively. In these pseudocodes, “combine”

a set of characteristic functions means using the procedure in Figure 5.1 except F is the

hyperfunction of the set of characteristic functions.

These algorithms terminate in a finite number of iterations. IDES5.1 and IDES5.2

converge because the partitions over finite states are refined continuously and the number

of equivalence classes grows monotonically. On the other hand, because π〈k〉M in general is

not a refinement of π〈k−1〉M , IDES5.3 cannot simply determine the fixed point by comparing

the numbers of equivalence nodes in ψ〈k−1〉M and ψ

〈k〉M . Therefore, it is more expensive to do

fixed-point analysis. In general, one should check whether or not new equivalence classes

are created over previous partitions.

69


input: an FSM M = (Q, I,Σ, Ω, ~δ, ~ω)output: characteristic function of πMbegin

01 if M is a Moore machine then ψ− := ~ω02 else ψ− := CompNewPartition(identity fn, ~ω)03 ψ+ := CompNewPartition(ψ−, ~δ)04 while num. equiv. nodes of ψ+ 6= that of ψ− do05 ψ− := ψ+

06 ψ+ := CompNewPartition(ψ+, ~δ)07 ψ+ := combine ψ+ and ψ−08 return ψ+

end

Figure 5.2. Algorithm IDES5.1: Identify Equivalent States, Equation (5.1).

Although Figures 5.2 and 5.3 look quite similar, the major difference is in combining

two characteristic functions in Line 7 and Line 8, respectively. Despite keeping one more

characteristic function, IDES5.2 could require less memory than IDES5.1 because ψi has a

simpler BDD representation than ψ−. On the other hand, although IDES5.3 keeps all the

characteristic functions along iterations, it has maximal flexibility to arrange the combina-

tion of them to reduce peak memory consumption.

5.3.3 Robust Equivalent State Identification

The limitations of equivalent state identification using BDD-based functional decompo-

sition result from the explicit representation of equivalence classes and the restricted BDD

variable ordering. In this section we propose some possible techniques to reduce BDD sizes.

Using any underestimated unreachable states as the don’t care set, we can assign each

such unreachable state to any equivalence class of reachable states. This flexibility enables

the simplification of characteristic functions. However, because these algorithms use the

number of equivalence classes to decide fixed points, the number of equivalence classes with

solely unreachable states should be kept as a constant during the iterations. (Note that if

70



01 if M is a Moore machine then ψi := ~ω02 else ψi := CompNewPartition(identity fn, ~ω)03 ψ− := ψi

04 ψ+ := CompNewPartition(ψ−, ~δ)05 while num. equiv. nodes of ψ+ 6= that of ψ− do06 ψ− := ψ+

07 ψ+ := CompNewPartition(ψ+, ~δ)08 ψ+ := combine ψ+ and ψi

09 return ψ+

end


unreachable states are not used as don’t cares, there is no such restriction.) Otherwise, we

have to complicate the fixed-point condition by testing if an equivalence class is contained in

the don’t care set. Claim 32 shows BDD constrain [CM90] is a good simplification operator

satisfying this requirement. On the contrary, BDD restrict [CM90] violates it. However, a

BDD restrict followed by a constrain is a good operation.

Claim 32 Given a Boolean function f(~λ, ~µ) with bound set and free set variables ~λ and ~µ,

respectively, assume Λ is the space spanned by ~λ. Let c(~λ) be the characteristic function of

the care set of Λ. Then, constrain(f, c) eliminates all equivalence nodes whose corresponding

equivalence classes are contained in the don’t care set, and preserves all other equivalence

nodes.

Proof. Since BDD structures below equivalence nodes are irrelevant, we can think of f to

be another function g : Λ → N , where N is the set of equivalence nodes. As constrain(g, c)

has its range equal to the image g(λ) | c(λ) = true, ∀λ ∈ Λ, equivalence nodes not in

this image disappear from the range and those in this image remain in the range. (On the

71



01 ψ〈0〉 := identity function02 if M is a Moore machine03 then ψ〈0〉 := ~ω

04 ψ〈1〉 := CompNewPartition(ψ〈0〉, ~δ)05 else ψ〈1〉 := CompNewPartition(ψ〈0〉, ~ω)06 k := 107 while fixed point not reached do08 k := k + 109 ψ〈k〉 := CompNewPartition(ψ〈k−1〉, ~δ)10 return combine ψ〈i〉, i = 0, 1, . . . , k − 1

end


other hand, the restrict operator could increase c to c′, c ⊆ c′. Although equivalence nodes

in the original image are kept, some with solely unreachable states might exist.)

To reduce the impact of the restricted BDD variable ordering, we can use the following

strategy. Within the allowed threshold of a BDD size, find the variable ordering such that

the lowest state variable is as high as possible. Treat this variable and those above it as

bound set variables; all others are the free set ones. Then, compact the BDD such that

every node under the cutset is an equivalence node. Work on the new smaller BDD, and

apply variable reordering to it based on the same strategy, incrementally throwing away

unnecessary variables. On the other hand, since this ordering restriction emerges only from

functional decomposition, arbitrary ordering can be used in other BDD manipulations.

This restricted ordering is needed only when counting the number of equivalence classes

and constraining BDDs with respect to the reachable state subspace.

Directly building a single hyperfunction of a set of (binary) functions f1, . . . , fk may

be impractical. Fortunately, this can be avoided by computing equivalence classes incre-

mentally. For instance, first perform functional decomposition on f1. For each resultant

72


equivalence class, use it as the care set and others as the don’t care set. Hence, there is

a greater chance to build a hyperfunction for the simplified functions of f2, . . . , fk. (If it

fails, we can deepen the recursion level to extract more don’t cares.) Conducting functional

decomposition on it, the equivalence classes in the care set are encoded using new binary

functions. In this way, BDD sizes are kept small. This approach trades time for memory.

We can also explore flexibility to reduce a partition before using it to compute a new

partition. Given two partitions π1 and π2, we say any π†1 (6= π1) satisfying π†1 ·π2 = π1 ·π2 is

a reduced partition of π1 with respect to π2. In particular, a simpler reduced partition,

whose characteristic function has a smaller BDD size, is of interest. Theorem 35 states the

validity of this flexibility.

Proposition 33 If πd ¹ πc holds for two partitions πc and πd, there exists a partition πx

such that πc · πx = πd.

Lemma 34 Assume partitions π, π′ and πc satisfy π · πc = π′ · πc. If any πd satisfies

πd ¹ πc, then π · πd = π′ · πd.

Proof. By πd ¹ πc and Proposition 33, assume there exists a partition πx such that πc ·πx =

πd. For π · πc = π′ · πc, we derive π · πc · πx = π′ · πc · πx, i.e., π · πd = π′ · πd.

Assume, after certain iterations of refinement, the overall (product) partition is πo. Let

πy be a new (not overall) partition after one more iteration, and let π†y be a reduced partition

of πy with respect to any πx, such that πo ¹ πx. (Let ψ¦ denote the characteristic function

of π¦ for any subscript ¦.) We have

Theorem 35 For ψz = CompNewPartition(ψy, ~δ) and ψ′z = CompNewPartition(ψ†y, ~δ),

equality πo · πy · πz = πo · πy · π′z holds.

Proof. Let ψ/·. denote the characteristic function of π/ · π., for any subscripts /

73


and .. In addition, (π)∗ is used to denote the partition with characteristic function

CompNewPartition(ψ, ~δ), for any partition π with characteristic function ψ.

By the definition of a reduced partition, π†y · πx = πy · πx. Since πo ¹ πx, equation

π†y · πo = πy · πo holds according to Lemma 34. So, (π†y · πo)∗ = (πy · πo)∗. From Lemma 30,

we get (π†y)∗·(πo)∗ = (πy)∗·(πo)∗. Since π′z = (π†y)∗ and πz = (πy)∗, then π′z ·(πo)∗ = πz ·(πo)∗.

Also, from Theorem 31, πo · πy ¹ (πo)∗. Hence, by Lemma 34, πo · πy · πz = πo · πy · π′z.

In the light of Theorem 35, an algorithm can be implemented by modifying IDES2 and

IDES3 as follows. Keep a set of characteristic functions to represent the overall partition.

Compute new partitions based only on an essential partition, which consists of equivalence

classes that refine the previous overall partition. In this manner, the BDD size is kept small

and the iterative computation is sped up.

5.4 Verification of Sequential Equivalence

The proposed technique can be applied for sequential verification. The following two

propositions form the basis of our equivalence checking. The first states a property that

two equivalent FSMs must have.

Proposition 36 Given two equivalent FSMs M1 and M2 with sets of equivalence classes

πM1 and πM2, respectively, assume expunging unreachable states from πM1 and πM2 results

in π[M1

and π[M2

, respectively. Then, there exists a bijection f : π[M1

→ π[M2

, where f

reflects the state isomorphism of M1 with M2.

On the other hand, to show the equivalence between two FSMs, Proposition 37 gives

necessary and sufficient conditions.

Proposition 37 M1 and M2, with initial states I1 and I2, respectively, are equivalent if,

74


and only if, there exists a bijection f : π[M1

→ π[M2

(f reflects the state isomorphism of M1

with M2), and E2 = f(E1) with I1 ∈ E1 ∈ π[M1

and I2 ∈ E2 ∈ π[M2

.

Based on Proposition 37, we can extend the identification of state equivalence to sequen-

tial equivalence checking. In order to pose the problem of verification as the identification

of state equivalence, the multiplexed machine is introduced.

5.4.1 Multiplexed Machine

To check equivalence between two FSMs M1 and M2 with m1 and m2 registers, respec-

tively, assume without loss of generality m2 ≥ m1. Their multiplexed machine, denoted

M1on2, is depicted in Figure 5.5. The two FSMs share the same primary inputs. Their

corresponding outputs are multiplexed as a set of global primary outputs. To minimize the

state variables of M1on2, for every next state variable of M1, we pair it arbitrarily with

one of M2. This pair is then multiplexed before being fed to a register, whose output is

then demultiplexed to recover the current state variables for M1 and M2. In addition,

one self-looped auxiliary state variable (aux ) is added which controls all multiplexers and

demultiplexers as indicated by the dotted lines in Figure 5.5. The value of aux remains

the same as its initial value. Let M1on2 select M1 and M2 when aux has values 0 and 1,

respectively. No matter what the initial value of aux is, the multiplexed machine functions

the same as M1 and M2, if they are equivalent. In the verification, we can imagine that

aux is in a superposition status, possessing values 0 and 1 simultaneously. (Note that,

without changing its functionality, the multiplexed machine can be simplified by omitting

the demultiplexers. That is, replacing each demultiplexer, we directly connect its input to

outputs. Also it is worth mentioning that choosing any subset of the next state variables of

M1 to be paired is valid. Suppose, in the extreme case, we choose an empty subset. Then,

aux and the multiplexers for outputs are unnecessary. The multiplexed machine, there-

75


M1

M2

PrimaryInputs

PrimaryOutputs

m1 bits(m2 - m1) bits

aux

0

1

0

1

0

1

Figure 5.5. Multiplexed Machine.

fore, degenerates into two separate machines. The corresponding verification is discussed

in Section 5.4.5.)

5.4.2 Algorithm for Sequential Equivalence Checking

Given two FSMs M1 and M2 with initial states1 I1 and I2, respectively, without loss

of generality assume their multiplexed machine M1on2 selects M1 (M2) while aux equals

zero (one). Their equivalence can be verified based on Lemma 38, a consequence of Propo-

sition 37.

Lemma 38 M1 and M2 are equivalent if, and only if, πM1on2 has at least one reachable

state with aux bit 0 and at least one with 1 in every equivalence class containing any reachable

state, and has initial states (I1 with aux 0 and I2 with aux 1) in the same equivalence class.

1To simplify the discussion, we assume each FSM has a single initial state. This can be straightforwardlygeneralized to a set of initial states.

76


Proof. Assume f : π[M1

→ π[M2

reflects the state isomorphism between M1 and M2. Let

E2 = f(E1) for E1 ∈ π[M1

and E2 ∈ π[M2

. Then, after adding the aux bit, original reachable

states (including initial states) q1 ∈ E1 and q2 ∈ E2 must be within the same equivalence

class of πM1on2 . Thus, every equivalence class of πM1on2 containing any reachable state must

have at least one (with aux 0) contributed from M1 and one (with aux 1) from M2.

By iterative refinement of the state space as in the identification of state equivalence,

equivalence classes of states for M1on2 can be derived whenever the fixed-point has been

reached. According to Lemma 38, both conditions are checked. However, the first condition

implies that we need to know reachable states of both M1 and M2. Fortunately, the first

condition is redundant, i.e., as long as the second condition is satisfied, so is the first. This

property is stated in Theorem 39. As a result, reachability analysis can be completely

eliminated.

Theorem 39 M1 and M2 are equivalent if, and only if, πM1on2 has initial states, namely

I1 with aux 0 and I2 with aux 1, within the same equivalence class.

Proof. By contradiction, we show that the first condition in Lemma 38 is redundant. Assume

πM1on2 has initial states in the same equivalence class Ei, and there exists an equivalence

class E containing reachable states all with aux bits 0 (or 1, it does not matter). Therefore,

E 6= Ei. For any reachable state of E, there must be a reachable state, say q, (with aux bit

0) that transitions to it. This transition makes q have no equivalent reachable states from

M2. Therefore, the equivalence class containing q has all reachable states with aux bits

0. Continuing this argument, we conclude that Ei must exclude the state, I2 with aux 1.

Hence, a contradiction arises.

Further, rather than checking that the condition of Theorem 39 is satisfied in the overall

partition of the state space, validity can be verified on the new partition at each iteration.

The correctness of this variant is based on Proposition 27. As the BDD representation

77


input: two FSMs under equivalence checkingoutput: yes if equivalent; no otherwisebegin

01 build the multiplexed machine M02 compute the init. partition πi

M03 if init. states not in an equiv class of πi

M04 then return no05 while fixed point not reached do06 compute πnew

M07 refine the overall partition and simplify πnew

M08 if init. states not in an equiv class of πnew

M09 then return no10 return yes

end

Figure 5.6. Algorithm: Verify Sequential Equivalence.

of the current partition is obtained, it is of linear time complexity in the number of state

variables to test if two initial states are within the same equivalence class. Consequently, this

checking can be done efficiently in each iteration. Figure 5.6 outlines the overall procedure

for sequential equivalence checking.

Remark: In theory, k FSMs can be verified simultaneously by introducing dlog2 keauxiliary state variables to control the k-to-1 multiplexers of their corresponding multiplexed

machine.

5.4.3 Robust Sequential Equivalence Checking

To make the verification procedure more robust, the techniques and restrictions listed in

Section 5.3.3 are also applicable here. Instead of repeating them, this section is concerned

with those that are particular to verification.

Verifying each primary output and/or characteristic function separately could substan-

tially reduce the number of encountered equivalence classes. The numbers of equivalence

78


classes induced by individual primary outputs may be exponentially smaller than those

induced by all of the primary outputs. The correctness of this separation is inferred from

Lemma 29. It is interesting to notice that the cone of influence reduction has been auto-

matically taken care of due to this separation, i.e., irrelevant state variables with respect to

the considered primary output disappear.

Although reachability analysis is unnecessary, any under-estimation of unreachable

states of M1 and/or M2 can be used as a don’t care set to simplify BDD expressions

and to reduce unnecessary state refinements. Theorem 40 shows the correctness of such

simplification and the maximal don’t care set for the multiplexed machine. However, as

mentioned in Section 5.3.3, the fixed-point condition should be preserved to ensure the

algorithm terminates.

Theorem 40 The equivalence condition of M1 and M2 is invariant under don’t care sim-

plification by unreachable states of M1on2, that is, unreachable states of M1 with aux 0

together with those of M2 with aux 1.

Proof. Because state transition is irrelevant to the simplification of characteristic functions

of partitions, the proof of Theorem 39 still holds.

Assume the sets of reachable (unreachable) states of M1 and M2 are R1 (U1) and R2

(U2), respectively. Let α be the auxiliary state variable. Since the state space of M1on2 is

the direct sum of M1 and M2 distinguished with the auxiliary state variable, it consists of

four disjoint subsets ¬α ∧ R1, ¬α ∧ U1, α ∧ R2, and α ∧ U2. The reachable set of states of

M1on2 is (¬α ∧R1) ∪ (α ∧R2); the unreachable set is (¬α ∧ U1) ∪ (α ∧ U2).

Besides don’t care simplification, the partitioned state space can be reduced further

according to the following theorem.

Theorem 41 Let πkM1on2

be the partition associated with the k-equivalence relation of

79


M1on2. Then, equivalence checking is invariant under the reduction of πkM1on2

by collapsing

the set E ∈ πkM1on2

| aux(q) = 0, ∀q ∈ E of equivalence classes into one equivalence class

and collapsing E ∈ πkM1on2

| aux(q) = 1, ∀q ∈ E into another, where aux(q) denotes the

valuation of the aux bit of q.

Proof. It is clear that M1 and M2 are equivalent only if the collapsed equivalence classes

are unreachable from the initial states.

Since we collapse the equivalence classes of M1 and of M2 separately, states from one

machine which have transitions to these equivalence classes do not have corresponding equiv-

alent states from the other machine. Besides, as state transition relations are not affected

by the collapsing, the equivalence relation among other states, which cannot transition to

these equivalence classes, remains intact. Since the condition holds for all k ≥ 0, M1 and

M2 must be equivalent. Hence, the verification is invariant under this reduction.

Corollary 42 For two FSMs M1 and M2 with n1 and n2 equivalence classes, respec-

tively, the number of equivalence classes can be kept at most minn1, n2 + 1 in our

sequential equivalence checking with the use of the collapsing process in Theorem 41.

That is, the number of variables introduced to generate equivalence nodes is at most

dlog2 log2(minn1, n2 + 1)e. Assume the n-input FSMs M1 and M2 have m1 and m2

state variables, respectively. Then, by verifying each output separately, the total number of

variables in our verification is at most (n+maxm1,m2+1+dlog2 log2(minn1, n2+1)e)≤ (n + maxm1,m2+ 1 + dlog2(minm1,m2+ 1)e).

In the construction of the multiplexed machine, a multiplexer, selecting state variables,

pairs a state variable from M1 with any unpaired one from M2. Since this pairing is ar-

bitrary (and, thus, can be adaptively changed on-the-fly), an optimization problem is to

maximize the BDD sharing between M1 and M2, and to simplify the consequent BDD

80


manipulations. Heuristics can be derived based on the cone of influence reduction and

functional similarity. The former pairs two state variables which are supports of two similar

sets of primary outputs; the latter pairs two state variables with similar transition func-

tionalities. In the extreme case, when comparing two identical copies of an FSM, we can

possibly reduce the BDD such that it is as if there is only one machine.

5.4.4 Error Tracing and Shortest Distinguishing Sequence

Given two states q1 and q2 which are k-distinguishable at an output of an FSM

M = (Q, I,Σ,Ω, ~δ, ~ω), this section illustrates how to derive a length-k input sequence

differentiating them.

Since q1 and q2 are k-distinguishable, their corresponding BDD paths lead to different

equivalence nodes in some characteristic function at the kth refinement. Let the functions

represented by these two BDD nodes be f1 and f2. (Notice that f1 and f2 should be the

functions before re-encoding and simplification mentioned in Section 5.3.1.) Then, any

solution, say σ∗, to (f1 xor f2) provides the kth distinguishing input vector. On the

other hand, two states q′1 = ~δ(σ∗, q1) and q′2 = ~δ(σ∗, q2) are (k − 1)-distinguishable. They

result in the distinguishability of q1 and q2 at the kth refinement. Similarly, the (k − 1)st

distinguishing input vector can be obtained. Repeating this process backward, one can

derive a shortest distinguishing sequence to trace an error.

5.4.5 State-Space Partitioning on Separate Machines

The multiplexed machine is not the only construction that extends state equivalence to

machine equivalence. To prove the equivalence of M1 and M2, the state variables can be

kept disjointed while the inputs are shared. Therefore, their state spaces are partitioned

separately, but simultaneously, by maintaining two sets of shared BDDs during functional

81


decomposition. Again, they are equivalent if, and only if, their initial states lead to the

same equivalence node when the fixed point is reached.

In the case of the multiplexed machine, state variables of M1 and M2 are merged by

multiplexers. As mentioned in Section 5.4.3, the register pairing affects the cone of influence

and BDD manipulations. By state-space partitioning on separate machines, the interference

among state variables is removed. However the major drawback is that there is no BDD

sharing between M1 and M2 above the cutset. Notice that, although the number of state

variables in this case is the same as for the product machine, the verification is still in the

sum state space.

5.4.6 State-Space Partitioning on Product Machine

Verification by state-space partitioning works for the product machine as well. It can

be done by slight modifications of [LTN90] and [Pix92], previously known as the backward

state traversal [Fil91]. We refer to it as state-space partitioning on the product machine.

When compared to state-space partitioning on the multiplexed machine, this approach

has more flexibility in BDD variable ordering. However, this flexibility prevents simplifica-

tion by the restrict or constrain operator with respect to the reachable states because this

might corrupt the represented equivalence relation.

5.5 Analysis

This section consists of two parts. First, some verification properties, independent of

the implementation of a design, are analyzed. Second, we discuss circuit implementation

related effects on the sequential equivalence checking problem.

82


5.5.1 Implementation-Independent Aspects

Given an FSM taking a total of n iterations in state-space partitioning, its partition

structure is defined as an ordered sequence p = (p1, p2, . . . , pn), where pi denotes the

accumulated number of equivalence classes at the ith iteration. Thus, pi < pi+1, for i =

1, . . . , n− 1, and pn = pn+1.

Theorem 43 Any two equivalent FSMs must have the same partition structure in their

reachable state subspace.

Proof. Assume two equivalent FSMs M and M′ have sets of equivalence classes π and

π′, respectively, in their reachable state subspace. Therefore, according to Proposition 36,

there exists a bijection f : π → π′.

Suppose M and M′ have different partition structures. Since the state space is contin-

uously refined in fixed-point computation, there must exist l- and k-distinguishable state

pairs (q1, q2) and (q′1, q′2), respectively, such that l > k, q1 ∈ E1 ∈ π, q′1 ∈ f(E1) ∈ π′,

q2 ∈ E2 ∈ π, and q′2 ∈ f(E2) ∈ π′. Let ~δ and ~δ′ be the transition functions of M and

M′, respectively. Then, pairs (qi, qj) | ∃σ(~δ−1(σ, qi), ~δ−1(σ, qj)) = (q1, q2) must be at

least (l − 1)-distinguishable and at least one of them is (l − 1)-distinguishable. Similarly,

(q′i, q′j) | ∃σ.(~δ′−1(σ, q′i), ~δ′−1(σ, q′j)) = (q′1, q

′2) are at least (k − 1)-distinguishable and

at least one of them, say (q′i∗ , q′j∗), is (k − 1)-distinguishable. Let σ∗ be the input such

that (~δ′(σ∗, q′1), ~δ′(σ∗, q′2)) = (q′i∗ , q

′j∗). Also, let (qi∗ , qj∗) = (~δ(σ∗, q1), ~δ(σ∗, q2)). Suppose

qi∗ ∈ Ei∗ ∈ π and qj∗ ∈ Ej∗ ∈ π. Then, q′i∗ ∈ f(Ei∗) ∈ π′ and q′j∗ ∈ f(Ej∗) ∈ π′. Now,

since (qi∗ , qj∗) is at least (l− 1)-distinguishable and (l− 1) > (k − 1), we are ready to have

recursive reasoning for (qi∗ , qj∗) and (q′i∗ , q′j∗). At some point of the recursion, we will reach

the situation that (q′i∗ , q′j∗) can be differentiated by some output while (qi∗ , qj∗) can not.

This violates the base cases of Lemma 26. Hence, M and M′ must have the same partition

structure.

83


Therefore, partition structures in reachable state subspace form a signature for equiv-

alent FSMs. This may not be true for the entire state space. However, even without the

knowledge of state reachability, the following holds.

Theorem 44 Given two FSMs M1 and M2 converging in m and n steps, respectively, in

state-space partitioning, their product machine converges in no more than minm,n steps

in state-space partitioning.

Proof. In state-space partitioning, the product machine has state “equivalence relation” ≡P

over (ordered) pairs of states, (q1, q2) with q1 ∈ Q1 and q2 ∈ Q2, where Q1 and Q2 are the

sets of states of M1 and M2, respectively. Notice that ≡P may not satisfy reflexive and

symmetric laws. Nevertheless, the transitive law holds for the ordered pairs of states. Since

the transitive law is maintained during the fixed-point computation, it is clear that once one

machine converges, so does the product machine. On the other hand, this state-partitioning

procedure does not refine the state subspace q ∈ Q1 | (q, q2) 6∈ ≡P , ∀q2 ∈ Q2 ×Q2 ∪Q1 × q ∈ Q2 | (q1, q) 6∈ ≡P ,∀q1 ∈ Q1. Hence, it could converge in less than minm,nsteps.

Theorem 45 Given two FSMs M1 and M2 converging in m and n steps, respectively,

in state-space partitioning, then their multiplexed machine converges in exactly maxm,nsteps in state-space partitioning. With the state space reduced by Theorem 41 in each iter-

ation, the computation converges in the same step as the state-space partitioning on their

product machine.

Proof. The construction of the multiplexed machine is designed to match corresponding

equivalence classes betweenM1 andM2. State-space partitioning on the combined machine

has no effect on the partition of the state subspace spanned by any individual FSM. Once

each subspace of M1 and M2 has reached a fixed point in state partitioning, so has the

84


space of their combined machine. Therefore, the combined machine converges in exactly

maxm,n steps.

When the state space is reduced by Theorem 41 in each iteration, the fixed-point com-

putation does not refine the state subspace spanned by the collapsed equivalence classes.

The state space is partitioned in the same way as that of the product machine. Hence, the

multiplexed and product machines converge in the same step in state-space partitioning.

In contrast, for state traversal of an FSM, although we can similarly define a traversal

structure to be the sequence of numbers of reached states, we can not use it as a signature.

Moreover, even if the traversal depths for two FSMs are known, they merely provide a lower

bound on the depth of the product machine. No strong argument like Theorems 44 and 45

is possible.

The following theorem shows the connection between the number of refinements in state

partitioning and the depth of state traversal.

Theorem 46 Given two k-distinguishable FSMs M1 and M2, both state-traversal- and

state-partition-based approaches differentiate them at the kth step.

Proof. Since state traversal on the product machine of M1 and M2 implicitly enumerates

all possible transitions, clearly any discrepancy can be observed in the shortest steps.

On the other hand, for state partitioning, since the initial states from M1 and M2 must

be k-distinguishable in the combined machine of M1 and M2. The theorem follows.

As a consequence, Corollary 47 follows.

Corollary 47 Given two FSMs M1 and M2, let M1×2 be their product machine. Assume

np is the number of refinements in state partitioning on M1×2, and nt is the depth of

state traversal on M1×2. Then, minnp, nt is an upper bound on the number of iterations

required for equivalence checking.

85


In other words, following Corollary 47, if np > nt, we can conclude the equivalence of

M1 and M2 in nt refinements of state partitioning on M1×2. Similarly, if np < nt, their

equivalence can be confirmed in np steps of state traversal on M1×2. Also, Corollary 48

follows immediately from Theorem 44.

Corollary 48 Given two FSMs, M1 and M2, converging in m and n steps, respectively,

in state-space partitioning, their equivalence can be concluded in no more than minm,nsteps in state partitioning on their multiplexed machine.

5.5.2 Implementation-Dependent Aspects

Retiming [LS83] is an important technique in sequential circuit optimization. There

are two types of atomic moves in retiming, namely forward (from inputs to outputs) moves

and backward (from outputs to inputs) moves across functional blocks. Here we investigate

their effects on the number of equivalence classes in the state space. Suppose an FSM Mb is

retimed from another FSM Mf using only backward moves across a functional block with

function f : Qb → Qf , where Qb and Qf are the state spaces of Mb and Mf , respectively.

(Equivalently, Mf is retimed from Mb using forward moves across the functional block

with function f .)

Proposition 49 Two states qb and q′b of Mb are equivalent, i.e., qb ≡Mbq′b, if and only if

their corresponding states f(qb) and f(q′b) of Mf are equivalent, i.e., f(qb) ≡Mff(q′b).

Proposition 50 If qb ≡Mbq′b, then the corresponding states of qb, q′b, f(qb), and f(q′b) in

the multiplexed machine Mbonf of Mb and Mf are in the same equivalence class of Mbonf .

Theorem 51 The number of equivalence classes of Mb is not greater than that of Mf .

Proof. Since f is a total function, i.e., f is well defined for all states of Mb, the theorem

follows from Proposition 49.

86


Theorem 52 The number of equivalence classes of Mf is greater than that of Mb if, and

only if, there exists a state q of Mf such that f−1(q) = ∅ and q 6≡Mff(qb), ∀qb ∈ Qb.

Proof. The theorem follows from Proposition 49.

Similar arguments of Theorems 51 and 52 were used in [SPRB95] for the discussion of

the validity of retiming.

5.6 Experimental Results

Using the VIS [BHSV+96] environment, we compared three equivalence checking tech-

niques, namely,

STPM – state traversal on the product machine,

SPPM – state partitioning on the product machine, and

SPMM – state partitioning on the multiplexed machine.

The experiments were conducted on a Linux machine with a Pentium III XEON 700-MHz

CPU and 2-Gb of RAM.

For STPM and SPPM, the VIS sequential verification command is used. Dynamic

variable reordering is turned on and the hybrid method [MKRS00], considered the state-of-

the-art technique for image computation, is used. For SPMM, variable reordering is enabled

when appropriate.

To demonstrate the relative power of the three techniques, we first compare a set of

benchmark circuits against themselves. (Although combinational checking suffices in this

circumstance, we are only interested in sequential methods.) In general, combinational

equivalence checking should be tried in situations where there is structural similarity. The

87


techniques of this chapter aim at situations where there is no such similarity. The self-

comparison benchmarks are used to compare the methods on a large set of examples. Care is

taken not to exploit similarity by using a method for pairing state variables which considers

only the cones of influence of the primary outputs. To further emphasize that no similarity

is being exploited, a second set of experiments is done comparing circuits against their

retimed versions.

An argument detailing why self-comparison is sufficient for the experiments is Propo-

sition 36, which states that two different implementations M1 and M2 must have cor-

responding equivalence classes in the reachable set of states. Thus, the reachable state

spaces of M1on2, M1on1 and M2on2 all have the same number of equivalence classes. Also,

even if M1 and M2 have incomparable numbers of equivalence classes in the whole state

spaces, by Corollary 42, the number of equivalence classes encountered by SPMM is at most

minn1, n2+1, where ni is the number of equivalence classes of Mi, i = 1, 2. Thus, conclu-

sions drawn from self-comparison experiments should remain valid for general comparisons.

In Tables 5.1 and 5.2, we provide the characteristics of the benchmark circuits, then the

empirical results in Tables 5.3 and 5.4. Table 5.1 gives the profiles of the selected bench-

marks from iscas’89, lgsynth’91, texas’97, VIS and texas. Columns 2–4 indicate the

number of inputs, outputs, and registers respectively. In addition, the number of reachable

states and the corresponding traversal depth are provided in Column 5. (Here, we reset

uninitialized state variables to zero.)

Also, the information of equivalence classes is included in Table 5.2. As mentioned

in Section 5.4.3, we can verify sequential equivalence by examining each primary output

separately instead of treating them as a whole. The advantage is that we can reduce the

peak memory requirements recording encountered equivalence classes. To provide strong

evidence, Table 5.2 contains two parts of data. The first part, “Overall Partition,” in

88


Columns 2 and 3 shows the number of equivalence classes induced by all primary outputs.

The number in the following parentheses indicates the depth of refinement in the corre-

sponding fixed-point computation. In contrast, the second part, “Worst Partial Partition,”

in Columns 4 and 5 lists the largest number of equivalence classes induced by some pri-

mary output. The number in the following parentheses indicates the maximum depth of

refinement among all outputs. Circuit s991 is an example of where separating verification

tasks for each output makes a substantial reduction in the number of encountered equiva-

lence classes. In the extreme case, the number of equivalence classes induced by all outputs

can be exponentially (in the number of outputs) larger than those induced by individual

outputs. Usually, the separation of verification tasks lengthens the required refinement.

However, as BDD manipulations could be simplified substantially, the runtime can still be

reduced in most cases. Further, within each part, we compare the number (in the column

marked “whole”) of equivalence classes in the whole state space to the number (in the col-

umn marked “reach”) of equivalence classes in the reachable subspace. As can be seen, in

most instances, this subset is fairly small when compared to the entire space. Since SPMM

directly benefits from these reductions, it can easily verify some large instances which are

unverifiable for STPM and SPPM as indicated in Tables 5.3 and 5.4, where the results for

SPPM and STPM report the best of verifying combined outputs and verifying each out-

put separately. From experience, SPPM has better results in verifying combined outputs

for most circuits while SPMM has the opposite results. This might be explained by the

fact that the performance of SPPM is not directly related to the encountered number of

equivalence classes, while that of SPMM is.

From the experiment in Table 5.3, we observe that, for SPMM, using a monolithic BDD

as a characteristic function suffices for all verifiable benchmarks. The only exception is sbc,

where an array of characteristic functions needs to be maintained. Because using multiple

characteristic functions usually complicates the fixed-point computation, it is in general

89


more time consuming. Also, we find that SPMM takes more time than STPM and SPPM

for circuits, such as s382, s420.1, etc., with numerous equivalence classes and deep refining

processes. It is understandable because SPMM enumerates each equivalence class in every

refining process.

For circuits like s420.1, where the depths of traversal and refinement are both ex-

ponential in the size of inputs, none of the three techniques is competent. However, for

s420.1, since the depth of refinement is half of that of traversal, SPPM is about twice as

fast as STPM. Notice that, as analyzed in Section 5.5.1, although the product machine has

a traversal depth of 65535 (due to self-comparison), we can conclude the equivalence by

traversing states at the 32768th step even before the fixed point is reached.

For cbp and minmax series of circuits, where depths are shallow, STPM and SPPM

perform much better than SPMM, which needs to take care of numerous equivalence classes

as listed in Table 5.2. On the other hand, for minmax circuits, as discussed in [Fil91],

SPPM has a polynomial complexity in input sizes while STPM has an exponential one. In

comparison, SPPM is the best choice for these cases.

Circuits key and bigkey are another extreme, which have a few equivalence classes.

SPMM verifies them quite easily while both STPM and SPPM fail. In general, for control

logic, SPMM performs much better than the other two. Microprocessor 8085 is an example,

where SPMM verifies all the outputs except for the 16 for the address bus. (The results of

8085 in Tables 5.2 and 5.3 exclude these unverifiable outputs.) Other examples are control,

IFetchControl2, and IFetchControl3. On the other hand, due to the large number of

outputs in IFetchControl2, IFetchControl3, clma, sbc, etc., SPMM takes a long time to

verify them because it processes each output once at a time. Fortunately, these tasks can

be parallelly verified to minimize the total completion time.

90


In Table 5.4, the equivalence between a circuit and its retimed implementation is

checked. Retimed circuits were obtained by using sis [SSL+92], except for texas bench-

marks, s641-retime, and tbk-retime. Other circuits, which are included in Table 5.3

but absent from Table 5.4, either take too long for sis to retime, or have incompatible

initial states created by the retiming. Table 5.4 suggests that SPMM does not benefit

particularly when self-comparison is done. (This is due to the fact that state variables

are paired only by cone of influence of outputs. Otherwise, corresponding state variables

are avoided to be paired together. Doing so destroys BDD sharing in the experiments of

self-comparison.) This supports that the results of Table 5.3 are relevant for comparing the

three methods. Also, observe from Table 5.4 that SPMM is relatively stable when moving

from self-comparison to comparing against retimed versions. For example, for s526 and

s526n, the results in Tables 5.3 and 5.4 are similar for SPMM, but STPM and SPPM yield

substantial variances. The stability of SPMM derives from the fact that it depends mainly

on the maximum number of registers in the two designs plus the number of equivalence

classes encountered.

Another view of Tables 5.3 and 5.4 is shown in Table 5.5, where the second and third

columns denote the numbers of wins in terms of smaller memory and time usage, respec-

tively, and the last gives the number of examples on which the method failed. This analysis

indicates that SPMM is, on average, more efficient and more rugged than the other two

methods.

We did not experiment with the equivalence checking between inequivalent circuits.

However the expectation is that, according to Theorem 46, all of the three verification

techniques can report the nonequivalence in the same iteration, say in the nth iteration. To

generate a counterexample, on the other hand, both STPM and SPPM have time complexity

O(n) while SPMM has O(n2). This difference results from the fact that, in SPMM, the

91


input information of the previous iterations is thrown away when equivalence nodes are

re-expressed using newly introduced variables.

To summarize the results, the major limitation of SPMM is the encountered number

of equivalence classes during verification. In contrast, STPM and SPPM do not suffer the

same limitation because equivalence classes are not explicitly represented in the BDDs. For

a circuit with a not-so-deep depth of refinement and a “reasonable” number (≤ ∼106) of

equivalence classes per output, SPMM has a great chance of verifying it. On the other hand,

due to the fact that the number of equivalence classes in the reachable state subspace is

invariant under different implementations, SPMM tends to be the most robust verification

technique.

5.7 Related Work

5.7.1 Computation of State Equivalence

Computing state equivalence is a key ingredient of FSM state minimization. Before

the implicit symbolic approach was proposed in [LTN90, Pix92], the explicit enumerative

approach [Koh78, HU79] had been the traditional way of doing it. The computation pro-

posed by Lin et al. [LTN90] builds a product machine of the considered FSM with itself

and reasons about the state equivalence using a relation over pairs of states. In contrast,

we demonstrated another symbolic computation which deals with equivalence classes rather

than equivalence relations. In essence, the prior approach represents the equivalence rela-

tions with BDD paths; our approach represents equivalence classes with BDD nodes. As its

strength, the prior approach imposes no particular limitation on the number of equivalence

classes to be handled. However, as its weakness, it is often unable to handle FSMs with

many state variables; the performance and capability of the approach are unpredictable es-

92


pecially for medium-sized FSMs. In comparison, our approach is more robust, but limited

to cases where the number of equivalence classes cannot exceed a few million.

5.7.2 Verification of FSM Equivalence

As mentioned earlier, our verification technique aimed for general sequential equivalence

checking. Structural similarities between two FSMs to be verified were not explored. The

forward [BCM90] and backward [Fil91] state traversals are the closest structure-independent

equivalence checking techniques to ours, especially the latter.

Also, there have been extensive studies on structure-dependent equivalence checking,

e.g., just to name a few [vE00, QCC+00, SWWK04]. In [vE00], signal correspondences

were identified and merged to simplify equivalence checking. In [QCC+00], two transition

systems under comparison need to be similar up to a one-to-one mapping between equivalent

states. Such a mapping is discovered by reachability analysis to converge their combina-

tional similarity. The structural traversal method in [SWWK04] is an over-approximative

reachability analysis based on circuit manipulations.

5.8 Summary

This chapter consists of two parts: the identification of equivalent states and the ver-

ification of sequential equivalence. We show that the former can be done efficiently by

BDD-based functional decomposition. By introducing the multiplexed machine, we can

verify sequential equivalence by means of state partitioning in the sum space, a new possi-

bility to do formal equivalence checking. In high-speed designs, a great portion of registers

are for timing speed up rather than increasing the number of equivalence classes of states.

In such cases, state-space partitioning would become preferable to state-space traversal.

93


A major advantage of the new verification technique is the substantial reduction in the

number of state variables. Compared to product-machine-based techniques, our approach

almost halves the number of state variables. Although there is an intrinsic restriction on

BDD variable ordering, to overcome it and minimize the BDD sizes, several techniques are

proposed. These make our algorithm even more promising.

94


Table 5.1. Profiles of Benchmark Circuits

Circuit In Out Reg Reach (Depth)s1196 14 14 18 2616 (2)s298 3 6 14 218 (18)349 9 11 15 2625 (6)

s400/s444 3 6 21 8865 (150)s420.1 18 1 16 65536 (65535)s499 1 22 22 22 (21)

s526/s526n 3 6 21 8868 (150)s641 35 24 19 1544 (6)s713 35 23 19 1544 (6)s953 16 23 29 504 (10)s967 16 23 29 549 (10)s991 65 17 19 524288 (3)

bigkey 262 197 224 1.17e+67 (2)clma 382 82 33 158908 (411)mm4a 7 4 12 832 (3)mm9a 12 9 27 2.25e+7 (3)mm9b 12 9 26 2.25e+7 (3)

mult16a 17 1 16 65535 (16)sbc 40 56 28 154593 (9)

control 33 21 35 119 (6)IFetchControl2 27 38 59 2.50e+8 (27)IFetchControl3 27 38 61 1.00e+9 (27)

parsepack 9 65 70 3.70e+19 (9)parsesys 9 65 312 2.21e+48 (103)8085 18 27 193 N/Abpb 9 4 36 6.87e+10 (32)

cbp 16 4 17 17 16 131072 (1)cbp 32 4 33 33 32 4.29e+9 (1)

key 258 193 228 N/Aminmax5 8 5 15 12032 (3)minmax10 13 10 30 1.79e+8 (3)

tbk-retime 6 3 49 2048 (3)

95


Table 5.2. Characteristics of Equivalence Classes of Benchmark Circuits

Overall Partition Worst Partial PartitionCircuit whole (rfn) reach (rfn) whole (rfn) reach (rfn)s1196 82944 (2) 1509 (2) 96 (3) 56 (3)s298 8061 (16) 135 (12) 249 (24) 118 (20)s349 18608 (5) 1801 (5) 248 (8) 35 (6)s400 608448 (93) 8865 (93) 17174 (183) 8597 (183)

s420.1 65536 (32768)s444 608448 (93) 8865 (93) 17174 (183) 8597 (183)s499 4.19e+6 (1) 22 (1) 24 (21) 22 (21)s526 1.43e+6 (119) 8868 (93) 43068 (199) 8597 (183)s526n 1.43e+6 (119) 8868 (93) 43068 (199) 8597 (183)s641 294912 (1) 1480 (1) 24750 (8) 1248 (8)s713 294912 (1) 1480 (1) 24750 (8) 1248 (8)s953 N/A 504 (2) 42 (10) 35 (10)s967 N/A 549 (2) 42 (10) 35 (10)s991 327680 (1) 10 (2)

bigkey N/A 4 (2)clma N/A N/A 5950 (178)mm4a 3616 (1) 712 (1) 452 (2) 217 (1)mm9a N/A 522244 (2) 260617 (1)mm9b N/A N/A 260617 (1)

mult16a 65536 (16) 65535 (16) 65536 (16) 65535 (16)sbc N/A N/A 23048 (10)

control N/A 43 (2) 14 (6) 8 (5)IF’hC’l2 N/A N/A 9434 (37)IF’hC’l3 N/A N/A 8442 (39)parsepack N/A 18 (9) 10 (9)parsesys N/A 164 (21) N/A8085∗ N/A 309619 (28) N/Abpb N/A 512 (3)

cbp 16 4 65536 (1)cbp 32 4 4.29e+9 (1)

key N/A 64 (7) N/Aminmax5 30784 (1) 5520 (1) 1924 (2) 965 (2)minmax10 1.07e+9 (1) N/A 2.09e+6 (2) 1.04e+6 (1)

tbk-retime 16 (1) 16 (3)

96


Table 5.3. Sequential Equivalence Checking between Identical Circuits

STPM SPPM SPMMmem time mem time mem time

Circuit (Mb) (sec) (Mb) (sec) (Mb) (sec)s1196 28.3 2.3 25.1 1.5 12.4 2.1s298 7.8 0.2 16.4 1.0 8.7 0.9s349 12.7 1.5 25.4 1.3 10.8 1.9s400 12.8 4.9 43.1 4.8 56.6 448.8s420.1 45.1 669.2 37.9 290.9 62.0 2.98e+5s444 12.7 4.8 42.2 4.5 55.8 438.9s499 299 157.1 16.5 1.0 8.6 0.2s526 22.5 7.1 117.0 293.8 50.4 358.2s526n 16.6 4.4 82.7 150.9 50.4 357.8s641 11.9 0.7 27.4 0.6 39.5 3.3s713 11.8 0.7 27.6 0.6 39.2 6.4s953 11.3 0.1 27.9 0.8 11.9 1.1s967 11.4 0.9 27.5 0.8 10.3 0.5s991 35.4 26.4 64.9 11.6 10.7 0.3bigkey >2G N/A >2G N/A 21.4 1.3clma 142 134.6 >2G N/A 117 4.30e+4mm4a 8.6 0.3 7.7 0.1 15.3 0.9mm9a 82.1 1.24e+5 58.9 16.6 244 4673.7mm9b >2G N/A >2G N/A 693 3.12e+4

mult16a 8.5 0.2 8.4 0.1 87.8 126.1sbc >2G N/A >2G N/A 537 1.29e+5

control 191 79.4 46.1 7.9 23.3 1.1IF’hC’l2 >2G N/A N/A >1.0e+6 258 1.37e+4IF’hC’l3 >2G N/A N/A >1.0e+6 259 1.38e+4parsepack >2G N/A 64.9 110.9 19.0 1.2parsesys >2G N/A 458 2.91e+4 102 45.98085∗ >2G N/A >2G N/A 793 3.06e+5bpb >2G N/A 51.7 62.9 46.1 17.2

cbp 16 4 18.0 0.3 18.0 0.3 75.2 70.2cbp 32 4 25.0 0.8 24.7 0.7 >2G N/A

key >2G N/A >2G N/A 68.5 15.4minmax5 27.3 0.8 28.1 0.6 26.0 12.2minmax10 151 1694.9 47.2 2.3 733 8.75e+4tbk-retime >2G N/A >2G N/A 84.2 112.3

97


Tab

le5.

4.Se

quen

tial

Equ

ival

ence

Che

ckin

gbe

twee

nD

iffer

ent

Impl

emen

tati

ons

ofSa

me

Des

ign

STP

MSP

PM

SPM

Mm

emti

me

mem

tim

em

emti

me

Cir

cuit

Reg

(Mb)

(sec

)(M

b)(s

ec)

(Mb)

(sec

)s208.1

/s208.1-retime

8/

1612

.40.

311

.80.

58.

92.

3s298

/s298-retime

14/

3412

.70.

321

.81.

79.

60.

7s386

/s386-retime

6/

1512

.60.

213

.00.

37.

30.

1s499

/s499-retime

22/

4143

719

6.9

690

401.

310

.71.

8s510

/s510-retime

6/

3413

.60.

419

.51.

812

.30.

4s526

/s526-retime

21/

5848

.324

.323

720

12.2

55.4

552.

5s526n

/s526n-retime

21/

6448

.441

.520

452

38.7

53.2

325.

9s526-retime

/s526n-retime

58/

64>

2GN

/A98

21.

26e+

554

.546

9.3

s641

/s641-retime

19/

1837

.51.

941

.11.

929

.39.

7s991

/s991-retime

19/

4234

524

31.9

139

760.

874

.313

4.6

mult16a

/mult16a-retime

16/

106

>2G

N/A

>2G

N/A

N/A

>1.

0e+

6tbk

/tbk-retime

5/

4956

.110

.370

.179

.246

.26.

6

98


Table 5.5. Overall Statistics

Method Wins in Memory Wins in Time FailedSTPM 11 12 13SPPM 7 15 10SPMM 28 21 2

99

Chapter 6

Verification Reduction

The existence of functional dependency among the state variables of a state transition

system has been identified as a common cause of inefficient BDD representation in formal

verification. Eliminating such dependency from the system compacts the state space and

may significantly reduce the verification cost. Despite this importance, how to detect func-

tional dependency without or before knowing the reachable state set remains a challenge.

This chapter tackles this problem by unifying two closely related but scattered studies —

detecting signal correspondence and exploiting functional dependency. Prior work on ei-

ther subject is a special case of our formulation. Unlike previous approaches, we detect

dependency directly from transition functions rather than from reached state sets. Thus,

reachability analysis is not a necessity for exploiting dependency. In addition, our proce-

dure can be integrated into reachability analysis as an on-the-fly reduction. Preliminary

experiments demonstrate promising results of extracting functional dependency without

reachability analysis; dependencies that were underivable before, due to the limitation of

reachability analysis on large transition systems, can now be computed efficiently; for ap-

100

CHAPTER 6. VERIFICATION REDUCTION

plication to verification, reachability analysis is shown to have substantial reduction in both

memory and run-time consumption.

6.1 Introduction

Reduction [Kur94] is an important technique in extending the capacity of formal veri-

fication. This chapter is concerned with property-preserving reduction [CGP99], where the

reduced model satisfies a property if and only if the original model does. In particular, we

focus on reachability-preserving reduction for safety property verification using functional

dependency.

The existence of dependency among state variables frequently occurs in state tran-

sition systems in both high-level specifications and gate-level implementations [STB96].

Such dependency may cause inefficient BDD [Bry86] representation in formal verification

[HD93] and can be used also in logic minimization [LN91a, STB96]. Thus, its detection

has attracted extensive research in both domains (e.g., see [BCM90, LN91a, HD93, vEJ96,

STB96, YSBO99]). The essence of all prior efforts [BCM90, LN91a, vEJ96, STB96] can

be traced back to functional deduction [Bro03], where variable dependency was derived

from the characteristic function of a reached state set. However, state transition systems of

practical applications are often too complex to compute their reachable states, even though

these systems might be substantially reduced after variable dependency is known. An im-

provement was proposed in [vEJ96] to exploit the dependency from the currently reached

state set during every iteration of a reachability analysis. However, the computation may

still be too expensive and may simplify subsequent iterations very little.

To avoid such difficulty, we take a different path to exploit dependency. The observa-

tion is that dependency among state variables may originate from the dependency among

101


transition functions1. Some variable dependency can be concluded more efficiently using

transition functions rather than the characteristic function of a reached state set. Therefore,

the computation requires only local image computation. Because the derived dependency is

an invariant, it can be used by any BDD- or SAT-based model checking procedure to reduce

verification complexity. Since not all dependency can be discovered this way due to the im-

perfect information about state reachability, this method is an approximative approach. To

complete the approximative computation, our procedure can be embedded into reachability

analysis as an on-the-fly detection. Reachability analysis is thus conducted on a reduced

model in each iteration. Our formulation leads to a unification of two closely related, but

scattered, studies on detecting signal correspondence [Fil92, vE00] and exploiting functional

dependency [HD93, vEJ96].

The chapter is organized as follows. After preliminaries and notation are given in

Section 6.2, our formulation of functional dependency and the corresponding calculations

are introduced in Section 6.3. Section 6.4 applies the developed algorithms to reachability

analysis as an on-the-fly reduction. Experimental results are provided in Section 6.5 to

demonstrate practical advantages. In Section 6.6, a closer comparison with prior work is

detailed. Section 6.7 concludes and outlines some future research directions.

6.2 Preliminaries and Notation

As a notational convention, the unordered version of a vector (an ordered set) ~v =

〈v1, . . . , vn〉 is written as ~v = v1, . . . , vn. In this case, n is the cardinality (size) of both

~v and ~v, i.e., |~v| = |~v| = n. Also, when a vector ~v is partitioned into k sub-vectors1From experience, it is commonly recognized that, to represent state transition systems, transition func-

tions are mostly preferable to transition relations. Complex transition systems are often compactly rep-resentable in transition functions but not in transition relations. This chapter assumes that transitionfunctions are the underlying representation of state transition systems. Consequently, our formulation isnot directly applicable to nondeterministic transition systems. The corresponding extension can apply theMOCB technique proposed in [HD93].

102


~v1, . . . , ~vk, the convention 〈~v1; . . . ;~vk〉 denotes that ~v1, . . . , ~vk are combined into one vector

with a proper reordering of elements to recover the ordering of ~v.

This chapter assumes, without loss of generality, that multi-valued functions are replaced

with vectors of Boolean functions. The image of a Boolean functional vector ~ψ over a subset

C of its domain is denoted as Image(~ψ,C); the range of ~ψ is denoted as Range(~ψ). Let

ψ : Bn → B be a Boolean function over variables x1, . . . , xn. The support set of ψ is

Supp(ψ) = xi | (ψ|xi=0 xor ψ|xi=1) 6= false. For a characteristic function F (~x) over

the set ~x of Boolean variables, its projection on ~y ⊆ ~x is defined as F [~y/~x] =

∃xi ∈ ~x\~y.F (~x). Also, we denote the identity function and its complement as = and

=†, respectively.

A state transition system is modelled an FSMM = (Q, I,Σ, Ω, ~δ, ~ω). Since symbols and

functions are in binary representations in this chapter, M will be specified, instead, with

a five-tuple (I, ~r, ~s, ~δ, ~ω), where ~r (resp. ~s) is the vector of Boolean variables that encodes

the input alphabets (resp. states).

6.3 Functional Dependency

Dependency for a state transition system can be formulated in two steps. We first

define combinational dependency among a collection of functions. The formulation is

then extended to sequential dependency for a state transition system.

6.3.1 Combinational Dependency

Given two Boolean functional vectors ~φ : Bl → Bm and ~ϕ : Bl → Bn over the same

domain, we are interested in rewriting ~φ in terms of a function of ~ϕ. The condition when such

103


a rewrite is feasible can be captured by a refinement relation, v ⊆ (Bl → Bm)× (Bl → Bn),

defined as follows.

Definition 8 Given two Boolean functional vectors ~φ : Bl → Bm and ~ϕ : Bl → Bn, ~ϕ

refines ~φ in C ⊆ Bl, denoted as ~φ vC ~ϕ, if ~φ(a) 6= ~φ(b) implies ~ϕ(a) 6= ~ϕ(b) for all

a, b ∈ C.

In other words, ~ϕ refines ~φ in C if and only if ~ϕ is more distinguishing than ~φ in C. (As

the orderings within ~φ and ~ϕ are not a prerequisite, our definition of refinement relation

applies to two unordered sets of functions as well.) In the sequel, the subscription C will be

omitted from the refinement relation v when C is the universe of the domain. Based on the

above definition, the following proposition forms the foundation of our later development.

Proposition 53 Given ~φ : Bl → Bm and ~ϕ : Bl → Bn, there exists a functional vector

~θ : Bn → Bm such that ~φ = ~θ ~ϕ = ~θ(~ϕ(·)) over C ⊆ Bl if and only if ~φ vC ~ϕ. Moreover, ~θ

is unique when restricting its domain to the range of ~ϕ.

For ~φ = ~θ ~ϕ, we call φ1, . . . , φm ∈ ~φ the functional dependents (or, briefly, dependents),

ϕ1, . . . , ϕn ∈ ~ϕ the functional independents (or, briefly, independents), and θ1, . . . , θn ∈~θ the dependency functions.

Problem formulation.

The problem of detecting (combinational) functional dependency can be formulated as

follows. Given a collection of Boolean functions ~ψ, we are asked to partition ~ψ into two

parts ~φ and ~ϕ such that ~φ = ~θ(~ϕ). Hence, the triple (~φ, ~ϕ, ~θ) characterizes the functional

dependency of ~ψ. We call such a triple a dependency triplet. Suppose ~ϕ cannot be

further reduced in (~φ, ~ϕ, ~θ) by recognizing more functional dependents from ~ϕ with all

possible modifications of ~θ. That is, |~ϕ| is minimized; equivalently, |~φ| is maximized. Then

104


the triplet maximally characterizes the functional dependency of ~ψ. In this chapter, we are

interested in computing maximal functional dependency. (Although finding a maximum

dependency might be helpful, it is computationally much harder than finding a maximal

one as it is the supremum over the set of maximal ones.)

The computation.

In the discussion below, when we mention Boolean functional vectors ~φ(~x) and ~ϕ(~x),

we shall assume that ~φ : Bl → Bm and ~ϕ : Bl → Bn with variable vector ~x : Bl. Notice that

Supp(~φ) and Supp(~ϕ) are subsets of ~x. The following properties are useful in computing

combinational dependency.

Theorem 54 Given functional vectors ~φ and ~ϕ, ~φ v ~ϕ only if Supp(~φ) ⊆ Supp(~ϕ).

Corollary 55 Given a collection of Boolean functions ψ1(~x), . . . , ψk(~x), if, for any xi ∈~x, ψj is the only function such that xi ∈ Supp(ψj), then ψj is a functional independent.

With the support set information, Theorem 54 and Corollary 55 can be used as a fast

screening in finding combinational dependency.

Theorem 56 Given functional vectors ~φ and ~ϕ, ~φ v ~ϕ if and only if |Range(~ϕ)| =

|Range(〈~φ, ~ϕ〉)|.

Theorem 57 Let θi ∈ ~θ be the corresponding dependency function of a dependent φi ∈ ~φ.

Let Θ0i = ~ϕ(~x)|φi(~x) = 0 and Θ1

i = ~ϕ(~x)|φi(~x) = 1. Then φi v ~ϕ if and only if

Θ0i ∩ Θ1

i = ∅. Also, θi has Θ0i , Θ1

i , and Bn\Θ0i ∪ Θ1

i as its off-set, on-set, and don’t-care

set, respectively. That is, θi(~ϕ(~x)) = φi(~x) for all valuations of ~x.

From Theorem 56, we know that the set ~ϕ of functional independents is as distinguishing

as the entire set ~φ ∪ ~ϕ of functions. Theorem 57, on the other hand, shows a way of

computing dependency functions.

105


CombinationalDependency

input: a collection ~ψ of Boolean functionsoutput: a dependency triplet (~φ, ~ϕ, ~θ)begin01 for each ψi ∈ ~ψ02 derive minimal refining sets Ei

1, . . . , Eik

03 select a minimal basis ~ϕ that refines all ψi ∈ ~ψ04 compute the dependency functions ~θ for ~φ = ~ψ\~ϕ05 return (~φ, ~ϕ, ~θ)end

Figure 6.1. Algorithm: CombinationalDependency.

Given a collection ~ψ of Boolean functions, its maximal dependency can be computed

with the procedure outlined in Figure 6.1. First, by Theorem 56, for each function ψi ∈ ~ψwe obtain the minimal subsets of ~ψ which refine ψi. Let the minimal refining subsets for

ψi be E1i , . . . , Ek

i . (Notice that k ≥ 1 since ψi refines itself and, thus, ψi is one of the

subsets.) The calculation can be done with local image computation because by Theorem 54

and Corollary 55 we only need to consider subsets of functions in ~ψ which overlap with

ψi in support sets. Second, we heuristically derive a minimal set of functional independents

that refines all the functions of ~ψ. Equivalently, for each ψi, some Ejii is selected such that

the cardinality of⋃|~ψ|

i=1 Ejii is minimized. This union set forms the basis of representing all

other functions. That is, functions in the union set are the functional independents; others

are the functional dependents. Finally, by Theorem 57, dependency functions are obtained

with respect to the selected basis.

A digression.

There are other variant definitions of dependency (see [Mar60] for more examples). The

functional dependency defined in [Bro03] (Section 6.9), which follows [Mar60], is too weak to

be applicable in our application. We, thus, resort to a stronger definition. As noted below,

106


our definition turns out to be consistent with functional deduction (see [Bro03], Chapter 8),

which is concerned with the variable dependency in a single characteristic function.

We relate our formulation to functional deduction as follows. In functional deduction,

variable dependency is drawn from a single characteristic function. Thus, to exploit the

dependency among a collection of functions ~ψ(~x), a single relation Ψ(~x, ~y) =∧

i(yi ≡ ψi(~x))

should be built, where yi’s are newly introduced Boolean variables. In addition, to derive

dependency solely among ~y, input variables ~x should be enforced in the eliminable

subset [Bro03]. With the foregoing transformation, variable dependency in functional de-

duction coincides with our defined functional dependency. A similar result of Theorem 57

was known in the context of functional deduction. Compared to the relational-oriented func-

tional deduction, our formulation can be understood as more functional-oriented, which is

computationally more practical.

6.3.2 Sequential Dependency

Given a state transition system M = (I, ~r, ~s, ~δ, ~ω), we consider the detection of func-

tional dependency among the set ~δ of transition functions. More precisely, detecting the

sequential dependency of M is equivalent to finding ~θ such that ~δ is partitioned into two

vectors: the dependents ~δφ, and the independents ~δϕ. Let ~s = ~sφ ∪ ~sϕ be such that

the valuations of ~sφ and ~sϕ are updated by ~δφ and ~δϕ, respectively. Then ~θ specifies the de-

pendency of M by ~sφ = ~θ( ~sϕ) and ~δφ = ~θ( ~δϕ), i.e., ~δφ(~r, 〈~θ( ~sϕ); ~sϕ〉) = ~θ ~δϕ(~r, 〈~θ( ~sϕ); ~sϕ〉).

Sequential dependency is more relaxed than its combinational counterpart because of

the reachability nature of M. The derivation of ~θ shall involve a fixed-point computation,

and can be obtained in two different ways, the greatest fixed-point (gfp) and the least fixed-

point (lfp) approaches, with different optimality and complexity. Our discussions start

from the easier gfp computation, and continue with the more complicated lfp one. The

107


optimality, on the other hand, is usually improved when changing from the gfp to the lfp

computation.

Remark 3 We mention a technicality regarding the set I of initial states. In general,

the combinational dependency among transition functions may not hold for the states in

I because I may contain predecessor-free states. (A state is called predecessor-free if

it has no predecessor states.) To overcome this difficulty, a new set I ′ of initial states is

defined. Let I ′ be the set of states which are one-step reachable from I. Now, since any

state in I ′ has at least one predecessor state, the calculated dependency holds for I ′. On

the other hand, the set of reachable states from I is identical to that from I ′ except for

some states in I. In the verification of safety properties, such a substitution is legitimate as

long as states in I satisfy the underlying property to be verified. In our discussion, unless

otherwise noted, we shall assume that the set of initial states consists of only states with

predecessors.

The greatest fixed-point calculation.

Figure 6.2 illustrates the gfp calculation of sequential dependency up to three iterations.

In the computation, state variables are treated functionally independent of each other ini-

tially. Their dependency is then discovered iteratively. Combinational dependency among

transition functions is computed in each iteration. The resultant dependency functions

are substituted backward in the subsequent iteration for the state variables of their corre-

sponding functional dependents. Thereby, the transition functions and previously derived

dependency functions are updated. More precisely, let ~θ(i) be the set of derived dependency

functions for ~δ(i) at the ith iteration. For j from i − 1 to 1, the set ~θ(j)(~s(i−1)~ϕ ) of depen-

dency functions is updated in order with ~θ(j)(~s(i)~ϕ ) = ~θ(j)(〈~θ(j+1)(~s(i)

~ϕ ); . . . ; ~θ(i)(~s(i)~ϕ );~s(i)

~ϕ 〉).After the updates of ~θ(j)’s, ~δ(i+1) is set to be ~δ

(i)~ϕ (~r, 〈~θ(1)(~s(i)

~ϕ ); . . . ; ~θ(i)(~s(i)~ϕ );~s(i)

~ϕ 〉), where

108


r

(1)

ϕ’(1)s

s’ (1)φδ

s

( i )

r

θ(2)sϕ

(1)s

( ii )

sϕ’(2)

s’φ(2)δ(1)

θ(1)

ϕ

r

s

sϕ’

s’φδϕ

( iii )

(2)

(3)

(3)

θ(3)

θ(2)

sϕ(2)

θ

sϕ(1)

(1)

θ

Figure 6.2. The greatest fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transition functions (andthus next-state variables) are partitioned into dependent and independent parts by thecomputation of combinational dependency. The derived dependency is used to reduce thestate space in the subsequent iteration.

~δ(i)~ϕ ⊆ ~δ corresponds to the functional independents of ~δ(i). At the (i + 1)st iteration,

the combinational dependency among ~δ(i+1) is computed. The iteration terminates when

the size of the set of functional independents cannot be reduced further. The termination

is guaranteed since |~δ(i)| decreases monotonically. In the end of the computation, the fi-

nal ~θ is simply the collection of ~θ(i)’s, and the final set of functional independents is ~δ(k)~ϕ ,

where k is the last iteration. The computation is summarized in Figure 6.3, where the

procedure CombinationalDependencyRestore is similar to CombinationalDependency with

a slight difference. It computes the dependency among the set of functions given in the first

argument in the same way as CombinationalDependency. However, the returned functional

109


SequentialDependencyGfp

input: a state transition system M = (I, ~r, ~s, ~δ, ~ω)output: a dependency triplet (~δφ, ~δϕ, ~θ) for ~δbegin01 i := 0; ~δ(1) := ~δ02 repeat03 if i ≥ 204 for j from i− 1 to 105 ~θ(j)(~s(i)

~ϕ ) := ~θ(j)(〈~θ(j+1)(~s(i)~ϕ ); . . . ; ~θ(i)(~s(i)

~ϕ );~s(i)~ϕ 〉)

06 if i ≥ 107 ~δ(i+1)(~r,~s(i)

~ϕ ) := ~δ(i)~ϕ (~r, 〈~θ(1)(~s(i)

~ϕ ); . . . ; ~θ(i)(~s(i)~ϕ );~s(i)

~ϕ 〉)08 i := i + 109 (~δ(i)

~φ, ~δ

(i)~ϕ , ~θ(i)) := CombinationalDependencyRestore(~δ(i), ~δ)

10 until |~δ(i)| = |~δ(i)~ϕ |

11 return (〈~δ(1)~φ

; . . . ;~δ(i−1)~φ

〉, ~δ(i−1)~ϕ , 〈~θ(1); . . . ; ~θ(i−1)〉)

end

Figure 6.3. Algorithm: SequentialDependencyGfp.

dependents and independents are the corresponding functions given in the second argument

instead of those in the first argument.

Notice that the final result of the gfp calculation may not be unique since, in each itera-

tion, there are several possible choices for a maximal functional dependency. As one choice

has been made, it fixes the dependency functions for state variables that are declared as

dependents. Thereafter, the dependency becomes an invariant throughout the computation

since the derivation is valid for the entire set of states with predecessors. For the same

reason, the gfp calculation may be too conservative. Moreover, the optimality of the gfp

calculation is limited because the state variables are initially treated functionally indepen-

dent of each other. This limitation becomes apparent especially when the dependency to

be discovered is between two state transition systems (e.g., in equivalence checking). To

discover more dependency, we need to adopt a least fixed-point strategy and refine the

dependency iteratively.

110


r

(1)

ϕ’(1)s

s’ (1)φδ

s

( i )

r

θ

sϕ(0)

(0)

θ(2)sϕ

(1)

θ(1)

s

( ii )

sϕ’(2)

s’φ(2)δ

r

θ(2)

sϕ(2)

s

sϕ’

s’φδ

( iii )

(3)

(3)

θ(3)

θ

Figure 6.4. The least fixed-point calculation of sequential dependency. The first threeiterations are illustrated in (i), (ii), and (iii). In each iteration, transition functions (andthus next-state variables) are partitioned into dependent and independent parts by thecomputation of combinational dependency. The derived dependency is used to reduce thestate space in the subsequent iteration.

The least fixed-point calculation.

Figure 6.4 illustrates the lfp calculation of sequential dependency up to three iterations.

In the computation, unlike the gfp one, the initial dependency among state variables is

exploited maximally based on the set of initial states. The dependency is then strengthened

iteratively until a fixed point has been reached. The set of functional independents tend to

increase during the iterations, in contrast to the decrease in the gfp calculation.

Consider the computation of initial dependency. For the simplest case, when |I| = 1,

any state variable sϕ can be selected as the basis. Any other variable is replaced with

111


either =(sϕ) or =†(sϕ), depending on whether its initial value equals that of sϕ or not. For

arbitrary I, the initial variable dependency can be derived using functional deduction on

the characteristic function of I. (As noted in Remark 3, excluding predecessor-free states

from I reveals more dependency.)

For the iterative computation, transition functions are updated in every iteration by

eliminating dependent state variables with the latest dependency functions. Combinational

dependency is then obtained for the new set of transition functions. Unlike the gfp iter-

ations, the obtained functional dependency in the ith iteration may not be an invariant

for the following iterations because the derived dependency may be valid only in the state

subspace spanned by ~s(i−1)~ϕ . As the state subspace changes over the iterations due to

different selections of independent state variables, the dependency may need to be recti-

fied. Notice that the set of functional independents may not increase monotonically during

the iterations. This non-convergent phenomenon is due to the existence of the don’t-care

choices of ~θ(i) in addition to the imperfect information about the currently reachable state

set. Therefore, additional requirements need to be imposed to guarantee termination. Here

we require that, after a certain number of iterations, the set of independent state variables

increase monotonically until ~θ(i) can be reused in the next iteration, that is, the fixed point

is reached. The algorithm is outlined in Figure 6.5. To simplify the presentation, it contains

only the iterations where ~s(i)~ϕ increases monotonically. Procedure CombinationalDepen-

dencyReuse is the same as CombinationalDependency except that it tries to maximally

reuse the dependency functions provided in its second argument.

In theory, the optimality of the lfp calculation lies somewhere between that of the gfp

calculation and that of the most general computation with reachability analysis. (The opti-

mality of the lfp calculation subsumes that of the gfp counterpart because the dependency

discovered by the gfp calculation can always be an invariant during the lfp calculation.

However, in practice, the lfp calculation may not maintain such dependency throughout

112


SequentialDependencyLfp

input: a state transition system M = (I, ~r, ~s, ~δ, ~ω)output: a dependency triplet (~δφ, ~δϕ, ~θ) for ~δbegin01 i := 0; (~s(0)

~φ, ~s

(0)~ϕ , ~θ(0)) := InitialDependency(I)

02 repeat03 i := i + 104 ~δ(i) := ~δ(~r, 〈~θ(i−1)(~s(i−1)

~ϕ );~s(i−1)~ϕ 〉)

05 (~δ(i)~φ

, ~δ(i)~ϕ , ~θ(i)) := CombinationalDependencyReuse(~δ(i), ~θ(i−1))

06 until ~θ(i) = ~θ(i−1)

07 return (~δ(i)~φ

, ~δ(i)~ϕ , ~θ(i))

end

Figure 6.5. Algorithm: SequentialDependencyLfp.

its iterative computations if the reachable state space is approximated in an inappropriate

path.) Since not all dependency in M can be detected by the lfp procedure due to the

imperfect information about the reachable states, the algorithm is incomplete in detect-

ing dependency. To make it complete, reachability analysis should be incorporated. We

postpone this integration to the next section and phrase it in the context of verification

reduction.

Remark 4 Notice that when ~θ(i)’s are restricted to consisting of only identity functions

and/or complementary identity ones, the refinement relation v among transition functions

reduces to an equivalence relation. In this case, the lfp calculation of sequential dependency

reduces to the detection of equivalent state variables. Hence, detecting signal correspon-

dence [vE00] is a special case of our formulation.

113


6.4 Verification Reduction

Here we focus on using reduction for safety property verification, where reachability

analysis is the core computation. The verification problem asks if a state transition system

M = (I, ~r, ~s, ~δ, ~ω) satisfies a safety property P , denoted as M |= P , for all of its reachable

states.

Suppose that (~δφ, ~δϕ, ~θ) is a dependency triplet of ~δ; let ~sφ and ~sϕ be the corresponding

state variables of ~δφ and ~δϕ, respectively. To represent the reachable state set, either

~s or ~sϕ can be selected as the basis. Essentially, R(~s) = Expand(R⊥( ~sϕ), ( ~sφ, ~sϕ, ~θ)) =

R⊥( ~sϕ) ∧ ∧i(sφi ≡ θi( ~sϕ)), where R and R⊥ are the characteristic functions representing

the reachable state sets in the total space and, respectively, in the reduced space spanned

by ~sϕ. Let P (~s) denote the states that satisfy P . Checking whether R(~s) ⇒ P (~s) is

equivalent to checking whether R⊥( ~sϕ) ⇒ P⊥( ~sϕ), where P⊥( ~sϕ) = P (〈~θ( ~sϕ); ~sϕ〉). Hence,

the verification problem can be carried out solely over the reduced space. As noted in

Remark 3, the set I of initial states might require special handling.

For a given dependency, reachability analysis can be carried out solely upon the reduced

basis. The validity of the given dependency can be tested in every iteration of the reacha-

bility analysis as was done in [HD93]. Below we concentrate on the cases where dependency

is not given. We show how the detection of functional dependency can be embedded into

and simplify the reachability analysis.

To analyze the reachability of a transition system with unknown dependency, two ap-

proaches can be taken. One is to find the sequential dependency with the forementioned gfp

and/or lfp calculation, and then perform reachability analysis on the reduced state space

based on the obtained dependency. The other is to embed the dependency detection into

the reachability analysis as an on-the-fly reduction. Since the former is straightforward, we

only detail the latter. Figure 6.6 sketches the algorithm. Procedure CombinationalDepen-

114


ComputeReachWithDependencyReduction

input: a state transition system M = (I, ~r, ~s, ~δ, ~ω)output: the set R of reachable states of Mbegin01 i := 0; (~s(0)

~φ, ~s

(0)~ϕ , ~θ(0)) := InitialDependency(I)

02 I⊥0 := I[~s(0)~ϕ /~s]

03 R⊥0 := I⊥0 ; F⊥0 := I⊥0

04 repeat05 i := i + 106 ~δ(i) := ~δ(~r, 〈~θ(i−1)(~s(i−1)

~ϕ );~s(i−1)~ϕ 〉)

07 (~δ(i)~φ

, ~δ(i)~ϕ , ~θ(i)) := CombinationalDependencyReach(~δ(i), ~θ(i−1), R⊥i−1)

08 T⊥i := Image(~δ(i)~ϕ , F⊥i−1)

09 ~sν := ~s(i)~ϕ \~s

(i−1)~ϕ ; ~θν := ~sν ’s corresponding functions in ~θ(i−1)

10 R⊥i−1 := Expand(R⊥i−1 , (~sν , ~s(i−1)~ϕ , ~θν))

11 R⊥i−1 := R⊥i−1 [~s(i)~ϕ /~s

(i)~ϕ ∪ ~s

(i−1)~ϕ ]

12 F⊥i := simplify T⊥i with R⊥i−1 as don’t care13 R⊥i := R⊥i−1 ∪ T⊥i

14 until R⊥i = R⊥i−1

15 return Expand(R⊥i , (~s(i)~φ

, ~s(i)~ϕ , ~θ(i)))

end

Figure 6.6. Algorithm: ComputeReachWithDependencyReduction.

dencyReach is similar to CombinationalDependencyReuse with two exceptions: First, the

derived dependency is with respect to the reached state set provided in the third argument.

Second, the set of independent state variables needs not increase monotonically since the

termination condition has been taken care of by the reached state sets. In each iteration

of the state traversal, the previously reached state set R is adjusted (by the expansion and

projection operations) to a new basis according to the derived dependency triplet.

115


6.5 Experimental Results

The forementioned algorithms have been implemented in the VIS [BHSV+96] environ-

ment. Experiments were conducted on a Sun machine with a 900-MHz CPU and 2-Gb

memory. Three sets of experiments have results shown in Tables 6.1, 6.2, and 6.3, respec-

tively. Table 6.1 demonstrates the relative power of exploiting dependency by the detection

of signal correspondence, the gfp, and lfp calculations of sequential dependency. Table 6.2

compares their applicabilities in the equivalence checking problem. Finally, Table 6.3 shows

how reachability analysis can benefit from our computation of functional dependency. In

the experiments, all the approaches under comparison use the same BDD ordering (not

optimized). In addition, no reordering is invoked.

Compared in Table 6.1 are three approaches: the computation of signal correspondence

[vE00], the gfp, and lfp calculations of sequential dependency. The first two columns list

the benchmark circuits and their sizes in state variables. The original sizes of retimed

circuits (for timing optimization) are listed in the following parentheses. For each compared

approach, four columns in order list the sizes of the computed independent state variables,

the required numbers of iterations, memory usage, and CPU time. Among these three

approaches, the minimum sizes of independent variables are highlighted in bold. It is

evident from Table 6.1 that the lfp calculation of sequential dependency subsumes the

detection of signal correspondence in both generality and optimality. On the other hand,

the powers of the lfp and gfp calculations are incomparable in practice. They have different

directions of approximating reachable state sets. For the gfp calculation, the unreachable

state set is gradually pruned each time dependency functions are substituted backward.

For the lfp one, the reachable state set grows with the iterative computation. It turns out

that the gfp computation is very effective in exploiting dependency for retimed circuits. For

instance, in circuit tbk-rt, 13 variables are identified as independents by the gfp calculation,

116


Tab

le6.

1.C

ompa

riso

nsof

Cap

abili

ties

ofD

isco

veri

ngD

epen

denc

y

Sign

alC

orr.

[vE

00]

Seq.

Dep

.G

fpSe

q.D

ep.

Lfp

mem

.ti

me

mem

.ti

me

mem

.ti

me

Cir

cuit

Stat

eV

ar.

indp

.it

er.

(Mb)

(sec

)in

dp.

iter

.(M

b)(s

ec)

indp

.it

er.

(Mb)

(sec

)s208.1-rt

16(8

)13

410

0.1

91

100.

19

1010

0.2

s298-rt

34(1

4)31

510

0.3

232

231.

624

1041

6.2

s386-rt

15(6

)7

210

0.1

71

100.

16

210

0.1

s499-rt

41(2

2)41

2113

1.6

291

2311

.629

2223

8.2

s510-rt

34(6

)32

413

0.4

212

5117

.523

658

81.1

s526-rt

58(2

1)50

613

1.0

382

5943

.839

1456

24.9

s526n-rt

64(2

1)55

413

1.0

372

6010

4.2

4014

5826

.8s635-rt

51(3

2)50

1613

0.6

342

132.

834

3321

7.4

s838.1-rt

73(3

2)48

2013

1.5

331

223.

733

4621

18.3

s991-rt

42(1

9)24

213

0.5

212

211.

420

221

1.4

mult16a-rt

106

(16)

666

130.

975

213

1.0

618

134.

6tbk-rt

49(5

)49

249

6.8

134

6226

4.1

213

5948

.4s1269

3737

221

0.6

351

211.

135

621

2.4

s1423

7473

639

2.3

740

394.

373

939

12.0

s3271

116

114

629

2.1

116

029

3.0

114

645

12.6

s4863

104

813

474.

781

169

178.

775

347

14.5

s5378

179

163

1237

6.5

155

251

15.9

154

1451

43.1

s6669

239

231

564

9.3

231

161

53.8

231

564

97.5

s9234.1

211

188

1899

79.5

189

297

250.

218

438

9996

7.6

s13207

669

303

1613

895

.646

05

111

384.

626

337

100

836.

0s15850

597

431

2414

222

1.7

569

313

414

87.1

315

3214

214

41.0

s35932

1728

1472

3128

159

9.8

1728

014

634

091.

5–

––

>10

5

s38584

1452

869

1730

352

5.5

1440

115

541

03.3

849

2530

322

001.

18085

193

9115

6528

.919

30

7042

.479

1763

64.3

117


compared to 24 by the lfp one. In general, the gfp computation uses much fewer iterations

than the other two approaches. In contrast, the lfp calculation outperforms the other two

approaches in circuits not retimed. The table also reveals that all the approaches do not

suffer from memory explosion. Rather, the time consumption may be a concern in the gfp

and lfp calculations of sequential dependency. This is understandable because testing the

refinement relation is more general and complicated than testing the equivalence relation

used in the detection of signal correspondence. Fortunately, the tradeoff between quality

and time can be easily controlled, for example, by imposing k-substitutability, which uses up

to k functions to substitute a dependent function. With our formulation, dependencies that

were underivable before, due to the limitation of reachability analysis on large transition

systems, can now be computed efficiently.

With similar layout to Table 6.1, Table 6.2 compares the applicabilities of these three

approaches to the equivalence checking problem. Here a product machine is built using

a circuit and its retimed version. As noted earlier, the gfp calculation itself cannot prove

the equivalence between two systems. It, essentially, computes the dependency inside each

individual system, but not the interdependency between them. On the other hand, the

detection of signal correspondence can rarely prove equivalence unless the two systems under

comparison are almost functionally identical. In contrast, the lfp calculation of sequential

dependency can easily prove the equivalence between two systems where one is forwardly

retimed from the other, and vice versa. Arbitrary retiming, however, may cause a failure,

although in principle there always exists a lfp calculation that can conclude the equivalence.

In Table 6.2, since the retiming operations on the retimed circuits involve both forward and

backward moves, none of the approaches can directly conclude the equivalences. However,

as can be seen, the lfp calculation can compactly condense the product machines.

Although detecting dependency can reduce state space, it is not clear if the BDD sizes for

the dependency functions and the rewritten transition functions are small enough to benefit

118


Tab

le6.

2.C

ompa

riso

nsof

Cap

abili

ties

ofC

heck

ing

Equ

ival

ence

Sign

alC

orr.

[vE

00]

Seq.

Dep

.G

fpSe

q.D

ep.

Lfp

mem

.ti

me

mem

.ti

me

mem

.ti

me

Cir

cuit

Stat

eV

ar.

indp

.it

er.

(Mb)

(sec

)in

dp.

iter

.(M

b)(s

ec)

indp

.it

er.

(Mb)

(sec

)s208.1

8+16

167

100.

217

110

0.1

1210

100.

4s298

14+

3439

510

0.5

372

211.

530

1331

4.4

s386

6+15

133

100.

213

212

0.3

123

100.

2s499

22+

4163

2114

3.1

432

387.

342

2245

23.6

s510

6+34

384

130.

627

250

25.9

295

3639

.8s526

21+

5864

813

2.2

592

6041

.650

1453

27.2

s526n

21+

6469

813

2.4

582

5912

1.9

5012

5831

.8s635

32+

5166

3113

7.8

661

211.

451

3325

9.1

s838.1

32+

7378

3125

16.8

652

484.

259

4737

22.5

s991

19+

4242

222

1.5

402

382.

539

341

5.4

mult16a

16+

106

826

144.

691

214

1.7

778

265.

1tbk

5+49

542

445.

517

461

175.

625

359

86.4

119


Tab

le6.

3.C

ompa

riso

nsof

Cap

abili

ties

ofA

naly

zing

Rea

chab

ility

R.A

.w

/oD

ep.

Red

ucti

onR

.A.w

Dep

.R

educ

tion

peak

reac

hed

mem

.ti

me

peak

reac

hed

mem

.ti

me

Cir

cuit

Iter

.(b

ddno

des)

(bdd

node

s)(M

b)(s

ec)

(bdd

node

s)(b

ddno

des)

(Mb)

(sec

)s3271

428

8193

0116

1582

4262

027

84.1

1884

3837

1074

6053

415

1082

.6s4863

218

5277

8124

8885

365

404.

854

9006

8772

6713

.1s5378

2–

–>

2000

–11

5143

911

3522

7021

.5s15850

1529

8428

8999

6194

565

321

337.

417

6670

7663

5671

446

381

75.0

8085

5016

6637

4917

0160

439

024

280.

278

3060

213

3832

221

246

40.1

120


reachability analysis. In Table 6.3, we justify that it indeed can improve the analysis.

Some hard instances for state traversal are studied. We compare reachability analyses

without and with on-the-fly reduction using functional dependency. In the comparison,

both analyses have the same implementation except switching off and on the reduction

option. The second column of Table 6.3 shows the steps for (partial) state traversal. For

each reachability analysis, four columns in order shows the peak number of live BDD nodes,

the size of the BDD representing the final reached state set, memory usage, and CPU

time. It is apparent that, with the help of functional dependency, the reachability analysis

yields substantial savings in both memory and time consumptions, compared to the analysis

without reduction.

6.6 Related Work

Among previous studies [HD93, vEJ96] on exploiting functional dependency, the one

closest to ours is [vEJ96] while functional dependency in [HD93] is assumed to be given.

In [vEJ96], their method is similar to our reachability analysis with on-the-fly dependency

detection. However, several differences need to be addressed. First, their dependency is

drawn entirely from the currently reached state set (using functional deduction) rather than

from the transition functions. Thus, in each iteration of their reachability analysis, image

computation need to be done before the detection of new functional dependency. The image

computation rarely benefits from functional dependency. In contrast, our approach is more

effective because the dependency is discovered before the image computation. The image

computation is performed on the reduced basis. Second, as their dependency is obtained

from the currently reached state set, not from transition functions, it is not as robust as

ours to remain valid through the following iterations. Third, their approach cannot be

used to detect functional dependency without reachability analysis while our formulation

121


can be used as a stand-alone technique. Also, we identify a new initial set of states with

predecessors. It uncovers more dependency to be exploited.

For related work [QCC+00, AGM96, SWWK04, vE00] specific to sequential equivalence

checking, the work of [vE00] is the most relevant to ours. As noted in Remark 4, finding

signal correspondence [vE00] is a special case of our lfp calculation. For other related work,

in [QCC+00], two transition systems under comparison need to be similar up to a one-to-one

mapping between equivalent states. Such a mapping is discovered by reachability analysis

to converge their combinational similarity. In comparison, the C-1-D approach of [AGM96]

can handle one-to-many mappings by imposing the additional 1-distinguishablity constraint.

Compared to [QCC+00] and [AGM96], our formulation works for arbitrary mappings. We

note that, in [AGM96], nondeterministic transition systems can be naturally handled. By

contrast, a complication of incorporating the MOCB technique [HD93] is necessary for us

to manage nondeterminism. The structural traversal method in [SWWK04] is an over-

approximative reachability analysis based on circuit manipulations; our computation, on

the other hand, is at the functional level. While these prior efforts focus on equivalence

checking, ours is more general for safety property checking.

6.7 Summary

We formulate the dependency among a collection of functions based on a refinement

relation. When applied to a state transition system, it allows the detection of functional

dependency without knowing reached state sets. With an integration into a reachability

analysis, it can be used as a complete verification procedure with the power of on-the-fly

reduction. Our formulation unifies signal correspondence [vE00] and functional dependency

[vEJ96] in the verification framework. In application to the equivalence checking problem,

our method bridges the complexity gap between combinational and sequential equivalence

122


checking. Preliminary experiments show promising results in detecting dependency and

verification reduction.

123

Chapter 7

Conclusions and Future Work

With the help of invariants, we studied four topics in the analysis and verification

of finite state transition systems, namely, combinationality and sequential determinism,

retiming and resynthesis, sequential equivalence checking, and verification reduction.

Combinationality and sequential determinism. Cyclic definitions occur naturally in

high-level system specifications due to resource sharing, module composition, etc.

Without the time separation provided by state-holding elements, instantaneous val-

uations of cyclic definitions result in a causality problem. However, not all cyclic

definitions are hazardous. Prior work on differentiating good from bad cyclic defini-

tions was based on ternary-valued simulation at the circuit level with the up-bounded

inertial delay model. Different equivalent netlists may result in different conclusions

about combinationality. We argued that the previous differentiation (combination-

ality formulation) is too conservative because the timing model rules out legitimate

instances when cyclic definitions are to be broken by rewriting or when the synthesis

target is software. We investigated, at the functional level, the most general con-

dition where cyclic definitions are semantically combinational. Essentially, a set of

124

CHAPTER 7. CONCLUSIONS AND FUTURE WORK

cyclic definitions is combinational at the functional level if and only if every state

evolution graph induced by an input assignment has all states in loops with a unique

output observation. The above invariant characterizes the combinationality at the

functional level. Our result admits strictly more flexible high-level specifications and

avoids inconsistent analysis for different equivalent netlists. Furthermore, it allows a

higher-level analysis of combinationality, and, thus, no costly synthesis of a high-level

description into a circuit netlist before combinationality analysis can be performed.

With our formulation, when the target is software implementations, combinational

cycles need not be broken as long as the execution of the underlying system obeys

a sequencing execution rule. For hardware implementations, combinational cycles

should be broken and replaced with acyclic equivalents at the functional level to avoid

malfunctioning in the final physical realization.

Moreover, we extended our combinationality formulation to systems with state-

holding elements. We showed the exact condition when a system with cyclic definitions

is deterministic in its input-output behavior. Although the analysis of combination-

ality and input-output determinism is of complexity PSPACE-complete. It may still

be practical as long as the cutset size is small.

As for future work, although the choice of cutset does not affect the analysis of com-

binationality, it does influence the resultant system rewritten with acyclic definitions.

It might be useful to decide a good cutset with respect to various optimization ob-

jectives. Also, as shown in Sections 3.3.2, 3.3.5 and 3.3.6, there are many ways to

rewrite cyclic definitions with acyclic equivalents. It would be interesting to explore

such flexibilities for further optimization.

Retiming and resynthesis. Transformations using retiming and resynthesis are consid-

ered as the most practical and important techniques in optimizing synchronous hard-

ware systems. Since the transformation modifies circuit structures directly without

125


resorting to state space traversal, the computation is inexpensive and the improvement

is transparent and predictable. Regardless of these advantages, these transformations

are not widely adopted in current synthesis flow of synchronous hardware systems.

The reason can be attributed to three unsolved problems, optimization capability, ver-

ification complexity, and the rectification of initialization sequences. Resolving these

questions is crucial in developing effective synthesis and verification algorithms.

These problems were resolved in the thesis through identifying some transformation

invariants under retiming and resynthesis. The first problem was resolved through

a constructive algorithm which determines if two given FSMs are transformable to

each other via retiming and resynthesis operations. The second problem, verifying

the equivalence of two FSMs under such transformation, was proved, contrary to a

common belief, to be as hard as the PSPACE-complete problem of general equivalence

checking if the transformation history is lost. As a result, we advocated a conserva-

tive design methodology for the optimization of synchronous hardware systems to

ameliorate verifiability. For instance, transformation history should be recorded, or

every retiming (resynthesis) operation should be followed by an equivalence checking.

For the third problem, initializing FSMs transformed under iterative retiming and

resynthesis, we showed that there is no general transformation-independent bound

limiting the growth of initialization sequences, unlike the case when only retiming is

performed. An algorithm computing the length increase of initialization sequences

was presented. Essentially, an initialization sequence should be rectified by prefixing

it with an arbitrary input sequence of length greater than the computed length.

For future work, it is important to investigate more efficient computation, with reason-

able accuracy, of the length increase of initialization sequences for FSMs transformed

under iterative retiming and resynthesis. On the other hand, it may seem that our

lag-independent bound can be used to improve retiming algorithms by pruning out

126


spurious linear constraints, similar to [MS98]. Moreover, since the result of [ESS96]

can be modified to obtain a retime function targeting area optimization with mini-

mum increase of initialization sequences, it would be useful to study retiming under

other objectives while avoiding increasing initialization sequences.

Sequential equivalence checking. We extended our studies to general sequential equiv-

alence checking beyond that of checking retiming and resynthesis equivalence.

Checking the equivalence of two sequential systems is one of the most challenging

problems and obstacle in designing correct hardware systems. The state-explosion

problem limits formal verification to small- or medium-sized sequential circuits partly

because BDD sizes depend heavily on the number of variables dealt with. In the worst

case, a BDD size grows exponentially with the number of variables. Thus, reducing

this number can possibly increase the verification capacity.

Given two FSMs M1 and M2 with numbers of state variables m1 and m2, respec-

tively, conventional formal methods verify equivalence by traversing the state space of

the product machine, with m1 + m2 registers. In contrast, we showed that the state

equivalence of an FSM can be computed without building a product machine. Apply-

ing the result to equivalence checking, we were able to introduce a different possibility,

based on partitioning the state space defined by a multiplexed machine, which has

merely maxm1,m2 + 1 registers. Essentially, sequential equivalence checking was

done in the disjoint union state space. For the invariants to be asserted, the product-

machine based verification checks if the outputs of the two FSMs under comparison

are identical throughout reachability analysis; the multiplexed-machine based verifi-

cation checks if the initial state pair of the two FSMs remains in the same equivalence

class throughout state space partitioning.

Empirical results demonstrated that the proposed approach is more robust than pre-

vious ones. The robustness can be attributed to three factors: First, the encountered

127


state variables is almost half of those in the product-machine based verification. Sec-

ond, the cone of influence reduction is automatically taken care of when every output

function is separately verified. Third, the number of equivalence classes in the reach-

able state space is an invariant under any valid transformation.

On the other hand, since an equivalence class is represented with a BDD node, the

verification capacity of the proposed approach is primarily limited by the encoun-

tered number of equivalence classes. The approach is feasible when the number of

encountered equivalence classes is no more than a few million. Another weakness is

the variable ordering restriction on BDDs. Fortunately, since the ordering restriction

only needs to be maintained in counting the number of equivalence classes, it does

not result in a notable hinderance.

A future research direction would be to develop specialized BDD operations to improve

our computations.

Verification reduction. We extended our studies further to general safety property check-

ing beyond equivalence checking, and proposed a reachability-preserving reduction

technique based on functional dependency. In essence, functional dependency is an

invariant, which acts as a catalyst simplifying verification tasks.

The existence of functional dependency among the state variables of a state transition

system can cause needless inefficiency in BDD representations for formal verification.

Eliminating such dependency from the system compacts the state space and can sig-

nificantly reduce the verification cost. Prior approaches to the detection of functional

dependency relied on reachability analysis, which makes the computation not scalable

to large systems. Instead, we investigated how functional dependency can be derived

without or before knowing the reachable state set. Two previous studies on detect-

ing signal correspondence and exploiting functional dependency were unified in our

approach. We presented a direct derivation of dependency from transition functions

128


rather than from reached state sets. As a consequence, reachability analysis is not

a necessity for exploiting dependency. Dependencies that were underivable before,

due to the limitation of reachability analysis on large transition systems, can now be

computed efficiently. In addition, our derivation of functional dependency was inte-

grated into reachability analysis as an on-the-fly reduction. Using this, reachability

analysis was shown to have a substantial reduction in both memory and run-time

consumption.

As a future research direction, our results can be reformulated in a SAT-solving frame-

work. An approach similar to that of [BC00], where van Eijk’s approach [vE00] was

adjusted, could be taken to prove safety properties with strengthened induction. We

believe that SAT-based verification can benefit from our results, because our approach

can impose more invariants than just signal correspondence; hence the searching proce-

dure of SAT solvers can be made more efficient. Furthermore, our current formulation

does not handle transition relations directly. We would like to know what would be

the appropriate formulation for transition relations rather than translating them into

sets of functional vectors.

129

Bibliography

[AGM96] P. Ashar, A. Gupta, and S. Malik. Using complete-1-distinguishability for

FSM equivalence checking. In Proc. Int’l Conf. Computer-Aided Design,

pages 346–353, 1996.

[BC00] P. Bjesse and K. Claessen. SAT-based verification without state space traver-

sal. In Proc. Formal Methods in Computer-Aided Design, pages 372–389,

2000.

[BCM90] C. Berthet, O. Coudert, and J.-C. Madre. New ideas on symbolic manipula-

tions of finite state machines. In Proc. Int’l Conf. Computer Design, pages

224–227, 1990.

[Ber99] G. Berry. The Constructive Semantics of Pure Esterel. Draft book, 1999.

[Ber00] G. Berry. The foundations of Esterel. In Proof, Language, and Interaction:

Essays in Honour of Robin Milner. MIT Press, 2000.

[BHSV+96] R. K. Brayton, G. D. Hachtel, A. Sangiovanni-Vincentelli, F. Somenzi,

A. Aziz, S.-T. Cheng, S. Edwards, S. Khatri, Y. Kukimoto, A. Pardo,

S. Qadeer, R. K. Ranjan, S. Sarwary, T. R. Shiple, G. Swamy, and T. Villa.

VIS: a system for verification and synthesis. In Proc. Int’l Conf. Computer

Aided Verification, pages 428–432, 1996.

130

BIBLIOGRAPHY

[Bro03] F. M. Brown. Boolean Reasoning: The Logic of Boolean Equations. Dover

Publications, 2003.

[Bry86] R. E. Bryant. Graph-based algorithms for Boolean function manipulation.

IEEE Trans. on Computers, C-35:677–691, August 1986.

[Bry87] R. E. Bryant. Boolean analysis of MOS circuits. IEEE Trans. on Computer-

Aided Design of Integrated Circuits and Systems, pages 634–649, 1987.

[Bry92] R. E. Bryant. Symbolic Boolean manipulation with ordered binary decision

diagrams. ACM Computing Surveys, 24(3):293–318, September 1992.

[BS95] J. Brzozowski and C.-J. Seger. Asynchronous Circuits. Springer-Verlag, 1995.

[CBM89] O. Coudert, C. Berthet, and J.-C. Madre. Verification of synchronous se-

quential machines based on symbolic execution. In Proc. Int’l Workshop

Automatic Verification Methods Finite State Syst., pages 365–373, 1989.

[CGP99] E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. MIT Press,

1999.

[CM90] O. Coudert and J.-C. Madre. A unified framework for the formal verification

of sequential circuits. In Proc. Int’l Conf. Computer-Aided Design, pages

126–129, 1990.

[CQS00] G. Cabodi, S. Quer, and F. Somenzi. Optimizing sequential verification by

retiming transformations. In Proc. Design Automation Conf., pages 601–606,

2000.

[DM91] G. De Micheli. Synchronous logic synthesis: algorithms for cycle-time mini-

mization. IEEE Trans. on Computer-Aided Design of Integrated Circuits and

Systems, 10:63–73, January 1991.

131

BIBLIOGRAPHY

[Edw03] S. Edwards. Making cyclic circuits acyclic. In Proc. Design Automation

Conference, pages 159–162, 2003.

[EL03] S. Edwards and E. Lee. The semantics and execution of a synchronous block-

diagram language. Science of Computer Programming, 48:21–42, 2003.

[EMMRM97] A. El-Maleh, T. E. Marchok, J. Rajski, and W. Maly. Behavior and testability

preservation under the retiming transformation. IEEE Trans. on Computer-

Aided Design of Integrated Circuits and Systems, 16:528–543, May 1997.

[ENSS98] G. Even, J. Naor, B. Schieber, and M. Sudan. Approximating minimum

feedback sets and multi-cuts in directed graphs. Algorithmica, 20:151–174,

1998.

[ESS96] G. Even, I. Y. Spillinger, and L. Stok. Retiming revisited and reversed.

IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems,

15:348–357, March 1996.

[Fil91] T. Filkorn. A method for symbolic verification of synchronous circuits. In

Proc. Int’l Symp. Comput. Hardware Description Lang. Applicat., pages 249–

259, 1991.

[Fil92] T. Filkorn. Symbolische Methoden fur die Verifikation Endlicher Zustandssys-

teme. Ph.D. Thesis, Institut fur Informatik der Technischen Universitat

Munchen, 1992.

[Hal93] N. Halbwachs. Synchronous Programming of Reactive Systems. Kluwer Aca-

demic Publishers, 1993.

[HD93] A. J. Hu and D. L. Dill. Reducing BDD size by exploiting functional depen-

dencies. In Proc. Design Automation Conference, pages 266–271, 1993.

132

BIBLIOGRAPHY

[HJJ+96] J. G. Henriksen, J. Jensen, M. Jorgensen, N. Klarlund, B. Paige, T. Rauhe,

and A. Sandholm. Mona: monadic second-order logic in practice. In Proc.

Int’l Conf. on Tools and Algorithms for the Construction and Analysis of

Systems, pages 89–110, 1996.

[HM95] N. Halbwachs and F. Maraninchi. On the symbolic analysis of combinational

loops in circuits and synchronous programs. In Proc. Euromicro, 1995.

[HU79] J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Lan-

guages, and Computation. Addison-Wesley, 1979.

[Imm88] N. Immerman. Nondeterministic space is closed under complementation.

SIAM Journal on Computing, 17:935–938, 1988.

[JJH01] J.-H. R. Jiang, J.-Y. Jou, and J.-D. Huang. Unified functional decomposition

via encoding for FPGA technology mapping. IEEE Trans. on Very Large

Scale Integration Systems, 9:251–260, April 2001.

[Jon75] N. Jones. Space-bounded reducibility among combinatorial problems. Journal

of Computer and System Sciences, 11:68–85, 1975.

[Kar72] R. Karp. Reducibility among combinatorial problems. In Complexity of

Computer Computations, pages 85–104. Plenum Press, 1972.

[Kau70] W. Kautz. The necessity of closed circuit loops in minimal combinational

circuits. IEEE Trans. on Computers, pages 162–164, 1970.

[KB01] A. Kuehlmann and J. Baumgartner. Transformation-based verification using

generalized retiming. In Proc. Int’l Conf. Computer Aided Verification, pages

104–117, 2001.

133

BIBLIOGRAPHY

[Koh78] Z. Kohavi. Switching and Finite Automata Theory. McGraw-Hill, New York,

1978.

[Kur94] R. P. Kurshan. Computer-Aided Verification of Coordinating Processes.

Princeton University Press, 1994.

[LN91a] B. Lin and A. R. Newton. Exact redundant state registers removal based on

binary decision diagrams. In Proc. Int’l Conf. Very Large Scale Integration,

pages 277–286, 1991.

[LN91b] B. Lin and A. R. Newton. Implicit manipulation of equivalence classes using

binary decision diagrams. In Proc. Int’l Conf. Computer Design, pages 81–85,

1991.

[LPV93] Y.-T. Lai, M. Pedram, and S. B. K. Vrudhula. BDD based decomposition

of logic functions with application to FPGA synthesis. In Proc. Design Au-

tomation Conf., pages 642–647, 1993.

[LS83] C. E. Leiserson and J. B. Saxe. Optimizing synchronous systems. Journal of

VLSI and Computer Systems, 1(1):41–67, Spring 1983.

[LS91] C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica,

6:5–35, 1991.

[LTN90] B. Lin, H. J. Touati, and A. R. Newton. Don’t care minimization of multi-

level sequential logic networks. In Proc. Int’l Conf. Computer-Aided Design,

pages 414–417, 1990.

[Mac] The MacTutor History of Mathematics. Online archive,

http://www-gap.dcs.st-and.ac.uk/∼history/.

134

BIBLIOGRAPHY

[Mal90] S. Malik. Combinational Logic Optimization Techniques in Sequential Logic

Synthesis. Ph.D. Thesis, University of California, Berkeley, 1990.

[Mal94] S. Malik. Analysis of cyclic combinational circuits. IEEE Trans. on

Computer-Aided Design of Integrated Circuits and Systems, 13(7):950–956,

July 1994.

[Mar60] E. Marczewski. Independence in algebras of sets and Boolean algebra. Fun-

damenta Mathematicae, 48:135–145, 1960.

[MKRS00] I.-H. Moon, J. H. Kukula, K. Ravi, and F. Somenzi. To split or to conjoin:

the question in image computation. In Proc. Design Automation Conf., pages

23–28, 2000.

[MS98] N. Maheshwari and S. Sapatnekar. Efficient retiming of large circuits. IEEE

Trans. on Very Large Scale Integration Systems, 6:74–83, March 1998.

[MSBSV91] S. Malik, E. M. Sentovich, R. K. Brayton, and A. Sangiovanni-Vincentelli.

Retiming and resynthesis: optimization of sequential networks with combi-

national techniques. IEEE Trans. on Computer-Aided Design of Integrated

Circuits and Systems, 10:74–84, January 1991.

[MSM04] M. N. Mneimneh, K. A. Sakallah, and J. Moondanos. Preserving synchroniz-

ing sequences of sequential circuits after retiming. In Proc. Asia and South

Pacific Design Automation Conference, January 2004.

[NK99] K. Namjoshi and R. Kurshan. Efficient analysis of cyclic definitions. In Proc.

Computer Aided Verification, pages 394–405, 1999.

[Pap94] C. H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

135

BIBLIOGRAPHY

[Pix92] C. Pixley. A theory and implementation of sequential hardware equivalence.


11:1469–1478, December 1992.

[PT87] R. Paige and R. E. Tarjan. Three partition refinement algorithms. SIAM

Journal on Computing, 16:973–989, 1987.

[QCC+00] S. Quer, G. Cabodi, P. Camurati, L. Lavagno, and R. K. Brayton. Verification

of similar FSMs by mixing incremental re-encoding, reachability analysis, and

combinational check. Formal Methods in System Design, 17:107–134, 2000.

[Ran97] R. K. Ranjan. Design and Implementation Verification of Finite State Sys-

tems. Ph.D. Thesis, University of California, Berkeley, 1997.

[RB03a] M. Riedel and J. Bruck. Cyclic combinational circuits: analysis for synthesis.

In Proc. Int’l Workshop on Logic and Synthesis, pages 105–112, 2003.

[RB03b] M. Riedel and J. Bruck. The synthesis of cyclic combinational circuits. In

Proc. Design Automation Conference, pages 163–168, 2003.

[RK62] J. P. Roth and R. M. Karp. Minimization over Boolean graphs. IBM Journal

of Research and Development, pages 227–238, December 1962.

[RSSB98] R. K. Ranjan, V. Singhal, F. Somenzi, and R. K. Brayton. On the optimiza-

tion power of retiming and resynthesis transformations. In Proc. Int’l Conf.

on Computer-Aided Design, pages 402–407, 1998.

[Sav70] W. Savitch. Relationships between nondeterministic and deterministic tape

complexities. Journal of Computer and System Sciences, 4:177–192, 1970.

[SBT96] T. Shiple, G. Berry, and H. Touati. Constructive analysis of cyclic circuits.

In Proc. European Design and Test Conf., pages 328–333, 1996.

136

BIBLIOGRAPHY

[Shi96] T. Shiple. Formal Analysis of Cyclic Circuits. Ph.D. Thesis, University of

California, Berkeley, 1996.

[SMB96] V. Singhal, S. Malik, and R. K. Brayton. The case for retiming with explicit

reset circuitry. In Proc. Int’l Conf. on Computer-Aided Design, pages 618–

625, 1996.

[SPRB95] V. Singhal, C. Pixley, R. L. Rudell, and R. K. Brayton. The validity of

retiming sequential circuits. In Proc. Design Automation Conference, pages

316–321, 1995.

[SSL+92] E. Sentovich, K. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha,

H. Savoj, P. Stephen, R. K. Brayton, and A. Sangiovanni-Vincentelli. SIS:

a system for sequential circuit synthesis. Tech. Report, UCB/ERL M92/41,

University of California, Berkeley, 1992.

[STB96] E. Sentovich, H. Toma, and G. Berry. Latch optimization in circuits generated

from high-level descriptions. In Proc. Int’l Conf. on Computer-Aided Design,

pages 428–435, 1996.

[Sto92] L. Stok. False loops through resource sharing. In Proc. Int’l Conf. on

Computer-Aided Design, pages 345–348, 1992.

[SWWK04] D. Stoffel, M. Wedler, P. Warkentin, and W. Kunz. Structural FSM traversal.


23(5):598–619, May 2004.

[TB93] H. J. Touati and R. K. Brayton. Computing the initial states of retimed

circuits. IEEE Trans. on Computer-Aided Design of Integrated Circuits and

Systems, 12:157–162, January 1993.

137

BIBLIOGRAPHY

[vE00] C. A. J. van Eijk. Sequential equivalence checking based on structural simi-

larities. IEEE Trans. on Computer-Aided Design of Integrated Circuits and

Systems, 19:814–819, July 2000.

[vEJ96] C. A. J. van Eijk and J. A. G. Jess. Exploiting functional dependencies in

finite state machine verification. In Proc. European Design & Test Conf.,

pages 9–14, 1996.

[YSBO99] B. Yang, R. Simmons, R. E. Bryant, and D. O’Hallaron. Optimizing symbolic

model checking for constraint-rich models. In Proc. Int’l Conf. Computer

Aided Verification, pages 328–340, 1999.

[ZSA98] H. Zhou, V. Singhal, and A. Aziz. How powerful is retiming? In Proc. Int’l

Workshop on Logic Synthesis, 1998.

138

discovering invariants in the analysis and veriﬂcation of...

Documents