simulation meets formal verification

Post on 22-Jan-2016

55 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Simulation meets formal verification. David L. Dill Stanford University. Serdar Tasiran U.C. Berkeley. Why do we care?. Verification is increasingly a bottleneck Large verification teams Huge costs Increases time-to-market Bugs are being shipped - PowerPoint PPT Presentation

TRANSCRIPT

Simulation meets formal verification

David L. DillStanford University

Serdar TasiranU.C. Berkeley

2David Dill, Serdar Tasiran

Why do we care?

Verification is increasingly a bottleneck

Large verification teams

Huge costs

Increases time-to-market

Bugs are being shipped

Simulation and emulation are not keeping up

Formal verification is hard

We need alternatives to fill the gap.

3David Dill, Serdar Tasiran

Outline

General observations

Conventional answers

Semi-formal methods

Conclusion

4David Dill, Serdar Tasiran

Orientation

Focus of this talk: Late stage bugs in register transfer level

descriptions (and above).

Late stage bugs are hard to find

few bugs per simulation cycle, person-hour

delays time-to-market

Functional errors in RTL are

not eliminated by synthesis

not discovered by equivalence checking.

5David Dill, Serdar Tasiran

Where do bugs come from?

Incorrect specifications

Misinterpretation of specifications

Misunderstandings between designers

Missed cases

Protocol non-conformance

Resource conflicts

Cycle-level timing errors

6David Dill, Serdar Tasiran

Design scales

Now: Single FSM: ~12 bits of state, ~30 states

Individual designer subsystem: ~50K gates, 10 FSMs

Major subsystem: ~ 250K gates, 50 FSMs

ASIC: ~2M gates

In a few years: 10 Billion transistor chips

Lots of reusable IP

7David Dill, Serdar Tasiran

Properties

Verification requires something to check

Properties can be represented in many ways

Temporal logic

Checkers in HDL or other language

Properties can be specified at various points:

End-to-end (black-box) properties.

Internal properties (white-box). [0-In]

Whitebox properties are easier to check, because results

don’t have to be propagated to system output.

8David Dill, Serdar Tasiran

“Coverage” is the key concept

Maximize the probability of

stimulating and detecting bugs,

at minimum cost

(in time, labor, and computation)

9David Dill, Serdar Tasiran

Outline

General observations

Conventional answers

Semi-formal methods

Conclusion

10David Dill, Serdar Tasiran

Simulation

Simulation is predominant verification method

Gate level or register transfer level (RTL)

Test cases

manually defined, or

randomly generated

11David Dill, Serdar Tasiran

Typical verification experience

Functional

testing

Weeks

Bugs

per

week

TapeoutPurgatory

12David Dill, Serdar Tasiran

Near-term improvements

Faster simulators

compiled code

cycle simulation

emulation

Testbench authoring tools (Verisity, Vera (Synopsys))

make pseudo-random better/easier

Incremental improvements won’t be enough.

13David Dill, Serdar Tasiran

Formal verification

Ensures consistency with specification for all possible

inputs (equivalent to 100% coverage of . . . something).

Methods

Equivalence checking

Model checking

Theorem proving

Valuable, but not a general solution.

14David Dill, Serdar Tasiran

Equivalence checking

Compare high level (RTL) with gate level

Gaining acceptance in practice

Products: Abstract, Avant!, Cadence, Synopsys, Verplex, …

Internal: Veritas (IBM)

But the hard bugs are usually in both descriptions

Targets implementation errors, not design errors.

15David Dill, Serdar Tasiran

Model checking

Enumerates all states in state machine.

Gaining acceptance, but not yet widely used.

Abstract, Avant!, IBM, Cadence,…

Internally supported at Intel, Motorola, ...

Barrier: Low capacity (~200 register bits).

Requires extraction (of FSM controllers) or abstraction (of the design).

Both tend to cause costly false errors.

16David Dill, Serdar Tasiran

Theorem proving

Theorem prover checks formal proof

Mostly check detailed manual proof.

Sometimes provides some automatic help.

Useful for verifying algorithms [Russinoff, AMD K7 floating pt]

integrating verification results [Aagard, et al. DAC 98] Many parts of a big problem can be solved

automatically Theorem prover ensures that parts fit together with no

gaps.

Not a general solution (too hard!)

17David Dill, Serdar Tasiran

Outline

General observations

Conventional answers

Semi-formal methods

Coverage measurement

Test generation

Symbolic simulation

Directed model checking

Conclusion

18David Dill, Serdar Tasiran

Semi-formal methods

Coverage measurement

Test generation

Symbolic simulation

Model checking for bugs

19David Dill, Serdar Tasiran

How to make simulation smarter

Simulationdriver

Simulationengine

Monitors

Symbolicsimulation

Coverageanalysis

Diagnosis ofunverifiedportions

Vectorgeneration

Conventional

Novel

[Keutzer & Devadas]

IDEAL: Comprehensive validation without redundant effort

20David Dill, Serdar Tasiran

Coverage Analysis: Why?

IDEAL: Comprehensive validation without redundant effort

What aspects of design haven’t been exercised?

Guides vector generation

How comprehensive is the verification so far?

A heuristic stopping criterion

Coordinate and compare Separate sets of simulation runs Model checking, symbolic simulation, …

Helps allocate verification resources

21David Dill, Serdar Tasiran

Coverage Metrics

A metric identifies important

structures in a design representation HDL lines, FSM states, paths in netlist

classes of behavior Transactions, event sequences

Metric classification based on level of representation.

Code-based metrics (HDL code)

Circuit structure-based metrics (Netlist)

State-space based metrics (State transition graph)

Functionality-based metrics (User defined tasks)

Spec-based metrics (Formal or executable spec)

22David Dill, Serdar Tasiran

Desirable scenario

IDEAL: Direct correspondence with design errors 100% coverage = All bugs of a certain type detected

Desirable Qualities Of Coverage Metrics

0% 100%

Metric 1

Metric 2

Metric n

Simple, cheap

Elaborate, expensive

. .

.

23David Dill, Serdar Tasiran

Desirable Qualities Of Coverage Metrics

IDEAL: Direct correspondence with bugs

PROBLEM: No good model for design errors No analog of “stuck-at faults” for design errors

Bugs are much harder to characterize formally

Difficult to prove that a metric is a good proxy for bugs

Then why use metrics? Need to gauge status of verification. Heuristic measures of verification adequacy Coverage guided validation uncovers more bugs

Must look for empirical correlation with bug detection Higher coverage Higher chance of finding bugs ~100% coverage Few bugs remain

24David Dill, Serdar Tasiran

Desirable Qualities Of Coverage Metrics

Direct correspondence with bugs

Ease of use

Tolerable overhead to measure coverage

Reasonable computational and human effort to: interpret coverage data achieve high coverage generate stimuli to exercise uncovered aspects

Minimal modification to validation framework

Every metric is a trade-off between these requirements

25David Dill, Serdar Tasiran

Coverage Metrics

Code-based metrics

Circuit structure-based metrics

State-space based metrics

Functionality-based metrics

Spec-based metrics

26David Dill, Serdar Tasiran

Code-Based Coverage Metrics

On the HDL description

Line/code block coverage

Branch/conditional coverage

Expression coverage

Path coverage

Tag coverage (more detail later)

Useful guide for writing test cases

Little overhead

A good start but not sufficient < max. code coverage must test more Does not address concurrency

always @ (a or b or s) // mux begin if ( ~s && p ) d = a; r = x else if( s ) d = b; else d = 'bx;

if( sel == 1 )

q = d;

else if ( sel == 0 )

q = z

27David Dill, Serdar Tasiran

Code-Based Coverage Metrics

Many commercial tools that can handle large-scale

designs

VeriCover (Veritools)

SureCov (SureFire, now Verisity)

Coverscan (DAI, now Cadence)

HDLScore, VeriCov (Summit Design)

HDLCover, VeriSure (TransEDA)

Polaris (formerly CoverIt) (interHDL, now Avant!)

Covermeter (ATC, now Synopsys)

...

28David Dill, Serdar Tasiran

Circuit Structure-Based Metrics

Toggle coverage: Is each node in the circuit toggled?

Register activity: Is each register initialized? Loaded? Read?

Counters: Are they reset? Do they reach the max/min value?

Register-to-register interactions: Are all feasible paths exercised?

Datapath-control interface:Are all possible combinations of control and status signals exercised?

sinit

s3

s4

s2

s5

s6

Control

Datapath

(0-In checkers have these kindsof measures.)

29David Dill, Serdar Tasiran

Circuit Structure-Based Metrics

Useful guide for test writers. Intuitive, easy to interpret.

Not sufficient by themselves. More of a sanity check.

Difficult to determine if a path is false a combination of assignments

to variables is possible

Problem with all metrics: “Is . . . coverable?”

Ask user or use heuristics

sinit

s3

s4

s2

s5

s6

Control

Datapath

30David Dill, Serdar Tasiran

Design Fault Coverage

During test, faulty and original designs behave differently

Fault detected bya test

Use faults as proxy for actual design errors.

Faults are local mutations in HDL code Gate-level structural description (netlist) State transition diagram of a finite state machine, …

COVERAGE: Fraction of faults detected by test suite.

Measurement methods similar to fault simulation for mfg. test [Abadir, Ferguson, Kirkland, TCAD ‘88] [Kang & Szygenda, ICCD ‘92] [Fallah, Devadas, Keutzer, DAC ‘98] . . .

31David Dill, Serdar Tasiran

Design Fault Coverage: Critique

Various fault models have been considered Gate (or input) omission/insertion/substitution Wrong output or wrong next state for given input Error in assignment on HDL line

Fault models motivated more by ease of use and definition Not really “common denominators” for design errors Additional restrictions, e.g. “single fault assumption”

But they provide a fine grain measure of how adequately the design is exercised and observed.

32David Dill, Serdar Tasiran

Observability

Simulation detects a bug only if a monitor flags an error, or design and reference model differ on a variable

Portion of design covered only when

it is exercised (controllability)

a discrepancy originating there causes discrepancy in a monitored variable (observability)

Low observability false sense of security

Most of the design is exercised Looks like high coverage

But most bugs not detected by monitors or ref. model

Observability missing from most metrics

Simulationdriver

Simulationengine Monitors

Symbolicsimulation

Coverageanalysis

Diagnosis ofunverifiedportions

Vectorgeneration

33David Dill, Serdar Tasiran

Tag Coverage [Devadas, Keutzer, Ghosh ‘96]

HDL code coverage metrics + observability requirement.

Bugs modeled as errors in HDL assignments.

A buggy assignment may be stimulated, but still missed

EXAMPLES: Wrong value generated

speculatively, but never used.

Wrong value is computed and stored in memory

Read 1M cycles later, but simulation doesn’t run that long.

34David Dill, Serdar Tasiran

Tag Coverage [Devadas, Keutzer, Ghosh ‘96]

IDEA: Tag each assignment with +, -: Deviation from intended value

1 + : symbolic representation of all values > 1

Run simulation vectors Tag one variable

assignment at a time

Use tag calculus

Tag Coverage: Subset of tags that propagate to observed variables

Confirms that tag is activated and its effect propagated.

A+ = 1C- = 4 - k A+ // k 0D = C- + A+

A+ = 1

35David Dill, Serdar Tasiran

Tag Coverage: Critique

Easily incorporated can use commercial simulators simulation overhead is reasonable

Easy to interpret can identify what blocks propagation of a tag can use ATPG techniques to cover a tag

Error model doesn’t directly address design errors

BUT a better measure of how well the design is tested than standard code coverage

36David Dill, Serdar Tasiran

State-Space-Based Metrics (FSM Coverage)

State, transition, or path coverage of “core” FSM: Projection of

design onto selected variables

Control event coverage [Ho et al., ‘96, FLASH processor] Transition coverage for variables controlling datapath

Pair-arcs (introduced by 0-in) For each pair of controller FSMs, exercise all feasible pairs

of transitions. Catches synchronization errors, resource conflicts, ...

Benjamin, Geist, et. al. [DAC ‘99] Hand-written abstract model of processor

Shen, Abraham, et.al. Extract FSM for “most important” control variable Cover all paths of a given length on this FSM

37David Dill, Serdar Tasiran

Probably the most appropriate metrics for “bug coverage”

Experience: Rare FSM interactions cause difficult bugs Addressed best by multiple-FSM coverage

Trade-off: Sophisticated metric on small FSM vs.

Simple metric on large FSM/ multiple FSMs.

Relative benefits design dependent.

Difficult to check if something is coverable

May require knowledge of entire design

Most code-coverage companies also provide FSM coverage Automatic extraction, user-defined FSMs Reasonable simulation overhead

State-Space-Based Metrics

38David Dill, Serdar Tasiran

Functional Coverage

Define monitors, tasks, assertions, … Check for specific conditions, activity, …

User-defined Coverage [Grinwald, et al., DAC ‘98] (IBM)

User defines “coverage tasks” using simple language: First-order temporal logic + arithmetic operators Snapshot tasks: Condition on events in one cycle Temporal tasks: Refers to events over different cycles

User expressions (Covermeter), Vera, Verisity

Assertion synthesis (checkers) (0-in)

Event Sequence Coverage Metrics (ESCMs)[Moundanos & Abraham, VLSI Test Symp. ‘98]

39David Dill, Serdar Tasiran

Functional Coverage

Good because they make the designer think about the design in a different and redundant way

BUT May require a lot of user effort (unless synthesized)

User needs to write monitors

May not test corner cases Designers will write monitors for expected case

Are design specific Monitors, assertions need to be re-defined for each

new design.

40David Dill, Serdar Tasiran

Spec-Based Metrics

Model-based metrics are weak at detecting missing functionality

The spec encapsulates required functionality Apply (generalize) design coverage metrics to formal spec

PROBLEMS:

Spec-based metrics alone may not exercise design thoroughly

Spec is often incomplete

Two cases that look equivalent according to specmay be implemented differently

A formal spec may not exist for the unit being tested

Model and spec-based metrics complement each other

41David Dill, Serdar Tasiran

Semi-formal methods

Coverage measurement

Test generation

Symbolic simulation

Model checking for bugs

42David Dill, Serdar Tasiran

Verification test generation

Approach: Generate tests automatically that maximize

coverage per simulation cycle.

Automatic test generation is crucial for high productivity.

Tests can be generated

off-line: vectors saved in files, or

on-line: vectors generated as you simulate them.

Specific topics ATPG methods (design fault coverage)

FSM-based methods (FSM coverage)

Test amplification

43David Dill, Serdar Tasiran

ATPG methods

Use gate-level design fault model

maybe just standard stuck-at model.

Generate tests automatically using ATPG (automatic test

pattern generation) techniques

Takes into account “observability” of error.

Oriented towards combinational designs.

General solution would need sequential ATPG [hard].

44David Dill, Serdar Tasiran

FSM-based test generation

Generate FSM tests using model checking techniques (e.g. BDD,

explicit).

Map FSM test to design test vector [ hard! ]

FSM

Design

FSM testDesigntest

45David Dill, Serdar Tasiran

Test vector mapping

User defines mapping rules from FSM event to input

vectors. [Ho PhD, Stanford 1996, Geist, et al., FMCAD 96]

Mapping must be relatively simple.

Automatically map to test vectors using sequential ATPG

techniques.

[Moundanos, et al., IEEE TOC Jan. 1998]

Published examples are small.

46David Dill, Serdar Tasiran

Coverage-driven search

[Ganai, Aziz, Kuehlmann DAC ‘99]

Identify signals that were not toggled in user tests. Attempts to solve for inputs in current cycle that will

make signal toggle using BDDs and ATPG methods.

Similar approach could be taken for other coverage metrics.

General problem: controllability (as in FSM coverage).

47David Dill, Serdar Tasiran

Test Amplification

Approach: Leverage interesting behavior generated by

user.

Explore behavior “near” user tests, to catch near misses.

Many methods could be used Satisfiability

BDDs

Symbolic simulation

Formal

+ =

Simulation 0-In Search

48David Dill, Serdar Tasiran

Semi-formal methods

Coverage measurement

Test generation

Symbolic simulation

Model checking for bugs

49David Dill, Serdar Tasiran

Symbolic simulation

Approach: Get a lot of coverage from a few simulations.

Inputs are variables or expressions

Operation may compute an expression instead of a value.

Advantage: more coverage per simulation

one expr can cover a huge set of values.

“a”

“b - c”“a + b - c”+

50David Dill, Serdar Tasiran

BDD-based symbolic simulation

Symbolic expressions are represented as BDDs.

Symbolic trajectory evaluation (STE): Special logic for specifying input/output tests.

Used at MOS transistor or gate level.

COSMOS [Bryant, DAC 90] (freeware), Voss [Seger]

Used at Intel, Motorola

Transistor and RTL simulation Innologic (commercial)

51David Dill, Serdar Tasiran

Higher-level symbolic simulation

Symbolic simulation doesn’t have to be bit-level.

RTL symbolic simulation can have built-in datatypes for:

Bitvectors, Integers (linear inequalities)

Arrays

Especially useful if combined with automatic decision

procedure for these constructs.

[Barrett et al. FMCAD 96, DAC 98]

52David Dill, Serdar Tasiran

Semi-formal verification usingSymbolic simulation

Symbolic simulation is a tool that can be used for full or

partial formal verification. Many papers are about full formal verification.

But tools naturally encourage partial verification.

Partial verification Use constants for some inputs

Convert variables to constants “on-the-fly” [Innologic]

Start with constant state, simulate a few cycles with symbolic inputs

May miss states with errors. Example: Robert Jones PhD thesis (Stanford/Intel) - symbolic

simulation of retirement logic of Pentium Pro.

53David Dill, Serdar Tasiran

Semi-formal methods

Coverage measurement

Test generation

Symbolic simulation

Model checking for bugs

54David Dill, Serdar Tasiran

Partial model checking

When BDD starts to blow up, delete part of state space. High-density BDDs [Ravi,Somenzi,ICCAD ‘95]

Subset state space that maximizes statecount/BDDsize

Prune BDDs using multiple FSM coverage (“saturated

simulation”) [Aziz,Kukula,Shiple, DAC 98]

Prioritized model checking Use best-first search for assertion violation states

Useful with BDDs or explicit model checking

Metrics: Hamming distance

[Yang, Dill HLDVT 96, Yuan et al. CAV 97] “Tracks” [Yang & Dill, DAC 98] Estimated probability of reaching target state in a

random walk [Kuehlmann, McMillan, Brayton, ICCAD 99]

55David Dill, Serdar Tasiran

Comments on model-checking for bugs

Topic is not mature.

Published examples are small.

Big increases in capacity needed.

56David Dill, Serdar Tasiran

Outline

General observations

Conventional answers

Semi-formal methods

Research issues

Conclusion

57David Dill, Serdar Tasiran

Research methodology

Research in this area is empirical. “Scientific method” is

important!

How do we measure success (can it find bugs?)?

What do we use for controls?

What is the “null hypothesis”?

Apparent effectiveness depends on Design methodology (language, processes)

Type of design

Designer style, training, and psychology

Size of design!

Design examples need to be large, realistic, and varied.

58David Dill, Serdar Tasiran

State of the art

Research and product development are immature

There are many ideas.

Experiments are encouraging, but not conclusive.

No clear winner has emerged.

Commercial products are on the way, but no clear winners (yet).

59David Dill, Serdar Tasiran

Coverage vs. scale

Scale (gates)

Coverage

1 FSM 50K 250K 2M

Modelchecking

Random simulation

Manual testw/ coverage

FSM-basedgeneration

Symbolicsimulation

Based on papers

60David Dill, Serdar Tasiran

The future

How can we verify huge systems with many reusable

components?

System-level simulation won’t find bugs efficiently enough.

Maybe: Vendors help with semi-formal verification Supply designs with checkers

Inside the design At interfaces

Environmental constraints, also.

Supply information about component Coverage info (e.g. conditions to trigger) Hints for efficient vector generation

61David Dill, Serdar Tasiran

Predictions

This is going to be an important area Many papers

Verification products

Simulation & emulation will continue to be heavily used.

Formal verification will be crucial, when applicable

Special application domains: protocols, FSMs, floating point, etc.

Design for verification would increase scope

62David Dill, Serdar Tasiran

Web page

http://verify.stanford.edu

top related