18. design verification - university of texas at austinjaa/lectures/18-1.pdf · 18. design veri...
TRANSCRIPT
18. Design Verification
Jacob Abraham
Department of Electrical and Computer EngineeringThe University of Texas at Austin
VLSI DesignFall 2017
November 6, 2017
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 1 / 44
Reliability in the Life of an Integrated Circuit – I
Design
Design “bugs”Verification (Simulation, Formal)
Fabrication
Wafer
Process variations,defects
Process Monitors
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 1 / 44
Reliability in the Life of an Integrated Circuit – II
Wafer Probe Package
Tester
Test cost,coverage
Design for Test,Built-In Self Test
System
Application
Test escapes,wearout,
environmentSystem Self-Test,Error Detection,Fault Tolerance,
Resilience
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 2 / 44
Analyzing Complex Designs
Need to (implicitly) search a very large state space
Find bugs in a design – verification process
Generate tests for faults in a manufactured chip
Basic algorithms for analyzing even combinational blocks (SAT,ATPG) are NP-complete
Approaches to deal with real designs
Exploit hierarchy in the design
Develop abstractions for parts of a design
Cost of a new mask set can be on the order of $1+ Million for alarge chip
Cannot afford mistakes
Want working “first silicon”
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 3 / 44
Many Aspects of Verification
Verifying the functional correctness of the design
Performance verification
Architecture level (number of clocks to perform a function)
Timing verificationCircuit level (how fast can we clock?)
Verifying power consumption
Verifying signal integrity and variation tolerance
Checking correct implementation of specifications at eachlevel
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 4 / 44
The Verification Problem
Need to deal with this complexityA subtle bug could produce an incorrect result in a specificstate for a specific data input
Seen as a “sequence dependency” when simulating a design(specific sequence of inputs to reach the erroneous state)
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 5 / 44
The (In)Famous Pentium FDIV Problem
Graph of x, y, x/y in a small region by Larry Hoyle
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 6 / 44
State-Space ExplosionMay need to check a very large number of states to uncover a bug
Problem: the number of protons in the universe is around 1080,which is less than the number of states for a system with 300storage elements!
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 7 / 44
What is a “Bug”?
Design does not match the specification
One problem: complete (and consistent) specifications maynot exist for many products
For example, the difficulty in designing an X86 compatiblechip is not in implementing the X-86 instruction setarchitecture, but in matching the behavior with Intel chips
Something which the customer will complain about
Marketing: “It’s not a bug, it’s a feature”
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 8 / 44
Design Flaws
About half of the flaws are functional flaws
Need verification methods to find and fix logical and functionalflaws
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 9 / 44
Design Bug Distribution in Pentium 4
Type of Bug %
“Goof” 12.7
Miscommunication 11.4
Microarchitecture 9.3
Logic/Microcode Changes 9.3
Corner Cases 8.0
Power Down 5.7
Documentation 4.4
Complexity 3.9
Initialization 3.4
Incorrect RTL Assertions 2.8
Design Mistake 2.6
Source: EE Times, July 4, 2001
42 Million Transistors
High-level description: 1+ millionlines of RTL
100 high-level bugs found throughformal verification
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 10 / 44
Design Effort
Verification is becoming an increasing part of the design effort
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 11 / 44
Verification Effort
Source: 1999 ITRSOver 50% – 80% of design effort is in verification
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 12 / 44
Design and Verification Gap
Blame it on Moore’s Law
The number of transistors on a chip is increasing every year(approx. 58%)
Design productivity, facilitated by EDA tool improvements,grows only about 21% per year
These numbers have held constant for two decades
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 13 / 44
“Bug” Introduction and Detection
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 14 / 44
Verification Consumes the Majority of Project Time
Source: Wilson Research Group and Mentor Graphics, 2014 Functional
Verification Study
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 15 / 44
More Verification Engineers than Designers
Source: Wilson Research Group and Mentor Graphics, 2014 Functional
Verification Study
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 16 / 44
Re-Spins because of Functional Flaws
Tom FitzpatrickEE Times, December 5, 2005
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 17 / 44
Hardware Bugs Force Respins
Source: Wilson Research Group and Mentor Graphics, 2014 Functional
Verification Study
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 18 / 44
Bugs Can Be Very Subtle – Example “Memory Sinkhole”Security Vulnerability
x86 Protection Rings (Wikipedia)
System Management Mode (SMM) introduced in 1990CPU executes code from separate area of memory (SMRAM)SMRAM is only accessible by the processor, (not even OS)SMM handles system-wide functions like power management,hardware control, proprietary OEM codeIntended for use only by system firmware
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 19 / 44
Sinkhole Security Vulnerability, Cont’d
Remap Local Advanced Programmable Interrupt Controller(APIC) over SMRAM range during critical execution statesCause memory accesses that should be sent to the MCH to beprematurely accepted by the APIC insteadPermits malicious ring 0 code influence over the SMM
https://www.blackhat.com/docs/us-15/materials/
us-15-Domas-The-Memory-Sinkhole-Unleashing-An-x86-Design-Flaw-Allowing-Universal-Privilege-Escalation.
pdfECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 20 / 44
Design and Implementation Verification
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 21 / 44
Verification Approaches
Simulation (the most popular verification method)
Cycle based, functional simulation for billions of cyclesGood coverage metrics usually not availableComputationally very expensive, slightest optimization hashuge impact
Emulation
Capital intensiveMap design to be verified on FPGAsRun OS and application at MHz rates
Formal verification
Exhaustive verification of small modules
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 22 / 44
Evaluating the Complete Design
Is there a verification technique which can be applied to theentire chip?
Only one approach which scales with the design: Simulation
Most common technique now used in industry
Cycle-based simulation can exercise the design for millionsof cycles
Unfortunately, the question of when to stop simulation is openNo good measures of coverage
EmulationUsed to verify the first Pentium (windows booted on FPGAsystem)Developing another accurate model is an issueCurrently used for post-silicon validation of Intel Atomplatform
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 23 / 44
Simulation Hierarchy
Levels of modeling the design
System C, System Verilog, Spec C
Register Level
Work distributed among clock cycles, all registers modeled
Gate Level
Structural model with timing – Zero, Unit, Detailed delays
Switch Level
Models transistor as a switch, simple timing model
Circuit Level
Complete timing, used for small sections of a design
Mixed-level: combination of techniques to improve speed
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 24 / 44
Simulation Speeds
Comparing speeds of simulating a microprocessor on a computer
Performance timers: 10K – 50K cycles/sec.
Behavior level: 1000 – 10K cycles/sec.
R-T level: 20 – 1000 cycles/sec.
Gate level: 4 – 25 cycles/sec.
Switch level: 1/4 – 1 cycles/sec.
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 25 / 44
When are we Done Simulating?
When do you tape out?
Motorola criteria (EE Times, July 4, 2001)
40 billion random cycles without finding a bug
Directed tests in verification plan are completed
Source code and/or functional coverage goals are met
Diminishing bug rate is observed
A certain date on the calendar is reached
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 26 / 44
Emulation Technologies
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 27 / 44
Architecture Verification
Verify that the design matches the architectural specification
Extensive testing the common approach
Conformance testing
Approaches used in industry
Manually writing testsGenerating pseudo-random instruction sequencesUsing biased pseudo-random instructionsGenerating instruction sequences from typical workloadsExample: to verify an X86 clone, capture instruction trace onanother X86 machine is running application
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 28 / 44
Example of Verifying Processors
Verifying PowerPC Processors
Model based test generator
Expert system which contains a formal model of processorarchitecture, and a heuristic data base of testing knowledge
Example, testing knowledge accumulated during theverification of the first PowerPC design included about 1,000generation and validation functions (120,000 lines of C code)
PowerPC behavioral simulator has about 40,000 lines of C++code
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 29 / 44
MAP1000 Media Processor Verification (Shen, 1999)
System Architecture
Many instructions issued in parallel – very long instructionword (VLIW)
Move scheduling and resource management out of hardwareto optimizing compilers
Specialized function units for serial processing
Wide fast memory buses
Large data and instruction caches
Intelligent memory I/O
Including “Strips/Strides” and background transfersSophisticated data transfer engines
Verification Challenges
Complex data/instruction caches, memory I/O mechanisms,large number of functional units
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 30 / 44
Pre-Silicon Design Verification
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 31 / 44
Post-Silicon Design Verification
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 32 / 44
Effectiveness of Tests
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 33 / 44
How do we Measure the Quality of Simulation Vectors?
Use “coverage” metrics
Current industry metrics borrowed from software testing
White box testingGrey box testingBlack box testing
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 34 / 44
Black Box Test Cases
Example, testing IP cores, without knowledge of structuraldesign
Base on “functional” fault or behavior models
Functional test cases
Partition behavior to reduce complexity
Boundary value analysis
Domain testing
Corner cases (“Bugs lurk in corners and congregate atboundaries”, Beizer)
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 35 / 44
White Box Test Cases
Derive test cases based on design structure
Simple metric: “statement coverage”
Path tests are better, but number of execution paths is huge
Path Testing Criteria
Execute all control flow paths – generally impossible toachieve
Statement testing – attempt 100% “code coverage” of allstatements in HDL
Branch testing – assure that all branch alternatives areexercised at least once
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 36 / 44
Simulating Black Boxes
Focus on functional requirements
Generating Test Cases for Black-Box Designs
Check for incorrect or missing functions
Check for interface errors
Check for access to boundary signals
Check corner cases
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 37 / 44
What is the Fundamental Problem with Simulation?
Traversing States in the Design
Simulation traverses paths in the state graph
A bug may be associated with a specific transition from aspecific state
Cannot guarantee that we will exercise that transition
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 38 / 44
Coverage-Driven Verification
Attempt to Verify that the Design Meets Verification Goals
Define all the verification goals up front in terms of“functional coverage points”
Each bit of functionality required to be tested in the design isdescribed in terms of events, values and combinations
Functional coverage points are coded into the HVL (HardwareVerification Language) environment (e.g., Specman ‘e’)
Simulation runs can be measured for the coverage theyaccomplish
Focus on tests that will accomplishing the coverage(“coverage driven testing”)
Then fix bugs, release constraints, improve the testenvironmentMeasurable metric for verification effort
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 39 / 44
Open Questions
Are There Better Measures of Coverage?
Coverage of statements in RTL would be a necessary but notsufficient
Coverage of all states is impractical even for a design with afew hundred state variables
Is there a way to identify a subset of state variables thatwould be tractable, and would lead to better bug detection?
How would these variables be related to the behavior of thedesign?
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 40 / 44
Assertions
Assertions capture knowledge about how a design shouldbehave
Used in coverage-based verification techniques
Assertions help to increase observability into a design, as wellas the controllability of a design
Each assertion specifies
legal behavior of some part of the design, orillegal behavior of part of the design
Examples of assertions (will be specified in a formal language)
The fifo should not overflowSome set of signals should be “one-hot”If a signal occurs, then . . .
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 41 / 44
Simulation Monitors and Assertions
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 42 / 44
Commercial Tools
Mentor Graphics
Synopsys
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 43 / 44
Dramatic Example of Design Bug Detection and Recovery
BAE RAD750 (133 MHz) Science Lab on Mars
Mars Science Laboratory – MSL (“Curiosity”) reliesextensively on BAE RAD750 chip running at 133 MHzDuring cruise to Mars (circa January 2012), MSL processesare unexpectedly reset – no code bug is found (7 monthsremaining before landing)Misbehavior eventually traced to processor hardwaremalfunction: instruction flow depends on processortemperatureOnly possible fix was via software – done successfully
ECE Department, University of Texas at Austin Lecture 18. Design Verification Jacob Abraham, November 6, 2017 44 / 44