opportunities and challenges for better than worstcase design todd austin (presenter) valeria...
Post on 20-Dec-2015
215 views
TRANSCRIPT
Opportunities and Challengesfor Better Than Worst Case DesignOpportunities and Challengesfor Better Than Worst Case Design
Todd Austin (presenter)
Valeria Bertacco
David Blaauw
Trevor Mudge
University of Michigan
Design-TimeVerification
andOptimization
Traditional Worst-Case DesignTraditional Worst-Case Design
L H
Time-to-Market
L H
Performance
Run-TimeVerification
TypicalCase
Optimization
Better Than Worst-Case DesignBetter Than Worst-Case Design
L H
Time-to-Market
L H
Performance
L H
Performance
L H
Time-to-Market
Online
Checker
Hardware
Addressing Challengesin the Nanometer RegimeAddressing Challengesin the Nanometer Regime
Design complexity Billions and billions of transistors lead to
untenable designs…
Soft errors upsets in logic and memory Cosmic rays, alpha particles, neutrons, etc…
Uncertainty in design parameters Process and temperature variation, supply
noise…
Power/performance demands Bounding performance, area, and battery life
Example BTWC Design:DIVA CheckerExample BTWC Design:DIVA Checker
All core function is validated by checker Simple checker detects and corrects faulty results, restarts core
Checker relaxes burden of correctness on core processor Tolerates design errors, electrical faults, defects, and failures Core has burden of accurate prediction, as checker is 15x slower
Core does heavy lifting, removes hazards that slow checker
speculativeinstructions
in-orderwith PC, inst,inputs, addr
IF ID REN REG
EX/MEM
SCHEDULER CHK CT
Performance Correctness
Core CheckerOnline
Checker
Hardware
Another BTWC Design:Razor LogicAnother BTWC Design:Razor Logic
Mai
n FF
Sha
dow
Latc
h
Mai
n FF
clk clk
clk_del
5
49 MEM39
9
Double-sampling metastability tolerant latches detect timing errors Second sample is correct-by-design
Microarchitectural support restores state Timing errors treated like branch mispredictions
Online
Checker
Hardware
recover
IFRa
zor F
F ID
Razo
r FF EX
Razo
r FF MEM
(read-only)WB
(reg/mem)
error bubble
recover recover
Razo
r FF
Stab
ilizer
FF
PC
recover
flushID
bubbleerror bubble
flushID
error bubble
flushIDFlushControl
flushID
error
Cycle: 0
inst1inst2inst3inst4inst5
123456
inst6
Distributed Pipeline Recovery
inst2inst7inst8
789
inst3inst4
Builds on existing branch prediction framework
Multiple cycle penalty for timing failure
Scalable design as all communication is local
Opportunities for CADOpportunities for CAD
Key observation:
Infrequent faults in the core design are tolerable.Infrequent faults in the core design are tolerable.
Opportunities: Focus only on the critical components, no need to verify
ad infinitum Optimize performance/power for the most common
scenarios (typical-case optimization)
Razor Opportunity:Typical-Case Energy ReductionRazor Opportunity:Typical-Case Energy Reduction
Eref
VoltageControl
Function
.
.
.Pipeline
reset
Vdd
Ediff = Eref - Esample
-
EsampleVoltageRegulator
Ediff
errorsignals
Energy reduction can be realized with a simple proportional control functionControl algorithm implemented in software
Energy/Performance CharacteristicsEnergy/Performance Characteristics
Decreasing Supply Voltage
Energy
Energy of ProcessorOperations, Eproc
Energy ofPipeline
Recovery,Erecovery
Total Energy,Etotal = Eproc + Erecovery
Optimal Etotal
PipelineThroughput
IPC
Energy of Processorw/o Razor Support
50%
1%
Razor Opportunity:Typical-Case Optimized AdderRazor Opportunity:Typical-Case Optimized Adder
Kogge-Stone Adder
G0
P0
G1
P1
G2
P2
G3
P3
G4
P4
G5
P5
G6
P6
G7
P7
G8
P8
G9
P9
G10
P10
G11
P11
G12
P12
G13
P13
G14
P14
G15
P15 Cin
…
Carry Propagations for Random DataCarry Propagations for Random Data
08162432
4048
56
016
32
48
64
0
0.01
pro
bab
ility
Bit Position Carry Distance
Pro
babi
lity
Carry Propagations for Typical DataCarry Propagations for Typical Data
08162432
4048
56
016
32
48
64
0
0.16
pro
bab
ility
Carry DistanceBit Position
Pro
babi
lity
Typical Case Optimized AdderTypical Case Optimized Adder
G0
P0
G1
P1
G2
P2
G3
P3
G4
P4
G5
P5
G6
P6
G7
P7
G8
P8
G9
P9
G10
P10
G11
P11
G12
P12
G13
P13
G14
P14
G15
P15 Cin
…
ripple carry circuit
carry-lookahead circuit
Benefits of Typical Case OptimizationBenefits of Typical Case Optimization
Adder
Topology
Latency (in gate delays)
Worst-Case Typical-Case Random
Kogge-Stone 8 5.08 7.09
TCO Adder 128 3.03 3.69
Typical-case performance much better than worst case Especially for typical-case optimized design
Core CAD Requirement:Observability of Circuit-Level CharacteristicsCore CAD Requirement:Observability of Circuit-Level Characteristics
App
ArchConfig
ArchitecturalSimulator
ArchitecturalSimulator
CircuitSimulatorCircuit
Simulator
Output
ArchMetrics
ModuleCircuitModels
TechModels
CircuitMetrics
Inputs,Voltage,
Constraints
Delay,Power,Switching
IF ID EX MEM WBSpeedand
Scope
Fidelityand
Observability
Circuit-Aware Architectural Simulator efficiently melds circuit simulation with architectural simulation
Additional CAD OpportunitiesAdditional CAD Opportunities
For synthesis: Typical-case library characterization (e.g., pdf of delay) Synthesize design for target performance, power, etc… TCO-style optimizations possible for macro-modules
For verification: Full formal verification for checker components Profile-directed simulation-based verification for core
For testing: Checker component can facilitate software-based
manufacturing test of core components
ConclusionsConclusions
Better than worst-case design abandons traditional worst-case design constraints
Couples complex designs with checkers
Enables CAD opportunities for typical-case optimization
Requires tool support for observability, synthesis and verification
For more information:
http://www.eecs.umich.edu/razor
First tutorial at DATE, Munich, March 2005