addressing power issues within an integrated rtl-to- gdsii flow fusion 2003 addressing power issues...
TRANSCRIPT
Addressing Power Issues Addressing Power Issues within an Integrated RTL-to-within an Integrated RTL-to-
GDSII FlowGDSII Flow
Fusion 2003Fusion 2003
Patrick GroeneveldPatrick Groeneveld
Magma Design Automation, Magma Design Automation,
P.G. Fusion2003 - 2
Agenda
• IntroductionIntroduction
• Tight integration through Magma data modelTight integration through Magma data model
• Using the tool: Using the tool: • TCL and GUI interface TCL and GUI interface
• Gate sizing technology for silicon integrityGate sizing technology for silicon integrity
• Power & voltage drop analysis with Blast RailPower & voltage drop analysis with Blast Rail
P.G. Fusion2003 - 3
Engineering an RTL to GDSII flow
Timing closure (parasitic cap.)Timing closure (parasitic cap.)
Tim
ing
analysis
Tim
ing
analysis
Parasitic extractio
n &
estim
ation
Parasitic extractio
n &
estim
ation
IR-d
rop
and
Po
wer an
alysisIR
-dro
p an
d P
ow
er analysis
DR
C/E
RC
DR
C/E
RC
synthesissynthesis
placeplace
routeroute
Gate sizing (gain based synthesis)Gate sizing (gain based synthesis)
Cloning, logic restructuring Cloning, logic restructuring
Load buffering Load buffering
Delay buffering Delay buffering
Timing/sizing driven placement Timing/sizing driven placement
Mapping for speed Mapping for speed
Useful skew clock synthesis Useful skew clock synthesis
Routing closureRouting closure
Congestion control Congestion control
Rip-up and reroute Rip-up and reroute
Design scale, concurrent designDesign scale, concurrent designHierarchy, Partitioning, design planningHierarchy, Partitioning, design planning
Large capacity and fast algorithms Large capacity and fast algorithms
Correct-by-construction tools Correct-by-construction tools
TestabilityTestability
BIST insertion BIST insertion
Scan chain reordering and routing Scan chain reordering and routing
ECO capabilityECO capability
Spare cell insertion Spare cell insertion
Clock skewClock skew
Balanced clock trees Balanced clock trees
Clock shielding Clock shielding
Dual-hierarchy supportDual-hierarchy support
Low power requirementsLow power requirements
Clock gating Clock gating
Multi-VDD regions Multi-VDD regions
Dual Vt support Dual Vt support
IR voltage drop, ElectromigrationIR voltage drop, Electromigration
Power infrastructure Power infrastructure
Decoupling caps, package designDecoupling caps, package design
High I/O countHigh I/O count
Flip-chip packaging Flip-chip packaging
Antenna rulesAntenna rules
Diode insertionDiode insertion
Antenna-friendly routing, jumper insertionAntenna-friendly routing, jumper insertionDSM mask rulesDSM mask rules
Filling, slotting, router adaptations Filling, slotting, router adaptations
Crosstalk noise & delayCrosstalk noise & delay
Wire shielding Wire shielding
Wire spacing Wire spacing
Hold time violationsHold time violations
Noise bufferingNoise buffering
Hold time bufferingHold time buffering
Yield, reliability, PVTYield, reliability, PVT
Wire wideningWire wideningWire spacing Wire spacing Etc. etc. etc.Etc. etc. etc.
Battling parasitic capacitancesBattling parasitic capacitancesBattling wire congestionBattling wire congestion
Block/macro placement Block/macro placement Block/macro placement Block/macro placement
P.G. Fusion2003 - 4
Compare: ‘Bolting’ a flow together using files
Net ListNet List
Physical Physical CompilerCompiler
PrimeTimePrimeTime
PLIBPLIB
LEFLEF
gds2lefgds2lef
lef2pliblef2plib
PrimeTimePrimeTime
LibertyLibertymodelmodel
FloorFloorplanplan
Jupiter/PlanetJupiter/Planet
RTLRTL
DesignDesignCompilerCompiler
Timing Timing ConstraintsConstraints
LIBLIB
Net ListNet List PDEFPDEF
Apollo IIApollo II
GDS IIGDS II
StarRCStarRC
SPEFSPEF
GDS IIGDS II(lib)(lib)
• The real value is in the The real value is in the flowflow, not in the , not in the algorithmsalgorithms
• Yet, design data is spread over many filesYet, design data is spread over many files• Files are big and slow to read.Files are big and slow to read.
• Redundant information is stored:Redundant information is stored:• less memory efficient.less memory efficient.
• Relevant information gets lost in the translationRelevant information gets lost in the translation• interpretation may not be consistentinterpretation may not be consistent
• DSM issues are everywhere, the file interface DSM issues are everywhere, the file interface makes dealing with them harder.makes dealing with them harder.
P.G. Fusion2003 - 5
Addressing Signal and Power Integrity
• Signal and power integrity pose major concerns for today’s DSM designsSignal and power integrity pose major concerns for today’s DSM designs
• Problem is compounded as timing, SI, and power are all inter-relatedProblem is compounded as timing, SI, and power are all inter-related
• Solutions available today rely on multiple point toolsSolutions available today rely on multiple point tools• Involves file transfers and data translation - leads to potential errorsInvolves file transfers and data translation - leads to potential errors
• Too many iterations between analysis and implementation toolsToo many iterations between analysis and implementation tools
• Lack of predictability may cause problems to be discovered too late – high cost of fixingLack of predictability may cause problems to be discovered too late – high cost of fixing
Voltage drop
Voltage-drop-Induced delay
P&R
Power consumption
Power planning
Timing
Crosstalk glitch
Crosstalk delay
Rail EM/Signal EM
Designers spend too much time converging to a solution
P.G. Fusion2003 - 6
Tool integration through a database
common data basewith all data.
api
internaldatastructure
Tool 1
api
internaldatastructure
Tool 2
api
internaldatastructure
Tool 3
api
internaldatastructure
Tool 4
api
internaldatastructure
Tool 189
api
internaldatastructure
Tool 190
e.g. GUIe.g. timer e.g. placer
• The database interfaces with the tools through and API or files.The database interfaces with the tools through and API or files.
• Each tool makes its own copy (data structure) of the data.Each tool makes its own copy (data structure) of the data.
• It takes memory and time to haul all this data aroundIt takes memory and time to haul all this data around
P.G. Fusion2003 - 7
The data model is key
in-corein-coreData ModelData Model
PlacementAlg.
RoutingAlg.
Tool nAlg.
...
TCLaccess
TimingAlg.
• Use a simple (data) Use a simple (data) model that can be used model that can be used by multiple algorithms by multiple algorithms without elaborate without elaborate translationstranslations
• Many interdependent Many interdependent problems will have to problems will have to be intertwined into a be intertwined into a flow.flow.
• Consistency of Consistency of interpretation is keyinterpretation is key
GUIaccess
VerificationAlg.
Volcano on disk
Externalformatsor tools
P.G. Fusion2003 - 8
Tight & efficient tool integration
RTLRTL
CDFG Net list of Super Cells
RoutingRouting
Mask layoutMask layout
Placement
P.G. Fusion2003 - 9
Victim netAggressors attacking
Victim netAggressors spaced out
Supporting cross talk fixing
• Post-route Silicon repair flow:Post-route Silicon repair flow:• parasitic extractionparasitic extraction• run timerrun timer• Determine/filter problem (e.g. crosstalk Determine/filter problem (e.g. crosstalk
noise)noise)• re-route victim:re-route victim:
• redo-global routeredo-global route• redo-detailed routeredo-detailed route
• And/or resize and buffer:And/or resize and buffer:• insert & place gateinsert & place gate• redo-global routeredo-global route• redo-detailed routeredo-detailed route
• parasitic extractionparasitic extraction• run timerrun timer• etc. etc. etc. etc. etc. etc.
• Traditionally: Export gds2, Traditionally: Export gds2, generate SPEF, run delay calc, generate SPEF, run delay calc, run timing, fix designrun timing, fix design
P.G. Fusion2003 - 10
Magma Data model supports incremental design and analysis
• You can add, change or delete You can add, change or delete • any objectany object
• at any timeat any time
• Such changes are tracked by the data model, it keeps Such changes are tracked by the data model, it keeps itself consistent at any time.itself consistent at any time.
• The incremental analysis tools detect such change The incremental analysis tools detect such change automatically automatically • … … and update of only the affected partsand update of only the affected parts
• The timer, extractor and DRC engine are brutally The timer, extractor and DRC engine are brutally incrementalincremental• They do not require to be started explicitly.They do not require to be started explicitly.
• Result: fast and simple operation.Result: fast and simple operation.
P.G. Fusion2003 - 11
Timingconstraints
‘Volcanoes’: snapshots of the flow
• The contents of the magma data structure can be written to disk at any The contents of the magma data structure can be written to disk at any time during the design flow. A volcano contains a complete snapshot of time during the design flow. A volcano contains a complete snapshot of allall design data. design data.
• Resume operation, or use as backup. Resume operation, or use as backup.
Magma flowMagma flowRTLRTL
Verilog/vhdl
GDSIIGDSIImaskdata
Designrules (.lef gds2)
Floorplan(.def).lib
floorplan.volcanologic.volcanortl.volcanolibrary.volcano place.volcano route.volcano final.volcano
600Mb30Mb
P.G. Fusion2003 - 12
Interfacing with the data model through TCL
• Complete access:Complete access:• inspect, modify or inspect, modify or
delete any object or delete any object or attributeattribute
• TCL scripts also TCL scripts also drive the flowdrive the flow
• Also the Graphical Also the Graphical User Interface User Interface interfaces through interfaces through TCLTCL• easy configuration and easy configuration and
adaptationadaptation
P.G. Fusion2003 - 13
Magma RTL-to-GDS flow outline in TCL (batch)
set m [import verilog mydesign.v]set m [import verilog mydesign.v]
import volcano library.volcanoimport volcano library.volcano
fix rtl $m $lfix rtl $m $l
fix time $m $lfix time $m $l
fix plan $m $lfix plan $m $l
fix cell $m $lfix cell $m $l
fix clock $m $lfix clock $m $l
fix wire $m $lfix wire $m $l
export volcano mydesign.volcanoexport volcano mydesign.volcano
export gdsii $m mydesign.gdsexport gdsii $m mydesign.gds
check model $m -level finalcheck model $m -level final
run route stub $mrun route stub $m
run route global $m -antennarun route global $m -antenna
run route track $m -optimize noiserun route track $m -optimize noise
run route power $m -finalrun route power $m -final
check route spacing_short $mcheck route spacing_short $m
check route open -segment $mcheck route open -segment $m
run route final $m -singlepassrun route final $m -singlepass
run route antenna $mrun route antenna $m
run route refine $mrun route refine $m
run route final -incremental $mrun route final -incremental $m
check route drc $mcheck route drc $m
P.G. Fusion2003 - 15
Lets start the tool
• Accounts are called magma1, magma2, Accounts are called magma1, magma2, through magma15.through magma15.
• Just log in and have fun.Just log in and have fun.
• In unix shell, type ‘mantle’In unix shell, type ‘mantle’
P.G. Fusion2003 - 16
Timing is poorly predictable
Gate-to-gate delay depends on:Gate-to-gate delay depends on:• Driver gate size. Driver gate size. • Wire length (unknown during logic synthesis)Wire length (unknown during logic synthesis)• The configuration of the neighboring wires:The configuration of the neighboring wires: distance, near/far (unknown before detailed routing) distance, near/far (unknown before detailed routing) • The layer of the wire (determined during routing)The layer of the wire (determined during routing)• Timing window and slope of the neighboring wires.Timing window and slope of the neighboring wires.
P.G. Fusion2003 - 17
slack
Conventional layout synthesis
slackCdream
s
Creal
P.G. Fusion2003 - 18
place & route
logic synthesis
Result: many iterations
P.G. Fusion2003 - 19
slack
Gain-based physical synthesis:
Cdream
s
Creal
P.G. Fusion2003 - 20
What happens when gates are sized?
Both inverters have Both inverters have
approx. the same approx. the same delay.delay.
Notice that the Notice that the
input capacitanceinput capacitance
increased 2xincreased 2x
inin
outoutvddvdd
vssvssinin
outoutvddvdd
vssvss vssvss
vddvdd
1x 2x
P.G. Fusion2003 - 21
Key: simple model for step B
Fixed part,Fixed part,
parasitic delayparasitic delayDelay of theDelay of the
gate + its loadgate + its load
ElectricalElectrical effort effort
proportional to output loadproportional to output loadCCloadload / C / Cinin
LogicalLogical effort effort
depends on depends on
function of gatefunction of gate
Delay = (g * h) + p Delay = (g * h) + p
Ivan Sutherland (1991):Ivan Sutherland (1991):
CloadCin
For details: See the book: ‘Logical Effort’ by Sutherland, Sproull, HarrisFor details: See the book: ‘Logical Effort’ by Sutherland, Sproull, Harris
Morgan Kaufmann publishers, ISBN 1-55860-557-6Morgan Kaufmann publishers, ISBN 1-55860-557-6
P.G. Fusion2003 - 22
Logical effort: g
• To keep the same output drive strength, the 2 n-transistors in series must double their To keep the same output drive strength, the 2 n-transistors in series must double their size.size.
• As a result, the input capacitance of the nand is larger.As a result, the input capacitance of the nand is larger.• For the same output drive strength, an inverter needs less input capacitance. For the same output drive strength, an inverter needs less input capacitance. • More complex gates have less gainMore complex gates have less gain
Inverter: Cin = 1Inverter: Cin = 1Nand 2: Cin = 4/3Nand 2: Cin = 4/3
P.G. Fusion2003 - 23
Logical effort: g
• Assuming that in static Assuming that in static CMOS gates the mobility of CMOS gates the mobility of the p-transistor is half of the p-transistor is half of the n-mobility:the n-mobility:
Gate 1 2 3 n
Inverter 1 - - -
Nand - 4/3 5/3 (n+2)/3
Nor - 5/4 7/3 (2n+1)/3
• 3-input nor3-input nor
P.G. Fusion2003 - 24
Gain-based synthesis: supercells
• We extract the super cells from the library description (.lib, lef)We extract the super cells from the library description (.lib, lef)
Super!
• Contains:Contains:• g, h, pg, h, p
• size-range size-range
P.G. Fusion2003 - 25
Aggressive gate sizing
• The gain ratio (=Cload/Cin) is maintained is The gain ratio (=Cload/Cin) is maintained is placementplacement
• Sizes change Sizes change duringduring placement. placement.• As a result, delay is kept (almost) constant.As a result, delay is kept (almost) constant.
Cload/Cin = fixed
P.G. Fusion2003 - 26
Pre-layout timing check
0.5ns 0.5ns 0.5ns 0.5ns
FFffffffff
• If there is no feasible gain assignment, the sizes literally ‘explode’.If there is no feasible gain assignment, the sizes literally ‘explode’.
• ESP: gain is a measure of the ‘tightness’ of the constraints ESP: gain is a measure of the ‘tightness’ of the constraints
P.G. Fusion2003 - 27
Load violations
• Maximum drive strength in the library might be too smallMaximum drive strength in the library might be too small• Drive information is stored in super cell, and managed pre-placement.Drive information is stored in super cell, and managed pre-placement.• Buffering, cloning and restructuring are used to maintain delay during placementBuffering, cloning and restructuring are used to maintain delay during placement
Cload
Cin
1x
2x4x
Permissiblerange
Load violation
P.G. Fusion2003 - 28
Keeping delay fixed
0.6ns 0.6ns 0.6ns 0.6ns
FF
• Actively managing wire Actively managing wire delay:delay:• Through automatic sizing Through automatic sizing
(sizing-driven placement)(sizing-driven placement)
• Through buffer insertionThrough buffer insertion
• CloningCloning
P.G. Fusion2003 - 29
Recap: What happened
……. at the logical-physical boundary?. at the logical-physical boundary?
• Delay fixedDelay fixed
• Cell Area unknownCell Area unknown• Sum of areas determines Sum of areas determines
chip sizechip size
• No iterations requiredNo iterations required• benefit: speedbenefit: speed
• All cells are sized, so All cells are sized, so each gate has exactly each gate has exactly the right drive strength:the right drive strength:
• Not too littleNot too little
• Not too much (waste of Not too much (waste of area)area)
• Avoids SI problemsAvoids SI problems
• Cell Area fixedCell Area fixed• Delay is a gambleDelay is a gamble• Worst case delay Worst case delay
determines timing (max)determines timing (max)• Iterate to make ends meet.Iterate to make ends meet.• Only critical paths are Only critical paths are
addressed. Therefore addressed. Therefore gates will be too big:gates will be too big:• waste of areawaste of area• waste of powerwaste of power• Cause unnecessary Cause unnecessary
agressorsagressors
Power analysis with blast railPower analysis with blast rail
P.G. Fusion2003 - 31
Integrated Signal & Power Integrity Solution
• Concurrently addresses Concurrently addresses crosstalk, power, voltage crosstalk, power, voltage drop, electromigration, OCV, drop, electromigration, OCV, and timing problemsand timing problems
• Correct-by-construction Correct-by-construction methodology eliminates methodology eliminates iterations and reduces iterations and reduces design cycle timedesign cycle time
• Patented unified data model Patented unified data model architecture enables on-the-architecture enables on-the-fly correctionfly correction
• Unified Implementation flow Unified Implementation flow – single executable– single executable
P.G. Fusion2003 - 32
Power & Voltage Drop Analysis Throughout The Flow• Enables analysis early and often to ensure predictabilityEnables analysis early and often to ensure predictability
• Power analysis – static and dynamicPower analysis – static and dynamic• Accounts for leakage, switching, and short-circuit (crowbar) powerAccounts for leakage, switching, and short-circuit (crowbar) power
• Easy, flexible setup from .libEasy, flexible setup from .lib• Switching activity can be specified or read in (VCD, GAF or SAIF)Switching activity can be specified or read in (VCD, GAF or SAIF)• Activity propagationActivity propagation
• Voltage drop analysis - dynamic and transientVoltage drop analysis - dynamic and transient• Dynamic looks at switching activities over multiple clock cyclesDynamic looks at switching activities over multiple clock cycles• Transient analysis accounts for voltage glitches within a cycleTransient analysis accounts for voltage glitches within a cycle
• Uses network parasitics - decoupling caps, package inductancesUses network parasitics - decoupling caps, package inductances
• Addresses timing changes due to voltage dropsAddresses timing changes due to voltage drops• Performs cell-based deratingPerforms cell-based derating• Calculates timing changes without going to external STACalculates timing changes without going to external STA• Concurrent optimization makes necessary design changesConcurrent optimization makes necessary design changes
P.G. Fusion2003 - 33
Concurrent Optimization For Timing, Power & SI
• Unique FixedTiming methodology is the keyUnique FixedTiming methodology is the key• Defers sizing decisions until later stageDefers sizing decisions until later stage• Sizes drivers optimally to match actual loadSizes drivers optimally to match actual load• Slew balancing minimizes victim/aggressor pairsSlew balancing minimizes victim/aggressor pairs
• Analyze-Avoid-Adjust methodology ensures signal integrityAnalyze-Avoid-Adjust methodology ensures signal integrity• Advanced analysis models identify potential problemsAdvanced analysis models identify potential problems• Optimization engines make on-the-fly changes to address themOptimization engines make on-the-fly changes to address them
• Power management throughout the flowPower management throughout the flow• Reduces power & area by ideal sizingReduces power & area by ideal sizing• Additional techniques used to reduce static and dynamic powerAdditional techniques used to reduce static and dynamic power
Design A Design A (100K)(100K)
Current Current Flow - CFlow - C
Magma Magma FlowFlow
Avg PowerAvg Power 160 mw160 mw 115 mw115 mw
Clock Tree Clock Tree componentscomponents
703703 383383
Std cell areaStd cell area 2.26 mm22.26 mm2 1.96 mm21.96 mm2
Critical pathCritical path 4.3 ns4.3 ns 4.1 ns4.1 ns
TATTAT 1 week1 week 5 hours5 hours
13% area reduction
30% power reduction
Better timing
Faster TAT
P.G. Fusion2003 - 34
Blast Rail power analysis steps in magma flow
Floor planning
Activity annotation &Activity annotation &propagationpropagation
Power AnalysisPower Analysis
Current & voltage dropCurrent & voltage dropcalculationcalculation
Power infrastructuregeneration (rails,
mesh)
RTL synthesis
Rail network extractionRail network extraction
Voltage dropand EM textual
reports
net listnet list
floor plan with orfloor plan with orwithout placed gateswithout placed gates
Physical synthesisand optimization flow
Voltage drop inducedVoltage drop induceddelaydelay
VCD file
Librarydata
Voltage & current sources, Voltage & current sources, resistancesresistances
Power consumption
report
Slews from Slews from the built-in the built-in timer/extractortimer/extractor
P.G. Fusion2003 - 35
A metal wire in a routing A metal wire in a routing layer is modeled by a layer is modeled by a rectangle, called a rectangle, called a segmentsegment, that has up to , that has up to four connection points and four connection points and four resistances:four resistances:
WW
LL
Rail Network Description Rail Extraction
Rx = Rsheet W/L
Ry = Rsheet L/W
Rx/2Rx/2
Ry/2
Ry/2
cp1
cp2
cp3
cp4
The network combinesThe network combines• Physical description of the power gridPhysical description of the power grid• Electrical description of the power gridElectrical description of the power grid
P.G. Fusion2003 - 36
Network Features
Electrical nodesElectrical nodes• Node numberNode number
• LayerLayer
• List of segmentsList of segments
• Voltage (user-Voltage (user-specified, extracted, specified, extracted, calculated)calculated)
SegmentsSegments• Rectangular shapeRectangular shape• Layer *Layer *• Resistance (or more) *Resistance (or more) *• Up to four nodesUp to four nodes• List of pins (cell pins List of pins (cell pins
and model pins)and model pins)
Rail Extraction
P.G. Fusion2003 - 37
Blast Rail power analysis steps in magma flow
Floor planning
Activity annotation &Activity annotation &propagationpropagation
Power AnalysisPower Analysis
Current & voltage dropCurrent & voltage dropcalculationcalculation
Power infrastructuregeneration (rails,
mesh)
RTL synthesis
Rail network extractionRail network extraction
Voltage dropand EM textual
reports
net listnet list
floor plan with orfloor plan with orwithout placed gateswithout placed gates
Physical synthesisand optimization flow
Voltage drop inducedVoltage drop induceddelaydelay
VCD file
Librarydata
Voltage & current sources, Voltage & current sources, resistancesresistances
Power consumption
report
Slews from Slews from the built-in the built-in timer/extractortimer/extractor
P.G. Fusion2003 - 38
• The switching activity of a signal The switching activity of a signal describes the following characteristics:describes the following characteristics:
• The probability that the signal is logic one The probability that the signal is logic one
• The rising toggle rate The rising toggle rate
• The falling toggle rateThe falling toggle rate
• By default, the tool is looking at the By default, the tool is looking at the switching activity over a one second time switching activity over a one second time interval.interval.
Defining Switching Activity Activityannotation
P.G. Fusion2003 - 39
• Option 1: Specify switching activity using:Option 1: Specify switching activity using:
force activity annotateforce activity annotate
• Example: Basic clock running at 150MHz and non-Example: Basic clock running at 150MHz and non-clock nets at 10% of clockclock nets at 10% of clock
force activity annotate $m -probability 0.2 -toggle force activity annotate $m -probability 0.2 -toggle 15e615e6
data loop net model_net $m {data loop net model_net $m {
if { [query net is_clock $net] } {if { [query net is_clock $net] } {
force activity annotate $net -probability 0.5force activity annotate $net -probability 0.5 -toggle 300e6 -toggle 300e6
}}
}}
Ways to define Switching Activity Activityannotation
P.G. Fusion2003 - 40
• Option 2: Import VCD (value change dump) format Option 2: Import VCD (value change dump) format data from logic simulation using:data from logic simulation using:
import vcdimport vcd
• Option 3: Let tool set switching activity based on Option 3: Let tool set switching activity based on clock defined for static timing analysisclock defined for static timing analysis• The clock nets are switching twice (rise/fall) per cycleThe clock nets are switching twice (rise/fall) per cycle
• The non-clock nets are set to 50% of maximum clock The non-clock nets are set to 50% of maximum clock frequency found from input conefrequency found from input cone
• Nets without clock phase information are set to 0 unless Nets without clock phase information are set to 0 unless default activity is defineddefault activity is defined• Example: Example: config activity default 1e6config activity default 1e6
Activityannotation
Ways to define Switching Activity
P.G. Fusion2003 - 41
• Option 4: Propagate switching activity using:Option 4: Propagate switching activity using:
force activity propagateforce activity propagate
• Propagates switching activity from primary inputs Propagates switching activity from primary inputs and sequential element outputsand sequential element outputs
• Logic function of gate is used to obtain switching Logic function of gate is used to obtain switching activity at the gate outputsactivity at the gate outputs
Propagating switching activity Activityannotation
Notation: { probability, toggle rate }
{0.2,10}
{0.7,5}
{0.3,2}
{0.21,2.9}
{0.8,10}
{0.842,8.48}
P.G. Fusion2003 - 42
Blast Rail power analysis steps in magma flow
Floor planning
Activity annotation &Activity annotation &propagationpropagation
Power AnalysisPower Analysis
Current & voltage dropCurrent & voltage dropcalculationcalculation
Power infrastructuregeneration (rails,
mesh)
RTL synthesis
Rail network extractionRail network extraction
Voltage dropand EM textual
reports
net listnet list
floor plan with orfloor plan with orwithout placed gateswithout placed gates
Physical synthesisand optimization flow
Voltage drop inducedVoltage drop induceddelaydelay
VCD file
Librarydata
Voltage & current sources, Voltage & current sources, resistancesresistances
Power consumption
report
Slews from Slews from the built-in the built-in timer/extractortimer/extractor
P.G. Fusion2003 - 43
Power Analysis - CMOS dissipation
Gnd
Vdd
Cload
Vin Vout
Gnd Gnd
Vdd
Cload
Vin Vout
Gnd
What influences What influences current?current?
• event/stateevent/state
• slopeslope
• loadload
• PVTPVT
P.G. Fusion2003 - 44
Power calculation
Given cell with pins:Given cell with pins:
• State = stable state of pin-values State = stable state of pin-values
• Event = change of (any) pin-valueEvent = change of (any) pin-value
D
CL
Q
• State related dissipation State related dissipation • leakageleakage
• bias currents (typ. zero for CMOS)bias currents (typ. zero for CMOS)
• Event related dissipationEvent related dissipation• dynamic (cap. loading)dynamic (cap. loading)
• short circuitshort circuit
• Where do we get it from?Where do we get it from?• .lib, CV.lib, CV22
P.G. Fusion2003 - 45
Blast Rail power analysis steps in magma flow
Floor planning
Activity annotation &Activity annotation &propagationpropagation
Power AnalysisPower Analysis
Current & voltage dropCurrent & voltage dropcalculationcalculation
Power infrastructuregeneration (rails,
mesh)
RTL synthesis
Rail network extractionRail network extraction
Voltage dropand EM textual
reports
net listnet list
floor plan with orfloor plan with orwithout placed gateswithout placed gates
Physical synthesisand optimization flow
Voltage drop inducedVoltage drop induceddelaydelay
VCD file
Librarydata
Voltage & current sources, Voltage & current sources, resistancesresistances
Power consumption
report
Slews from Slews from the built-in the built-in timer/extractortimer/extractor
P.G. Fusion2003 - 46
Viewing the Voltage Drop Report
############################################################################################################################################
# Mantle analysis report# Mantle analysis report
# Command:# Command:
# report rail analysis vdrop \# report rail analysis vdrop \
# /work/incrementer/incrementer/net:VDD –max_nodes 4# /work/incrementer/incrementer/net:VDD –max_nodes 4
# Date: Fri Oct 4 01:40:10 2002# Date: Fri Oct 4 01:40:10 2002
# Version: mantle.linux version 3.2.a.50-linux22_x86# Version: mantle.linux version 3.2.a.50-linux22_x86
# IR-Drop Analysis Configuration:# IR-Drop Analysis Configuration:
# IR-drop iteration mode: off# IR-drop iteration mode: off
# Automatic power pad pin detection is turned on.# Automatic power pad pin detection is turned on.
############################################################################################################################################
IR-drop on net: /work/design/design/net:VDD (2.300 V).IR-drop on net: /work/design/design/net:VDD (2.300 V).
Node x (um) y (um) layer voltage dropNode x (um) y (um) layer voltage drop
------ -------- -------- ------ ------------------ -------- -------- ------ ------------
70 19.020 45.760 metal1 783.1 uV70 19.020 45.760 metal1 783.1 uV
81 26.940 45.760 metal2 783.1 uV81 26.940 45.760 metal2 783.1 uV
132 17.820 45.760 metal2 782.8 uV132 17.820 45.760 metal2 782.8 uV
67 11.055 45.760 metal1 782.0 uV67 11.055 45.760 metal1 782.0 uV
P.G. Fusion2003 - 47
Viewing Results in the GUI
Open the Open the Power/Rail AnalyzerPower/Rail Analyzer form form
from the Layout windowfrom the Layout window
Tools -> Power/Rail AnalyzerTools -> Power/Rail Analyzer
GUI
P.G. Fusion2003 - 48
Power Consumption Map GUI
P.G. Fusion2003 - 49
Power consumption in a single clock cycle
P.G. Fusion2003 - 50
Power Density Map GUI
P.G. Fusion2003 - 51
Current Map GUI
P.G. Fusion2003 - 52
Wire VDrop Map GUI
P.G. Fusion2003 - 53
Cell VDrop Map GUI
P.G. Fusion2003 - 54
Global VDrop Map (Oilmap) GUI
P.G. Fusion2003 - 55
Using the Query Panel GUI
P.G. Fusion2003 - 56
Using the Query Panel GUI
P.G. Fusion2003 - 57
Pre- and post-placement voltage drop maps
Unplaced
Placed
P.G. Fusion2003 - 58
Using vdrop analysis to debug power design
Missing connection
to the pad in the lower left corner
P.G. Fusion2003 - 59
Next round …
… other problems
The power mesh is stopping too early. This leads to weak connected cell rows.
P.G. Fusion2003 - 60
Leakage Power Optimization - Multi-Vt Libraries
Magma provides automated solution to analyze and Magma provides automated solution to analyze and optimize using multiple Vt librariesoptimize using multiple Vt libraries
• Optimization to reduce leakage powerOptimization to reduce leakage power
• Optimizes to fast, high leakage cells for timing critical pathsOptimizes to fast, high leakage cells for timing critical paths
• Utilizes slower, low leakage cells for non-timing critical pathsUtilizes slower, low leakage cells for non-timing critical paths
• Understands timing vs power and area vs power tradeoffs to Understands timing vs power and area vs power tradeoffs to achieve best performanceachieve best performance
• Example - leakage power numbers after optimizationExample - leakage power numbers after optimization
Regular flow: 370 μWMulti-Vt flow: 96.2 μW (1.1% low Vt cells)
P.G. Fusion2003 - 61
Blast Rail power analysis steps in magma flow
Floor planning
Activity annotation &Activity annotation &propagationpropagation
Power AnalysisPower Analysis
Current & voltage dropCurrent & voltage dropcalculationcalculation
Power infrastructuregeneration (rails,
mesh)
RTL synthesis
Rail network extractionRail network extraction
Voltage dropand EM textual
reports
net listnet list
floor plan with orfloor plan with orwithout placed gateswithout placed gates
Physical synthesisand optimization flow
Voltage drop inducedVoltage drop induceddelaydelay
VCD file
Librarydata
Voltage & current sources, Voltage & current sources, resistancesresistances
Power consumption
report
Slews from Slews from the built-in the built-in timer/extractortimer/extractor
P.G. Fusion2003 - 62
Voltage-Drop-Induced Delay
• Voltage drops can affect cell performance – this in turn can affect Voltage drops can affect cell performance – this in turn can affect critical path timingcritical path timing
• Sizing changes to address timing can affect power – which in turn Sizing changes to address timing can affect power – which in turn affects voltage dropaffects voltage drop
• Blast Rail provides concurrent timing, power, and voltage drop Blast Rail provides concurrent timing, power, and voltage drop analysis to measure these inter-relationshipsanalysis to measure these inter-relationships
• Uses k-factor based derating functions specified in .lib format.Uses k-factor based derating functions specified in .lib format.• Supports multiple libraries, each with its own specific NVPTSupports multiple libraries, each with its own specific NVPT
• Concurrent optimization tunes cell sizing as required to avoid Concurrent optimization tunes cell sizing as required to avoid potential timing problemspotential timing problems
timing powervoltage
dropP V(0) T
PV(1)T…PV(n)T
P.G. Fusion2003 - 63
Electromigration: wires wear out!
electronselectrons
Contact(tungsten)
‘reservoir’
‘End-of-line’overhang
Cavities in wire
Nor
te-D
ame
Mic
roel
ectr
onic
s La
bora
tory
P.G. Fusion2003 - 64
Dealing with Electromigration
• A statistical effect, resulting in a gradual increase of the wire resistance, A statistical effect, resulting in a gradual increase of the wire resistance, followed by failure.followed by failure.
• The time that 50% of the wires fail is given by:The time that 50% of the wires fail is given by:
kT
E
f
a
eJ
At
*1
*2
• Depends on the current density JDepends on the current density J
• Widening wires would helpWidening wires would help
• AC and DC Electromigration are very different.AC and DC Electromigration are very different.
P.G. Fusion2003 - 65
Rail Electromigration Analysis
• EM induced failuresEM induced failures• Short term failure: Short term failure: peak current density analysispeak current density analysis
• Long term failure: Long term failure: average current density analysisaverage current density analysis
• Joule heating: Joule heating: RMS current density analysisRMS current density analysis
• Width dependent, peak/average/rms dc current Width dependent, peak/average/rms dc current density limits for each layer and viadensity limits for each layer and via• Current density checks against limitsCurrent density checks against limits
• MTF calculation using Black’s formulaMTF calculation using Black’s formula
• report rail analysis em report rail analysis em objectobject
• UseUse rule em limit wire rule em limit wire to define current density limitsto define current density limits..
P.G. Fusion2003 - 66
Viewing Rail EM Results
Rail EM ReportRail EM Report
###################################################################### ###################################################################### # Mantle analysis report# Mantle analysis report# Command:# Command:# report rail analysis em \# report rail analysis em \# /work/incrementer/incrementer \# /work/incrementer/incrementer \# -number 5 \# -number 5 \# -all_segments \# -all_segments \# -file em.rpt # -file em.rpt # Date: Fri Jan 24 08:50:53 2003# Date: Fri Jan 24 08:50:53 2003# Version: mantle.linux version 4.0.a.8-linux22_x86 # Version: mantle.linux version 4.0.a.8-linux22_x86 ############################################################################################################################################
Net Actual Allowed Slack Layer Location Width/ Req. Width/ RelativeNet Actual Allowed Slack Layer Location Width/ Req. Width/ Relative
Current Current #Vias #Vias SlackCurrent Current #Vias #Vias Slack
--- ------------ ----------- ---------- ------ ------ ------ -------- ------- ---- --- ------------ ----------- ---------- ------ ------ ------ -------- ------- ---- VSS 0.00540 mA/um 0.10 mA/um 0.04 mA/um metal1 31.92u 30.64u 0.340 um 0.000 um 1 VSS 0.00540 mA/um 0.10 mA/um 0.04 mA/um metal1 31.92u 30.64u 0.340 um 0.000 um 1 VDD 0.00220 mA/um 0.10 mA/um 0.07 mA/um metal1 29.04u 44.96u 0.800 um 0.000 um 1 VDD 0.00220 mA/um 0.10 mA/um 0.07 mA/um metal1 29.04u 44.96u 0.800 um 0.000 um 1 VDD 0.00150 mA/um 0.10 mA/um 0.08 mA/um metal1 29.04u 34.88u 0.660 um 0.000 um 1 VDD 0.00150 mA/um 0.10 mA/um 0.08 mA/um metal1 29.04u 34.88u 0.660 um 0.000 um 1 VDD 0.00610 mA/um 0.10 mA/um 0.03 mA/um metal1 27.72u 24.80u 0.800 um 0.000 um 1 VDD 0.00610 mA/um 0.10 mA/um 0.03 mA/um metal1 27.72u 24.80u 0.800 um 0.000 um 1 VDD 0.00450 mA/um 0.10 mA/um 0.05 mA/um metal1 28.61u 16.00u 0.340 um 0.000 um 1 VDD 0.00450 mA/um 0.10 mA/um 0.05 mA/um metal1 28.61u 16.00u 0.340 um 0.000 um 1
Rail EM