ee241 - spring 2013bwrcs.eecs.berkeley.edu/classes/icdesign/ee241_s13/lectures/lectu… ·...
TRANSCRIPT
1
EE241 - Spring 2013Advanced Digital Integrated Circuits
Lecture 13: Timing revisited
2
Announcements
Homework 2 due today
Quiz #2 on Monday
Midterm project report due next Wednesday
2
3
Outline
Last lectureECC in SRAM
SRAM scaling options
This lectureBack to timing
Latch-based timing
Variability and timing
4. Design for performance
A. Timing
3
5
Flip-Flop Parameters
D
Clk
Q
D
Q
Clk
TCQ
TH
PWm
TSU
Delays can be different for rising and falling data transitions
6
Latch Parameters
D
Clk
Q
D
Q
Clk
TCQ
TH
PWmTSU
TDQ
Delays can be different for rising and falling data transitions
Unger and TanTrans. on Comp.10/86
4
7
Example Clock System
Courtesy of IEEE Press, New York. 2000
8
Clock Nonidealities
Clock skewSpatial variation in temporally equivalent clock edges; deterministic + random, tSK
Clock jitterTemporal variations in consecutive edges of the clock signal; modulation + random noise
Cycle-to-cycle (short-term) - tJSLong-term - tJL
Variation of the pulse width for level-sensitive clocking
5
9
Clock Skew and Jitter
Both skew and jitter affect the effective cycle time
Only skew affects the race margin, if jitter is from the sourceDistribution-induced jitter affects both
Clk1
Clk2
tSK
tJS
10
Clock Uncertainties
2
4
3
Power Supply
Interconnect
5 Temperature
6 Capacitive Load
7 Coupling to Adjacent Lines
1 Clock Generation
Devices
Sources of clock uncertainty
6
11
Clock Constraints in Edge-Triggered Systems
Courtesy of IEEE Press, New York. 2000
12
Latch timing
D
Clk
Q
tDQ
tCQ
When data arrives to transparent latch
When data arrives to closed latch
Data has to be ‘re-launched’
Latch is a ‘soft’ barrier
7
13
Single-Phase Clock with Latches
Latch
Logic
Clk
Clk
tCY
PW
Tskl Tskl TsktTskt
Unger and TanTrans. on Comp.10/86
sktsklsk TTT
In Rabaey Chapter 10:
14
Preventing Late Arrivals
Clk
tCY PW
TSU Datamustarrive
Clk
TCQ TLM
TSU
Clk
TDQ TLM
TSUPW
TSU
8
15
Preventing Late Arrivals
,max
skl skt SU CQMCY LM
DQM
T T T T PWt T
T
CY CQM LM SU skl sktt T T T T T PW
CY DQM LMt T T
Or:
16
Preventing Premature Arrivals
ClkTHPW
TCQ TLm
Two cases, reduce to one:
Lm skl skt H CQmT T T T PW T
9
17
Single-Latch Timing Summary
Latch
Logic
Clk
,max
skl skt SU CQMCY LM
DQM
T T T T PWt T
T
Lm skl skt H CQmT T T T PW T
Bounds on logic delay:
Solutions: 1) Balance logic delays2) Locally generated short PW
18
Latch-Based Design
L1Latch
Logic
Logic
L2Latch
Clk
L1 latch is transparentwhen Clk = 0
L2 latch is transparent when Clk = 1
10
19
Latch-Based Timing
As long as transitions are within the assertion period of the latch,no impact of position of clock edges
20
Latch Design and Hold Times
11
21
Latch-Based Timing
Longest path
2CY DQM LHM LLMT T T T
l Short paths
CLLm SK H CQmT T T T
CLHm SK H CQmT T T T
Same as register-based design but holdsfor both clock edges
Independent of skew
22
Latch-Based Timing
L1Latch
Logic
Logic
L2Latch
Clk
Clk
Clk
L1 latch
L2 latch
Skew
Can tolerate skew!
Longpath
Shortpath
Static logic
12
23
Soft-Edge Properties of Latches
Slack passing – logical partition uses left over time (slack) from the previous partitionTime borrowing – logical partition utilizes a portion of time allotted to the next partitionMakes most impact in unbalanced pipelines
Bernstein et al, Chapter 8, Chandrakasan, Chap 11 (by Partovi)
24
Slack-Passing and Cycle Borrowing
For N stage pipeline, overall logic delay should be < N Tcl
13
4. Design for performance
B. Statistical timing
26
Pictorial view of setup and hold tests
0 or moreswitching(s)allowed
Latest clock arrival time Earliest clock arrival time(next cycle)
Data must be stable
Hold time
Early RAT
Data must be stable
Setup time
Late RAT
Actual early AT Actual late AT
Earlyslack
Lateslack
ICCAD '07 Tutorial Chandu Visweswariah
14
27
LF CF
Hold test
Handling of across-chip variation
Each gate has a range of delay: [lb, ub]The lower bound is used for early timingThe upper bound is used for late timing
This is called an early/late splitStatic timing obtains bounds on timing slacks
Timing is sperformed as one forward pass and one backward pass
LF CF
Setup test
Capturing early path
Launching late path
Capturing late path
Launching early path
ICCAD '07 Tutorial Chandu Visweswariah
28
How is the early/late split computed?
The best way is to take known effects into account during characterization of library cells
History effect, simultaneous switching, pre-charging of internal nodes, etc.This drives separate characterization for early and late; this is the most accurate method
Failing that, the most common method is derating factorsExample: Late delay = library delay * 1.05
Early delay = library delay * 0.95The IBM way of achieving derating is LCD factors (Linear Combination of Delay) (FC=fast chip, SC=slow chip, see next page)
Late delay = L * FC_delay + L * NOM_delay + L * SC_delayEarly delay = E * FC_delay + E * NOM_delay + E * SC_delayAcross-chip variation is therefore assumed to be a fixed proportion of chip-to-chip variation for each cell type
ICCAD '07 Tutorial Chandu Visweswariah
15
29
IBM delay modeling*
Intrinsic:Chip means
SystematicACV
RandomACV
Early Late Early Late*P. S. Zuchowski, ICCAD’04
At a given cornerlate delay = intrinsic + systematic + randomearly delay = intrinsic – systematic – random
ICCAD '07 Tutorial Chandu Visweswariah
30
Traditional timing corners
Fast
chi
p ea
rly
Fast
chi
p la
te
Intra-chip variation
Slow
chi
p ea
rly
Slow
chi
p la
te
Intra-chip variation
Chip-to-chip variation
Fast chip Slow chip
ICCAD '07 Tutorial Chandu Visweswariah
16
31
The problem with an early/late splitThe early/late split is very useful
Allows bounds during delay modelingAny unknown or hard-to-model effect can be swept under the rug of an early/late split
But, it has problemsAdditional pessimism (which may be desirable)Unnecessary pessimism (which is never desirable)
LF CF
Setup test
Capturing early path
Launching late path
This physically commonportion can’t be both fastand slow at the same time
ICCAD '07 Tutorial Chandu Visweswariah
32
Additional pessimism: Clock tree view
LF1 LF2 CFLF3 Comb.
Comb.
Comb.
FP1 and FP2
FP3
ICCAD '07 Tutorial Chandu Visweswariah
17
33
How to have less pessimism?
Common path pessimism removal
Account for correlations
Credit for statistical averaging of random
34
Statistical timing
Deterministic
Statistical
a
b
c+
+ MAX
b
a
c+
+ MAX
ICCAD '07 Tutorial Chandu Visweswariah
18
35
The problem of correlationsThere are many reasons for correlations
Chip-to-chip variations are perfectly correlated within a single chipSame circuit typesSame device familiesSame metal levelsSame voltage islandsSame regions of the chipDependence on common sources of variationReconvergent fanoutEtc.
In a reasonable-sized chip, there may be 100 million timing quantities, so we don’t handle correlations in the classical way
Not by storing and manipulating a 100M x 100M covariance matrix
36
Canonical form
All timing quantities are parameterized by the sources of variation
Correlation can be judged on-demand by inspection
annn RaXaXaXaa 122110
Constant(nominalvalue)
Independentlyrandomuncertainty
Sensitivities Deviation ofglobal sourcesof variation fromnominal values
19
37
Statistical timing basicsRepresent all timing quantities in canonical form
Delays, slews, guard times, ATs, RATs, slacks, PLL adjusts, constraints, CPPR adjusts
Propagate ATs forward through the timing graphAddition of two canonical forms is easyMax/min operations are also easy with the help of some analytic formulas
Propagate RATs backward through the timing graphSubtraction of two canonical forms is easyUse statistical max/min operations
Slack is simply the difference between AT and RATSince this is available in canonical form, we get sensitivities of circuit performance to sources of variation for freeThese can be used to ensure a robust design
38
Statistical max operation
*C. E. Clark, “The greatest of a finite set of random variables,” OR Journal, March-April 1961, pp. 145—162**M. Cain, “The moment-generating function of the minimum of bivariate normal random variables,” American Statistician, May ’94, 48(2)
20
39
Unified view of correlations
Independentlyrandom part
Spatially correlated part:within-chip distance-related correlation
Globally correlated part: chip-to-chip, wafer-to-wafer, batch-to-batch variation
1
0 Distance
Correlation Coefficient
Rrii XaXaaD 0
ICCAD '07 Tutorial Chandu Visweswariah
40
Spatial correlation vs. early/late split
LF1 LF2 CFLF3
early clock
0
1
2
3
4 6
7
8
9
10
5 11
12
13
14
15
16
17 19
20
21
22
23
18 24
25
26
27
28
29
30 32
33
34
35
36
31 37
38
39
40
41
42
43 45
46
47
48
49
44 50
51
52
53
54
55
56 58
59
60
61
62
57 63
64
65
66
67
68
69 71
72
73
74
75
70 76
77
7879
8081
82 84
85
86
87
88
83 89
90
Dependence on common virtual variables cancels out at the timing testICCAD '07 Tutorial Chandu Visweswariah
21
41
Next Lecture
Latches and flip-flops