spring 2014, mar 17...elec 7770: advanced vlsi design (agrawal)1 elec 7770 advanced vlsi design...
TRANSCRIPT
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 11
ELEC 7770ELEC 7770Advanced VLSI DesignAdvanced VLSI Design
Spring 2014Spring 2014ZeroZero--Skew Clock RoutingSkew Clock Routing
Vishwani D. AgrawalVishwani D. AgrawalJames J. Danaher ProfessorJames J. Danaher Professor
ECE Department, Auburn UniversityECE Department, Auburn University
Auburn, AL 36849Auburn, AL 36849
[email protected]://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr14/course.htmlhttp://www.eng.auburn.edu/~vagrawal/COURSE/E7770_Spr14/course.html
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 22
Zero-Skew Clock RoutingZero-Skew Clock Routing
FFFF
FF FF
FFFF
FF FF
FFFF
FF FF
FFFF
FF FF
CK
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 33
Zero-Skew: ReferencesZero-Skew: References H-TreeH-Tree
A. L. Fisher and H. T. Kung, “Synchronizing Large A. L. Fisher and H. T. Kung, “Synchronizing Large Systolic Arrays,” Systolic Arrays,” Proc. SPIEProc. SPIE, vol. 341, pp. 44-52, May , vol. 341, pp. 44-52, May 1982.1982.
A. Kahng, J. Cong and G. Robins, “High-Performance A. Kahng, J. Cong and G. Robins, “High-Performance Clock Routing Based on Recursive Geometric Clock Routing Based on Recursive Geometric Matching,” Matching,” Proc. Design Automation ConfProc. Design Automation Conf., June ., June 1991, pp. 322-327.1991, pp. 322-327.
M. A. B. Jackson, A. Srinivasan and E. S. Kuh, “Clock M. A. B. Jackson, A. Srinivasan and E. S. Kuh, “Clock Routing for High-Performance IC’s,” Routing for High-Performance IC’s,” Proc. Design Proc. Design Automation ConfAutomation Conf., June 1990, pp. 573-579.., June 1990, pp. 573-579.
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 44
Zero-Skew RoutingZero-Skew Routing Build clock tree bottom up:Build clock tree bottom up:
Leaf nodes are all equal loading flip-flops.Leaf nodes are all equal loading flip-flops. Two zero-skew subtrees are joined to form a larger zero-skew Two zero-skew subtrees are joined to form a larger zero-skew
subtree.subtree. Entire clock tree is built recursively.Entire clock tree is built recursively.
R.-S. Tsay, “An Exact Zero-Skew Clock Routing R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,” Algorithm,” IEEE Trans. CADIEEE Trans. CAD, vol. 12, no. 2, pp. 242-, vol. 12, no. 2, pp. 242-249, Feb. 1993.249, Feb. 1993.
J. Rubenstein, P. Penfield and M. A. Horowitz, “Signal J. Rubenstein, P. Penfield and M. A. Horowitz, “Signal Delay in RC Tree Networks,” Delay in RC Tree Networks,” IEEE Trans. CADIEEE Trans. CAD, vol. 2, , vol. 2, no. 3, pp. 202-211, July 1983.no. 3, pp. 202-211, July 1983.
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 55
Balancing Subtrees (1)Balancing Subtrees (1)
t1
C1c1/2c1/2
t2
C2c2/2c2/2
r1
r2(1 – x)L
xL
Tapping point
Subtree 1
Subtree 2
A
B
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 66
Balancing Subtrees (2)Balancing Subtrees (2)
Subtrees 1 and 2 are each balanced (zero-Subtrees 1 and 2 are each balanced (zero-skew) trees, with delays t1 and t2 to respective skew) trees, with delays t1 and t2 to respective leaf nodes.leaf nodes.
Total capacitances of subtrees are C1 and C2, Total capacitances of subtrees are C1 and C2, respectively.respectively.
Connect points A and B by a minimum-length Connect points A and B by a minimum-length wire of length L.wire of length L.
Determine a tapping point x such that wire Determine a tapping point x such that wire lengths xL and (1 – x)L produce zero skew.lengths xL and (1 – x)L produce zero skew.
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 77
Balancing Subtrees (3)Balancing Subtrees (3)
Use Elmore delay formula:Use Elmore delay formula:0.69 r1(C1 + c1/2) + t1 = 0.69 r2(C2 + c2/2) + t20.69 r1(C1 + c1/2) + t1 = 0.69 r2(C2 + c2/2) + t2
Substitute:Substitute: r1 = axL, r2 = a(1 – x)Lr1 = axL, r2 = a(1 – x)L c1 = bxL, c2 = b(1 –x)Lc1 = bxL, c2 = b(1 –x)LabLabL22x + aL(C1+C2)x = 1.45 (t2 – t1) + aL(C2+bL/2)x + aL(C1+C2)x = 1.45 (t2 – t1) + aL(C2+bL/2)
Then solve for x:Then solve for x:1.45 (t2 – t1) + aL (C2 + bL/2)1.45 (t2 – t1) + aL (C2 + bL/2)
x =x = ────────────────────────────────────── aL(bL + C1 + C2)aL(bL + C1 + C2)
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 88
Balancing Subtrees Example 1Balancing Subtrees Example 1 Subtree parameters:Subtree parameters:
Subtree 1: t1 = 5ps, C1 = 3pFSubtree 1: t1 = 5ps, C1 = 3pF Subtree 2: t2 = 10ps, C2 = 6pFSubtree 2: t2 = 10ps, C2 = 6pF
Interconnect:Interconnect: L = 1mmL = 1mm Wire parameters: a = 100Wire parameters: a = 100ΩΩ/cm, b = 1pF/cm/cm, b = 1pF/cm
Tapping point:Tapping point:1.45(t2 – t1) + aL (C2 + bL/2)1.45(t2 – t1) + aL (C2 + bL/2) 1.45(10–5) + 1001.45(10–5) + 100×0.1(6 + 1×0.1/2)×0.1(6 + 1×0.1/2)
X =X = ────────────────── =────────────────── = ────────────────────── ────────────────────── aL (bL + C1 + C2)aL (bL + C1 + C2) 100×0.1(1×0.1+3+6) 100×0.1(1×0.1+3+6)
= 1.45(5 + 60.5)/(10= 1.45(5 + 60.5)/(10×9.1) = 0.7445×9.1) = 0.7445
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 99
Example 1Example 1
FF
FF
FF
FF
FF
FF
FFTo next level
Subtree 1
Subtree 2
0.7
74
45m
m
0.2555mm
t1 = 5ps, C1 = 3pFt1 = 5ps, C1 = 3pF
t2 = 10ps, C2 = 6pFt2 = 10ps, C2 = 6pF
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1010
Balancing Subtrees, x > 1Balancing Subtrees, x > 1 Tapping point set at root of tree with larger loading (C2, t2).Tapping point set at root of tree with larger loading (C2, t2). Wire to the root of other tree is elongated to provide Wire to the root of other tree is elongated to provide
additional delay. Wire length L is found as follows:additional delay. Wire length L is found as follows: Set x = 1 in Set x = 1 in abLabL22x + aL(C1+C2)x = 1.45(t2 – t1)+aL(C2+bL/2)x + aL(C1+C2)x = 1.45(t2 – t1)+aL(C2+bL/2)
i.e., Li.e., L22 + (2C1/b)L – 2.9 (t2 – t1)/(ab) = 0 + (2C1/b)L – 2.9 (t2 – t1)/(ab) = 0 Wire length is given by:Wire length is given by:
[(aC1)[(aC1)2 2 + 2.9 ab(t2 – t1)]+ 2.9 ab(t2 – t1)]½½ – aC1 – aC1LL == ────────────────────────────────────────
a ba b R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,” R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,”
IEEE Trans. CADIEEE Trans. CAD, vol. 12, no. 2, pp. 242-249, Feb. 1993., vol. 12, no. 2, pp. 242-249, Feb. 1993.
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1111
Balancing Subtrees Example 2Balancing Subtrees Example 2 Subtree parameters:Subtree parameters:
Subtree 1: t1 = 2ps, C1 = 1pFSubtree 1: t1 = 2ps, C1 = 1pF Subtree 2: t2 = 15ps, C2 = 10pFSubtree 2: t2 = 15ps, C2 = 10pF
Interconnect:Interconnect: L = 1mmL = 1mm Wire parameters: a = 100Wire parameters: a = 100ΩΩ/cm, b = 1pF/cm/cm, b = 1pF/cm
Tapping point:Tapping point:
1.45(t2 – t1) + aL (C2 + bL/2)1.45(t2 – t1) + aL (C2 + bL/2) 1.45(15–2) + 1001.45(15–2) + 100×0.1(10 + ×0.1(10 + 1×0.1/2)1×0.1/2)x =x = ─────────────────── =─────────────────── = ────────────────────── ──────────────────────
aL (bL + C1 + C2)aL (bL + C1 + C2) 100×0.1(1×0.1+1+10) 100×0.1(1×0.1+1+10)
= (18.85 + 100.5)/(10= (18.85 + 100.5)/(10×11.1) = 1.0752×11.1) = 1.0752
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1212
Example 2, x = 1.0752Example 2, x = 1.0752
Setting x = 1.0,Setting x = 1.0,
[(aC1)[(aC1)22+2.9ab(t2 – t1)]½ – aC1+2.9ab(t2 – t1)]½ – aC1LL == ──────────────────── ────────────────────
a ba b
[(100[(100×1×1))2 2 + 290 (15 – 2)]½ – 100×1+ 290 (15 – 2)]½ – 100×1== ─────────────────────── ───────────────────────
100×1100×1
== 0.1735cm0.1735cm
For a wire of 1.735mm length, place the clock feed at one end.For a wire of 1.735mm length, place the clock feed at one end.
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1313
Example 2, L = 1.735mm Example 2, L = 1.735mm
FF
FF
FF
FF
FF
FF
FF
To next level
Subtree 1
Subtree 2
L = 1.7355mm
t1 = 2ps, C1 = 1pFt1 = 2ps, C1 = 1pF
t2 = 15ps, C2 = 10pFt2 = 15ps, C2 = 10pF
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1414
Balancing Subtrees, x < 0Balancing Subtrees, x < 0 Tapping point set at root of tree with smaller loading (C1, t1).Tapping point set at root of tree with smaller loading (C1, t1). Wire to the root of other tree is elongated to provide Wire to the root of other tree is elongated to provide
additional delay. Wire length L found as follows:additional delay. Wire length L found as follows: Set x = 0 in Set x = 0 in abLabL22x + aL(C1+C2)x = 1.45(t2 – t1)+aL(C2+bL/2)x + aL(C1+C2)x = 1.45(t2 – t1)+aL(C2+bL/2)
i.e., Li.e., L22 + (2C2/b)L – 2.9 (t1 – t2)/(ab) = 0 + (2C2/b)L – 2.9 (t1 – t2)/(ab) = 0 Wire length is given by:Wire length is given by:
[(aC2)[(aC2)2 2 + 2.9 ab(t1 – t2)]+ 2.9 ab(t1 – t2)]½½ – aC2 – aC2LL == ────────────────────────────────────────
a ba b R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,” R.-S. Tsay, “An Exact Zero-Skew Clock Routing Algorithm,”
IEEE Trans. CADIEEE Trans. CAD, vol. 12, no. 2, pp. 242-249, Feb. 1993., vol. 12, no. 2, pp. 242-249, Feb. 1993.
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1515
Balancing Subtrees Example 3Balancing Subtrees Example 3 Subtree parameters:Subtree parameters:
Subtree 1: t1 = 15ps, C1 = 10pFSubtree 1: t1 = 15ps, C1 = 10pF Subtree 2: t2 = 2ps, C2 = 1pFSubtree 2: t2 = 2ps, C2 = 1pF
Interconnect:Interconnect: L = 1mmL = 1mm Wire parameters: a = 100Wire parameters: a = 100ΩΩ/cm, b = 1pF/cm/cm, b = 1pF/cm
Tapping point:Tapping point:
1.45(t2 – t1) + aL (C2 + bL/2)1.45(t2 – t1) + aL (C2 + bL/2) 1.45(2–15) + 1001.45(2–15) + 100×0.1(1 + 1×0.1/2)×0.1(1 + 1×0.1/2)x =x = ─────────────────── = ───────────────────────────────────────── = ──────────────────────
aL (bL + C1 + C2)aL (bL + C1 + C2) 100×0.1(1×0.1+1+10) 100×0.1(1×0.1+1+10)
= ( – 18.85 + 10.5)/(10= ( – 18.85 + 10.5)/(10×11.1) = – 0.0752×11.1) = – 0.0752
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1616
Example 3, x = – 0.0752Example 3, x = – 0.0752
Setting x = 0.0,Setting x = 0.0,
[(aC2)[(aC2)22+2.9ab(t1 – t2)]½ – aC2+2.9ab(t1 – t2)]½ – aC2LL == ────────────────── ──────────────────
a ba b
[(100[(100×1×1))22+290 (15 – 2)]½ – 100×1+290 (15 – 2)]½ – 100×1== ─────────────────────── ───────────────────────
100×1100×1
== 0.1735cm0.1735cm
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1717
Example 3, L = 1.255mm Example 3, L = 1.255mm
FF
FF
FF
FF
To next level
Subtree 1L = 1.735mm
FF
FF
FFSubtree 2
t1 = 15ps, C1 = 10pFt1 = 15ps, C1 = 10pF
t2 = 2ps, C2 = 1pFt2 = 2ps, C2 = 1pF
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1818
Zero-Skew DesignZero-Skew Design
FF A FF BComb.
CK
CK
Single-cycle path delay
timeTck = 75ns
FF CComb.
Delay=75ns
Delay= 50ns
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 1919
Nonzero-Skew DesignNonzero-Skew Design
FF A FF BComb.
CK
CK
Single-cycle path delay
timeTck = 50ns
FF CComb.
Delay=75ns
Delay= 50ns
Delay = 25ns
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2020
Optimized Skew DesignOptimized Skew Design
FF A FF BComb.
CKPeriod = T
FF CComb.
Delay=75ns
Delay= 50ns
SA SB SCDelay
Linear program: Objective: Minimize TConstraints (subject to):
SB – SA + T ≥ 75SC – SB + T ≥ 50SA – SC + T ≥ 75
Comb. Delay=75ns
Online LP SolversOnline LP Solvers PHPSimplex solverPHPSimplex solver
http://www.phpsimplex.com/simplex/simplex.htm?l=en
LINDO LINDO http://www.lindo.com/ (Download) (Download) File Lec12.ltxFile Lec12.ltx
! Lecture 11 example! Lecture 11 example
MIN TMIN T
SUBJECT TO SUBJECT TO
1)1) SB - SA + T ≥SB - SA + T ≥ 7575
2)2) SC - SB + T ≥ SC - SB + T ≥ 5050
3)3) SA - SC + T ≥ SA - SC + T ≥ 7575
ENDEND
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2121
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2222
Optimized Skew DesignOptimized Skew Design
FF A FF BComb.
CKT = 66.67ns
FF CComb.
Delay=75ns
Delay= 50ns
8.33ns 16.67ns 0nsDelay
Comb. Delay=75ns
Spring 2014, Mar 17 . . .Spring 2014, Mar 17 . . . ELEC 7770: Advanced VLSI Design (Agrawal)ELEC 7770: Advanced VLSI Design (Agrawal) 2323
ConclusionConclusion
Zero-skew design is possible at the layout level.Zero-skew design is possible at the layout level. Zero-skew usually results in higher clock speed.Zero-skew usually results in higher clock speed. Nonzero clock skews can improve the design Nonzero clock skews can improve the design
with reduced hardware and/or higher speed.with reduced hardware and/or higher speed.