i. mandoiu, a.b. kahng

1
Annual Review September 2001 Floorplan Evaluation with Timing-Driven Global Wireplanning, Pin Assignment, and Buffer/Wire Sizing I. Mandoiu, A.B. Kahng A bstract W e describe a new algorithm forfloorplan evaluation using timing- and congestion-driven buffered routing according to a prescribed buffersite m ap. Specifically, we describe a provably good multi-commodity flow based algorithm that finds a global routing minimizing routing area subject to buffer/w ire congestion constraints.This perm its detailed floorplanevaluation, i.e., computing the tradeoff curve between routing area (wirelength and num ber ofbuffers) and buffer/w ire congestion under nearly any com bination ofdelay and capacity constraints.O uralgorithm is sim ultaneously taking into account: •m axim um source/bufferw ireloads •bufferand w ire congestion constraints •individual sink delay constraints •bufferand w ire sizing •layerand pin assignm ent Prelim inary experim ents show near-optim al results w ith practical runtim e. REFERENCES C . Albrecht, A.B. Kahng, I.I. M andoiu, and A. Zelikovsky,Floorplan Evaluation with Tim ing- D riven G lobalW ireplanning,Pin Assignm ent,and Buffer/W ire Sizing,subm itted. B ufferPlanning M ethodologies VDSM requires bufferinsertion forall global nets -50nm technology >1,000,000 buffers Buffer-block m ethodology (C ong etal.[IC C AD ’99],Tang&W ong [ISPD ’00], D ragan etal.[IC C AD ’00 + ASPD AC ’01]): -Buffers inserted in blocks located w ithin available free space -Sim plifies design by isolating bufferinsertion from circuitblock im plementations Buffer-site m ethodology (Alpertetal.[D AC ’01]): -Block designers leave “holes”in circuitblocks to be used forbufferinsertion -Alleviates congestion problem s ofbufferblocks Buffer-block m ethodology Buffer-site m ethodology Floorplan Evaluation Problem •U se tile graph to m odel congestion: -b(v)= # buffersites available in tile v -w (u,v)= # routing channels betw een tiles u and v -W ire congestion = m ax ratio betw een # buffers used in a tile v and b(v) -Buffercongestion = m ax ratio betw een # w ires betw een u and v and w (u,v) U pper-bounded m aximumwireload ofbuffers/sources -G uarantees bounded inputrise/fall tim es atbuffers and sinks -Im proves coupling noise im m unity + reliability w .r.t.hot-carrier& AC self-heating Floorplan Evaluation Problem (FEP) Given: -Tile graph w ith buffercapacities b(v) and w ire capacities w(u,v) -N etlist,each nethas unnasigned source and sink pins (given as sets oftiles) -Bufferw ireload U ,bufferand w ire congestion bounds 0 and 0 Find: - Pin assignm entand feasible buffered routing foreach net,m eeting bufferand w ire congestion bounds and m inim izing the total routing area, (#buffers)+ (total w irelength),w here , 0 are given scaling constants R e d u c tio n to M ulticom m o d ity Flow K e y in gredient: g a d g e t co n stru ctio n g ivin g 1 -to -1 c o rre s - pondence b e tw e e n fe a sib le b u ffe re d p a th s in th e tile g ra p h G =(V,E ) and s-t p a th s in a d ire c te d g ra p h H=(V ’,E ’): -E a c h ve rte x v of G re p la c e d in H b y v e rtice s v 0 ,v 1 ,… , v U - E ach edge (u,v) re p la ce d in H by the set E u,v o f e d g e s o f th e fo rm (u j-1 ,v j ) and (v j-1 ,u j ) - F o r e a ch vertex v of G , w e ad d to H th e set E u containing all edges of the form (v j ,v 0 ), these edges correspond to b u ffe r in se rtio n - F o r e a c h n e t, w e add to H source and sink vertices co n n e cte d to th e 0 th co p ie s o f th e tile s to w h ich th e y ca n be a s sig n e d In te g e r p ro g ra m fo rm ulation o f F EP: T ile g r a p h w ith tw o 2 -p in n e ts G a d g e t r e p la c in g (u ,v ) fo r U =5 p x x G E v u v u w x E p G V v v b x E p x E p E p p p p p p v u p p v p p v u v u v v path feasible {0,1}, net , 1 ) ( ) , ( ), , ( ν | | ) ( ), ( μ | | s.t. | | | | min 0 , 0 ) , ( , R elax+R ound A pproach •Introduced by R aghavan & Thom son [C O M B’87],consists oftw o steps: 1.Solve the fractional relaxation -R elaxation = multicom modityflow with setcapacity constraints -Exactlinearprogram m ing algorithm s are im practical forlarge instances 2.R ound to integersolution -Very fast(lineartim e); provably good ifcapacities are large O ur m ain contribution is a fast approximation algorithm for multicommodity flow with set capacity constraints -Based on the general fram ew ork form ulticom m odity flow approxim ation introduced by G arg and Konem ann [FO C S’98] -Incorporates speed-up idea due to Fleischer[SID M A’00] -R ecently applied w ith greatsuccess to global routing (Albrecht[TC AD ’01])and buffered global routing via buffer-blocks (D ragan etal.[IC C AD ’00,ASPD AC ’01]) -Sim ple code w ith very predictable runtim e,m ain subroutine is Dijkstra’s shortest path algorithm A pproxim ation A lgorithm x p =0, y v =/ 0 b(v), z e =/ 0 w(e), u=/D, p i = While v b(v)y v + e w(e)z e + Du < 1 For i = 1,…, #nets do If p i = or weight(p i ) > (1+)l i Find path p i with min weight l i among s i -t i paths of H End If x p i = x p i + 1 For every vV(G) and eE(G): y v = y v ( 1 + |p i E v |/ 0 b(v) ) z e = z e ( 1 + |p i E e |/ 0 w(e) ) u = u( 1 + ( v |p i E v |+ e |p i E e |)/D ) End For End For End While Output x scaled by the number of ‘While’ iterations •The algorithm finds a feasible solution w ith total cost (1+ 0 )D ifoptim um is D •R unning tim e forfixed D is O( -2 K ln n), w here K = # nets and n = |V(H )|= O (# tiles) •O ptim um can be found by binary search on D ;in practice binary search notnecessary Extensions •Sink delay upper-bounds -The delay (e.g.,Elm ore delay)ofany buffer-to-bufferpath in H can be charged to the lastpath edge,w hich is ofthe form (v j ,v 0 ) -D ijkstra’s algorithm is replaced by an algorithm for finding the m inimum -weight delay-bounded source-sink path •Buffer/w ire sizing and layerassignm ent -Discrete buffer and wire libraries can be m odeled by constructing a product directed graph H (size ofH grow s linearly w ith the size oflibraries) •M ultipin nets -Feasible routings ofa netm ay notcorrespond to trees in the grid graph,since m ultiple buffers (edges)m ay be needed in a tile (b/w 2 tiles) -For sm allnets (3-4 pins) m in-weight feasible routings can be found by trying all Steinerpoints and connecting them w ith m in-weightpaths -A heuristics for larger nets is to compute a fixed set of Steiner points for each net (using, e.g., a Steiner tree algorithm ) and then update m in-w eightpaths in each iteration Feasiblem ultipin routing w hich isnota tree in G Experim entalR esults 502 0.44 0.60 11.95 4225 5.64 30723 RABID a9c3 1082 0.31 0.63 0.72 3801 0.00 29082 MCF 1526 nets 1079 0.30 0.58 0.75 3376 0.00 26057 M CF+PA 694 0.81 0.84 23.25 4410 8.35 27060 RABID xc5 1641 0.60 0.96 7.35 3841 0.73 25155 MCF 2149 nets 1644 0.50 0.98 4.87 3340 0.05 22265 M CF+PA 1386 0.32 0.40 4.12 3004 0.00 23138 M CF+PA 1393 0.32 0.51 2.70 3428 0.00 25946 MCF 1663 nets 813 0.64 0.45 15.04 3840 6.38 27601 RABID playout 304 0.50 1.00 4.87 991 0.01 6041 M CF+PA 314 0.47 1.00 2.99 1135 0.07 6792 MCF 324 nets 167 0.36 0.93 21.51 1339 11.87 7592 RABID ami49 CPU B_congest W _congest % LB Gap #Buffers % LB Gap Wirelen. Algo Testcase •M C F uses less w irelength than the R ABID heurstic ofAlpertetal.(D AC ’01) •M C F wirelength is alw ays w ithin 1% oflower-bound;runtime 2x R ABID runtim e •Sim ultaneous pin assignm entfurtherdecreases w iring resources by 10% C onclusions W e have proposed the firstcoherentapproach to floorplan definition,timing- and congestion-driven buffered globalroute planning,w ire/buffersizing,layer assignm ent,and pin assignm ent.In the past,each ofthese optim izations has been considered in isolation (see, e.g., Albrecht [TCAD’01],Dragan et al. [IC CAD’00,ASPD AC’01], Cong et al. [ICCAD ’99], Tang&W ong [ISPD ’00], Alpert et al. [DAC’01]). Experimental results show that our method significantly outperforms approaches based on cascading individual optim izations such as the recentR ABID algorithm ofAlpertetal.Future w ork aim s to incorporate in our im plem entation practical im provem ents such as the use of uneven size tiles , window constraints on bufferusage (as opposed to tile constraints),and faster-converging weight-updating rules . G SR C bookshelf A reference implementation of the multicomm odity flow based algorithm described in this poster will soon be added to the GSRC bookshelf. The number of packages contributed to the GSRC bookshelf is continuously increasing. Among the recent additions are reference implementations of three rectilinear minimum spanning tree algorithms and an experimental com parison of their perform ance (authors A.B. Kahng and I. M andoiu, with code contributed in partby L.Scheffer).To learn m ore aboutthe Bookshelf and how to becom e a contributor,see http://w w w .gigascale.org/bookshelf/

Upload: quinlan-rosales

Post on 30-Dec-2015

30 views

Category:

Documents


1 download

DESCRIPTION

Floorplan Evaluation with Timing-Driven Global Wireplanning, Pin Assignment, and Buffer/Wire Sizing. I. Mandoiu, A.B. Kahng. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: I. Mandoiu, A.B. Kahng

Annual Review September 2001

Floorplan Evaluation with Timing-Driven GlobalWireplanning, Pin Assignment, and Buffer/Wire Sizing

I. Mandoiu, A.B. Kahng

Abstract

We describe a new algorithm for floorplan evaluation using timing- andcongestion-driven buffered routing according to a prescribed buffer site map. Specifically, we describe a provably good multi-commodity flow based algorithm that finds a global routing minimizing routing area subject to buffer/wire congestion constraints. This permits detailed floorplan evaluation, i.e., computing the tradeoff curve between routing area (wirelength and number of buffers) and buffer/wire congestion under nearly any combination of delay and capacity constraints. Our algorithm is simultaneously taking into account:

• maximum source/buffer wireloads

• buffer and wire congestion constraints

• individual sink delay constraints

• buffer and wire sizing

• layer and pin assignment

Preliminary experiments show near-optimal results with practical runtime.

REFERENCESC. Albrecht, A.B. Kahng, I.I. Mandoiu, and A. Zelikovsky, Floorplan Evaluation with Timing-Driven Global Wireplanning, Pin Assignment, and Buffer/Wire Sizing, submitted.

Buffer Planning Methodologies• VDSM requires buffer insertion for all global nets

- 50nm technology >1,000,000 buffers

• Buffer-block methodology (Cong et al. [ICCAD’99], Tang&Wong [ISPD’00], Dragan et al. [ICCAD’00 + ASPDAC’01]):

- Buffers inserted in blocks located within available free space

- Simplifies design by isolating buffer insertion from circuit block implementations

• Buffer-site methodology (Alpert et al. [DAC’01]):

- Block designers leave “holes” in circuit blocks to be used for buffer insertion

- Alleviates congestion problems of buffer blocks

Buffer-block methodology Buffer-site methodology

Floorplan Evaluation Problem• Use tile graph to model congestion:

- b(v) = # buffer sites available in tile v

- w(u,v) = # routing channels between tiles u and v

- Wire congestion = max ratio between # buffers used in a tile v and b(v)

- Buffer congestion = max ratio between # wires between u and v and w(u,v)

• Upper-bounded maximum wireload of buffers/sources

- Guarantees bounded input rise/fall times at buffers and sinks

- Improves coupling noise immunity + reliability w.r.t. hot-carrier & AC self-heating

Floorplan Evaluation Problem (FEP)

Given:-Tile graph with buffer capacities b(v) and wire capacities w(u,v)

- Netlist, each net has unnasigned source and sink pins (given as sets of tiles)

- Buffer wireload U, buffer and wire congestion bounds 0 and 0

Find: - Pin assignment and feasible buffered routing for each net, meeting buffer and wire congestion bounds and minimizing the total routing area, (#buffers) + (total wirelength), where , 0 are given scaling constants

R e d u c t i o n t o M u l t i c o m m o d i t y F l o w• K e y i n g r e d i e n t : g a d g e t c o n s t r u c t i o n g i v i n g 1 - t o - 1 c o r r e s -p o n d e n c e b e t w e e n f e a s i b l e b u f f e r e d p a t h s i n t h e t i l e g r a p h G = ( V , E ) a n d s - t p a t h s i n a d i r e c t e d g r a p h H = ( V ’ , E ’ ) :

- E a c h v e r t e x v o f G r e p l a c e d i n H b y v e r t i c e s v 0 , v 1 , … , v U

- E a c h e d g e ( u , v ) r e p l a c e d i n H b y t h e s e t E u , v o f e d g e s o f t h e f o r m ( u j - 1 , v j ) a n d ( v j - 1 , u j )

- F o r e a c h v e r t e x v o f G , w e a d d t o H t h e s e t E u c o n t a in i n g a l l e d g e s o f t h e f o r m ( v j , v 0 ) , t h e s e e d g e s c o r r e s p o n d t o b u f f e r i n s e r t i o n- F o r e a c h n e t , w e a d d t o H s o u r c e a n d s i n k v e r t i c e s c o n n e c t e d t o t h e 0 t h c o p i e s o f t h e t i l e s t o w h i c h t h e y c a n b e a s s i g n e d

• I n t e g e r p r o g r a m f o r m u l a t i o n o f F E P :T i l e g r a p h w i t h t w o 2 - p i n n e t s

G a d g e t r e p l a c i n g ( u , v ) f o r U = 5

px

x

GEvuvuwxEp

GVvvbxEp

xEpEp

p

p p

p pvu

p pv

p pvu vuv v

path feasible {0,1},

net ,1

)(),( ),,(ν||

)( ),(μ|| s.t.

||||min

0,

0

),( ,

Relax+Round Approach• Introduced by Raghavan & Thomson [COMB’87], consists of two steps:

1. Solve the fractional relaxation

- Relaxation = multicommodity flow with set capacity constraints

- Exact linear programming algorithms are impractical for large instances

2. Round to integer solution

- Very fast (linear time); provably good if capacities are large

• Our main contribution is a fast approximation algorithm for multicommodityflow with set capacity constraints

- Based on the general framework for multicommodity flow approximation introduced by Garg and Konemann [FOCS’98]

- Incorporates speed-up idea due to Fleischer [SIDMA’00]

- Recently applied with great success to global routing (Albrecht [TCAD’01]) and buffered global routing via buffer-blocks (Dragan et al.[ICCAD’00,ASPDAC’01])

- Simple code with very predictable runtime, main subroutine is Dijkstra’s shortest path algorithm

Approximation Algorithmxp=0, yv=/0b(v), ze=/0w(e), u=/D, pi=While v b(v)yv + e w(e)ze + Du < 1For i = 1,…, #nets doIf pi = or weight(pi) > (1+)liFind path pi with min weight li among si-ti paths of H

End Ifxpi = xpi + 1

For every vV(G) and eE(G):yv = yv( 1 + |piEv|/0b(v) )ze = ze( 1 + |piEe|/0w(e) )u = u( 1 + (v|piEv|+e|piEe|)/D )

End ForEnd For

End WhileOutput x scaled by the number of ‘While’ iterations

• The algorithm finds a feasible solution with total cost (1+0)D if optimum is D

• Running time for fixed D is O(-2K lnn), where K = # nets and n = |V(H)| = O(# tiles)

• Optimum can be found by binary search on D; in practice binary search not necessary

Extensions

• Sink delay upper-bounds

- The delay (e.g., Elmore delay) of any buffer-to-buffer path in H can be charged to the last path edge, which is of the form (vj,v0)

- Dijkstra’s algorithm is replaced by an algorithm for finding the minimum-weight delay-bounded source-sink path

• Buffer/wire sizing and layer assignment

-Discrete buffer and wire libraries can be modeled by constructing a product directed graph H (size of H grows linearly with the size of libraries)

• Multipin nets

- Feasible routings of a net may not correspond to trees in the grid graph, since multiple buffers (edges) may be needed in a tile (b/w 2 tiles)

- For small nets (3-4 pins) min-weight feasible routings can be found by trying all Steiner points and connecting them with min-weight paths

- A heuristics for larger nets is to compute a fixed set of Steiner points for each net (using, e.g., a Steiner tree algorithm) and then update min-weight paths in each iteration

Feasible multipin routing which is not a tree in G

Experimental Results

5020.440.6011.9542255.6430723RABIDa9c3

10820.310.630.7238010.0029082MCF1526 nets

10790.300.580.7533760.0026057MCF+PA

6940.810.8423.2544108.3527060RABIDxc5

16410.600.967.3538410.7325155MCF2149 nets

16440.500.984.8733400.0522265MCF+PA

13860.320.404.1230040.0023138MCF+PA

13930.320.512.7034280.0025946MCF1663 nets

8130.640.4515.0438406.3827601RABIDplayout

3040.501.004.879910.016041MCF+PA

3140.471.002.9911350.076792MCF324 nets

1670.360.9321.51133911.877592RABIDami49

CPUB_congestW_congest%LB Gap#Buffers%LB GapWirelen.AlgoTestcase

• MCF uses less wirelength than the RABID heurstic of Alpert et al. (DAC’01)

• MCF wirelength is always within 1% of lower-bound; runtime 2x RABID runtime

• Simultaneous pin assignment further decreases wiring resources by 10%

ConclusionsWe have proposed the first coherent approach to floorplan definition, timing-and congestion-driven buffered global route planning, wire/buffer sizing, layerassignment, and pin assignment. In the past, each of these optimizations has been considered in isolation (see, e.g., Albrecht [TCAD’01], Dragan et al. [ICCAD’00,ASPDAC’01], Cong et al. [ICCAD’99], Tang&Wong [ISPD’00], Alpert et al. [DAC’01]). Experimental results show that our method significantly outperforms approaches based on cascading individual optimizations such as the recent RABID algorithm of Alpert et al. Future work aims to incorporate in our implementation practical improvements such as the use of uneven size tiles, window constraints on buffer usage (as opposed to tile constraints), and faster-converging weight-updating rules.

GSRC bookshelf

A reference implementation of the multicommodity flow based algorithm described in this poster will soon be added to the GSRC bookshelf. The number of packages contributed to the GSRC bookshelf is continuously increasing. Among the recent additions are reference implementations of three rectilinear minimum spanning tree algorithms and an experimental comparison of their performance (authors A.B. Kahng and I. Mandoiu, with code contributed in part by L. Scheffer). To learn more about the Bookshelf and how to become a contributor, see http://www.gigascale.org/bookshelf/