i. mandoiu, a.b. kahng

Annual Review September 2001

Floorplan Evaluation with Timing-Driven GlobalWireplanning, Pin Assignment, and Buffer/Wire Sizing

I. Mandoiu, A.B. Kahng

Abstract

We describe a new algorithm for floorplan evaluation using timing- andcongestion-driven buffered routing according to a prescribed buffer site map. Specifically, we describe a provably good multi-commodity flow based algorithm that finds a global routing minimizing routing area subject to buffer/wire congestion constraints. This permits detailed floorplan evaluation, i.e., computing the tradeoff curve between routing area (wirelength and number of buffers) and buffer/wire congestion under nearly any combination of delay and capacity constraints. Our algorithm is simultaneously taking into account:

• maximum source/buffer wireloads

• buffer and wire congestion constraints

• individual sink delay constraints

• buffer and wire sizing

• layer and pin assignment

Preliminary experiments show near-optimal results with practical runtime.

REFERENCESC. Albrecht, A.B. Kahng, I.I. Mandoiu, and A. Zelikovsky, Floorplan Evaluation with Timing-Driven Global Wireplanning, Pin Assignment, and Buffer/Wire Sizing, submitted.

Buffer Planning Methodologies• VDSM requires buffer insertion for all global nets

- 50nm technology >1,000,000 buffers

• Buffer-block methodology (Cong et al. [ICCAD’99], Tang&Wong [ISPD’00], Dragan et al. [ICCAD’00 + ASPDAC’01]):

- Buffers inserted in blocks located within available free space

- Simplifies design by isolating buffer insertion from circuit block implementations

• Buffer-site methodology (Alpert et al. [DAC’01]):

- Block designers leave “holes” in circuit blocks to be used for buffer insertion

- Alleviates congestion problems of buffer blocks

Buffer-block methodology Buffer-site methodology

Floorplan Evaluation Problem• Use tile graph to model congestion:

- b(v) = # buffer sites available in tile v

- w(u,v) = # routing channels between tiles u and v

- Wire congestion = max ratio between # buffers used in a tile v and b(v)

- Buffer congestion = max ratio between # wires between u and v and w(u,v)

• Upper-bounded maximum wireload of buffers/sources

- Guarantees bounded input rise/fall times at buffers and sinks

- Improves coupling noise immunity + reliability w.r.t. hot-carrier & AC self-heating

Floorplan Evaluation Problem (FEP)

Given:-Tile graph with buffer capacities b(v) and wire capacities w(u,v)

- Netlist, each net has unnasigned source and sink pins (given as sets of tiles)

- Buffer wireload U, buffer and wire congestion bounds 0 and 0

Find: - Pin assignment and feasible buffered routing for each net, meeting buffer and wire congestion bounds and minimizing the total routing area, (#buffers) + (total wirelength), where , 0 are given scaling constants

R e d u c t i o n t o M u l t i c o m m o d i t y F l o w• K e y i n g r e d i e n t : g a d g e t c o n s t r u c t i o n g i v i n g 1 - t o - 1 c o r r e s -p o n d e n c e b e t w e e n f e a s i b l e b u f f e r e d p a t h s i n t h e t i l e g r a p h G = ( V , E ) a n d s - t p a t h s i n a d i r e c t e d g r a p h H = ( V ’ , E ’ ) :

- E a c h v e r t e x v o f G r e p l a c e d i n H b y v e r t i c e s v 0 , v 1 , … , v U

- E a c h e d g e ( u , v ) r e p l a c e d i n H b y t h e s e t E u , v o f e d g e s o f t h e f o r m ( u j - 1 , v j ) a n d ( v j - 1 , u j )

- F o r e a c h v e r t e x v o f G , w e a d d t o H t h e s e t E u c o n t a in i n g a l l e d g e s o f t h e f o r m ( v j , v 0 ) , t h e s e e d g e s c o r r e s p o n d t o b u f f e r i n s e r t i o n- F o r e a c h n e t , w e a d d t o H s o u r c e a n d s i n k v e r t i c e s c o n n e c t e d t o t h e 0 t h c o p i e s o f t h e t i l e s t o w h i c h t h e y c a n b e a s s i g n e d

• I n t e g e r p r o g r a m f o r m u l a t i o n o f F E P :T i l e g r a p h w i t h t w o 2 - p i n n e t s

G a d g e t r e p l a c i n g ( u , v ) f o r U = 5

px

x

GEvuvuwxEp

GVvvbxEp

xEpEp

p

p p

p pvu

p pv

p pvu vuv v

path feasible {0,1},

net ,1

)(),( ),,(ν||

)( ),(μ|| s.t.

||||min

0,

0

),( ,

Relax+Round Approach• Introduced by Raghavan & Thomson [COMB’87], consists of two steps:

1. Solve the fractional relaxation

- Relaxation = multicommodity flow with set capacity constraints

- Exact linear programming algorithms are impractical for large instances

2. Round to integer solution

- Very fast (linear time); provably good if capacities are large

• Our main contribution is a fast approximation algorithm for multicommodityflow with set capacity constraints

- Based on the general framework for multicommodity flow approximation introduced by Garg and Konemann [FOCS’98]

- Incorporates speed-up idea due to Fleischer [SIDMA’00]

- Recently applied with great success to global routing (Albrecht [TCAD’01]) and buffered global routing via buffer-blocks (Dragan et al.[ICCAD’00,ASPDAC’01])

- Simple code with very predictable runtime, main subroutine is Dijkstra’s shortest path algorithm

Approximation Algorithmxp=0, yv=/0b(v), ze=/0w(e), u=/D, pi=While v b(v)yv + e w(e)ze + Du < 1For i = 1,…, #nets doIf pi = or weight(pi) > (1+)liFind path pi with min weight li among si-ti paths of H

End Ifxpi = xpi + 1

For every vV(G) and eE(G):yv = yv( 1 + |piEv|/0b(v) )ze = ze( 1 + |piEe|/0w(e) )u = u( 1 + (v|piEv|+e|piEe|)/D )

End ForEnd For

End WhileOutput x scaled by the number of ‘While’ iterations

• The algorithm finds a feasible solution with total cost (1+0)D if optimum is D

• Running time for fixed D is O(-2K lnn), where K = # nets and n = |V(H)| = O(# tiles)

• Optimum can be found by binary search on D; in practice binary search not necessary

Extensions

• Sink delay upper-bounds

- The delay (e.g., Elmore delay) of any buffer-to-buffer path in H can be charged to the last path edge, which is of the form (vj,v0)

- Dijkstra’s algorithm is replaced by an algorithm for finding the minimum-weight delay-bounded source-sink path

• Buffer/wire sizing and layer assignment

-Discrete buffer and wire libraries can be modeled by constructing a product directed graph H (size of H grows linearly with the size of libraries)

• Multipin nets

- Feasible routings of a net may not correspond to trees in the grid graph, since multiple buffers (edges) may be needed in a tile (b/w 2 tiles)

- For small nets (3-4 pins) min-weight feasible routings can be found by trying all Steiner points and connecting them with min-weight paths

- A heuristics for larger nets is to compute a fixed set of Steiner points for each net (using, e.g., a Steiner tree algorithm) and then update min-weight paths in each iteration

Feasible multipin routing which is not a tree in G

Experimental Results

5020.440.6011.9542255.6430723RABIDa9c3

10820.310.630.7238010.0029082MCF1526 nets

10790.300.580.7533760.0026057MCF+PA

6940.810.8423.2544108.3527060RABIDxc5

16410.600.967.3538410.7325155MCF2149 nets

16440.500.984.8733400.0522265MCF+PA

13860.320.404.1230040.0023138MCF+PA

13930.320.512.7034280.0025946MCF1663 nets

8130.640.4515.0438406.3827601RABIDplayout

3040.501.004.879910.016041MCF+PA

3140.471.002.9911350.076792MCF324 nets

1670.360.9321.51133911.877592RABIDami49

CPUB_congestW_congest%LB Gap#Buffers%LB GapWirelen.AlgoTestcase

• MCF uses less wirelength than the RABID heurstic of Alpert et al. (DAC’01)

• MCF wirelength is always within 1% of lower-bound; runtime 2x RABID runtime

• Simultaneous pin assignment further decreases wiring resources by 10%

ConclusionsWe have proposed the first coherent approach to floorplan definition, timing-and congestion-driven buffered global route planning, wire/buffer sizing, layerassignment, and pin assignment. In the past, each of these optimizations has been considered in isolation (see, e.g., Albrecht [TCAD’01], Dragan et al. [ICCAD’00,ASPDAC’01], Cong et al. [ICCAD’99], Tang&Wong [ISPD’00], Alpert et al. [DAC’01]). Experimental results show that our method significantly outperforms approaches based on cascading individual optimizations such as the recent RABID algorithm of Alpert et al. Future work aims to incorporate in our implementation practical improvements such as the use of uneven size tiles, window constraints on buffer usage (as opposed to tile constraints), and faster-converging weight-updating rules.

GSRC bookshelf

A reference implementation of the multicommodity flow based algorithm described in this poster will soon be added to the GSRC bookshelf. The number of packages contributed to the GSRC bookshelf is continuously increasing. Among the recent additions are reference implementations of three rectilinear minimum spanning tree algorithms and an experimental comparison of their performance (authors A.B. Kahng and I. Mandoiu, with code contributed in part by L. Scheffer). To learn more about the Bookshelf and how to become a contributor, see http://www.gigascale.org/bookshelf/

i. mandoiu, a.b. kahng

Documents

bv buffer congestion

meeting buffer

v of edges

tile v wu

isolating buffer insertion

buffer capacities bv

buffer sites available

vertex v of g