multi supply digital layout

6

Click here to load reader

Upload: regis-santonja

Post on 28-Jun-2015

894 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Multi Supply Digital Layout

SAME 2001, November 15th

2001 1

Abstract

In this paper, the principle of a technique called

"multi-supply digital layout" is described. The use

of this technique allows a reliable backannotation

between digital blocks that are NOT powered off

the same supplies, within an analog top-cell. The

supplies do not have to have the same voltage

levels, thanks to the integration of level shifters for

voltage adaptation within the digital layout. It is

also applicable in systems where a supply can be

turned off while another one stays alive. This

technique also optimizes the die size with no extra

efforts, reduces the layout phase and optimizes scan

insertion and ATPG.

Index Terms – Layout, level shifter, back-

annotation, scan, low power, multiple supplies,

standard cells.

I. Introduction

The goal of this paper is to present why and how to

make a multiple supply digital layout.

We will present a flow which covers all the steps

from the RTL design down to the layout, using only

standard CAD tools. We will also compare this

technique with the existing literature on the subject,

and explain why it is best suited towards our needs

in terms of resulting area, layout development time,

and scan test.

1. What is post-layout back-annotation ?

It is the process of calculating the cell delays based

on the final routing, and putting these delays into

the cell models for simulation or static timing

analysis.

Back-annotation is needed in order to ensure that

the functionality is kept from RTL design down to

silicon. Back-annotated simulations and static

timing analysis allow the designer to ensure that all

the timing constraints of the design are met.

2. Example of timing constraints: the

setup time

Usually, there are two levels of complexity for

calculating the timing constraints of a flip-flop :

a) Before layout, when the clock is considered

perfect (no skew),

b) After layout, when a clock skew shows up.

Dealing with a multiple supply layout adds another

level of complexity because we need level shifters

on some data and clock paths. The diagram below

summarizes the situation.

Level

shifters

Where δclk(i) is the delay for the clock root driver to the pin of flip-flop i,

δck2q is the transition time of the flip-flop, δd is the data path delay and Tclk

is the clock period.

In order for the layout tool to generate a balanced

clock tree, one needs to have a logical and a timing

model for the level shifters. The level shifters are

presented in section III.4.

3. Why do we need several power

supplies in a design?

There are two reasons for using several power

supplies, both of which are necessary for power

management chips. This kind of circuit is very

common in mobile phones. They are used for

SAME 2001

Session 2: DEA METHODOLOGY

MULTI-SUPPLY DIGITAL LAYOUT

Regis Santonja, Motorola

Volker Wahl, Motorola

Toulouse

Page 2: Multi Supply Digital Layout

SAME 2001, November 15th

2001 2

regulating and distributing the power supplies to the

other chips in the telephone.

a) Reducing Power Consumption

Reducing the power consumption of portable

devices such as mobile phones, PDAs or portable

PCs has become one of the most important goals of

the semiconductor industry. As exposed in section

II.3. of this paper, using several power supplies is

one of the most effective techniques to reduce

power consumption.

b) Interfacing circuits that operate at

different voltage levels.

The second reason for using several power supplies

is to interface circuits that operate at several

voltages. Power management chips include a variety

of programmable functions (such as an audio

codec, an ADC used to monitor the supply levels, a

touch screen interface, a USB, an RS232 port,

etc…). The most effective technique is to have each

of these functions controlled by a logic powered off

the same voltage which is required for the

function’s interface.

A simplified example of how a power management

chip can be in the heart of a multiple supply system

is presented below: we have a processor with inputs

and outputs operating at 1.8V and a core at 2.5V.

The power management chip communicates through

its serial interface (SPI) operating at 1.8V with an

embedded real time clock powered off an external

Lithium cell at 3.2V.

The organization of this paper is the following. In

section II, we present the prior art in multiple

supply layout and show why it is not adapted to our

needs. In section III, we present our layout solution.

In section IV we present the design flow and how to

integrate the analog level shifters in the digital flow.

In section V, we present a program which generates

scripts for Silicon Ensemble. In section VI, we

present our multiple voltage clock tree solution.

Finally, section VII presents a possibility for

enhancing the flow in the future.

II. Prior Art

1. Interfacing circuits that operate at

different voltage levels.

On analog-oriented chips where several digital

blocks powered off different supplies have to be

laid out on the same silicon, the traditional way to

do this was to design and layout the digital blocks

separately, place them as macro cells in the analog

top cell of the chip, then use an analog router such

as IC Craftsman to interconnect the blocks.

This method had the following disadvantages:

a) There was no way to use the inter-block

connections' parasitics and generate a standard

SDF file for back-annotation.

b) Three digital layouts had to be done separately

with no way to globally re-order the scan chain.

c) Three tools and environments had to be used:

Silicon Ensemble, Cadence Framework II

(Virtuoso) and IC Craftsman.

d) Tools such as IC Craftsman and Virtuoso from

Cadence are analog tools and not familiar to

most of the digital designers.

2. Sophisticated layout techniques found

in the literature.

The authors in [1] [2] [3] [4] and [5] have already

proposed some techniques to layout multiple supply

circuits. However, they have started from a different

situation: they have a single supply circuit and want

to save power by multiplying the number of its

supplies. For doing this, they split the circuit at the

gate level and assign to each gate the power supply

which best matches its timing requirements, with no

Page 3: Multi Supply Digital Layout

SAME 2001, November 15th

2001 3

respect to the function implemented, in such a way

that a given function can be spread over several

supplies. As the number of connections within a

function is statistically much bigger than the number

of connections between the functions, this method

(called gate-level voltage scaling) generates a lot of

routing between the supplies. Because of this, these

authors have developed sophisticated techniques in

order to minimise the routing. However, the

drawback is that the placement algorithm has to be

modified. For example, Chingwei Yeh and Yin-

Shuin Kang in [1] and [4] have proposed a

modification of the simulated annealing by

introducing a new cost function associated with

voltage clustering.

These methods cannot be used for our designs, as

we require to use standard CAD tools.

3. How can we reduce power by using

multiple supplies?

This technique - called gate level voltage scaling -

consists in using a low supply voltage for the parts

of the circuit that do not suffer from the implied

transistor performance degradation, and keep a

higher voltage level for the critical paths of the

circuit. Effectively, lowering the voltage is the most

effective technique for reducing CMOS power

consumption because the latter is proportional to

the square of the supply voltage.

4. What about clock distribution?

Many papers have been published since 1990 about

generating a zero skew clock tree [7]. Various

algorithms have been proposed for single supply, as

well as for dual supply circuits [2] [8]. However, in

[2], Usami et al. propose a clock tree structure

where the leaves have to be in the low voltage

region: the tree does not reach the flip-flops in the

other region.

We’ll see in section VII. that we propose a

technique allowing a given clock tree to drive flip-

flops in both low and high voltage regions.

III. Our layout solution

1. Supplies isolation within the epi

Because we are in a mixed-signal environment, we

have to pay attention to the transitions in the digital

domain that might generate commutation noise on

sensitive analog blocks. For this reason, the digital

has to be surrounded by an isolation ring. In the

same manner, we isolate the digital blocks operating

at different voltages from each other, especially if

they do not have the same ground. The picture

below represents two inverters. We can see that

without the isolation, vss1 and vss2 would short

together.

Note that there is a minimum ring width and

distance required between the rings.

2. Layout style

In opposition to the prior art, our starting point is to

develop a chip which is already, by nature, a

multiple supply circuit. In fact, we could say that

another type of voltage scaling technique

(architecture voltage scaling) was used at the

system level, resulting in the definition of a chip in

which all the functions (control, real time clock, SPI

interface etc…) have been assigned to a voltage

supply. For this reason, we do not encounter the

same issues than these authors concerning the

routing. Thus, our layout solution has the following

advantages:

• it is the simplest,

• it works fine with standard cell-based layout

tools (no need to modify the placement

algorithm),

• it includes all the necessary level shifters,

• it makes it easy to isolate the voltage regions

from each other with a negligible impact on the

overall area,

• cells can be abutted in each voltage region as in

usual single-supply layouts.

These last two points can result in significant area

savings compared to the prior art. And if we

compare to section II.1, the listed disadvantages

have disappeared:

a) We can now generate a single standard SDF

file for back-annotation. All the inter-region

connections are taken into account.

b) Only one digital layout had to be done with the

possibility to globally re-order the scan chain.

c) Only one layout tool is used: Silicon Ensemble,

and no analog tool.

d) Silicon Ensemble is familiar to most of the

digital designers.

In practice, we grouped the cells powered by the

same voltage in 3 voltage regions, as presented

below. Note that the three regions are separated by

the necessary isolation ring.

Page 4: Multi Supply Digital Layout

SAME 2001, November 15th

2001 4

Two issues have to be taken into account when a

signal goes from one voltage to another one:

3. The signal goes from a high voltage to

a low voltage

The first issue that can show up is associated with

antenna diodes that can allow a static current to

flow from the high to the low voltage region.

Effectively, charge-collecting antennas are formed

during wafer processing when an interconnect (field

poly or metal) is connected to a poly gate that does

no yet have an electrical connection to diffusion. A

connection to diffusion is typically completed at the

top level of metal, so conductors below the top level

of metal are generally considered responsible for

damage from collecting charge during plasma

processing. Therefore, antenna area ratio design

rules are commonly used in the semiconductor

industry to ensure that the remaining charges do not

damage circuits [6].

Many companies in the industry add systematically

antenna diodes in their standard cells that are

connected on all input pins of the gates. These

antenna diodes are either connected to the supply

(P-type diode) or to the ground (N-type diode),

depending on the area cost for the cell.

In order to avoid this leakage, we can take

advantage from the cells which happen to have only

N-type antenna diodes, such as all the simple

buffers in the technology we used. The inserted cell

has to be powered off the low supply as presented

on the Figure below.

4. The signal goes from a low voltage to

a high voltage

Whenever a gate has to drive the input of another

gate operating at a higher voltage, a voltage

conversion is needed at the interface. Connecting

the low voltage signal directly to the high voltage

gate is not acceptable, even though it would be the

simplest solution. The simulation plot below shows

this situation with two inverters, the first one being

operating at a lower supply than the second one.

When a falling edge is presented at the input of the

first inverter, there is a static current consumption in

the second inverter because its PMOS is weakly

opened.

Curent in second

inverter

input

output

130 mV

50 µA

The solution we adopted is to use a dual cascode

voltage switch (DCVS), which I call a “level

shifter” in this paper. However, a usual level shifter

as presented in [3] has its output undefined

whenever the input supply is turned off. For this

reason, we have added a 2-input AND gate in order

to force the output low and a NMOS in order to cut

any current which could flow to the ground as

shown below. The NMOS and the AND gate are

controlled by a signal which is low when the input

voltage supply is switched off.

As a consequence, the

type of the diodes

appears to be random,

leading to the risk of

having a static current

from the higher to the

lower voltage flowing

through a P-type diode,

as presented on the right.

Page 5: Multi Supply Digital Layout

SAME 2001, November 15th

2001 5

IV. Design Flow and Libraries

The principle of the technique presented here is to

avoid the need of using analog tools and tool

environments from RTL down to the layout. CAD

tools all have to be digital and standard. In order to

stay in a pure digital environment, we had to write

all the digital libraries for the level shifters, just as

those that are used for normal standard cells:

1. Verilog (HDL description)

The verilog model of a standard level shifter is

similar to the one of a buffer. In our case, the model

we used is similar to a 2-inputs AND gate. RTL

design is performed as usual, without any reference

to the power supplies. The level shifters are

instantiated within the RTL code.

2. Design Compiler (Synthesis)

The level shifter’s timing parameters (fall/rise slew

rate and fall/rise transition delays) under all the

necessary PVT (process, voltage and temperature)

corners have been extracted from Spice simulations.

A Design Compiler .lib file has been generated and

compiled to a .db file so that the synthesis will treat

the level shifter as a standard cell.

3. Fastscan (ATPG)

A Fastscan model of the level shifters has been

generated, too, so that we can automatically

generate scan patterns for the production test.

Fastscan does not need any timing information. The

logical function is a 2-inputs AND gate, as for the

verilog. From there, Fastscan treats the level shifter

as if it was a digital cell. Running ATPG is easier

because we can read the complete design in

Fastscan, rather than generating a set of scan

vectors for each region. In addition, the fault

coverage is most probably higher.

4. Silicon Ensemble (Place&Route)

Silicon Ensemble needs 2 library files for the level

shifter. The first one is the LEF and is a view of the

layout of the cell. The second file is the TFL

(Timing Library File). It can be automatically

derived from the Design Compiler’s library using

the syn2tlf program provided by Cadence. The TLF

file is needed for CT-Gen (the Clock-Tree

Generator) in order to estimate the clock skew and

the insertion delay of the clock tree.

Silicon Ensemble generates a post-layout netlist

which includes the level shifters, and an RC file

which contains the list of all the capacitors and

resistances of the routed nets. These two files can

then be read by the delay calculator which generates

a SDF file used for the back-annotation. The delay

calculator can be Design Compiler or Primetime

from Synopsys, or any internal tool (quite often

foundries have their own golden delay calculator).

V. Automated floorplan and

placement

A small program has been developed in order to

ease the floorplan generation. Based on the number

of level shifters and the desired utilization

percentage of each voltage region, it proposes a

selection of floorplans with different aspect ratios

for which it generates Silicon Ensemble scripts that

will initialize the floorplan, place the level shifters

automatically and route the horizontal and vertical

power stripes as represented below.

Finally, the cells are gathered in groups, and each

group is assigned to a region, so that the placement

tool will locate each cell in the correct region.

VI. Clock tree synthesis

The clock tree structure with dual supply voltages

presented in [2] handles clock domains in which all

the flip-flops are only allowed to operate at the low

voltage while meeting the timing constraints.

We propose here a technique allowing a given clock

tree to drive flip-flops in both low and high voltage

regions. However, the clock tree generator is not

allowed to place clock buffers in a voltage region

which is different from the clock’s root driver.

The level shifter’s

layout has been done

in such a way that it

looks like a standard

cell’s layout except

that it is “dual-rail”

as shown on the

right.

Page 6: Multi Supply Digital Layout

SAME 2001, November 15th

2001 6

Effectively, we have to avoid that a clock buffer

gets placed in a voltage region that is turned off if

the corresponding branch is supposed to drive

functions that are in use (powered on). The dashed

line on the diagram below symbolizes a dead branch

of the clock tree, which makes some functions in

voltage regions 1 and 3 fail if voltage 2 is turned

off.

The correct placement of the clock tree buffers is

managed by several steps, automated in a Unix shell

script. There are as many CT-Gen runs as voltage

regions. The diagram below presents an example of

a clock tree generation in voltage region 3: all

possible “holes” in the rows of regions 1 and 2 are

filled with dummy filler cells. Then all cells in these

regions are assigned the FIXED property in the

DEF file (Silicon Ensemble ASCII database).

Finally, CT-Gen is launched.

Once all the clock trees have been generated, the

routing can be launched as for a usual layout, and

RC parasitics file can be generated as in the

standard way.

VII. Future enhancements

By the chosen flow, all voltage regions will be

back-annotated using the same PVT conditions,

because only one SDF file is generated. A region

could impose its own voltage range (best case,

worst case) to the others, even if the latter have

weaker voltage constraints. This problem could be

eliminated by splitting the RC file, generating an

SDF file for each region, and merging them together

with a simple PERL script.

VIII. References

[1] Chingwei Yeh and Yin-Shuin Kang, Cell-Based

Layout Techniques Supporting Gate-Level Voltage

Scaling for Low Power. IEEE Transactions on Very

Large Scale Integration (VLSI) Systems, Vol. 8 No.

5, October 2000.

[2] Kimiyoshi Usami, Mitsunori Igarashi, Fumihiro

Minami, Takashi Ishikawa, Masahiro Kanazawa,

Makoto Ichida, and Kazutaka Nogami, Automated

Low-Power Technique Exploiting Multiple Supply

Voltages Applied to a Media Processor. IEEE

Journal of solid-state Circuits, Vol.33, No.3, March

1998.

[3] C.Yeh and M.-C. Chang, Gate-level voltage

scaling for low-power design using multiple supply

voltages. IEE Proc. Circuits Devices Syst., Vol.

146, No. 6, December 1999.

[4] Chigwei Yeh, Yin-Shuin Kang, Shan-Jih Shieh,

Jinn-Shyan Wang, Layout Techniques Supporting

the Use of Dual Supply Voltages for Cell-Based

Designs. Design Automation Conference, 1999.

Proceedings. 36th , 1999

[5] Yi-Jong Yeh and Sy-Yen Kuo, An Optimization-

based low-power voltage scaling technique using

multiple supply voltages. Circuits and Systems,

2001. ISCAS 2001. The 2001 IEEE International

Symposium on , Volume: 5, 2001.

[6] Martin Polzl, A Strategy to Detect Charge

Damaging Process Steps within a Multilayer

Metallization Technology. 1997 2nd International

Symposium on Plasma Process-Induced Damage.

[7] G. E. Tellez and M. Sarrafzadeh, Clock period

constrained minimal buffer insertion in clock trees.

In Proceedings of the IEEE/ACM International

Conference on Computer-Aided Design, 1994.

[8] Jatuchai Pangjun and Sachim S. Sapatnekar,

Clock Distribution Using Multiple Voltages in Low

Power Electronics and Design, 1999. Proceedings.

1999 International Symposium on , 1999.

[9] Alain Guyot and Sélim Abou-Samra, Low

Power CMOS Digital Design in ICM’98, December

14-16 1998.

[10] Anantha P. Chandrakasan, Samuel Sheng, and

Robert W. Brodersen, Low-Power CMOS Digital

Design in IEEE Journal of Solid-State Circuits. Vol.

27, No. 4, April 1992.