3.low power estimation

8/8/2019 3.Low Power Estimation

1/82

Chapter 3 Power Estimation Simulation based

Vectors are given, circuit is known, simulation

is performed. The instantaneous currents are

averaged.

Probabilistic analysis based

Averaging of the inputs is performed first,probabilistic measures are extracted.


2/82


3/82

Power is also dissipated due to glitching

activity in a circuit. Glitches occur due to

different delays through different paths of

the circuit.

A hazardous transition occurs at the output

of the AND gate due to different delaysthrough two different paths converging at

the inputs to the AND gate.


4/82


5/82

The glitches can die while propagating

through a logic gate if the width of the

glitch is much smaller than the inertial delay

of the logic gate.


6/82

3.1 Modeling of Signals Stochastic process: Let g(t), t , be a

stochastic process that takes the values of logical 0

or logical 1, transitioning from one to the other atrandom times.

Strict-sense stationary (SSS): A stochastic

process is said to be strict-sense stationary if its

statistical properties are invariant to a shift of the

time origin. More importantly, the mean of such a

process does not change with time.


7/82

Mean ergodic: If a constant-mean process

g(t) has a finite variance and is such that g(t)

and g(t+) become uncorrelated as pthen g(t) is mean ergodic.

Definition 3.1 (Signal Probability) The

signal probability of signal g(t) is given by

P(g) = lim T p -TT g(t) dt


8/82

Definition 3.2 (Signal Activity) The signal

activity of a logic signal g(t) is given by

A(g) = lim T p ng(T)/T

where ng(t) is the number of transitions of

g(t) in the time interval between T/2 and

T/2.


9/82

If the primary inputs to the circuit are modeled asmutually independent SSSmean-ergodic 0-1

process, then the probability of signal g(t)assuming the logic value 1 at any given time tbecomes a constant, independent of time and isreferred to as the equilibrium signal probability ofrandom quantity g(t) and is denoted by P(g=1),which we refer to simply as signal probability.

Hence, A(g) becomes the expected number oftransitions per unit time.


10/82


11/82

3.2 Signal Probability

Calculation Inputs: Signal probabilities of all the inputs to a

circuit

Output: Signal probabilities for all nodes of thecircuit

Step 1: For each input signal and gate output in thecircuit, assign a unique variable.

Step 2: Starting at the inputs and proceeding to theoutputs, write the expression for the output of eachgate as a function of its input expressions..


12/82

Step 3: Suppress all exponents in a given

expression to obtain the correct probability for thatsignal.

Reconvergent fanout can produce expressions for the

signal probability of internal nodes having exponents

greater than 1. Intuitively, in probability expressionsinvolving independent primary inputs, such exponents

cannot be present.


13/82

Letfbe written in a canonical sum of products of

primary inputs as follows:

f = i=1p (k=1n sk), where skis either xkor xk.

Since the product terms inside the summation are

mutually independent, we have

P(f) = i=1p (k=1n P(sk)). This expression isdefined as the canonical sum of probability

products of f.


14/82

P(xk) P(xk) = P(xk) (1 P(xk))

= P(xk) P2(x

k)

= P(xk) P(xk)

= 0


15/82

3.3 Probabilistic Techniques for

Signal Activity Estimation


16/82

3.3.1 Switching Activity in

Combinational Logic The Boolean difference of fj with respect to

xi is defined as follows:

x fj / xxi = fj | xi=1 fj | xi=0 where denotesthe exclusive-or operation.

The Boolean difference signifies the

condition under which output fj is sensitizedto input xi.


17/82

If the primary inputs xi, i = 1, , n, to logic gate

M are not spatially correlated, then the signal

activity at output fj is given by

A(f j) = i=1n P(xfj / xxi) A(xi) (3.6)

P(xfj / xxi) signifies the probability of sensitizing

input xi to output fj , while P(xfj / xxi) A(xi) is the contribution of switching

activity at output fj due to input xi only.


18/82


19/82

x fand / xx1 = fand | x1=1 fand | x1=0= x

2

0

= x2

A(f and) = p(x2)A(x1) + P(x1)A(x2)


20/82

Equation (3.6) fails to consider the effect of

simultaneous switching of signals at logic

gate inputs and, hence, can grossly

overestimate signal activity.


21/82

The output switching activity is zero


22/82

3.4 Statistical techniques The circuit is simulated repeatedly using a

logic simulator and the switching activities

at various nodes are noted.

Statistical mean estimation techniques are

used in determining the stopping criteria in

the Monte Carlo simulations.


23/82

3.4.1 Estimating Average Power

in Combinational Circuits Burch et al. experimentally determined that the

power consumed by a circuit over a period thas a

normal distribution. Let p and sbe the measured average and the

standard deviation of the random sample of the

power measured over time T, respectively. Then

with (1 - E) * 100% confidence we can write thefollowing inequality:

p - Pavg < tE/2 s / N1/2


24/82

where tE/2 is obtained from a t-distribution

with N 1 degrees of freedom and Pavg is

the true average power.

p - Pavg / p < tE/2 s / (p* N1/2) < I

I: the desired percentage error for the given

confidence level (1 - E) * 100%.


25/82

3.4.2 Estimating Average Power

in Sequential Circuits The basic idea of Monte Carlo methods for

estimating activity of individual nodes is to

simulate a circuit by applying random-pattern inputs. The convergence of

simulation can be obtained when the

activities of individual nodes satisfy somestopping criteria.


26/82


27/82


28/82

3.5 Estimation of Glitching

Power Static Hazard: A static hazard is defined as

the possible occurrence of a transient pulse

on signal line whose static value is notsupposed to change.

Dynamic Hazard: A dynamic hazard is the

possible occurrence of a spurious transitionduring the occurrence of a functional 0 p 1or a 1 p 0 transition.


29/82

Three-valued logic simulation for

AND GateAND 0 1 X

0 0 0 0

1 0 1 X

X 0 X X


30/82

Logic simulation can be used to detect

probable static hazards by using a six-

valued logic.

The estimate is pessimistic because some of

these hazards might not be present under

certain delay conditions.


31/82

Six-valued logic for Static hazard analysis

Logic representation Bit sequence at t, t, t+1

0 static 0 0001 static 1 111

R - rising 0U1

F - falling 1U0

SH0 static 0 hazard 0U0

SH1 static 1 hazard 1U1


32/82

AND Operation with Six-Valued Logic

AND 0 1 R F SH0 SH1

0 0 0 0 0 0 0

1 0 1 R F SH0 SH1

R 0 R R SH0 SH0 R

F 0 F SH0 F SH0 F

SH0 0 SH0 SH0 SH0 SH0 SH0

SH1 0 SH1 R F SH0 SH1


33/82

{1000, 1100, 1110} corresponding to fast,

medium, and slow falling signals.

Eight-valued logic is required for logic

simulation to detect dynamic hazard.


34/82

Eight-valued logic for dynamic hazard

analysisLogic representation Bit sequence at t, t, t, t+1

0 static 0 00001 static 1 1111

R - rising 0001, 0011, 0111

F - falling 1000, 1100, 1110

SH0 static 0 hazard 0100, 0010, 0110

SH1 static 1 hazard 1011, 1101, 1001

DH0 dynamic 0 hazard 1010

DH1 dynamic 1 hazard 0101


35/82

3.5.2 Delay Models A circuit node where two reconvergent

paths with different delays meet may have a

large number of spurious transitions.

However, even in a tree-structured circuit

with balanced paths there can be a large

number of spurious transitions due to slightvariations in delays.


36/82


37/82


38/82


39/82


40/82

3.5.2.1 Statistical Estimation Delays are modeled as random variables

and should be generated from time to time

along the simulation.

Whenever we generate a new set of delays,

they correspond to another die or even the

same die but with different operatingconditions such as temperature and power

supply voltage.


41/82

Activity a a = F(PI, D) PI: primary input vectors, D: a

random vector consisting of all the random

variables of gate delays.


42/82


43/82


44/82

The difference is due to the

glitching activity


45/82

While the different nonzero-delay models

do track each other (except for one circuit,

C6288, which has a depth of about 120levels), it is clear that the nonzero-delay

models can produce very different results

compared to the zero-delay case. Thedifference is due to the glitching activity.


46/82

For some circuits minimum and maximum

average power can vary widely if uncertain

specifications of primary inputs exist.


47/82


48/82

The delay mismatch of different

paths causes spurious transitions.


49/82

They are 20 times greater than those

obtained using the zero-delay model.


50/82

3.8 Power Dissipation in Domino

CMOS Domino logic circuits do not have direct-

path short-circuit currents except when

static pull-up devices are used to moderatethe charge redistribution problem or when

clock skew is not well dealt with.


51/82


52/82


53/82


54/82

B fixed: X1

A Varies: X2

X3

V

Z1

X1 X2

X3

CyZ0

Y

Figure 3.36 CMOS gate y = (x1 + x2)x3


55/82

3.10 Power Estimation at the

Circuit Level The gate presents a variable capacitance to the

power/ground rails. The magnitude of thiscapacitance depends on the logic values at theinput to the gate.

Two signals A and B are to be connected to thetwo equivalent inputs x1 and x2 of the gate inFigure 3.36 such that very often A has a transition

and B stays zero then A should be connected to x2and B to x1 as this results in lower powerconsumption than the other case.


56/82

3.11 High Level Power

Estimation


57/82

The signal probabilities of the lower order bits of a

word are essentially uncorrelated in space and

time with a signal probability of 0.5 and switchingactivity of 0.25 and are essentially independent of

the data distributions.

The higher order bits show complete dependence

because they represent the sign extensions intwos complement representation.


58/82

3.12 Information-Theory-Based

Approaches The output entropy of Boolean functions

can be used to predict the average

minimized area of CMOS combinationalcircuits.

If x is a random variable with a signalprobability p, then the entropy of x is

defined as H(x) = p log2 1/p + (1 p) log2 1/(1 p)


59/82

For a discrete variable x, which can take ndifferent values, the entropy is defined as

H(x) = i=1n

pi log2 1/pi, where pi is the probabilitythatx takes the ith value xi.

Given the input signal probabilities of 0.5, theoutput entropy of the boolean function can be usedto predict the area of its average minimizedimplementation as

A = K* (2n / n)* H(Y) where A is the area of theimplementation and K is the proportionalityconstant, Y = f(X).


60/82


61/82


62/82

RTL: Register Transfer Level

A high level technique to estimate power can beused in the following three steps:

Determine the input/output entropies ofcombinational logic block by runningRTLsimulation of sequential circuits.

From the input/output entropies, determine theswitching activity, area, and estimate of averagepower.

Combine with latch and clockpower to determinethe total power dissipation.


63/82

Two approaches to determining lower

bounds of maximum dynamic power in

static CMOS circuits:

deterministic (automatic test generation

based) and

simulation-based approaches.


64/82

The instantaneous power dissipation due to

two consecutive input binary vectors is

proportional to

Pi = for all gates T(g) * C(g), where C(g)

denotes the output capacitance of gate gand

T(g) is a binary variable that indicateswhether gate gswitches or not

corresponding to the two input vectors.


65/82

To justify the transition [ i.e., to see if T(*) = 1 isachievable], the modified justification mechanismin a 5-V D algorithm (an ATG algorithm for

stuck-at faults) is used. In CMOS circuits, the capacitive load of a logic

gate can be approximated by the fanout of the gate.

Pi = for all gates [g(V1) g(V2)] * F(g), where V1,V2 denote two consecutive input binary vectors tothe circuit, g(*) represents the boolean function ofgategin term of PIs, and F(g) denotes thenumber of fanouts of gate


66/82

The justification mechanism in the algorithm

includes two processes backtracing and

implication. Two composite values to be in conflict if they

have 0 and 1 at the same position.

Experiments show the test generation approach is

superior to the traditional simulation-basedtechnique in both efficiency and the quality of the

results.


67/82

a

b

c

d

e

g 1

1 T

hi

1

1

1

0

1

Fig 7.6 The D algorithm -sensitization step


68/82

Each gate is associated with a stack to store

all the composite logic values a/b that have

been assigned to g[ a and b denotes g(V1)and g(V2), respectively]. The variables a

and b can be 1, 0, oru (unknown). At each

gate, the top of the stack stores the mostrecently updated value for the gate.


69/82

After assigning a rising transition

(0/1) to x, y(V2) is forced to be 1.


70/82


71/82

Circuit level simulation

Extract circuit netlist description from layout

Captures internal (diffusion) and external (wiring andgate fanout) capacitances

Run an analog simulation

Characterization of device models (nfets, pfets)

Solution of large system of equations so verycomputationally intensive ( < few thousand transistors)

Can accurately estimate (within a few %) dynamic andleakage power dissipation

HSPICE, spectre (Cadence), PowerMill (Synopsys)


72/82

Gate level simulation

Perform logic simulation to obtain the switchingevents for each net (signal)

Logic description in structural VHDL or Verilog

Zero-delay or unit-delay timing models

Determine frequency of each net fy = ty/(2T),where ty if the number of logic switches of net yand Tis the simulation time, to compute dynamic

power

Pdyn = 7CyVDD2fy

Pre-layout so must estimate Cy


73/82

Gate internal and leakage power

Use gate characterization (E(g, e)) and logicsimulation event count (f(g, e)) to calculate thegates dynamic internal power (short circuit andcharging/discharging of internal capacitors)

Pint = 77 E(g, e) f(g, e)

During simulation record the fraction of time T(g,s)/T that each gate gstays in a particular statess tocalculate leakage power

Pleak= 77 E(g, s) f(g, s)/T


74/82

Capacitance estimation

Device (diffusion and gate) capacitance

Depends on width/length of driving gates source/draindiffusion and fanout gates

Part of characterization of cell based designs

Wiring capacitance

Depends on placement and routing

Wire load predict wire length of a net from the

number of pins incident to the net Mapping table can be constructed from historical data of

existing designs


75/82


76/82

Gate level simulation

considerations Simulation vectors need to be chosen carefully

(application dependent)

Internal power really depends on operating voltage,temperature, process, p multidimensionalcharacterization

Accuracy within 5-10% of HSPICE

Signal glitches may not be modeled precisely

(glitches depend on delays in the circuit)


77/82

Gate level probabilistic analysis

For each internal net y determine the signalprobability of the net wrt to the given signal

probabilities of the primary inputs From the signal probabilities determine the

transition density D(y) of each internal net y

Compute the total power

P = 7 0.5 CyVDD2 D(y)

Pre-layout so must estimate Cy


78/82

Determining signal probabilities

Signal probability definition

P1 = t1/(t0 + t1) and P0 = 1 P1

Propagate the given statistical quantities

from the primary inputs to the internal

signal nets and outputs of the circuit

Propagate quantities using probabilisticsignal propagation model


79/82

Signal propagation model

Apply Shannons decomposition to the n-input Boolean function y = f(x1, , xn)

Y = xifxi + !xif!xi, where fxi(f!xi) is the newBoolean function obtained by setting xi = 1(xi = 0) in f(xi, , xn)

P(y) = P(xifxi) + P(!xif!xi) = P(xi)P(fxi ) +P(!xi)P(f!xi)

Apply recursively (note: P(!xi) = 1 P(xi))


80/82

Determine transition density

For a transition (1-to-0 or 0-to-1) to have occurred

fxi f!xi = 1 the Boolean difference ofy wrt xi

denoted dy/dxi P(dy/dxi) is the probability that dy/dxi evaluates to

1 and D(xi) is the transition density of xi

Then the total transition density of the nety is

D(y) = 7 P(dy/dxi) D(xi)


81/82

Gate level probabilistic analysis

considerations Computationally efficient

Must only compute signal probabilities and

transition densities for each net to evaluate P = 7 0.5 CyVDD

2 D(y)

Assumes given correct signal probabilitiesfor primary inputs (and if wrong, large

errors are possible) Given average power dissipation values


82/82

Architectural level simulation

Perform RTL simulation to obtain the inputactivity for each major functional unit

Architectural description in behavioral VHDLor Verilog or C, C++

Energy characterization of functional units

Transition-sensitive energy models

System busses ALUs, register file, pipeline registers

Analytical energy models Caches, DRAMs

3.low power estimation

Documents