spring 2006ee 5324 - vlsi design ii - © kia bazargan 332 ee 5324 – vlsi design ii kia bazargan...

36
Spring 2006 EE 5324 - VLSI Design II - © Kia Ba zargan 1 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Upload: silvester-malone

Post on 13-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 1

EE 5324 – VLSI Design IIEE 5324 – VLSI Design II

Kia Bazargan

University of Minnesota

Part VIII: Timing IssuesPart VIII: Timing Issues

Page 2: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 2

References and Copyright

• Textbooks referenced [Rab96] J. M. Rabaey

“Digital Integrated Circuits: A Design Perspective”Prentice Hall, 1996.

• Slides used(Modified by Kia when necessary) [©Prentice Hall] © Prentice Hall 1995, © UCB

1996 Slides for [Rab96] http://bwrc.eecs.berkeley.edu/Classes/IcBook/instructors.html

Page 3: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 3

Why Deal With Timing?

• Clock Makes sure signals are settled before being

written Controls the order of operations

• Problem? Physical implementation of the circuit what

we planned Why?

o Wires incur delay on signalso Clock edge might arrive too early or too late

• Challenges Clock routing Synchronization protocols

Page 4: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 4

Clock Skew• Clock signal

Connects to all registers/flip-flops Connects to all pre-charge/evaluate of dynamic

logic Huge fanout large capacitive load Routed to all parts of the chip

Huge capacitance of the clock net itself Example: Alpha processor: 3.24 nF (40% chip C)

• Clock skew Clock net has huge RC Signal arrival time depends on the length of the

dest from source Not the “same” clock signal for different

destinations• Why important?

Timing violated Larger chips even worse

Page 5: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 5

Clock Wire Delay

CL

r

c

Rs

r = 0.07 /lc=0.04 fF/m2

(Tungsten wire)

[©Prentice Hall]

Page 6: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 6

Reference Circuit: Pipelined Datapath

• We use this circuit to analyze the problem

CL1 R1 CL2 R2 CL3 R3

t’ t’’ t’’’

In Outti

tl,min

tl,max

tr,min

tr,max

Skew: = t’’ – t’

Page 7: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 7

Skew in Single-Phase Edge-Triggered Clocking

• Race between clock and data

R1 R2

t’ t’’= t’+

’ ’’

tr,min+tl,min+ti

tr,min+tl,min+ti tr,min+tl,min+ti (skew bound)

[Rab96] p513

Page 8: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 8

Skew in Single-Phase Edge-Triggered Clocking

• Data stable before clock applied

R1 R2

t’ t’’+T= t’+

’ ’’

tr,max+tl,max+ti

T tr,max+tl,max+ti- T tr,max+tl,max+ti-

’’+T

t’’+ T t’+tr,max+tl,max+ti

(clock periodbound)

[Rab96] p513

Page 9: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 9

Clock Signal Direction

• Same direction as data: >0 Skew constraint (bound) must be strictly

controlled - : If constraint not met,

even reducing clock frequency would not help!

+ : Positive skew increases throughput (by ) (see “clock period bound”)

o Not worth: high risk

• Opposite direction as data: <0 Skew constraint always met Throughput decreases (by ||)

Page 10: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 10

Skew in Two-Phase Master-Slave Clocking

CL1 M1 CL2 M2 CL3 M3

In S1 S2 S3

Page 11: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 11

Two-Phase Clock Timing

clock period T

T1

-T12

clockoverlap

1

2

1’

T2

T12 T21

tmin > – T12

tmax < T + – T12

tmin > – T12

tmax < T + – T12

new data applied to CL2 previous data latched into M2

[©Prentice Hall]

Page 12: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 12

Two-Phase vs. Single-Phase

• Comparing the skew bounds, T12 acts as a buffer for the skew Skew can always be countered by increasing

T12

• Performance Increasing T12 could mean longer clock periods

• Positive vs. negative skew Same as single-phase

Page 13: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 13

How to Counter Clock Skew Problems?

• Routing the clock in the opposite direction of data Local solution only, not always an option (see below)

• Controlling the non-overlap periods of the clock Only for 2-phase clocks Could decrease clock frequency

• Perform the routing of the clock such that skew is minimum

. . .

log Out

InPositive Skew

Negative Skew

Reg R

eg

Reg

Reg

[©Prentice Hall]

Page 14: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 14

Clock Routing

CLOCK

H-Tree Network

Observe: Only Relative Skew is Important

CLOCK

Mainclock driver

Secondaryclock

drivers

Reduces absolute delay.Makes Power-Down easierSensitive to variationsin Buffer Delay

LocalArea

modulemodule

modulemodule

modulemodule

Comb-Tree Network

[©Prentice Hall]

Page 15: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 15

Example: DEC Alpha 21164

• Clock frequency: 300MHz – 9.3 million transistors

• Total clock load: 3.75 nF• Power in clock distribution network: 20W

(40% of the total!)• Uses two-level clock distribution

Single 6-stage driver at center Secondary buffers drive left and right side

• Clock grid in metal3 and metal4

[©Prentice Hall]

Page 16: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 16

DEC Alpha 21164

Clock Drivers

[©Prentice Hall]

Page 17: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 17

DEC Alpha 21164: Clock Skew

[©Prentice Hall]

Page 18: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 18

Self-Timed and Asynchronous Circuits

• Functions of clock in synchronous designs Act as completion signal (data stable before

latched) Ensures correct ordering of events Based on worst-case delay of the circuit

• Truly asynchronous design Completion is ensured by careful timing analysis Ordering of events is implicit in logic Very risky

• Self-timed design Completion ensured completion signal Ordering imposed by handshaking protocol “Local” solution to the timing problem Based on average delay of the circuit [©Prentice Hall]

Page 19: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 19

done done donestart start start

Req Req Req

Ack Ack Ack

Example of Self-Timed Pipeline (Handshaking)

• “Start” and “done” signals ensure physical timing constraints met

• Acknowledge/Request (aka handshaking protocol) ensure correct ordering of the operations

CL1R1 CL2R2 CL3R3In

tCL1 tCL2 tCL3

start start start

HSReq

HSReq

HSReq

Ack Ack Ack

done done done

Page 20: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 20

Self-Timed Circuits: Advantages and Disadv.

• Advantages to synchronous: Timing signals generated locally

o No clock routing problemso Saving in power consumption of the clock net

Potential increase in performanceo Separate physical and logical ordering mechanismo Self-timed: average, synchronous: worst-case

Robust to variations (manufacturing + environment)

• Disadvantage: Larger area

o Redundancyo Control circuit (handshaking)

Page 21: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 21

Completion Signal Generation Methods

• Delay module method Mimic the delay of the

logic circuit using a separate delay element.

Not much area overhead Not aggressive in

obtaining average speed Used in memories (internal

timing)

• Dual-rail computation Use redundant signal

representation Denote 1, 0, “in transition”

LogicNetwork

In

Delay Modulestartdone

out

B B0 B1

In transition 0 00 0 11 1 0Illegal 1 1

Page 22: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 22

Completion Signal Generation: Redundant Code

Start

StartB0

B1Done

Vdd Vdd

B1B0

In1In1In2In2

PDN PDN

[©Prentice Hall]

Page 23: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 23

Redundant Signal Coding (cont.)

• When “start” is low Circuit precharged (B0,B1) in the “transition” state

• When “start” high ONLY ONE of the pull-down networks evaluates Only one of the B0, B1 signals goes high

• “Done” defined as the OR For an N-bit word, all “done” signals must be

combined more area, more delay

Page 24: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 24

Example: Self-Timed Adder

P0

C0

P1

G0

P2

G1

P3

G2 G3

VDD

Start

Start

P0

C0

P1

K0

P2

K1

P3

K2 K3

VDD

Start

Start

C0 C1 C2 C3 C4 C4

C4C0 C1 C2 C3 C4

VDD

Start

C4

C3

C2

C1

C4

C3

C2

C1

Start Done

(a) Differential carry generation

(b) Completion signal

[©Prentice Hall]

Page 25: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 25

Example: Self-Timed Adder (cont.)

• Dual evaluation network used only for thecarry chain (critical path)

• Using K (kill) instead of G (generate),inverts the function

• “Done” evaluation assumed to be slower than sum evaluation

• Example: Self-timed: 0.23 nsec/bit, 3300 2. Synchronous: same delay, less area BUT, actual performance of self-timed

substantially better (average vs. worst-case delays)

Self-timed: O(log N) delay – similar to tree-structured synchronous

Page 26: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 26

Handshaking

• Protocol for the logical ordering of operations Avoid race Avoid hazards

• Extra hardware to implement State machine Queues possible

• Exact protocol depends on: Architecture Environment Must accommodate:

o New data available (sender)o Request computation (sender)o Acknowledge receipt (receiver)o Ready for new computation (receiver)

Page 27: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 27

Four-Phase Handshaking

Sender-receiverconfiguration Timing diagram

Sen

der

Rece

iver

Req

Ack

Data

Req

Data

Ack

Cycle 1 Cycle 2

Sender’s actionReceiver’s action

[©Prentice Hall]

Page 28: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 28

Event Logic: the Muller C-element

[©Prentice Hall]

A

B

F

A B Fn+1

001

1

010

1

0Fn

Fn

1

(a) Schematic (b) Truth table

VDD

FA

B

QS

R

A

B

F

Static

Dynamic

C

Page 29: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 29

Two-Phase Handshaking Implementation

[©Prentice Hall]

SenderLogic

ReceiverLogic

Data

C

Data

Acce

pte

d

Req

Ack

Data Ready

Implementation

Sender’s actionReceiver’s action

Req

Ack

Data

cycle 1 cycle 2

Timing diagram

“edge-sensitive” to HS signals

0 Data Ready (DR)=1

1

1 Receiver: “ready for new data” (Ack)

2

2 Sender: “new data ready” (DR) Req

3

3 Receiver: “done, ready for new data” (Ack)

Page 30: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 30

Example: Self-Timed FIFO

[©Prentice Hall]

Reqi En1 Done1 En2 Done2 En3 Reqo

Acki Reqi En1 Done1 En2 Done2

Acki Reqi En1 Done1

Acki Reqi

C C

R1In Out

En

Acki

Reqi

R2 R3

CReq0

Acko

Done

Page 31: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 31

Asynchronous Systems

• Outside world usually asynchronous• Synchronization usually by polling• Perfect synchronization impossible

Sample input at transition

f

fin

AsynchronousSystem

SynchronousSystem

Synchro-nization

[©Prentice Hall]

Page 32: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 32

A Simple Synchronizer

Vin

Vout

• Data sampled on Falling Edge of Clock

• Latch will eventually Resolve Signal Value,but ... this might take infinite time!

Page 33: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 33

System Level Synchronization

[©Prentice Hall]

Reference clock

PC board

Chip 1 Chip 2

Logic Logic

I/O Data

1’

2’

1 “

2 “

Crystal-basedclock-generator

Clo

ck

Gen

erat

or

Clo

ckG

ener

ator

Page 34: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 34

Skew of Local Clocks vs Reference

"

"

(a) Skew of local clock signals

with respect of reference clock.(b) Local clock signals as produced

by PLL based clock generator.

[©Prentice Hall]

Page 35: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 35

Phase-Locked Loop Based Clock Generator

[©Prentice Hall]

Phasedetector

Chargepump

Up

Down

Loopfilter

VCO

Clock decode &

buffer

Divide byN

Reference clock

Localclock

1 2 ...

Vcontr

Acts also as Clock Multiplier

Up

Down

Page 36: Spring 2006EE 5324 - VLSI Design II - © Kia Bazargan 332 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part VIII: Timing Issues

Spring 2006 EE 5324 - VLSI Design II - © Kia Bazargan 36

To Probe Further ...

• Clock skew visualization (cool animations!!) P. J. Restle,

"Technical Visualizations in VLSI Design",Design Automation Conference, pp. 494-499, 2001

• Asynchronous FIFO design (system-level comm) T. Chelcea and S. Nowick,

“Robust Interfaces for MixedTiming Systems with Application to LatencyInsensitive Protocols”,Design Automation Conference, pp. 21-26, 2001.