elastic circuits jordi cortadella universitat politècnica de catalunya, barcelona emicro 2013

93
Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Upload: ulysses-smail

Post on 15-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Elastic circuits

Jordi CortadellaUniversitat Politècnica de Catalunya, Barcelona

EMicro 2013

Page 2: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Goals• Convince ourselves that:

– designing an asynchronous circuit is easy– synchronous and asynchronous circuits are similar– asynchronous circuits bring new advantages

• Not to cover exotic asynchronous schemes

• Elasticity can also be synchronous

EMicro 2013 Elastic circuits 2

Page 3: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Clocking

EMicro 2013 Elastic circuits

Nvidia KeplerTM GK110

• How to distribute the clock?

• How to determine the clockfrequency?

• How to implement robustcommunications?

• How to reduce and manageenergy?

3

28nm, 7.1B transistors, 550mm2, 2688 CUDA cores,Base clock: 836MHz, Memory clock: 6GHz

Page 4: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

EMicro 2013 Elastic circuits 4

Page 5: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Outline• Synchronous and Source-synchronous circuits• Completion detection• Handshaking• Performance analysis• Why asynchronous?• Design automation• Synchronous elasticity• Globally-asynchronous Locally-synchronous

EMicro 2013 Elastic circuits 5

Page 6: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synchronous andSource-Synchronous

Page 7: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synchronous circuit

EMicro 2013 Elastic circuits

PLL

7

Page 8: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

12112

Synchronous circuit

EMicro 2013 Elastic circuits

CL

Two competing paths:• Launching path• Capturing path

Launching path < Capturing path + Period

CLKtree + CL < CLKtree + Period

CL < Period (no clock skew)

2PLL

8

Page 9: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Source-synchronous

EMicro 2013 Elastic circuits

CLKgen matched delay matched delay matched delay

• No global clock required

• More tolerance to PVT variations

• Period > longest combinational path

• Good for acyclic pipelines

Launching path

Capturing path

9

Page 10: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

CLKgen

?

Source-synchronous with forks and joins

EMicro 2013 Elastic circuits

How to synchronize incoming events?

10

Page 11: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

C element (Muller 1959)

EMicro 2013 Elastic circuits

CA

BC

A

B

C

A B C0 0 00 1 C1 0 C1 1 1

11

Page 12: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

C element (Muller 1959)

EMicro 2013 Elastic circuits

A

B C

A

B

C

A B C0 0 00 1 C1 0 C1 1 1

MAJ

12

(many implementations exist)

Page 13: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Completion detection

Page 14: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Completion detection

EMicro 2013 Elastic circuits

CLKgen

fixed delay

The fixed delay must be longer than theworst-case logic delay (plus variability)

Q: could we detect when a computation has completed ASAP ?

14

Page 15: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

A 1 SP 0 SP 1 SP 1 SP

Delay-insensitive codes: Dual Rail• Dual rail: every bit encoded with two signals

EMicro 2013 Elastic circuits

A.t A.f A0 0 Spacer0 1 01 0 11 1 Not used

A.t

A.f

15

Page 16: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual Rail AND gate

EMicro 2013 Elastic circuits

A B C

SP SP SP

0 - 0

- 0 0

SP 1 SP

1 SP SP

1 1 1

A

BC

A.t

A.f

B.t

B.f

C.t

C.f

16

Page 17: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual Rail Inverter

EMicro 2013 Elastic circuits

A Z

SP SP

0 1

1 0

A.t

A.f

Z.t

Z.f

17

Page 18: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual Rail AND/OR gate

EMicro 2013 Elastic circuits

A

BC

A.t

A.f

B.t

B.f

C.t

C.f

A

BC

A.f

A.t

B.f

B.t

C.f

C.tA

BC

18

Page 19: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual rail: completion detection

Dual-rail logic

•••

•••

C done

Completion detection tree

EMicro 2013 Elastic circuits 19

Page 20: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Multi-input C element

EMicro 2013 Elastic circuits

C

C

C

C

C

C

a1

a2

a3

a4

a5

a6

a7

c

20

Page 21: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual rail: completion detection

EMicro 2013 Elastic circuits

AND

OR

INV

AND

CLKgen

21

Page 22: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual rail: completion detection

EMicro 2013 Elastic circuits

AND

OR

INV

AND

CCLKgen

22

Page 23: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Dual rail: operation

EMicro 2013 Elastic circuits

AND

OR

INV

AND

CCLKgen

ResetComputeComputeComputeCompute

For a correct operation, all internal signals should be reset before the compute phase:• Use a more complex implementation of dual-rail (e.g., DIMS), or• Have internal completion detection, or• Use timing assumptions

23

Page 24: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Other DI codes• There are many DI codes:

– k-out-of n, Berger, Knuth, …

• Example: 1-out-of-4

– 2 bits with 4 wires– Same wire efficiency as DR– Less power consuming– Good for communication– Bad for logic

EMicro 2013 Elastic circuits

Wires Value0000 Spacer0001 00010 10100 21000 3

others not used

24

Page 25: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Single rail data vs. dual railSome back-of-the-envelope estimations:

EMicro 2013 Elastic circuits

Single rail Dual RailArea 1 2Delay 1 << 1Static power 1 2Dynamic power < 0.2 2

Dual rail:• Good for speed• Large area• High power comsumption

25

Page 26: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Handshaking

Page 27: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Handshaking

EMicro 2013 Elastic circuits

CLKgen unknown delay

Assume that the source module can provide data at any rate:

• When should the CLK generator send an event if the

internal delays of the circuit are unknown?

Solution: handshaking

27

Page 28: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Handshaking

EMicro 2013 Elastic circuits

I have data

I want data

Data

Request

Acknowledge

28

Page 29: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Asynchronous elastic pipeline

C

ReqIn ReqOut

AckIn AckOut

C C C

• David Muller’s pipeline (late 50’s)• Sutherland’s Micropipelines (Turing award, 1989)

EMicro 2013 Elastic circuits 29

Page 30: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Multiple inputs and outputs

EMicro 2013 Elastic circuits 30

Page 31: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Multiple inputs and outputs

EMicro 2013 Elastic circuits

delay

31

Page 32: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Mulitple inputs and outputs

EMicro 2013 Elastic circuits

C

Req

Ack Req

Ack

32

Page 33: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Channel-based communication• A channel contains data and handshake wires

EMicro 2013 Elastic circuits

Single-Rail DataReq

Ack

Dual-Rail DataAck

33

Page 34: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Push/pull channels

• Push: the sender initiates the communication• Pull: the receiver initiates the communication

EMicro 2013 Elastic circuits

Sender Receiver

Single-Rail DataReq (push)

Ack

Single-Rail DataAck

Req (pull)

34

Page 35: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Four-phase protocol

• Valid data on the active edge of Req• Req/Ack must return to zero before the next transfer• Different variations of the 4-phase protocol exist

EMicro 2013 Elastic circuits

Data 1 Data 2 Data 3

Req

Ack

Data

Data transfer Data transfer

35

Page 36: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Two-phase protocol

• Every edge is active• It may require double-edge triggered flip-flops or

pulse generators

EMicro 2013 Elastic circuits

Data 1 Data 2 Data 3

Req

Ack

Data

Data transfer Data transfer

36

Page 37: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

How to memorize?

EMicro 2013 Elastic circuits

CombinationalLogic LL

delay

CC

? ?

2-phase or 4-phase ?

37

Page 38: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

How to memorize?

EMicro 2013 Elastic circuits

CombinationalLogic LL

delay

CC

Pulsegenerator

2-phase

38

Page 39: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

How to memorize?

EMicro 2013 Elastic circuits

CombinationalLogic LL

delay

CC 4-phase

39

Page 40: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Performance analysis

Page 41: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Ring oscillators

EMicro 2013 Elastic circuits

CC

CC

C

• Every ring requires an odd number of inverters

• The cycle period is determined by the slowest ring

• The cycle period is adapted to the operating conditions(temperature, voltage)

41

1

2 3 4

5

6 7

Page 42: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Global Rings

EMicro 2013 Elastic circuits 43

C

C C

C

CC

Page 43: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Global Rings

EMicro 2013 Elastic circuits

Th = 1 / 6

• Ramamoorthy and Ho, 1980Performance evaluation of asynchronous concurrent systems with Petri nets

• T. Williams et al., A self-timed chip for division, 1987• Greenstreet and Steiglitz, Bubbles can make self-timed pipelines fast, 1990• Manohar and Martin, Slack elasticity in concurrent computing, 1998.

44

Page 44: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Global Rings

EMicro 2013 Elastic circuits

Th = 2 / 6

• Ramamoorthy and Ho, 1980Performance evaluation of asynchronous concurrent systems with Petri nets

• T. Williams et al., A self-timed chip for division, 1987• Greenstreet and Steiglitz, Bubbles can make self-timed pipelines fast, 1990• Manohar and Martin, Slack elasticity in concurrent computing, 1998.

45

Page 45: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Global Rings

EMicro 2013 Elastic circuits

Th = 3 / 6

• Ramamoorthy and Ho, 1980Performance evaluation of asynchronous concurrent systems with Petri nets

• T. Williams et al., A self-timed chip for division, 1987• Greenstreet and Steiglitz, Bubbles can make self-timed pipelines fast, 1990• Manohar and Martin, Slack elasticity in concurrent computing, 1998.

46

Page 46: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Global Rings

EMicro 2013 Elastic circuits

Th = 1 / 6

• Ramamoorthy and Ho, 1980Performance evaluation of asynchronous concurrent systems with Petri nets

• T. Williams et al., A self-timed chip for division, 1987• Greenstreet and Steiglitz, Bubbles can make self-timed pipelines fast, 1990• Manohar and Martin, Slack elasticity in concurrent computing, 1998.

47

Page 47: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Global Rings

EMicro 2013 Elastic circuits

0 NN/2tokens

Th

1/2

• Ramamoorthy and Ho, 1980Performance evaluation of asynchronous concurrent systems with Petri nets

• T. Williams et al., A self-timed chip for division, 1987• Greenstreet and Steiglitz, Bubbles can make self-timed pipelines fast, 1990• Manohar and Martin, Slack elasticity in concurrent computing, 1998.

Tokenlimited

Bubblelimited

48

Page 48: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

A latch-based view of synchronous circuits

EMicro 2013 Elastic circuits

Filp-flop =Master + Slave

49

Page 49: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Multiple Rings

EMicro 2013 Elastic circuits

2 / 4 2 / 42 / 5

5 / 7 ?It’s bubblelimited !!!2 / 7

50

Page 50: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Slack matching

EMicro 2013 Elastic circuits

2 / 4 2 / 42 / 5

2 / 7 ?4 / 9

• We can add as many bubbles as we want (but not tokens!)• Slack matching can be solved optimally in polynomial time• Slack matching is conceptually equivalent to buffer (FIFO) sizing or recycling

51

Page 51: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Performance analysis

EMicro 2013 Elastic circuits 52

C

C C

C

CC

(Mean Cycle Ratio)

Page 52: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Latch-based design

EMicro 2013 Elastic circuits

L3L2L1 L4

L1

L2

L3

L4

53

Launching path

Capturing path

Page 53: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Matched delays can be adjustable

EMicro 2013 Elastic circuits

L3L2L1 L4

54

delayselection

Delays can be adjusted:

• At testing/boot time (to adjust to static variability)

• At runtime (to compensate dynamic variability)

Page 54: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Why asynchronous?

Page 55: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Exploiting elasticity

CLK

Rigidclock

Highperformance

LowenergyEMicro 2013 Elastic circuits 56

Page 56: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Highperformance

Exploiting elasticity

Vo

ltage

Performance

1 VRigid

2 GHz1 GHz500 MHz

Lowenergy

0.9 V

0.8 V

0.7 V

Rigidclock

Highperformance

Lowenergy

Voltagescaling

EMicro 2013 Elastic circuits 57

Page 57: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Voltage scaling and power savings

-24%-14%

3 ARM926 coreson the same die

EMicro 2013 Elastic circuits 58

Page 58: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Tracking variability

EMicro 2013 Elastic circuits 59

matched delay

Page 59: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Tracking variability

delay

best typ worst

multi-corner matched delay

critical paths

Good correlation for:

• Process variability (systematic)

• Global voltage fluctuations

• Temperature

• Aging (partially)EMicro 2013 Elastic circuits 60

Page 60: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Margins

Gate and wire delays (typ) P V T AgingPLLJitter

Skew

Rigid Clocks:

Cycle period

Gate and wire delays (typ) P V TA

gin

g

Elastic Clocks:

Skew

Cycle period

Margin reduction

Speed-up / Power savings

EMicro 2013 Elastic circuits 61

Page 61: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

wasted timecomputation time

Rigid clock

computation time

Cycle period

Cycle period

Elastic clock

Clock elasticity

EMicro 2013 Elastic circuits 62

Page 62: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Design Automation

Page 63: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Design automation paradigms• Synthesis of asynchronous controllers

– Logic synthesis from Petri nets or asynchronous FSMs

• Syntax-directed translation– Correct-by-construction composition of handshake

components

• De-synchronization– Automatic transformation from synchronous to

asynchronousEMicro 2013 Elastic circuits 64

Page 64: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synthesis of asynchronous controllers

EMicro 2013 Elastic circuits

DeviceLDS

LDTACK

D

DSr

DSw

DTACK

VME BusController

DataTransceiver

BusDSr

LDS

LDTACK

D

DTACK

Read Cycle

65

Page 65: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synthesis of asynchronous controllers

EMicro 2013 Elastic circuits

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

LDS

LDTACK

D

DSr

DTACK

VME BusController

Signal Transition Graph

66

Page 66: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synthesis of asynchronous controllers

EMicro 2013 Elastic circuits

DTACKD

DSr

LDS

LDTACK

LDS+ LDTACK+ D+ DTACK+ DSr- D-

DTACK-

LDS-LDTACK-

DSr+

Cortadella et al., Petrify67

Page 67: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Syntax-directed translation

EMicro 2013 Elastic circuits

SEQ

xR

R

RWMUX

yR

R

RWMUX

*

DMX-

DMX-

DMX <>

DMX <

do

→→ @

áá ññ→

out

int = type [0..255]& gcd: main proc (in? chan <<int,int>> & out! chan int)begin x, y: var int| forever do in?<<x,y>>

; do x <> y then if x < y then y:=y-x else x:=x-y fi od

; out!x odend

Sources:

J. Kessels and A. Peeters.DESCALE: A Design Experiment for a SmartCard Application Consuming Low Energy,in Principles of Asynchronous Circuit Design, A Systems Perspective,Eds., J. Sparso and S. Furber, Kluwer Academic Publishers, 2001.

P.A.Beerel, R.O. Ozdag and M. Ferretti.A Designer’s Guide to Asynchronous VLSI,Cambridge University Press, 2010. 68

Page 68: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

De-synchronization• Strategy: substitute the clock tree

by local clocks and handshakes

• Combinational logic and latches are not modified

• More tolerance to variability– Similar area, less power and/or more speed

• Cortadella, Kondratyev, Lavagno and Sotiriou. Desynchronization: Synthesis of asynchronous circuits from synchronous specifications.IEEE TCAD, Oct 2006.

EMicro 2013 Elastic circuits 69

Page 69: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synchronous operation

EMicro 2013 Elastic circuits

CLKgen

Transforming a synchronous circuit into asynchronous (automatically)

70

Page 70: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

De-synchronization

EMicro 2013 Elastic circuits

Transforming a synchronous circuit into asynchronous (automatically)

72

Page 71: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

System-level de-synchronization

EMicro 2013 Elastic circuits 74

CLK

Page 72: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

System-level de-synchronization

EMicro 2013 Elastic circuits 75

Page 73: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

System-level de-synchronization

EMicro 2013 Elastic circuits 76

Page 74: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synchronous elasticity

Page 75: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Different flavors of elasticity

EMicro 2013 Elastic circuits

+147 … 348201…

…Rigid

+e48…147…

201… 3

Elastic

79

4 38+s …147

201… Synchronous Elastic

Carloni et al., Latency-insensitive systems.

Page 76: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Asynchronous elasticity

req

ack

EMicro 2013 Elastic circuits 80

Page 77: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synchronous elasticity

valid

stop

Ring oscillator

CLK

PLL

EMicro 2013 Elastic circuits 81

Page 78: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Latch-based elasticity

sender receiver

V V V V

En En En En

Data

Valid

Stop

Data

Valid

Stop

EMicro 2013 Elastic circuits 82

Page 79: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Elastic netlists

ForkJoin

Join / Fork

EB

EBEB

EB

Enable signalto data latches

EMicro 2013 Elastic circuits 83

Page 80: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Variable Latency Units

EMicro 2013 Elastic circuits

[0 - k] cycles

[0 - k] cycles

donego clear

84

V/S V/S

Page 81: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Globally-asynchronousLocally-synchronous

GALS

Page 82: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

SoC design with GALS• Most IPs are synchronous

• Different components may have different operating frequencies

• Some components have variable latencies (e.g., cache hit/miss latency)

• Multiple clock domains are essential

EMicro 2013 Elastic circuits 86

Bridge

CDC

DSP

P

Fast Bus

Slow Bus

Bridge

CDC

Mem

CLK2

CLK1

CLK3

Page 83: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Multiple clock domains

EMicro 2013 Elastic circuits

CLK

Single clock(mesochronous)

f1/f0

f2/f0

f3/f0

CLK(f0)

Rational clockfrequencies

CLK

1C

LK2

CLK

3

CLK

0

Independent clocks

(controllable skew)

87

Page 84: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Synchronous handshakes

EMicro 2013 Elastic circuits

CLK1 CLK2

Data

Sender ReceiverValid

Ack

• The arrival of data is unpredictable• Handshakes solve the problem

88

Page 85: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

The problem: metastability

EMicro 2013 Elastic circuits

D Q

ФT

D Q

?

D

Q

ФRФR

setup hold

89

Page 86: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

How long does it take to resolve metastability?

EMicro 2013 Elastic circuits

Metastability

MTBF: Mean Time Between Failures

90

Page 87: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Classical synchronous solution

EMicro 2013 Elastic circuits

D Q D Q D Q D Q

ФT ФR

Wffe

D

rtMTBF

2

Mean Time Between Failures fФ: frequency of the clock fD: frequency of the data tr: resolve time available W: metastability window : resolve time constant

# FFs MTBF

1 FF 15 min

2 FF 9 days

3 FF 23 years

Example

91

Page 88: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Handshake with synchronizers

EMicro 2013 Elastic circuits

CLK1 CLK2

Data

Sender ReceiverValid

Ack

• Simple solution• Throughput can be highly degraded:

a long round trip for every transaction

92

Page 89: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Asynchronous FIFOs

EMicro 2013 Elastic circuits

Circular buffer

Valid Valid

Ack Ack

Data Data

Clk In Clk Out

FIFO control

• Ack is issued as soon as data has been delivered

• No impact on throughput (1 token/cycle)

• Min latency determined by the internal synchronizers

• Some tricky structures for the FIFO pointers (e.g. Grey encoding)

93

Page 90: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

SoC design with GALS

EMicro 2013 Elastic circuits

Bridge

CDC

DSP

P

Fast Bus

Slow Bus

Bridge

CDC

Mem

CLK2

CLK1

CLK3

• Bridges for Clock Domain Crossing usually contain asynchronous FIFOs

• Latency cost only when interfacing with synchronous domains

• No latency penalty between asynchronous domains

94

Page 91: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Conclusions• Elasticity offers flexibility in time

– Modularity– Dynamic adaptability– Tolerance to variability

• Better optimization of power/performance

• Why isn’t it an important trend in circuit design?– Lack of commercial EDA support (timing sign-off)– Designers do not feel comfortable with “unpredictable” timing– Other aspects: testing, verification, …

• De-synchronization might be a viable solutionEMicro 2013 Elastic circuits 95

Page 92: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

Bibliography• Carmona, Cortadella, Kishinevsky and Taubin,

Elastic Circuits, IEEE Trans. On CAD, Oct. 2009.

• Beerel, Ozdag and Ferreti, A Designer’s Guide to Asynchronous VLSI, Cambridge 2001.

• Sparso and Furber, Principles of Asynchronous Circuit Design: A Systems Perspective,Kluwer 2001.

• Myers, Asynchronous Circuit Design,John Wiley&Sons, 2001

EMicro 2013 Elastic circuits 96

Page 93: Elastic circuits Jordi Cortadella Universitat Politècnica de Catalunya, Barcelona EMicro 2013

EMicro 2013 Elastic circuits 97