elasticity and petri nets

55
Jordi Cortadella, Universitat Politecnica de Catalunya, Barcelona Mike Kishinevsky, Intel Corp., Strategic CAD Labs, Hillsboro

Upload: ocean

Post on 25-Feb-2016

34 views

Category:

Documents


2 download

DESCRIPTION

Jordi Cortadella, Universitat Politecnica de Catalunya, Barcelona Mike Kishinevsky, Intel Corp., Strategic CAD Labs, Hillsboro. Elasticity and petri nets. Moore’s law. Source: Intel Corp. Is the GHz race over ?. Many-Core is here. Source: Intel Corp. Why this tutorial ?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Elasticity and  petri  nets

Jordi Cortadella, Universitat Politecnica de Catalunya, Barcelona Mike Kishinevsky, Intel Corp., Strategic CAD Labs, Hillsboro

Page 2: Elasticity and  petri  nets

Moore’s law

Source: Intel Corp.

Page 3: Elasticity and  petri  nets

Is the GHz race over ?

Page 4: Elasticity and  petri  nets

Many-Core is here

Source: Intel Corp.

Page 5: Elasticity and  petri  nets
Page 6: Elasticity and  petri  nets

Why this tutorial ?

Digital circuits are complex concurrent systems

Variability and power consumption are key critical aspects in deep submicron technologies

Multi (many)-core systems will become a novel paradigm: System design Applications Concurrent programming

Theory of concurrency may play a relevant role in this new scenario

Page 7: Elasticity and  petri  nets

Elasticity

Tolerance to delay variability

Different forms of elasticity Asynchronous: no clock Synchronous: variability synchronized with

a clock

In all forms of elasticity, token-based computations are performed(req/ack, valid/stop signals are used)

Page 8: Elasticity and  petri  nets

Outline

Asynchronous elastic systems The basics: circuits and elasticity Synthesis of asynchronous circuits from Petri nets Modern methods for the synthesis of large

controllers De-synchronization: from synchronous to

asynchronous Synchronous elastic systems

Basics of synchronous elastic systems Early evaluation and performance analysis Optimization of elastic systems and their

correctness

Page 9: Elasticity and  petri  nets
Page 10: Elasticity and  petri  nets

Outline

Gates, latches and flip-flops.Combinational and sequential circuits.

Basic concepts on asynchronous circuit design.

Petri net models for asynchronous controllers. Signal Transition Graphs.

Page 11: Elasticity and  petri  nets

Boolean functions

Composed from logic gates

a

b

x

y

zb

ba

a

cd

))()(( dcbazbaybax

Page 12: Elasticity and  petri  nets

Memory elements: latches

HD Q

En

Active high:

En = 0 (opaque): Q = prev(Q)

En = 1 (transparent): Q = D

LD Q

En

Active low:

En = 1 (opaque): Q = prev(Q)

En = 0 (transparent): Q = D

Page 13: Elasticity and  petri  nets

Memory elements: flip-flop

H QLD

CLK

FFD Q

CLK

CLK

D

Q

Page 14: Elasticity and  petri  nets

Finite-state automata

STATE

Inputs Ouputs

CL

CLK• Output function• Next-state function

Page 15: Elasticity and  petri  nets

Network of Computing Units

InIn OutOut

B1 B3

B2

No combinational cycles

Page 16: Elasticity and  petri  nets

Marked Graph ModelCircuit

Marked graph

Combinational logic

Register

Page 17: Elasticity and  petri  nets
Page 18: Elasticity and  petri  nets

Outline

What is an asynchronous circuit ? Asynchronous communication Asynchronous design styles (Micropipelines) Asynchronous logic building blocks Control specification and implementation Delay models and classes of async circuits Channel-based design Why asynchronous circuits ?

Page 19: Elasticity and  petri  nets

Synchronous circuit

R R R RCL CL CL

CLK

Implicit (global) synchronization between blocksClock period > Max Delay (CL + R)

Page 20: Elasticity and  petri  nets

Asynchronous circuit

R R R RCL CL CL

Req

Ack

Explicit (local) synchronization:Req / Ack handshakes

Page 21: Elasticity and  petri  nets

Motivation for asynchronous Asynchronous design is often unavoidable:

Asynchronous interfaces, arbiters etc.

Modern clocking is multi–phase and distributed –and virtually ‘asynchronous’ (cf. GALS – next slide): Mesachronous (clock travels together with data) Local (possibly stretchable) clock generation

Robust asynchronous design flow is coming(e.g. VLSI programming from Philips, Balsa fromUniv. of Manchester, NCL from Theseus Logic …)

Page 22: Elasticity and  petri  nets

Globally Async Locally Sync (GALS)

Local CLK

R RCL

Async-to-sync Wrapper

Req1

Req2

Req3

Req4

Ack3

Ack4Ack2

Ack1

Asynchronous World

Clocked Domain

Page 23: Elasticity and  petri  nets

Key Design Differences

Synchronous logic design:

proceeds without taking timing correctness(hazards, signal ack–ing etc.) into account

Combinational logic and memory latches(registers) are built separately

Static timing analysis of CL is sufficient todetermine the Max Delay (clock period)

Fixed set–up and hold conditions for latches

Page 24: Elasticity and  petri  nets

Key Design Differences

Asynchronous logic design: Must ensure hazard–freedom, signal ack–ing,

local timing constraints Combinational logic and memory latches

(registers) are often mixed in “complex gates” Dynamic timing analysis of logic is needed to

determine relative delays between paths

To avoid complex issues, circuits may be builtas Delay-insensitive and/or Speed-independent (as discussed later)

Page 25: Elasticity and  petri  nets

Synchronous communication

Clock edges determine the time instants where data must be sampled

Data wires may glitch between clock edges(set–up/hold times must be satisfied)

Data are transmitted at a fixed rate(clock frequency)

1 1 0 0 1 0

Page 26: Elasticity and  petri  nets

Dual rail

Two wires with L(low) and H (high) per bit “LL” = “spacer”, “LH” = “0”, “HL” = “1”

n–bit data communication requires 2n wires Each bit is self-timed Other delay-insensitive codes exist (e.g. k-of-n)and event–based signalling (choice criteria: pin and power efficiency)

1 1

0 0

1

0

Page 27: Elasticity and  petri  nets

Bundled data

Validity signal Similar to an aperiodic local clock

n–bit data communication requires n+1 wires Data wires may glitch when no valid Signaling protocols

level sensitive (latch) transition sensitive (register): 2–phase / 4–

phase

1 1 0 0 1 0

Page 28: Elasticity and  petri  nets

Example: memory read cycle

Transition signaling, 4-phase

Valid address

Address

Valid data

Data

A A

DD

Page 29: Elasticity and  petri  nets

Example: memory read cycle

Transition signaling, 2-phase

Valid address

Address

Valid data

Data

A A

DD

Page 30: Elasticity and  petri  nets

Asynchronous modules

Signaling protocol:reqin+ start+ [computation] done+ reqout+ ackout+ ackin+reqin- start- [reset] done- reqout- ackout- ackin-(more concurrency is also possible)

Data IN Data OUT

req in req outack in ack out

DATAPATH

CONTROL

start done

Page 31: Elasticity and  petri  nets

Asynchronous latches: C element

CA

BZ

A B Z+

0 0 00 1 Z1 0 Z1 1 1

Vdd

Gnd

A

A

A

AB

B

B

B

Z

Z

Z

[van Berkel 91]

Static Logic Implementati

on

Page 32: Elasticity and  petri  nets

C-element: Other implementations

A

A

B

B

Gnd

Vdd

Z

A

A

B

B

Gnd

Vdd

Z

Weak inverter

Quasi-Static

Dynamic

Page 33: Elasticity and  petri  nets

Dual-rail logic

A.t

A.f

B.t

B.f

C.t

C.f

Dual-rail AND gate

Valid behavior for monotonic environment

Page 34: Elasticity and  petri  nets

Completion detection

Dual-rail logic

•••

•••

C done

Completion detection tree

Page 35: Elasticity and  petri  nets

Differential cascode voltage switch logic

start

start

A.t

B.t

C.t

A.fB.fC.f

Z.tZ.f

done

3––input AND/NAND gate

N-type transistor network

Page 36: Elasticity and  petri  nets

Example of dual-rail design Asynchronous dual-rail ripple-carry

adder(A. Martin, 1991) Critical delay is proportional to logN

(N=number of bits) 32–bit adder delay (1.6m MOSIS CMOS): 11

ns versus 40 ns for synchronous Async cell transistor count = 34

versus synchronous = 28

Page 37: Elasticity and  petri  nets

Bundled-data logic blocks

Single-rail logic

••••••

delaystart done

Conventional logic + matched delay

Page 38: Elasticity and  petri  nets

Micropipelines (Sutherland 89)

CJoin Merg

e

Toggle

r1

r2

g1

g2

d1

d2

Request-Grant-Done (RGD)Arbiter

Call

r1

r2ra

a1

a2Select

in outfoutt

selin

out0

out1

Micropipeline (2-phase) control blocks

Page 39: Elasticity and  petri  nets

Micropipelines (Sutherland 89)

L L L Llogic logic logic

Rin

Aout

C C

C C

Rout

Aindelay

delay

delay

Page 40: Elasticity and  petri  nets

Data-path / Control

L L L Llogic logic logic

Rin RoutCONTROL AinAout

Page 41: Elasticity and  petri  nets

Control specification

A+

B+

A–

B–

A

B

A inputB output

Page 42: Elasticity and  petri  nets

Control specification

A+

B–

A–

B+

A B

Page 43: Elasticity and  petri  nets

Control specification

A+

C–

A–

C+ A

C

B+

B– BC

Page 44: Elasticity and  petri  nets

Control specification

A+

C–

A–

C+ A

C

B+

B–B

C

Page 45: Elasticity and  petri  nets

Control specification

CC

RiRo

Ai

Ao

Ri+

Ao+

Ri-

Ao-

Ro+

Ai+

Ro-

Ai-

Ri Ro

Ao Ai

FIFOcntrl

Page 46: Elasticity and  petri  nets

A simple filter: specification

y := 0;loop x := READ (IN); WRITE (OUT, (x+y)/2); y := x;end loop

RinAin

AoutRout

ININ

OUTOUT

filter

Page 47: Elasticity and  petri  nets

A simple filter: block diagram

x y+

controlRinAin

RoutAout

Rx Ax Ry Ay Ra Aa

ININOUTOUT

• x and y are level-sensitive latches (transparent when R=1)• + is a bundled-data adder (matched delay between Ra and Aa)• Rin indicates the validity of IN• After Ain+ the environment is allowed to change IN• (Rout,Aout) control a level-sensitive latch at the output

Page 48: Elasticity and  petri  nets

A simple filter: control spec.

x y+

controlRinAin

RoutAout

Rx Ax Ry Ay Ra Aa

ININOUTOUT

Rin+

Ain+

Rin–

Ain–

Rx+

Ax+

Rx–

Ax–

Ry+

Ay+

Ry–

Ay–

Ra+

Aa+Ra–

Aa–

Rout+

Aout+

Rout–

Aout–

Page 49: Elasticity and  petri  nets

A simple filter: control impl.

C

Rin

Ain

Rx Ax RyAy AaRa

Aout

Rout

Rin+

Ain+

Rin–

Ain–

Rx+

Ax+

Rx–

Ax–

Ry+

Ay+

Ry–

Ay–

Ra+

Aa+Ra–

Aa–

Rout+

Aout+

Rout–

Aout–

Page 50: Elasticity and  petri  nets

Taking delays into account

x+

x–

y+

y–

z+

z– xz

yx’z’

Delay assumptions:• Environment: 3 time units• Gates: 1 time unit

events: x+ x’– y+ z+ z’– x– x’+ z– z’+ y– time: 3 4 5 6 7 9 10 12 13 14

Page 51: Elasticity and  petri  nets

Taking delays into account

xz

yx’z’

Delay assumptions: unbounded delays

events: x+ x’– y+ z+ x– x’+ y– time: 3 4 5 6 9 10 11

very slow

failure !

x+

x–

y+

y–

z+

z–

Page 52: Elasticity and  petri  nets
Page 53: Elasticity and  petri  nets

Motivation (designer’s view) Modularity for system-on-chip design

Plug-and-play interconnectivity

Average-case peformance No worst-case delay synchronization

Many interfaces are asynchronous Buses, networks, ...

Page 54: Elasticity and  petri  nets

Motivation (technology aspects) Low power

Automatic clock gating Electromagnetic compatibility

No peak currents around clock edges Security

No ‘electro–magnetic difference’ between logical ‘0’ and ‘1’in dual rail code

Robustness High immunity to technology and environment

variations (temperature, power supply, ...)

Page 55: Elasticity and  petri  nets

Dissuasion Concurrent models for specification

CSP, Petri nets, ...: no more FSMs Difficult to design

Hazards, synchronization Complex timing analysis

Difficult to estimate performance Difficult to test

No way to stop the clock