observability conditions and automatic operand-isolation in high-throughput asynchronous pipelines

18
Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines Arash Saifhashemi Peter A. Beerel University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu) (Thanks to a grant from Intel and NSF) Patmos 2012, Sep 2012, Newcastle upon Tyne

Upload: yaron

Post on 26-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines. Arash Saifhashemi Peter A. Beerel University of Southern California USC Asynchronous CAD/VLSI Group (async.usc.edu) (Thanks to a grant from Intel and NSF) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous

PipelinesArash Saifhashemi

Peter A. BeerelUniversity of Southern California

USC Asynchronous CAD/VLSI Group (async.usc.edu)(Thanks to a grant from Intel and NSF)

Patmos 2012, Sep 2012, Newcastle upon Tyne

Page 2: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Asynchronous Circuit Design - Today Applications

• 3D Network on chips (STMicroelectronics)• Ethernet Switches (Intel SRD)• Ultra high-speed FPGAs (Achronix)• Process variation• Low-power chip design (Encryption – Tiempo,

…)

Basic challenges: Automation

Proteus design flow (USC)• Uses commercial synchronous CAD tools• Starting at a high-level specification written in

SVC (SystemVerilogCSP) Fulcrum Microsystems Ethernet switch chip (up to 72 10G ports, 40G)

- 1.2 B transistors, 90% Asynchronous 13% Proteus

Tiempo TAM16 - Clockless 16-

bit microcontrolle

r

STMicroelectronics WIOMING 3D-IC (July

2012)

Achronix FPGA. 1.7 M

LUTs. 2.1 Gbps IO

Page 3: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

ConstraintsSync Library

Clock Gating

Clock Tree SynthesisNetlist

Clock Gating

The Proteus Flow

Synthesis

Physical Design

Verilog

Netlist

Netlist

Constraints

Constraints

Final Layout

Proteus/Sync

LibraryClockFree

System- Verilog

Image Netlist

SVC2RTLDesign Goals

Synth. RTL Constraints

Async Netlist

Key Features• Re-uses synchronous EDA tools• Seamless integration into existing flows• Up to 2X higher performance

Tool Status• Started at USC Async CAD/VLSI• Commercialized by TimeLess (2008)• Acquired by Fulcrum (2010)• Intel Acquired Fulcrum (2011)• Used in Intel Ethernet Alta FM6000 chip

The Problem• Limited and manual power optimization

6

Page 4: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Conditional Communication in Proteus

0

1

0

Not received

Dummy value

0

1

Not sent

Page 5: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Example: ALU

SVC Description

No conditionality in high-level description

Page 6: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Reconverging fanouts

+

Unnecessary calculation

Page 7: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Adding Isolation Cells

• All inputs/outputs are unconditional

• Operand Isolation• And-based isolation

cells• Generated by

synchronous RTL synthesizer

• Does not prevent switching in

asynchronous circuitsIsolation cells are not effective in asynchronous

circuits

Page 8: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Three-valued logic

• Formal justification of conditioning• Three-valued logic image model

• Each iteration is modeled by a clock cycle• Each variable can be 0, 1, or N (no token)

Status of each channel

One iteration

Page 9: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

3VL Unconditional Functions

Unconditional functions

• Can be represented only by , , operators

• Example: functions represented by combinational gates in a typical cell library: NAND, NOR, AOI, XOR, …Lemma 1: the output is N iff at least one of the inputs is N.

Page 10: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

SEND/RECEIVE Operators

• Conditional Communication• RECEIVE and SEND are modeled as and Ⓡ Ⓢ operators

Behave like buffers when E=1

Page 11: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

SEND Reconditioning

Assuming y=f(x) is unconditional and e TFO(y)

Lemma 2:

Application: SEND cells can be moved through logic

• Similar to retiming in synchronous circuits

Less switching when e=0

Less number of SENDs

Page 12: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Observability in 3V Networks

Local Observability Partial Care (LOPC)• OPC(f,C,xj) of input xj of a node representing a function f is the condition

under which f’s output is not affected as xj changes in C {0,1,N}Global Observability Partial Care (GOPC)

• GOPC(C,x) of a variable x is the condition under which the value of no primary output is affected as the value of x changes in C {0,1,N}

• Example: 𝑂𝑃𝐶 (𝑀𝑢𝑥 , {0 ,1 } , 𝑖1 )=𝑠{ 1}𝑖2{0 , 1}

i1 changes in {0,1} are not observable when…

i2 =0 or i2 =1

𝑂𝑃𝐶 ( 𝑓 ,𝐶 , 𝑥 ) implies→

𝐺𝑂𝑃𝐶 (𝐶 ,𝑥 )

s =1

Page 13: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

GOPC Conditioning

When xj is not observable…• Add a SEND followed by a RECEIVE• Move the SENDs using SEND reconditioning

Lemma 3: 𝐼𝑓 𝑒 { 0}→𝐺𝑂𝑃𝐶 ( {0,1 } ,𝑥1 ) h𝑡 𝑒𝑛 : 𝑓 (𝒙 )= ( 𝑓 (𝒙 ) Ⓢ𝑒 ) Ⓡ𝑒

SEND Reconditioning

0

0 or 1

NNN

N

N

1

Page 14: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Conditioning

&

+

0

0

+

No Activity

Page 15: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Inserting Isolating Nodes and Recognizing Enable DomainsSynchronous synthesis tools can insert isolating nodes

• Constrained to insert isolating nodes only on non-critical pathsNode u is in e’s Enable Domain OIED(e) if

• All paths starting from a primary input and ending at u include an isolating node controlled by e

• Detected using a DFS search

Page 16: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Pre-layout Analysis

• Wu : power of receiving data on all inputs and sending the output (unconditional nodes)

• K: power of conditional nodes

• rf: activity factor Total power Power of each domain

Domain power after isolation (n inputs)

Benefit of isolating each domain

Page 17: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Post-layout Experimental Results• Case study: 32-bit ALU placed and routed

• Back annotated switching activity using a VCD file• Results:

• Isolating ADD and SUB are detrimental for rADD and rSUB > 0.2

• 53% power reduction when only isolating MUL (rf=0.25)

• Area cost of isolating MUL is about 4% and no performance penalty

Page 18: Observability Conditions and Automatic Operand-Isolation in High-Throughput Asynchronous Pipelines

Conclusions and Future Work

Conditional communication in async. circuits is not free

• Creates area and performance overheads• Requires manual or automatic optimization

Asynchronous circuits can/should leverage sync. tools

• This paper is first to use 3-valued-logic and observability don’t cares for power optimization of asynchronous circuits

Our future work• Evaluate the proposed method on bigger designs• Adopt other sync power optimization techniques such as clock

gating• Optimize the location of SEND/RECEIVE nodes (Reconditioning)