1 automatically generating custom instruction set extensions nathan clark, wilkin tang, scott mahlke...

Automatically Generating Custom Instruction Set Extensions

Nathan Clark, Wilkin Tang, Scott MahlkeWorkshop on Application Specific

Processors

Problem StatementThere’s a demand for high performance, low power special purpose systems E.g. Cell phones, network routers, PDAs

One way to achieve these goals is augmenting a general purpose processor with Custom Function Units (CFUs) Combine several primitive operations We propose an automated method for CFU

generation

System Overview

Example1 2

Potential CFUs1,32,42,63,44,55,86,77,8

Example1 2

Potential CFUs1,32,42,6…1,3,42,4,52,6,7…

Example1 2

Potential CFUs1,32,42,6…1,3,4,52,4,5,82,6,7,8…1,3,4,5,8

Characterization

Use the macro library to get information on each potential CFU Latency is the sum of each primitive’s

latency Area is the sum of each primitive’s

macrocell

Issues we considerPerformance On critical path Cycles saved

Cost CFU area Control logic

Difficult to measure

Decode logic Difficult to

measure Register file area

Can be amortized

More Issues to Consider

IO number of input

and output operands

Usability How well can the

compiler use the pattern

Selection

Currently use a Greedy Algorithm Pick the best

performance gain / area first

Can yield bad selections

Case study 1: BlowfishSpeedup: 1.24 10 cycles can be

compressed down to 2!

Cost: ~6 adders6 inputs, 2 outputsC code this DFG came from: r ^=(((s[(t>>24)] +

s[0x0100+((t>>16)&0xff)]) ^ s[0x0200+((t>>8)&0xff)]) + s[0x0300+((t&0xff)])&0xffffffff;

r65 r70

Case study 2: ADPCM DecodeSpeedup: 1.20 3 cycles can be

compressed down to 1

Cost: ~1.5 adders2 inputs, 2 outputsC code this DFG came from: d = d & 7;

if ( d & 4 ) { … }

Experimental SetupCFU recognition implemented in the Trimaran research infrastructureSpeedup shown is with CFUs relative to a baseline machine Four wide VLIW with predication Can issue at most 1 Int, Flt, Mem, Brn

inst./cyc. 300 MHz clock

CFU Latency is estimated using standard cells from Synopsis’ design library

Varying the Number of CFUs

More CFUs yields more performance Weakness in our selection algorithm causes plateaus

adpcm-decode

0 5 10 15 20

Number of function units

Speedup Additional cost

Varying the Number of Ops

Bigger CFUs yield better performance If they’re too big, they can’t be used as often

and they expose alternate critical paths

blow f ish

0 5 10 15 20

Max Number of ops/CFU

Speedup Additional cost

Related WorkMany people have done this for code size Bose et al., Liao et al.

Typically done with traces Arnold, et al.

Previous paper used more enumerative discovery algorithmWe are unique because: Compiler based approach Novel analyzation of CFUs

Conclusion and Future Work

CFUs have the potential to offer big performance gain for small costRecognize more complex subgraphs Generalized acyclic/cyclic subgraphs

Develop our system to automatically synthesize application tailored coprocessors

1 automatically generating custom instruction set extensions nathan clark, wilkin tang, scott mahlke...

Documents

9-session bible study jen wilkin - christianbook...9-session...

douglas wilkin

partnership 4 health 2009-2011: wilkin county, minnesota ·...

hathitrust: building the universal collection john wilkin 18...

altimetry and coastal currents ted strub, john wilkin,...

eecs 483 – compiler construction fall 2006, university of...

encyclopedia a research tool by sharon wilkin by sharon...

welcome to tang soo do training with arrowhead tang soo...

michelle wilkin, pharmd 2017-2018 pharmacy residents ·...

douglas wilkin, ph.d. ck-12 life science editor...

[linda voss, terry wilkin] adult obesity a...

douglas wilkin, ph.d

spotlight for new members - local.extension.umn.edu · the...

jianwu (jim) tang - mbl.edu · tang, jianwu curriculum...

pcmh group name pcp name city/state€¦ · gina tilley...

chinese art - tang horses, tang horse and riders

tang ravan2

tang dynasty

nato phonetic alphabet - · pdf file14.06.2015 · rah...

© nichola wilkin ltd. 2014 online safety quiz start quiz