rcim 2008 - modello generale

27
POLITECNICO DI MILANO Core Identification for Core Identification for Reconfigurable Systems driven by Reconfigurable Systems driven by Specification Self Specification Self - - Similarity Similarity Roberto Cordone: [email protected] Massimo Redaelli: [email protected] Reconfigurable Computing Italian Meeting Reconfigurable Computing Italian Meeting 19 December 2008 Room S01, Politecnico di Milano - Milan (Italy)

Upload: usrdresd

Post on 13-Jun-2015

497 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: RCIM 2008 - Modello Generale

POLITECNICO DI MILANO

Core Identification for Core Identification for

Reconfigurable Systems driven by Reconfigurable Systems driven by

Specification SelfSpecification Self--SimilaritySimilarity

Roberto Cordone: [email protected]

Massimo Redaelli: [email protected]

Reconfigurable Computing Italian MeetingReconfigurable Computing Italian Meeting19 December 2008

Room S01, Politecnico di Milano - Milan (Italy)

Page 2: RCIM 2008 - Modello Generale

2

OutlineOutline

Introduction

General Problem

Rationale

Core Identification solutions

Results

Concluding Remarks

Page 3: RCIM 2008 - Modello Generale

3

The problemThe problem

1. Partition a specification into subsets of operations

(tasks)

2. Map each task onto a compatible circuit design

(mode)

3. Assign a portion of the device to each task,

compatibly with its mode (size, shape,

heterogeneity)

4. Assign a reconfiguration time to each task

5. Assign an execution time to each task

Page 4: RCIM 2008 - Modello Generale

4

The data (1)The data (1)

A specification DFG = (O,P)

operations O, including os, oe for start and end

precedences P: (o, o’) means that o ends before o’ starts

A set M of modes, characterized by

size cm (number of CLBs, possibly shape)

reconfiguration time dm

A compatibility relation between modes and tasks

a task S can be implemented in different modes (MS)

a mode can implement different tasks

Page 5: RCIM 2008 - Modello Generale

5

The data (2)The data (2)

A latency lS,m associated to each task S and compatible

mode m

A set U of reconfigurable units (RUs)

size γu is the number of CLBs in unit u

A scheduling time horizon T (provided by a heuristic)

Page 6: RCIM 2008 - Modello Generale

6

DecisionDecision variablesvariables

Partition O into tasks (set xS = 1 or 0 for each S ⊆ O)

Map each used task S onto a compatible mode mS ∈MS

Assign to each used task S a portion US ⊆ U

compatible with mS

Assign to each used task S a reconfiguration start time τS

Assign to each used task S an execution start time tS

Page 7: RCIM 2008 - Modello Generale

7

A A generalgeneral modelmodel (1)(1)

xS defines a partition of O, with singletons for os, oe

and no induced cyclic precedence

mode mS is compatible with task S

mode mS fits into portion US

portion US is connected (to minimize communication overhead)

further shape constraints on portion US

further compatibility constraints between mode mS and portion US

(e.g., heterogeneous RUs)

Minimize the completion time

Subject to

Page 8: RCIM 2008 - Modello Generale

8

A A generalgeneral modelmodel (2)(2)

the execution follows the reconfiguration

the precedences are respected:

for all S and S’ such that xS = xS = 1 and

two tasks cannot run together on the same RU

for all S and S’ such that xS = xS = 1

when a task is in execution, its RUs cannot be reconfigured

for all S and S’ such that xS = xS = 1

when a task is in reconfiguration, another task can share the

reconfiguration, but only using the same RUs and mode

for all S and S’ such that xS = xS = 1

Page 9: RCIM 2008 - Modello Generale

9

Some Some remarksremarks

The partition of O turns the DFG (O,P) into a

Task Dependency Graph TDG = (N,A)

Also the TDG is acyclic (precedence constraints)

Partitioning, mapping, placing and scheduling

are not independent

The size of the search space is overwhelming:

for each subset of operations, one must define

a mode, out of |M| available ones

a subset of RUs, out of |U| available ones

a reconfiguration start time out of |T| available ones

an execution start time out of |T| available ones

Decomposition approach: build a partition xS independent from the

scheduling, but good enough for scheduling purposes

Page 10: RCIM 2008 - Modello Generale

10

The Proposed Approach The Proposed Approach -- RationaleRationale

Reconfiguration times impact heavily on the final

solution’s latency

Reuse the configurable modules!

Our approach: identify recurrent structures in the

specification, automatically

Page 11: RCIM 2008 - Modello Generale

11

The Proposed ApproachThe Proposed Approach

int test_code( int io , int * o1)

{

int a = 2, b = 10;

Specification DFG

Partitioned DFG

Reconfigurable Implementation

Page 12: RCIM 2008 - Modello Generale

12

The Proposed Approach: DFG PartitioningThe Proposed Approach: DFG Partitioning

Our approach: two phases

Template Identification

Produce a collection of isomorphism equivalence

classes, each containing some isomorphic subgraphs of

the original specification

Graph covering (template choice)

Choose which among the identified templates are best

suitable for implementation as (re)configurable modules

Page 13: RCIM 2008 - Modello Generale

13

Template identificationTemplate identification

Problem: finding repeated operations that get

performed in the specification.

In available literature (Software Engineering):

extracting procedures from flat (maybe legacy) code

Text-based matching approach (Ducasse et al. 1999,

Baker 1995)

AST approach (Baxter et al. 1998)

Source-based metrics approach (Higo et al. 2002, 2004)

Page 14: RCIM 2008 - Modello Generale

Isomorphic graphsIsomorphic graphs

are isomorphic iff exists

or, if directed,

Page 15: RCIM 2008 - Modello Generale

Problems with IsomorphismProblems with Isomorphism

• Several problems have been investigated:

1. Graph Isomorphism

2. Subgraph Isomorphism (GT48)

3. Largest Common Subgraph (GT49)

• However, we are concerned with only one graph:

• Isomorphic Subgraphs

• Find two isomorphic subgraphs S1 and S2 of a given

graph G

Page 16: RCIM 2008 - Modello Generale

Our problem peculiaritiesOur problem peculiarities

The input graph is a Data

Flow Graph. Then:

Each operation/node has

an associated action;

The inputs of every

operation performing a

non-commutative action

must be distinguished

Page 17: RCIM 2008 - Modello Generale

17

The The AlgorithmAlgorithm

1. Build a collection V of pairs of basic isomorphic subgraphs;

2. Extract one pair (S, S’ ) from V;

a) build the non-overlapping neighborhoods N (S) and N (S’ ),

which include the nodes adjacent, respectively, to S and S’ .

If any of them is empty, goto 3;

b) perform a maximum cardinality bipartite matching between N

(S) and N (S’ );

c) for each matched pair, if adding the two nodes to S and S’

preserves the isomorphism, add them to S and S’ . Goto 2(a)

3. Save the maximal isomorphic non-overlapping subgraphs S and

S’. Goto 2.

Page 18: RCIM 2008 - Modello Generale

18

Sample Sample runrun

Page 19: RCIM 2008 - Modello Generale

The initialization?The initialization?

Choose good starting points…

Iterate through all the edges, and create the sets of

those with same

Source operation o1

Sink operation o2

Same input order

They induce pairs of nodes which are good starting

point for the algorithm

Page 20: RCIM 2008 - Modello Generale

20

StructuringStructuring the outputthe output

The algorithm returns a list of pairs:

{ (S1, S2), (S3, S4), (S5, S6), …}

Suppose S1 and S3 are isomorphic. Then so are S2 and

S4!

Suppose S3 is isomorphic to a subgraph of S1. Then S2

has a subgraph isomorphic to S4!

Page 21: RCIM 2008 - Modello Generale

21

HierarchicalHierarchical TemplateTemplate GraphGraph

Size does matter. But also frequency does…

Page 22: RCIM 2008 - Modello Generale

22

Template choice: metricsTemplate choice: metrics

Largest Fit First

Largest templates are best

Most Frequent fit First

Templates with the largest number of instances are best

Communication Weight metrics

E.g., #internal edges vs. #boundary edges ratio

Page 23: RCIM 2008 - Modello Generale

23

Experimental Results: ReversedExperimental Results: Reversed--tree templatestree templates

BenchmarkLargest

Template

Largest

#Instances#Templates

AES - encryptblock 16 3 151

AES - decryptblock 19 3 162

DES - des_encrypt 38 4 57

FDCT 6 6 40

Page 24: RCIM 2008 - Modello Generale

24

Experimental Results: FreeExperimental Results: Free--shape templatesshape templates

BenchmarkLargest

Template

Largest

#Instances#Templates

AES - encryptblock 132 2 6790

AES - decryptblock 147 2 11006

DES - des_encrypt 100 2 1802

FDCT 62 2 1470

Page 25: RCIM 2008 - Modello Generale

25

Experimental Results: Graph coveringExperimental Results: Graph covering

BenchmarkCover %

LFF

Cover %

MFF

Cover %

CommCPU Time

AES - encryptblock 74.3 32.7 74.1 32.5 sec

AES - decryptblock 85.31 51.7 70.8 61 sec

DES - des_encrypt 90.5 59.6 87.8 8.3 sec

FDCT 76.7 53.8 73.3 6.4 sec

Page 26: RCIM 2008 - Modello Generale

26

ExperimentalExperimental resultsresults

Page 27: RCIM 2008 - Modello Generale

2727

QuestionsQuestions