gclp ppt

GCLP ALGORITHM FOR HARDWARE SOFTWARE

PARTITIONING PROBLEM

OVERVIEW

INTRODUCTIONHARDWARE SOFTWARE PARTITIONINGHARDWARE SOFTWARE PARTITIONING PROBLEMSGCLP ALGORITHM ALGORITHM FOUNDATIONGC/LPMODIFIED GCLP ALGORITHM FOR MULTIFUNCTION

SYSTEMSCONCLUSIONREFERENCES

INTRODUCTION

The application specific hardware is much faster than software, but it is more expensive.

Software is cheap and provides the great flexibility to modify, append the features and maintain. But software is slow with respect to hardware due to its sequential execution nature.

Hence in ideal system performance critical components should be realized in hardware and non critical components with variable values to be in software.

It is very challenging to create an ideal system. Traditionally, partitioning was carried out manually, however, as

the system to design have become more and more complex so need of portioning automation arises.

Partitioning Algorithm

GCLP:

Global Criticality/Local Phase algorithm

Introduced by Kalavade and Lee in 1994.

Goal:

To generate high quality solutions for a partitioning/scheduling problem.

Partitioning

In embedded system design the term partitioning combines two tasks:

Allocation, i.e. the selection of architectural components

Mapping, i.e. the binding of system functions to these components.

Hardware Software Partitioning

Goal: Map each node of a directed acyclic graph (DAG) to hardware or software (binary choice) and to determine the schedule for each node.

DAG: The task level description of an application is specified as SDF (synchronous data flow) graph, then SDF is translated to DAG representing precedence relationship among the nodes. A DAG is input to partitioning tool.

Hardware Software PartitioningProblem

Given a DAG, area and time estimates for hw and sw mapping of all nodes, and communication cost, subject to resource capacity constraints and deadline D, determine for each node i, the hw or sw mapping(Mi) and the start time for the execution of the node (schedule ti), such that the total area occupied by the nodes mapped to hardware is minimum.

Partitioning Algorithm(with various notations)

Graph parameters{G,D, ahi , asi,thi, tsi sizei, }

GCLP Algorithm

Architecture constraints{AH,AS, ahcomm , ascomm ,tcomm}

OutputsMi, ti

Graph Parameters

NOTATION INTERPRETATION

G DAG G=(N,A)where N is set of nodes representing computations and A is the set of arcs representing precedences.

D Dead line constraint. The execution time of the graph shouldn’t exceed D cycles.

ahiHardware area estimate for node i.

asiSoftware size estimate for node i.

thiHardware execution time estimate for node i.

tsiSoftware execution time estimate for node i.

sizeiSize of node i(number of atomic operations).

Architectural Constraints


AH Hardware capacity constraint(total area of nodes mapped to hardware cannot exceed AH)

AS Software capacity constraint(total area of nodes mapped to software cannot exceed AS)

ahcomm Hardware required for the transfer of one data sample between hardware and software.

ascomm Software required for the transfer of one data sample between hardware and software.

tcomm Number of cycles required to transfer one sample of data between hardware and software.

Algorithm output


Mi Mapping for node i(0 for software and 1 for hardware)

ti Start time for the execution of node i(schedule)

Algorithm Foundation

Uses list scheduling: serial traverse the node list to select a mapping that minimizes objective function

Objective functions: Minimize finish time of the node or minimize area of the node

The GCLP algorithm adaptively selects an appropriate mapping objective at each step to determine the mapping and the schedule.

Algorithm Foundation(contd…)

Fig: Selection of the mapping objective at each step of GCLP.

GC-LP

GC: Global Criticality is a look-ahead method that estimates time criticality of the algorithm. If time is critical, the objective function that minimizes finish time is selected, else the one that minimizes area.

LP: LP is a classification of nodes based on their heterogeneity and intrinsic properties. Each node is classified as an extremity (local phase 1), repeller (local phase 2), or normal (local phase 3) node. A measure called local phase delta quantifies the local mapping preferences of the node.

GCLP Algorithm

GCLP Algorithm(Contd…)

N represents the set of nodes in the graphNU(NM) is the set of unmapped(mapped) nodes at the current step.NU is initialized to N.The algorithm maps one node per step.At the beginning of each step, the global time criticality measure

GC is computed.GC is a global measure of time criticality at each step of the

algorithm, based on the currently mapped and unmapped nodes and the deadline requirements.

Unmapped nodes whose predecessors have already been mapped and scheduled are called ready nodes.

GCLP Algorithm(Contd…)

A node is selected for mapping from the set of ready nodes using an urgency criterion, i.e., a ready node that lies on the critical path is selected for mapping.

The local phase of the selected node is identified and the corresponding local phase delta is computed.

GC and the local phase delta are then used to select the mapping objective. Using this objective, the selected node is assigned a mapping (Mi).

The mapping is also used to determine the start time for the node (ti).

The process is repeated |N| times until no nodes are left unmapped.

GC is a global lookahead measure that estimates time criticality at each step of the algorithm.

Steps: The hardware/software mapping and schedule for the already mapped

nodes is known.Using this schedule and the required deadline D, the remaining time

Trem is first determined.

All the unmapped nodes are mapped to software and the corresponding finish time TS

is computed.

Suppose that TS exceeds the allowed deadline D. Some of the unmapped nodes have to be moved from software to hardware to meet the deadline.

Global Criticality(GC)

Example Used For Computing GC

Local Phase(LP)

GC is an averaged measure over all the unmapped nodes at each step.

This desensitizes GC to the local properties of the node being mapped.

To emphasize the local characteristics of nodes, classify nodes as

Extremities (or local phase 1 nodes),Repellers (or local phase 2 nodes), Normal nodes (or local phase 3 nodes).

Extremities(Local Phase 1)

The bottleneck resource in hardware is area, while the bottleneck resource in software is time.

Extremities are nodes that consume a disproportionately large amount of the bottleneck resource on a particular mapping (relative to the other mapping).

A hardware extremity node is defined as a node that consumes a large area in hardware, but a relatively small amount of time in software.

A software extremity node is defined as a node that takes up a large amount of time in software, but a relatively small amount of area when mapped to hardware.

Local Phase 2 or Repeller Nodes

GCLP uses the concept of repellers or local phase 2 nodes to perform on-line swaps of similar nodes between hardware and software.

To do this, identify certain intrinsic nodal properties (called repeller properties) that reflect the inherent suitability of a node to either a hardware or a software mapping.

Bit operations are handled better in hardware, while memory operations are better suited to software.

As a result, a node with many bit manipulations, relative to other nodes, is a software repeller, while a node with a lot of memory operations, relative to other nodes, is a hardware repeller.

Local Phase 3 or Normal Nodes

A node that is neither an extremity nor a repeller is defined to be a local phase 3 node or a normal node.

The threshold is set to its default value (0.5) when a normal node is mapped. Thus the mapping objective is governed by GC alone.

Modified GCLP algorithm for Multifunction Systems

Optimizing the design of multifunction embedded systems such as multistandard audio/video codecs and multisystem phones.

Such systems run a prespecified set of applications, and any

“one” of the applications is selected at a run time, depending on system parameters.

Modified algorithm considers global preferences across the application set as well as the preference of each individual application to generate an efficient overall solution while ensuring that timing constraints of each application are met.

Partitioning Multiple Applications

Two methods for partitioning multiple applications. These methods use modified versions of the GCLP algorithm.

HOP(Hardware Bias for Common Node)

The commonality measures are used to bias the mapping decisions made by GCLP.

CHOP(Consistency in Mapping Common Nodes)

The applications are ordered according to their relative criticality.

Conclusion

The hardware-software codesign problem has received a lot of attention recently.

The GCLP algorithm find the best hardware-software implementation for a particular application.

The algorithm maps the selected node to an implementation (hardware or software) in each iteration.

The key features of the algorithm are hardware area minimization and meeting latency constraints.

REFERENCES

1. A. Kalavade, E. A. Lee, “A Global Criticality/ Local Phase Driven Algorithm for the Constrained Hardware/Software Partitioning Problem”, Proc. of CODESIC ASHE, Third Intl. Workshop on HardwarelSojhare Codesign, Grenoble,France, Sept. 22-24,1994. pp 42-48.

2. Kalavade and E.A. Lee, “The Extended Partitioning Problem: Hardware/Software Mapping, Scheduling, and Implementation-bin Selection.”, Journal of Design Automation for Embedded systems, Vol. 2,no. 2,March1997,pp. 126 -163.

3.Bastin,Martin Holzer Markus Rupp “Improvements of the GCLP algorithm for HW/SW partitioning of task graphs” Proc.of fourth Intl/conference Circuit Signals and Systems,Nov,2006,pp.107-113.

4.A. Kalavade, “System-level codesign of mixed hardware-software systems”, Ph.D. dissertation, University of California, Berkeley, CA, Sept.1995.

5. Kalavade and P.A.Subramanyam “Hardware/Software Partitioning for Multifunction systems“ IEEE Trans. computer aided design of Integrated Circuits and systems, September 1998,pp.819-837.

6. B. Knerr, M. Holzer, and M. Rupp“Extending The GCLP Algorithm For Hw/Sw Partitioning: A detailed Platform Model And Performance Improvements” IEEE Austrochip ‘06,Oct 2006, Vienna Austria,pp 89-95.

THANK YOU!!!!!!!!

gclp ppt

Documents