high-level synthesis scheduling, allocation, assignment,
DESCRIPTION
High-level Synthesis Scheduling, Allocation, Assignment,. Note: Several slides in this Lecture are from Prof. Miodrag Potkonjak, UCLA CS. Overview. High Level Synthesis Scheduling, Allocation and Assignment Estimations Transformations. Allocation, Assignment, and Scheduling. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/1.jpg)
Mani SrivastavaUCLA - EE DepartmentRoom: 6731-H Boelter HallEmail: [email protected]: 310-267-2098WWW: http://www.ee.ucla.edu/~mbs
Copyright 2003 Mani Srivastava
High-level Synthesis Scheduling, Allocation, Assignment,
Note: Several slides in this Lecture are from
Prof. Miodrag Potkonjak, UCLA CS
![Page 2: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/2.jpg)
Copyright 2003 Mani Srivastava2
Overview
High Level Synthesis
Scheduling, Allocation and Assignment
Estimations
Transformations
![Page 3: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/3.jpg)
Copyright 2003 Mani Srivastava3
Allocation, Assignment, and Scheduling
D
+
-
>>
>>
+
-
>>
+ >>
+
>>
+
Allocation: How Much?2 adders
Assignment: Where?
Schedule: When?
Shifter 1
Time Slot 4
1 shifter24 registers
D
Techniques Well Understood and Mature
![Page 4: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/4.jpg)
Copyright 2003 Mani Srivastava4
Scheduling and Assignment
+
*3*2
3
+
*1
2
+1 1
2
3
3
4 4
+
*3*2
3
+2
+1 2
3
4
1
2 3
4 control steps
+ * * + *
*1
Schedule 1 Schedule 2
1 +1
2 +2
3 +3 *1
4 *2 *3
Control Step
1 +3
2 +1 *2
3 +2 *3
4 *1
Control Step
![Page 5: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/5.jpg)
Copyright 2003 Mani Srivastava5
ASAP Scheduling Algorithm
![Page 6: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/6.jpg)
Copyright 2003 Mani Srivastava6
ASAP Scheduling Example
![Page 7: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/7.jpg)
Copyright 2003 Mani Srivastava7
ASAP: Another Example
Sequence Graph ASAP Schedule
![Page 8: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/8.jpg)
Copyright 2003 Mani Srivastava8
ALAP Scheduling Algorithm
![Page 9: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/9.jpg)
Copyright 2003 Mani Srivastava9
ALAP Scheduling Example
![Page 10: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/10.jpg)
Copyright 2003 Mani Srivastava10
ALAP: Another Example
Sequence Graph ALAP Schedule(latency constraint = 4)
![Page 11: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/11.jpg)
Copyright 2003 Mani Srivastava11
Observation about ALAP & ASAP
No priority is given to nodes on critical path As a result, less critical nodes may be scheduled
ahead of critical nodes No problem if unlimited hardware However of the resources are limited, the less
critical nodes may block the critical nodes and thus produce inferior schedules
List scheduling techniques overcome this problem by utilizing a more global node selection criterion
![Page 12: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/12.jpg)
Copyright 2003 Mani Srivastava12
List Scheduling and Assignment
List_Scheduling() {
Create_Candidate_List();
while (Candidate_List != NULL) {
Select_Candidate();
Schedule Candidate();
}
}
+
*3*2
3
+
*1
2
+14 control steps
Schedule 1
+1 +3
+3 *1
*2 *3
*2
+3 +2
1:
2:
3:
4:
![Page 13: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/13.jpg)
Copyright 2003 Mani Srivastava13
List Scheduling Algorithm using Decreasing Criticalness Criterion
![Page 14: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/14.jpg)
Copyright 2003 Mani Srivastava14
Scheduling
NP-complete Problem Optimal Heuristics - Iterative Improvements Heuristics – Constructive Various versions of problem
Unconstrained minimum latency Resource-constrained minimum latency Timing constrained
If all resources identical, reduced to multiprocessor scheduling
Minimum latency multiprocessor problem is intractable
![Page 15: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/15.jpg)
Copyright 2003 Mani Srivastava15
Scheduling - Optimal Techniques
Integer Linear Programming
Branch and Bound
![Page 16: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/16.jpg)
Copyright 2003 Mani Srivastava16
Integer Linear Programming
Given: integer-valued matrix Amxn,
vectors B = ( b1, b2, … , bm ), C = ( c1, c2, … , cn )
Minimize: CTX
Subject to:
AX B
X = ( x1, x2, … , xn ) is an integer-valued vector
![Page 17: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/17.jpg)
Copyright 2003 Mani Srivastava17
Integer Linear Programming Problem: For a set of (dependent) computations {t1,t2,...,tn},
find the minimum number of units needed to complete the execution by k control steps.
Integer linear programming:Let y0 be an integer variable. For each control step i ( 1 i k ): define variable xij asxij = 1, if computation tj is executed in the ith control step. xij = 0, otherwise. define variable yi = xi1 + xI2 + ... + xin .
![Page 18: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/18.jpg)
Copyright 2003 Mani Srivastava18
Integer Linear Programming
Integer linear programming:For each computation dependency: ti has to be done before tj, introduce a constraint: k x1i+ (k-1) x2i+ ... + xki k x1j+ (k-1) x2j+ ... + xkj+ 1(*)
Minimize: y0
Subject to: x1i+ x2i+ ... + xki = 1 for all 1 i n
yj y0 for all 1 i k
all computation dependency of type (*)
![Page 19: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/19.jpg)
Copyright 2003 Mani Srivastava19
An Example
c1 c2 c3
c4
c6
c5
6 computations3 control steps
![Page 20: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/20.jpg)
Copyright 2003 Mani Srivastava20
An Example
Introduce variables: xij for 1 i 3, 1 j 6
yi = xi1+xi2+xi3+xi4+xi5+xi6 for 1 i 3
y0
Dependency constraints: e.g. execute c1 before c4
3x11+2x21+x31 3x14 +2x24+x34+1
Execution constraints:
x1i+x2i+x3i = 1 for 1 i 6
![Page 21: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/21.jpg)
Copyright 2003 Mani Srivastava21
An Example Minimize: y0
Subject to: yi y0 for all 1 i 3
dependency constraints
execution constraints One solution: y0 = 2
x11 = 1, x12 = 1,
x23 = 1, x24 = 1,
x35 = 1, x36 = 1.
All other xij = 0
![Page 22: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/22.jpg)
Copyright 2003 Mani Srivastava22
ILP Model of Scheduling
Binary decision variables xil
i = 0, 1, …, n l = 1, 2, … +1
Start time is unique
![Page 23: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/23.jpg)
Copyright 2003 Mani Srivastava23
ILP Model of Scheduling (contd.)
Sequencing relationships must be satisfied
Resource bounds must be met let upper bound on # of resources of type k be ak
![Page 24: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/24.jpg)
Copyright 2003 Mani Srivastava24
Minimum-latency Scheduling Under Resource-constraints
Let t be the vector whose entries are start times Formal ILP model
![Page 25: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/25.jpg)
Copyright 2003 Mani Srivastava25
Example
Two types of resources Multiplier ALU
Adder Subtraction Comparison
Both take 1 cycle execution time
![Page 26: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/26.jpg)
Copyright 2003 Mani Srivastava26
Example (contd.)
Heuristic (list scheduling) gives latency = 4 steps Use ALAP and ASAP (with no resource
constraints) to get bounds on start times ASAP matches latency of heuristic
so heuristic is optimum, but let us ignore it! Constraints?
![Page 27: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/27.jpg)
Copyright 2003 Mani Srivastava27
Example (contd.)
Start time is unique
![Page 28: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/28.jpg)
Copyright 2003 Mani Srivastava28
Example (contd.)
Sequencing constraints note: only non-trivial ones listed
those with more than one possible start time for at least one operation
![Page 29: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/29.jpg)
Copyright 2003 Mani Srivastava29
Example (contd.)
Resource constraints
![Page 30: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/30.jpg)
Copyright 2003 Mani Srivastava30
Example (contd.)
Consider c = [0, 0, …, 1]T
Minimum latency schedule since sink has no mobility (xn,5 = 1), any feasible
schedule is optimum Consider c = [1, 1, …, 1] T
finds earliest start times for all operations equivalently,
![Page 31: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/31.jpg)
Copyright 2003 Mani Srivastava31
Example Solution: Optimum Schedule Under Resource
Constraint
![Page 32: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/32.jpg)
Copyright 2003 Mani Srivastava32
Example (contd.)
Assume multiplier costs 5 units of area, and ALU costs 1 unit of area
Same uniqueness and sequencing constraints as before
Resource constraints are in terms of unknown variables a1 and a2
a1 = # of multipliers
a2 = # of ALUs
![Page 33: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/33.jpg)
Copyright 2003 Mani Srivastava33
Example (contd.) Resource constraints
![Page 34: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/34.jpg)
Copyright 2003 Mani Srivastava34
Example Solution
MinimizecTa = 5.a1 + 1.a2
Solution with cost 12
![Page 35: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/35.jpg)
Copyright 2003 Mani Srivastava35
Precedence-constrained Multiprocessor Scheduling
All operations done by the same type of resource intractable problem intractable even if all operations have unit delay
![Page 36: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/36.jpg)
Copyright 2003 Mani Srivastava36
Scheduling - Iterative Improvement
Kernighan - Lin (deterministic) Simulated Annealing Lottery Iterative Improvement Neural Networks Genetic Algorithms Taboo Search
![Page 37: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/37.jpg)
Copyright 2003 Mani Srivastava37
Scheduling - Constructive Techniques
Most Constrained
Least Constraining
![Page 38: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/38.jpg)
Copyright 2003 Mani Srivastava38
Force Directed Scheduling
Goal is to reduce hardware by balancing concurrency
Iterative algorithm, one operation scheduled per iteration
Information (i.e. speed & area) fed back into scheduler
![Page 39: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/39.jpg)
Copyright 2003 Mani Srivastava39
The Force Directed Scheduling Algorithm
![Page 40: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/40.jpg)
Copyright 2003 Mani Srivastava40
Step 1
Determine ASAP and ALAP schedules
*
-+
**
*+ <
**-
*
-
+* * *+ <**
-
ASAP ALAP
![Page 41: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/41.jpg)
Copyright 2003 Mani Srivastava41
Step 2
Determine Time Frame of each op Length of box ~ Possible execution cycles Width of box ~ Probability of assignment Uniform distribution, Area assigned = 1
C-step 1
C-step 2
C-step 3
C-step 4
Time Frames
*
-
*
*
-
*
**
+ <
+
1/2
1/3
![Page 42: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/42.jpg)
Copyright 2003 Mani Srivastava42
Step 3
Create Distribution Graphs Sum of probabilities of each Op type Indicates concurrency of similar Ops
DG(i) = Prob(Op, i)
DG for Multiply DG for Add, Sub, Comp
![Page 43: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/43.jpg)
Copyright 2003 Mani Srivastava43
Diff Eq Example: Precedence Graph Recalled
![Page 44: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/44.jpg)
Copyright 2003 Mani Srivastava44
Diff Eq Example: Time Frame & Probability Calculation
![Page 45: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/45.jpg)
Copyright 2003 Mani Srivastava45
Diff Eq Example: DG Calculation
![Page 46: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/46.jpg)
Copyright 2003 Mani Srivastava46
Conditional Statements
Operations in different branches are mutually exclusive Operations of same type can be overlapped onto DG Probability of most likely operation is added to DG
DG for Add
-+
-+
+Fork
Join
+-+
-+
![Page 47: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/47.jpg)
Copyright 2003 Mani Srivastava47
Self Forces Scheduling an operation will effect overall concurrency Every operation has 'self force' for every C-step of its time frame Analogous to the effect of a spring: f = Kx
Desirable scheduling will have negative self force Will achieve better concurrency (lower potential energy)
Force(i) = DG(i) * x(i)
DG(i) ~ Current Distribution Graph value
x(i) ~ Change in operation’s probability
Self Force(j) = [Force(i)]
b
ti
![Page 48: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/48.jpg)
Copyright 2003 Mani Srivastava48
Example Attempt to schedule multiply in C-step 1
Self Force(1) = Force(1) + Force(2)
= ( DG(1) * X(1) ) + ( DG(2) * X(2) )
= [2.833*(0.5) + 2.333 * (-0.5)] = +0.25
This is positive, scheduling the multiply
in the first C-step would be bad
DG for Multiply
*
-
*
*
-
*
**
+ <
+
C-step 1
C-step 2
C-step 3
C-step 41/2
1/3
![Page 49: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/49.jpg)
Copyright 2003 Mani Srivastava49
Diff Eq Example: Self Force for Node 4
![Page 50: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/50.jpg)
Copyright 2003 Mani Srivastava50
Predecessor & Successor Forces
Scheduling an operation may affect the time frames of other linked operations
This may negate the benefits of the desired assignment Predecessor/Successor Forces = Sum of Self Forces of
any implicitly scheduled operations
*
-+
**
*+ <
**-
![Page 51: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/51.jpg)
Copyright 2003 Mani Srivastava51
Diff Eq Example: Successor Force on Node 4
If node 4 scheduled in step 1 no effect on time frame for successor node 8
Total force = Froce4(1) = +0.25 If node 4 scheduled in step 2
causes node 8 to be scheduled into step 3 must calculate successor force
![Page 52: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/52.jpg)
Copyright 2003 Mani Srivastava52
Diff Eq Example: Final Time Frame and Schedule
![Page 53: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/53.jpg)
Copyright 2003 Mani Srivastava53
Diff Eq Example: Final DG
![Page 54: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/54.jpg)
Copyright 2003 Mani Srivastava54
Lookahead Temporarily modify the constant DG(i) to include the effect
of the iteration being considered
Force (i) = temp_DG(i) * x(i)temp_DG(i) = DG(i) + x(i)/3
Consider previous example:
Self Force(1) = (DG(1) + x(1)/3)x(1) + (DG(2) + x(2)/3)x(2) = .5(2.833 + .5/3) -.5(2.333 - .5/3) = +.41667
This is even worse than before
![Page 55: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/55.jpg)
Copyright 2003 Mani Srivastava55
Minimization of Bus Costs
Basic algorithm suitable for narrow class of problems Algorithm can be refined to consider “cost” factors Number of buses ~ number of concurrent data transfers Number of buses = maximum transfers in any C-step Create modified DG to include transfers: Transfer DG
Trans DG(i) = [Prob (op,i) * Opn_No_InOuts]
Opn_No_InOuts ~ combined distinct in/outputs for Op
Calculate Force with this DG and add to Self Force
![Page 56: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/56.jpg)
Copyright 2003 Mani Srivastava56
Minimization of Register Costs Minimum registers required is given by the largest
number of data arcs crossing a C-step boundary Create Storage Operations, at output of any operation
that transfers a value to a destination in a later C-step Generate Storage DG for these “operations” Length of storage operation depends on final schedule
s
ss
d
d d
Storage distribution for S
ASAP Lifetime MAX Lifetime ALAP Lifetime
![Page 57: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/57.jpg)
Copyright 2003 Mani Srivastava57
Minimization of Register Costs( contd.) avg life] =
storage DG(i) = (no overlap between ASAP & ALAP)
storage DG(i) = (if overlap)
Calculate and add “Storage” Force to Self Force
3
life] [MAX life] [ALAP life] [ASAP
life][max
life] [avg
[overlap]life][max
[overlap] - life] [avg
7 registers minimum
ASAP Force Directed
5 registers minimum
![Page 58: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/58.jpg)
Copyright 2003 Mani Srivastava58
Pipelining* * *
***
+
+<
--
* * ****
+
+<
--
DG for Multiply
123, 1’4, 2’ 3’ 4’
Instance
Instance’
Functional Pipelining
1
2
34
*
*
Structural Pipelining
Functional Pipelining Pipelining across multiple
operations Must balance distribution
across groups of concurrent C-steps
Cut DG horizontally and superimpose
Finally perform regular Force Directed Scheduling
Structural Pipelining Pipelining within an operation For non data-dependant
operations, only the first C-step need be considered
![Page 59: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/59.jpg)
Copyright 2003 Mani Srivastava59
Other Optimizations Local timing constraints
Insert dummy timing operations -> Restricted time frames
Multiclass FU’s Create multiclass DG by summing probabilities of
relevant ops Multistep/Chained operations.
Carry propagation delay information with operation Extend time frames into other C-steps as required
Hardware constraints Use Force as priority function in list scheduling
algorithms
![Page 60: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/60.jpg)
Copyright 2003 Mani Srivastava60
Scheduling using Simulated Annealing
Reference:
Devadas, S.; Newton, A.R.
Algorithms for hardware allocation in data path synthesis.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, July 1989, Vol.8, (no.7):768-81.
![Page 61: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/61.jpg)
Copyright 2003 Mani Srivastava61
Simulated Annealing
Local Search
Solution space
Cos
t fu
nctio
n
?
![Page 62: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/62.jpg)
Copyright 2003 Mani Srivastava62
Statistical Mechanics
Combinatorial Optimization
State {r:} (configuration -- a set of atomic position )
weight e-E({r:])/K BT -- Boltzmann distribution
E({r:]): energy of configuration
KB: Boltzmann constant
T: temperature
Low temperature limit ??
![Page 63: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/63.jpg)
Copyright 2003 Mani Srivastava63
Analogy
Physical System
State (configuration)
Energy
Ground State
Rapid Quenching
Careful Annealing
Optimization Problem
Solution
Cost Function
Optimal Solution
Iteration Improvement
Simulated Annealing
![Page 64: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/64.jpg)
Copyright 2003 Mani Srivastava64
Generic Simulated Annealing Algorithm
1. Get an initial solution S2. Get an initial temperature T > 03. While not yet 'frozen' do the following: 3.1 For 1 i L, do the following:
3.1.1 Pick a random neighbor S'of S 3.1.2 Let =cost(S') - cost(S) 3.1.3 If 0 (downhill move) set S = S' 3.1.4 If >0 (uphill move)
set S=S' with probability e-/T
3.2 Set T = rT (reduce temperature)4. Return S
![Page 65: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/65.jpg)
Copyright 2003 Mani Srivastava65
Basic Ingredients for S.A.
Solution Space
Neighborhood Structure
Cost Function
Annealing Schedule
![Page 66: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/66.jpg)
Copyright 2003 Mani Srivastava66
Observation
All scheduling algorithms we have discussed so far are critical path schedulers
They can only generate schedules for iteration period larger than or equal to the critical path
They only exploit concurrency within a single iteration, and only utilize the intra-iteration precedence constraints
![Page 67: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/67.jpg)
Copyright 2003 Mani Srivastava67
Example
Can one do better than iteration period of 4? Pipelining + retiming can reduce critical path to 3, and also
the # of functional units Approaches
Transformations followed by scheduling Transformations integrated with scheduling
![Page 68: High-level Synthesis Scheduling, Allocation, Assignment,](https://reader035.vdocuments.us/reader035/viewer/2022062217/56815041550346895dbe4050/html5/thumbnails/68.jpg)
Copyright 2003 Mani Srivastava74
Conclusions
High Level Synthesis Connects Behavioral Description and Structural
Description Scheduling, Estimations, Transformations High Level of Abstraction, High Impact on the
Final Design