more scheduling resource sharing - cornell university · 2018-10-09 · resource sharing and...
TRANSCRIPT
![Page 1: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/1.jpg)
More SchedulingResource Sharing
ECE 5775High-Level Digital Design Automation
Fall 2018
![Page 2: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/2.jpg)
▸ Lab 3 is released (due Friday 10/5)– 10 FPGA boards available – Go through the CORDIC tutorial asap
1
Announcements
![Page 3: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/3.jpg)
▸ More SDC scheduling
▸ Resource sharing overview– Sub-problems: functional unit, register, and connectivity binding
problems– Key concepts: compatibility and conflict graphs
2
Outline
![Page 4: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/4.jpg)
▸ ILP for time-constrained scheduling minimize cTyx1,1 + x2,1 +x6,1 + x8,1 – y1 £ 0x6,2 + x7,2 + x8,2 – y1 £ 0 x7,3 + x8,3 – y1 £ 0x5,4 + x9,4 + x11,4 – y2 £ 0 …
3
Review: ILP Formulation for TCS
What is the y vector?
+´´´´
´ ´ + <
-
-
1
2
3
4
v2v1
v3
v4
v5
v6
v7
v8
v9
v10
v11
+´
´
´´
´
´
+ <
-
-
1
2
3
4
v2v1
v3
v4
v5
v6
v7 v8
v9
v10
v11
![Page 5: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/5.jpg)
4
Review: SDC-Based Scheduling
• Target cycle time: 5ns• Delay estimates
– Add (+) 1ns– Load (ld) 3ns– Store (st) 1ns
si : schedule variable for operation ild
+
ldld
+
v1
v3
v4
v2v0
1ns
3ns
1ns
§ Dependence constraints<v0 ,v4>:s0 – s4 ≤0<v1 ,v3>:s1 – s3 ≤0<v2 ,v3>:s2 – s3 ≤0<v3 ,v4>:s3 – s4 ≤0<v4 ,v5>:s4 – s5 ≤0
§ Cycle time constraintsv2à v5 :s2 – s5 ≤-1v1à v5 :s1 – s5 ≤-1
stv51ns
▸ A linear programming formulation based on system of integer difference constraints (SDC)
![Page 6: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/6.jpg)
▸ Difference constraints can be conveniently represented using constraint graph– Each vertex represents a variable and each weighted edge
corresponds to a different constraint – Detect infeasibility by the presence of negative cycle (by solving
single-source shortest path)
5
SDC Constraint Graph
s0
s1
s2
s3s4
00
0
0
-1
s0 – s4 ≤0s1 – s3 ≤0s2 – s3 ≤0s3 – s4 ≤0s4 – s5 ≤0s2 – s4 ≤-1s1 – s4 ≤-1s4 – s2 ≤0
0-1
s2 – s4 ≤-1s5
0
-1
s2 – s4 ≤-1s4 – s2 ≤0
0≤-1
![Page 7: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/7.jpg)
▸ Resource constraints cannot be represented exactly in integer difference form
6
Handling Resource Constraints (NP-Hard in General)
§ Resource constraintsè Heuristic partial orderings
v0à v2 :s0 – s2 ≤-1
OR
v1à v0 :s1 – s0 ≤-1v2à v0 :s2 – s0 ≤-1
3 cycle latency
2 cycle latency
ld
+
ldld
+
v1
v3
v4
v2v0
1ns
3ns
1ns
stv51ns
• Resource constraint– Two read ports
[J. Cong & Z. Zhang, DAC, 2006] [Z. Zhang & B. Liu, ICCAD, 2013]
![Page 8: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/8.jpg)
7
Exact and Practically Scalable Scheduling with SDC and SAT (SDS)
Conflict based search
Graph basedfeasibility checkingPolynomial time~1M variables
>1M clauses
SATResource
Constraints
Partial orderings
Difference constraints
s0 – s4 ≤0s1 – s3 ≤0s2 – s3 ≤0s3 – s4 ≤0s4 – s5 ≤0s2 – s5 ≤-1s1 – s5 ≤-1
SDCTiming
Constraints
InfeasibilityConflict clauses
Conflict-driven learning
[S. Dai, G. Liu, and Z. Zhang, FPGA 2018]
![Page 9: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/9.jpg)
▸ Given a Boolean function F(x1, x2, … xn), find an assignment to xi’s to make F evaluate to 1– If such assignment exists, F is satisifiable– Otherwise, F is unsatisfiable
▸ Example: (x + y + z) (x’ + y’ + z) (x’ + y’ + z’) – A satisfying assignment: x=1, y=0, z=1
▸ First NP-complete problem (Cook-Levin theorem)
▸ Numerous practical applications– Hardware/software verification (e.g., equivalence checking, model checking)– Artificial intelligence (e.g., planning, automated reasoning)– Automated theorem proving– Combinatorial design
…
8
Boolean Satisfiability Problem (SAT)
![Page 10: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/10.jpg)
▸ SAT solvers have made significant progress in scalability– From toy problems with 100-200 variables (early 90s) – To industrial applications with 1M+ variables, 5M+ constraints
(2010s)
▸ Modern SAT solvers typically employ a backtracking-based search algorithm where conflict-driven clause learning is a key to efficiency
9
Scalability of SAT Solvers
[source: A. Sabharwal, Modern SAT Solvers:Key Advances and Applications, 2011]
![Page 11: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/11.jpg)
Ruv : whether operation u is sharing the same resource with operation v
Ouàv : denotes whether operation u is scheduled earlier than v
Ordering constraints: Operations sharing the same resources must be scheduled apart
R01à (O0à1∨O1à0 )~(O0à1∧ O1à0 )R02à (O0à2∨O2à0 )~(O0à2∧ O2à0 )R12à (O1à2∨O2à1 )~(O1à2∧ O2à1 )
Note:R01à (O0à1∨O1à0 )meansR01implies(O0à1∨O1à0 )
10
Encoding Resource Constraints in SAT
SAT clauses for ordering three load operations
ld
+
ldld
+
v1
v3
v4
v2v0
stv5
Two read ports available
![Page 12: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/12.jpg)
Partial orderings
Difference constraints
InfeasibilityConflict clauses
▸ Is 2-cycle schedule feasible?
Conflict-Driven Learning
1 -1 -1
-1propose
conflict
Negative cyclesum = -1
11
s0
s1
s2
s3s4
-1
s5
-1
What SAT learns from SDC:Any ordering involving operation 0 before 2 should no longer be attempted
![Page 13: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/13.jpg)
▸ Is a 2-cycle schedule feasible?
Conflict-Driven Learning
1propose
conflict
Negative cyclesum = -2
12
s0
s1
s2
s3s4
-1
s5
-1
-1-1
-1
![Page 14: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/14.jpg)
▸ Is a 2-cycle schedule feasible?
Conflict-Driven Learning
1propose
No conflict
No negative cycle
13
s0
s1
s2
s3s4
-1
s5
-1
-1-1
Feasible! Returns schedule.
![Page 15: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/15.jpg)
▸ Generate short conflicts– Shorter conflict è more pruning è faster
convergence
– Negative cycle = irreducibly inconsistent set of constraints• Keeps conflicts short• Becomes consistent if any constraint is removed from the set
14
Fast Conflict-Driven Learning
![Page 16: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/16.jpg)
▸ Combining SDC and SAT with conflict-driven learning enables fast yet exact resource-constrained scheduling– Up to 1000X faster than ILP
▸ Broader applications– Not just specific to HLS– Applies to constrained scheduling problems in
other fields
15
Take-Away Points on SDS Scheduling
![Page 17: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/17.jpg)
High-level Programming Languages
(C/C++, SystemCMatlab, ...)
Compilation
Transformations
RTLgeneration
Recap: High-Level Synthesis Flow
S0
S1
S2
S0
S1
S2
ab
z
d
3 cycles
*–
Control data flow graph (CDFG)
Finite state machines with datapath
BB3
BB1
BB2
BB4
T F
+
-
*+
*
if (condition) {…
} else {t1 = a + b;t2 = c * d;t3 = e + f;t4 = t1 * t2;z = t4 – t3;
}
16
Scheduling Binding
Allocation
![Page 18: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/18.jpg)
Resource Sharing and Binding
▸ Resource sharing: shares resources to minimize cost, in resource usage/area/power– Typically carried out by binding in high-level synthesis– Other subtasks such allocation and scheduling greatly impact
the resource sharing opportunities
▸ Binding: maps operations, variables, and/or data transfers to the available resources– After scheduling: decide resource usage and detailed
architecture (focus of this lecture)– Before scheduling: affect both area and delay – Simultaneous scheduling and binding: better result but more
expensive
17
![Page 19: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/19.jpg)
▸ Functional unit (FU) binding– Primary objective is to minimize the number of FUs– Considers connection cost
▸ Register binding– Primary objective is to minimize the number of registers– Considers connection cost
▸ Connectivity binding– Minimize connections by exploiting the commutative property of
some operations / FUs– NP-hard
18
Binding Sub-problems
![Page 20: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/20.jpg)
Sharing Conditions
▸ Functional units (registers) are shared by operations (variables) of same type whose lifetimes do not overlap – Lifetime: [birth-time, death-time)
• Operation: The whole execution time (if unpipelined)• Variable: From the time this variable is defined to the time it
is last used
19
![Page 21: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/21.jpg)
20
Operation Binding
Functional Unit Operations
Mul1 op1, op3
AddSub1 op2, op4
AddSub2 op5, op6
clock edge
×
×
+
+ +−
2 31
ab
cdefg
op1op2
op3
op4
op5
op6
Functional Unit Operations
Mul1 op1, op3
AddSub1 op2, op4, op6
AddSub2 op5
Binding 1 Binding 2
![Page 22: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/22.jpg)
21
Register Binding
clock edge
×
×
+
+ +
−
2 31
a
b
c
d
e
f
g
Lifetime crossing clock edge;Register Implied
![Page 23: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/23.jpg)
22
Variable Lifetime Analysis
v1 [1, 2)v2 [2, 3)v3 [3, 4)
Variables v1, v2, and v3 can share the same register
Variable lifetimes
clock edge
×
×
+
+ +−
2 3 41
a
b
cdefg
v1v2
v3
![Page 24: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/24.jpg)
▸ Operation/variables compatibility:– Same type, non-overlapping lifetimes
▸ Compatibility graph: – Vertices: operations/variables – Edges: compatibility relation
▸ Conflict graph: Complement of compatibility graph
23
Compatibility and Conflict Graphs
a b
c
d
a b
c
d
A scheduled DFG(operations have the same type) Compatibility graph
a b
c
d
Conflict graph
Note: The graphs for variables/registers can be constructed in a similar way
![Page 25: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/25.jpg)
Clique Cover Number and Chromatic Number
▸ Compatibility graph:– Partition the graph into a minimum number of cliques
• Clique in an undirected graph is a subset of its vertices such that every two vertices in the subset are connected by an edge
▸ Conflict graph:– Color the vertices by a minimum number of colors (chromatic
number), where adjacent vertices cannot use the same color
24
a b
c
d
a b
c
d
A scheduled DFG Clique partitioning on compatibility graph
a b
c
d
Coloring on conflict graph
Operations have same type
![Page 26: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/26.jpg)
▸ Next lecture: More Binding, Pipelining
25
Before Next Class
![Page 27: More Scheduling Resource Sharing - Cornell University · 2018-10-09 · Resource Sharing and Binding Resource sharing: shares resources to minimize cost, in resource usage/area/power](https://reader033.vdocuments.us/reader033/viewer/2022041706/5e4512646ff68e55e000f546/html5/thumbnails/27.jpg)
▸ These slides contain/adapt materials developed by– Prof. Deming Chen (UIUC)– Prof. Jason Cong (UCLA)
26
Acknowledgements