automatic instruction scheduler retargeting by reverse-engineering matthew j. bridges, neil...
TRANSCRIPT
Automatic Instruction Scheduler Automatic Instruction Scheduler Retargeting by Reverse-EngineeringRetargeting by Reverse-Engineering
Matthew J. Bridges, Neil Vachharajani, Guilherme Ottoni, David I. August
Liberty Research GroupDepartment of Computer Science
Princeton University
2 Princeton University
Structural HazardsStructural Hazards
Add
Shl
Sub
Xor
Code
Dynamic Schedule
Add
Shl Sub
Xor
Add Shl Sub
Static Schedule
Xor
Resource Usage
Add
Shl
Sub
Xor
ALU0
ALU0
ALU1
ALU1
ALU1ALU0
ALU1ALU0
ALU1ALU0
Resources
Tim
e
3 Princeton University
Structural HazardsStructural Hazards
Add
Shl
Sub
Xor
Code
Resource Usage
Add
Shl
Sub
Xor
ALU0
ALU0
ALU1
ALU1
ALU1ALU0
ALU1ALU0
ALU1ALU0
Resources
Tim
e
Add
Shl
Sub
Xor
Static Schedule
Add
Shl
Sub
Xor
Dynamic Schedule
4 Princeton University
Encoding Structural HazardsEncoding Structural Hazards
Processor Manuals
Machine Description
SchedulerHazard
Detector
Processor Manuals
Pros:
Easily Available
Standard Methodology
Cons:
Processor manual does not describe the actual machine
Takes several man-weeks to translate processor manual into machine description
Machine description written incorrectly
Compiler
Offline
5 Princeton University
Processor ManualsProcessor ManualsIA64 Instructions Set Reference Vol. 3 ~370 pages of ISA description
Itanium 2 Processor Reference Manual~200 pages of microarchitecural details
6 Princeton University
Humans aren’t PerfectHumans aren’t Perfect
OP_SHIFT (alt(ALT_SHIFT_REG));
OP_SHIFT (alt||(ALT_SHIFT_IMM));
ALT_SHIFT_REG (format (OF_IREG_IREG resv(RL_Itype));ALT_SHIFT_IMM (format (OF_IREG_IMM6 resv(RL_Itype));
Itanium Instruction Set Reference Vol. 3: Page 3:212
Itanium 2 Processor Reference Manual: Page 27
IMPACT Itanium 2 Machine Description
7 Princeton University
Encoding Structural HazardsEncoding Structural HazardsArchitecture Description
Language[LISA][nML]
Pros:
Machine description accurately describes processor
Cons:
ADLs are limited
Not used for general purpose processors
Often unavailable
ADL
Machine Description
SchedulerHazard
Detector
Compiler
Offline
8 Princeton University
Encoding Structural HazardsEncoding Structural HazardsQuery machine while
scheduling
[Baker `91][Dupre `04]
Pros:
Avoids human errors
Easily available
Cons:
Increase in compile time
Unable to cross-compile
SchedulerHazard
Detector
Compiler
9 Princeton University
Encoding Structural HazardsEncoding Structural Hazards
Machine Description
Query machine a priori
Pros:
Avoids human errors
Easily available
Avoid scheduling overhead
Cons:
?
SchedulerHazard
Detector
Compiler
OfflineStructural HazardDetermination Algo.
10 Princeton University
Limits of QueryingLimits of Querying
Machine A
Infinite
Finite
Machine B
Infinite
Finite
Goal: Perfectly characterize the machine
Goal: Characterize the machine
11 Princeton University
Identifying a Subspace to ExploreIdentifying a Subspace to Explore
I I I
I I I
I I I
Width (w)
Dep
th (
p)
Instructions (I)
# Possible Instruction Schedules
€
∞
€
∞
€
Iw∗p
€
3006*8
€
10107MillenniaMyopia
€
Iw
€
3006 230 MillenniaPipelined
Itanium 2 Time
€
∞
€
C + w −1
w
⎛
⎝ ⎜
⎞
⎠ ⎟ 1.3 HoursCategorization
€
24 + 6 −1
6
⎛
⎝ ⎜
⎞
⎠ ⎟
100 Schedules/Second
Assumption
€
I + w −1
w
⎛
⎝ ⎜
⎞
⎠ ⎟ 3No Order
€
300 + 6 −1
6
⎛
⎝ ⎜
⎞
⎠ ⎟ Millennia
I I I
Width (w)
Instructions (I)
Width (w)
Instructions (I)
€
L
€
K
€
J
Width (w)
Categories (C)
€
CL
€
CJ
€
CK
Width (w)
Instructions (I)
€
L
€
K
€
JGeneral
12 Princeton University
Cost of AssumptionsCost of Assumptions
1.00
1.02
1.04
1.06
1.08
1.10
1.12
1.14
1.16
1.18
1.20
TI TMS320C3x Sparc V8 Itanium Itanium 2
Speedup Over No Hazard Detection
Ignore Order Pipelined Manual Resource Maps
13 Princeton University
Reverse-Engineering AlgorithmReverse-Engineering Algorithm1. All Instructions in a Single Category
2. Split Categories• Perform a Random Walk
• Longer walk = more accurate categories
3. Extract Canonical Instructions
4. Exhaustive Exploration of Canonical Instructions …
5. Update machine description
14 Princeton University
Cost of Finding CategoriesCost of Finding Categories
0
10
20
30
40
50
60
70
80
90
100
1 10 100 1000
Time (Minutes)
% of No Order Speedup
Itanium 2 Itanium Sparc V8 TI TMS320C3x
15 Princeton University
ConclusionConclusionObtaining structural hazards for use in instruction scheduling is error prone
Processors manuals wrong or confusingManual translation is error proneADLs aren’t available
We can automatically extract structural hazards useful for scheduling
It is impossible to determine all structural hazards a prioriIt is possible to find sufficient hazard information to produce good schedules Algorithm achieves 81-100% of manual resource map performance
Greatly reduces the time needed to retarget the instruction scheduler of a compiler
Facilitates Design Space Exploration
16 Princeton University
Thank YouThank You