enabling rapid design space exploration and prototyping of ... · 2/11/2019 · maeri tutorial @...
TRANSCRIPT
![Page 1: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/1.jpg)
Enabling Rapid Design Space Exploration and Prototyping of DNN Accelerators
Tushar KrishnaGeorgia Tech
http://synergy.ece.gatech.edu
http://synergy.ece.gatech.edu/tools/maeri/maeri-tutorial-hpca-2019
Tutorial @ HPCA 2019Feb 16 2019
![Page 2: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/2.jpg)
Deep Learning Landscape
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
3
Design Tools
MLSL
This Tutorial
Model Creation
On-c
hip
Buffe
r 168 PE Array
ShiDIanNao
Eyeriss
NVDLA
ARM Trillum
Apple Neural Engine
CambriconX
Training Inference
TensorRT
![Page 3: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/3.jpg)
Spatial (or Dataflow) Accelerators• Millions of Parameters (i.e., weights)
• Billions of computations
• Heavy data movement
Spread computations across hundreds of ALUs
Reuse data within the array via local memories
and direct communicationExamples: MIT Eyeriss, Google TPU, …
Memory Hierarchy
ALU ALU ALU ALU
ALU ALU ALU ALU
ALU ALU ALU ALU
ALU ALU ALU ALU Mem
ory Hierarchy
Control
Register/FIFO/SRAM
*
*Y. Chen et. al., “Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks,” ISCA, 2016.
Processing Element (PE)
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
4
![Page 4: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/4.jpg)
Two Key HW Design Challenges
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
5
• How do we map billions of computations over limitedcompute and memory resources (aka Dataflow)?
• How do we design the accelerator to efficiently map arbitrary layer types and dataflows?
![Page 5: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/5.jpg)
MAESTRO: Analytical cost model for DNN dataflows
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
6
Buffer SizeConnectivityNoC Bandwidth…
PhysicalResourcesDescription
SharedBuffer NoC
PEPE
PE
IFMapWeightPSumPSum
/OFMap…
Accelerator Architecture
Layer VGG1Spatial[5] K…Unroll R
Dataflow Description
Layer VGG1K = 64;…S = 3;
DNN LayerDescription
CONV
IN
POOL
CONV
Neural Network Structure
… FC
FC
OUT
∑ ∑ …∑ W * I ∑ ∑ …∑ W * I
∑ ∑ …∑ W * I ∑( ∑ …∑ W * I+ ∑ …∑ W * I
+ ∑ …∑ W * I )…
Loop Ordering
Loop Tiling + Tile Mapping
PE0
PE1
PE N
DataflowMAESTRO Inputs
MAESTRO
Infrastructure and Target DNN MAESTRO Outputs
NLR WS Shi DLA RS0
0.5
1
1.5
2
NLR WS Shi DLA RS0
2
4
6
8
10
NLR WS Shi DLA RS0
1
2
3
4
B
C
D
Ba
nd
wid
th R
eq
uir
em
en
t (G
bp
s)
L1
Me
mo
ry R
eq
uir
em
en
t (K
B)
Th
ro
ug
hp
ut
(GFLO
PS
)
0
1.0
1.5
2.0
L1 M
emor
y Re
quire
men
t (KB
)
Peak
Ban
dwid
th
Requ
irem
ent (
Gbp
s)
0
2468
10
0
2
Thro
ughp
ut(G
FLO
PS)
34
NLR WS DLAShi RSDataflow Style
NLR WS DLAShi RSDataflow Style
NLR WS DLAShi RSDataflow Style
NLR WS Shi DLA RS0
0.5
1
NLR WS Shi DLA RS0
2
4
6
8
10
NLR WS Shi DLA RS0
5
10
15
20
B
C
D
Ba
nd
wid
th R
eq
uir
em
en
t (G
bp
s)
L1
Me
mo
ry R
eq
uir
em
en
t (K
B)
Th
ro
ug
hp
ut
(GFLO
PS
)
0
1
0.5
Peak
Ban
dwid
th
Requ
irem
ent (
Gbp
s)
L1 M
emor
y Re
quire
men
t (KB
)
0
2468
10
10
Thro
ughp
ut(G
FLO
PS)
0
15
20
NLR WS DLAShi RS NLR WS DLAShi RS NLR WS DLAShi RSDataflow Style Dataflow Style Dataflow StyleVG
G16
- CO
NV1
VGG
16- C
ON
V11
NoC Bandwidth
L1 MemoryRequirement
Roofline Throughput
1
5
0.5
NLR WS Shi
120
80
40
0RSDLA
0NLR WS Shi DLA
80
40
RS
MACL1 Read L1 WriteL2 Read L2 Write
Energy Analysis
https://arxiv.org/abs/1805.02566
![Page 6: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/6.jpg)
Schedule: Morning
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
7
Time Topic Presenter
8:30 to 9:00 Introduction and Background Tushar
9:00 to 10:00 A primer on DNN Dataflows Michael
10:00 – 10:30 MAESTRO Data Directives Prasanth
10:30 – 10:50 Coffee Break
10:50- 11:10 MAESTRO Data Directives [contd] Prasanth
11:10 – 11:45 MAESTRO Analytical Model Hyoukjun
11:45 – 12:30 MAESTRO Hands-on Exercises Hyoukjun & Prasanth
12:30 – 2:00 Lunch
![Page 7: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/7.jpg)
MAERI –DNN Accelerator for Flexible Dataflows
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
8
Deep Neural NetworkNeu
rons
Verilog
Dataflow Configs
Cycle-Accurate
Sims
MAERI Mapper
Find Optimal Dataflow
MAERI RTL
X X X XX X X X
+++
++
+ +
X X XX X X X
+++
++
+ +
+
X
…
… …
To/From DRAMWeight, Input, Output SRAM
X X X XX X X X
+++
++
+ +VN0
X X XX X X X
+++
++
+ +VN1
+
VN2
Weights/Inputs Weights/Inputs
Output Activation Output Activation
Output Activation
1Virtual Neurons
X
2
34567
Kwon et al., ASPLOS 2018, Zhao et al, ISPASS 2019
![Page 8: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/8.jpg)
Schedule: Afternoon
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
9
Time Topic Presenter
2:00 - 2:20 Overview of MAERI Tushar
2:20 – 3:00 MAERI Mapper Zhongyuan
3:00 – 3:20 MAERI RTL Hyoukjun
3:20 – 3:40 MAERI Demo Hyoukjun
3:40 – 4:00 Coffee Break4:00 – 4:30 Hands-on Exercises Hyoukjun & Zhongyuan
4:30 – 5:00 Extensions Michael
5:00 – 5:10 Wrap-Up Tushar
![Page 9: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/9.jpg)
Tool Release and Resources
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
10
• Slides and Video will be posted on the tutorial page• http://synergy.ece.gatech.edu/tools/maeri/maeri-tutorial-hpca2019/
• All codebases will be added to github by tomorrow evening• Link will be added on tutorial website
• Feedback• Please add your name to the sign-up if you have not
• For statistics
• We will send out a feedback form
• This is all work in progress• Please reach out to us if you find a bug
• Better still – fix it and contribute back on github!
![Page 10: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/10.jpg)
Future Extensions
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
11
•MAESTRO• validation• support for sparsity• support for other layer-types
•MAERI• Testbenches for other layer-types and networks• Mapper to Testbench auto-generator• Code-optimization for FPGAs
![Page 11: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/11.jpg)
Presenters
February 16, 2019MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology
12
Michael PellauerSr. Research Scientist,NVIDIA
Intel VSSAD (2010-2015)
PhD (MIT) in 2010
Tushar KrishnaAssistant Professor, School of ECE,Georgia Tech
PhD (MIT) in 2014
Hyoukjun KwonPhD CandidateSchool of CS,Georgia Tech
Prasanth ChatarasiPhD CandidateSchool of CS,Georgia Tech
Zhongyuan ZhaoPhD CandidateSchool of CS,Shanghai JiaotongUniversity
![Page 12: Enabling Rapid Design Space Exploration and Prototyping of ... · 2/11/2019 · MAERI Tutorial @ HPCA 2019 Tushar Krishna | Georgia Institute of Technology February 16, 2019 11 •MAESTRO](https://reader034.vdocuments.us/reader034/viewer/2022052016/602e6c878ab75539401a58dd/html5/thumbnails/12.jpg)
Enabling Rapid Design Space Exploration and Prototyping of DNN Accelerators
http://synergy.ece.gatech.edu
http://synergy.ece.gatech.edu/tools/maeri/maeri-tutorial-hpca-2019
Tutorial @ HPCA 2019Feb 16 2019