![Page 1: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/1.jpg)
Summarizer: Trading Communication with Computing Near Storage
Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. Krishina Giri Nara*, Jing Li‡,Hung-Wei Tseng†, Steven Swanson‡, Murali Annavaram*
*University of Southern California†North Carolina State University
‡University of California, San Diego
![Page 2: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/2.jpg)
Host
Motivation – High Data Movement Cost
2
CPU Storage interface
Data computation @ host Data transfer from storage
External (host -- storage) Internal
Limited data bandwidth High access latency
External (host -- storage) Internal
![Page 3: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/3.jpg)
StorageProcessor
(SP)
Host
Near Data Processing (NDP)
3
CPU Storage interface
Data computation @ host Data transfer from storage
InternalExternal (host – storage)
![Page 4: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/4.jpg)
Host
CPU
Near Data Processing (NDP)
4
Storage interface
StorageProcessor
(SP)
Data computation @ host Data transfer from storage
InternalExternal (host – storage)
W/O NDP
With NDPData computation @ storage
![Page 5: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/5.jpg)
Host
Near Data Processing (NDP) on SSDs
5
CPU Storage interface SP
Data computation @ host Data transfer from storage
InternalExternal (host – storage)
W/O NDP
With NDPData computation @ storage
Garbage collection
Wear-leveling
Data computation @ storage
![Page 6: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/6.jpg)
Host
Near Data Processing (NDP) on SSDs
6
CPU Storage interface SP
Data computation @ host Data transfer from storage
InternalExternal (host – storage)
W/O NDP
With NDP
Garbage collection
Wear-leveling
Data computation @ storage
Obstacles to in-SSD processing
• Less powerful embedded processor
• Dynamic computation resource availability
• Manual workload partitioning is difficult Summarizer: Dynamic NDP framework for SSD
![Page 7: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/7.jpg)
Host
CPU
Summarizer – Basic Concept
7
Storage interface AP
Monitoring resources
![Page 8: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/8.jpg)
Host
CPU
Summarizer – Basic Concept
8
Storage interface AP
Monitoring resources
![Page 9: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/9.jpg)
Summarizer – Detailed Firmware Architecture
9
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
SSD Embedded Processors
![Page 10: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/10.jpg)
Normal Page Read Request
10
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
RD ( LBA)
(RD) PPA
![Page 11: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/11.jpg)
Normal Page Read Request
11
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
RD(PPA 1)RD(PPA 2)
Page data
![Page 12: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/12.jpg)
Normal Page Read Request
12
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
Page data
![Page 13: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/13.jpg)
Summarizer – Initialization (Function Offloading)
13
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
INIT ( foo)
foo()
foo()f#1Function offloading
Function registration
New NVMe command
![Page 14: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/14.jpg)
Summarizer – Computation (Dynamic mode)
14
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
foo()f#1
RD&PROC( LBA,foo)
New NVMe command
New NVMe command decode
RD&PROC(PPA,foo)
goo()f#2
![Page 15: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/15.jpg)
Summarizer – Computation (Dynamic mode)
15
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
foo()f#1
RD&PROC(PPA,foo)
RD&P(PPA1,foo)
RD&P(PPA2,foo)
Page data
RD&P(PPA1,foo)
goo()f#2
![Page 16: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/16.jpg)
Summarizer – Computation (Dynamic mode)
16
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
foo1()f#1
RD&PROC(PPA,foo)
Page data
RD&P(PPA1,foo)
buf1, foo
CC/Proc
Register in TQ
goo()f#2
![Page 17: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/17.jpg)
Summarizer – Computation (Dynamic mode)
17
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
foo()f#1
RD&PROC(PPA,foo)
Page data
RD&P(PPA1,foo)
CC
TQ is full
goo()f#2
![Page 18: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/18.jpg)
Summarizer – Finalization
18
Host Memory
SQ CQ
Host CPU
Sto
rag
e I
nte
rfa
ce (P
CIe
/ N
VM
e)
SSD Firmware
NAND FlashNAND FlashNAND FlashNAND Flash
Flash Controller
SSD DRAM
DRAM Controller
Summarizer
User Functions
TQ
Re
qu
est
qu
eu
e
Re
spo
nse
qu
eu
e
I/O Controller(NVMe command decoder)
SSD SoC Interconnection
Flash Translation Layer (FTL)
NVMe Host Driver
User Applications /Operating Systems
Task Controller
FINAL ( foo)
New NVMe command
foo()f#1
Results
goo()f#2
![Page 19: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/19.jpg)
Summarizer API and NVMe commands
19
Initialization
Finalization
Computation
• NVMe command: INIT_TSKn• Transfer a in-SSD procedure to SSD memory• Initialize data structure and temporal variables for in-SSD
computation
• NVMe command: READ_PROC_TSKn• Page read command is issued with the flag indicating the user
procedure embedded in SSD memory• Return the special code if the requested page is processed in SSD• Page data is transferred to the host if the requested page is NOT
computed in SSD
• NVMe command: FINAL_TSKn• Gather final in-SSD computation results and transfer to the host
![Page 20: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/20.jpg)
Evaluation Platform
• LS2085a intelligent SSD development platform
• ARM cores running FTL and Summarizerfirmware
• FPGA implementing NAND flash controller
• PCIe Gen. 3 4x lanes for host communication
20
LS2085a
Interconnection
DDR4 Memory Controller
DRAM DRAM
CPU
L1D(32KB)
L2(1MB)
L1I(48KB)
CPU
L1D(32KB)
L1I(48KB)
PC
Ie(h
ost
–L
S2
08
5a
)
PC
Ie(L
S2
08
5a
-F
PG
A)
FPGA(ALTERA Stratix V)
NAND flash DIMMNAND flash DIMMs
CPU
L1D(32KB)
L2(1MB)
L1I(48KB)
CPU
L1D(32KB)
L1I(48KB)
![Page 21: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/21.jpg)
Evaluation - Performance
21
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
Static workload offloading
![Page 22: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/22.jpg)
Evaluation - Performance
22
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
CPU only processing (baseline) SSD only processing
![Page 23: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/23.jpg)
Evaluation - Performance
23
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
Summarizer Dynamic Offloading
![Page 24: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/24.jpg)
Evaluation - Performance
24
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
SSD processing + transfer time(internal + external + In-SSD processing)
Host CPU processing time
![Page 25: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/25.jpg)
Evaluation - Performance
25
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host timeExecution time normalized to baseline (CPU only)
![Page 26: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/26.jpg)
Evaluation - Performance
26
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
Ex
ecu
tio
n t
ime
(no
rma
lize
d t
o b
ase
lin
e)
![Page 27: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/27.jpg)
Evaluation - Performance
27
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
0.70 0.60
0.30
0.24
0.0
0.2
0.4
0.6
0.8
1.0
1.2
CPU only Dynamic
Chart TitleSDD time Host timeE
xe
cuti
on
tim
e (n
orm
ali
zed
to
ba
seli
ne
)
![Page 28: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/28.jpg)
Evaluation - Performance
28
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
0.70 0.62
0.30
0.24
0.0
0.2
0.4
0.6
0.8
1.0
1.2
CPU only Dynamic
Chart TitleSDD time Host time
Performance improved by 14%
Data computation @ host Data transfer from storage
InternalExternal (host – storage)
W/O NDP
With NDPData computation @ storage
![Page 29: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/29.jpg)
Evaluation - Performance
29
0
1
2
3
4
0 0.2 0.4 0.6 0.8 1
Static Dynamic
TPC-H Query6
SDD time Host time
Performance degraded by static NDP
![Page 30: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/30.jpg)
Evaluation - Performance
30
16% 10%
20% 7%
Ex
ecu
tio
n t
ime
(no
rma
lize
d t
o b
ase
lin
e)
Ex
ecu
tio
n t
ime
(no
rma
lize
d t
o b
ase
lin
e)
Ex
ecu
tio
n t
ime
(no
rma
lize
d t
o b
ase
lin
e)
Ex
ecu
tio
n t
ime
(no
rma
lize
d t
o b
ase
lin
e)
![Page 31: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/31.jpg)
Design Exploration – Higher Internal Bandwidth
31
Host
CPU Storage interface
Data transfer bottleneck
Commercial SSD maintains internal bandwidth ≈ external bandwidth
![Page 32: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/32.jpg)
Design Exploration – Higher Internal Bandwidth
32
Host
CPU Storage interface SP
Data transfer bottleneck
Higher internal bandwidth without increasing external bandwidth
![Page 33: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/33.jpg)
0%
20%
40%
60%
80%
100%
1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4
TPC-H Query 6 TPC-H Query 1 TPC-H Query 14 String Similarity Join Average
Sp
ee
du
pDesign Exploration – Higher Internal Bandwidth
33
External : Internal bandwidth ratio
![Page 34: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/34.jpg)
0%
20%
40%
60%
80%
100%
1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4 1:1 1:2 1:3 1:4
TPC-H Query 6 TPC-H Query 1 TPC-H Query 14 String Similarity Join Average
Sp
ee
du
pDesign Exploration – Higher Internal Bandwidth
34
Summarizer is effective if an SSD platform has higher internal bandwidth
![Page 35: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/35.jpg)
Design Exploration – Better SSD Processor
35
Host
CPU Storage interface
Better embedded processor is cost effective
AP
![Page 36: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/36.jpg)
Design Exploration – Higher Internal Bandwidth
36
0%
20%
40%
60%
80%
100%
120%
X1 X2 X4 X8 X16 X1 X2 X4 X8 X16 X1 X2 X4 X8 X16 X1 X2 X4 X8 X16 X1 X2 X4 X8 X16
TPC-H Query6 TPC-H Query1 TPC-H Query14 String Similarity Join Average
Sp
ee
du
p
Embedded processor performance
![Page 37: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/37.jpg)
Design Exploration – Higher Internal Bandwidth
37
0%
20%
40%
60%
80%
100%
120%
X1 X2 X4 X8 X16 X1 X2 X4 X8 X16 X1 X2 X4 X8 X16 X1 X2 X4 X8 X16 X1 X2 X4 X8 X16
TPC-H Query6 TPC-H Query1 TPC-H Query14 String Similarity Join Average
Sp
ee
du
p
Summarizer is a cost effective NDP solution with powerful storage processors
![Page 38: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/38.jpg)
Conclusion
38
▪Dynamic computation offloading framework• Opportunistic in-SSD computation
• Page-level task control
• Optimal performance improvement
▪ Summrizer programming model
✓ Dynamic NDP framework for SSDs• Opportunistically enables in-SSD processing• Page-level NDP control• Automatic workload partitioning
✓ Summarizer programming model• Evaluation on the real development platform• Explored design space for future SSDs
![Page 39: Trading Communication with Computing Near Storage...Summarizer: Trading Communication with Computing Near Storage Gunjae Koo*, Kiran Kumar Matam*, Te I†, H.V. KrishinaGiri Nara*,](https://reader034.vdocuments.us/reader034/viewer/2022051912/600285bdb0458246b37ede35/html5/thumbnails/39.jpg)
Thank you
Summarizer:Trading Communication with Computing Near Storage
Gunjae Koo, Kiran Kumar Matam, Te I, H. V. Krishna Giri Nara, Jing Li,Hung-Wei Tseng, Steven Swanson, Murali Annavaram
(We thank to Dell EMC for supporting the SSD development board)