smallest risc-v device for next-generation edge computing€¦ · title: ngs joint program...
TRANSCRIPT
![Page 1: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/1.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Smallest RISC-V Device for Next-Generation Edge Computing
1
Seiji Munetoh1, Chitra K Subramanian2, Arun Paidimarri2, Yasuteru Kohda1 IBM Research – Tokyo1 & T.J. Watson Research Center2
![Page 2: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/2.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Processor chip size, Transistor count and Technology node
2
Server Desktop Mobile
Embedded
RFID
Our target
• A simple microprocessor core uses 100K-1M transistors, and can fit in an area as small as 100X100 um2 using advanced technology nodes.
• Running at 10 MHz, such a microprocessor will consume 1-10 mW (and much less, if it runs slower). • The creation of compute elements that are ultra-compact and low cost will enable a dramatic
expansion of applications in areas from security to IoT to health care and beyond
Next-gen edge?
![Page 3: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/3.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Our 1st target application – Authentication
Hash based authentication –HMAC-SHA256 and variants
Host/device communication – Optical –With micro-LED and micro-PV/PD cells
–Protocol, UART, HDLC frame, custom payloads
Bootloader ROM (synthesized, embedded in the proc. chip) –Basic device authentication
–Upload new application to the device
Storage Memory (as external chip) –Use SRAM to emulate NV memory chip
3
![Page 4: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/4.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
MicroLED RISC-V
processor (DD1)
32KB SRAM
Our 1st gen. processor and 2.5D integrated device
ASIC: 300um x 250um, GF14LPP
SoC: Based on PULPino (RV32IMC)
Memory: 2KB data SRAM
+ Authentication engine
+ Analog custom circuits(LDO, Clock...)
4
Si interposer < 1mm2 , 20𝜇m bump
+ Processor
+ Memory (32KB SRAM)
+ Optical I/O: PD, MicroLED
+ Power: PV cells (1V,3V)
Processor (DD2)
2.5D integration device
![Page 5: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/5.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Original PULPino Architecture (32KB SRAM x2)
32KB Inst SRAM
32KB Data SRAM
Boot ROM (xKB)
SPI-Flash (xKB)
5
RV32IMC
core 32KB
data
SRAM
MU
X M
UX
boot
ROM
MU
X
AXI
APB
bridge bridge bridge
SPI-Flash
bridge
APB SPI
master
32KB
inst SRAM
inst SRAM 32KB
data SRAM 32KB
boot ROM 8KB
IO space
0x0000 0000
0x0000 8000
0x0000 4000
0x0010 0000
Application
Copy to inst. SRAM & execution
UART I2C
![Page 6: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/6.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Architecture evaluation using COTS FPGA boards
Reduce Memory (SRAM/Flash) size
Reduce # of I/O pins
Confirm Performance
Develop BootROM w/ Uploader
Emulate & Test ASIC design
–Custom analog circuits
(LDO, Clock OSC, PD,LED)
6
ZedBpard
(Original PULPino)
ZYBO
(Modified PULPino)
ARTY
(ASIC emulation)
Add-on PCB
Comm, I2C.SPI
ASIC
![Page 7: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/7.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Reduce Memory footprint: Original PULPino: 32KB(I)+32KB(D)+Ext Flash Reduced memory: 2KB(D) + 32KB Ext SRAM
PULPino constraint: SRAM size > App size
Remove Inst. SRAM from processor die, and use
external memory to store and exec the code (XIP)
– SRAM area (in 14nm):
60Kmm2 /64KB => 4Kmm2 /2KB, (1/14)
Support XIP
Expand bus widths from 4 to 8,16,32
Evaluate the performance of 4,8,16,32 bit data bus
widths between Proc. and ext. SRAM
7
32KB inst SRAM
32KB data SRAM
2KB data SRAM 32KB inst
SRAM
2KB data SRAM 32KB inst
SRAM
2KB data SRAM 32KB inst
SRAM
2KB data SRAM 32KB inst
SRAM
4-bit
8-bit
16-bit
32-bit
Modifications to PULPino to Reduce Size
Original
Candidates
![Page 8: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/8.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Original
Candidates
Reduced memory footprint: Application Execution Performance Trade Off
Clock/Instruction
– 1(original), 76(4), 40(8), 24(16), 16(32)
Application execution time from ext. SRAM
– 8 ~ 36 times slow
Choose the 8-bit bus configuration,
– IO footprint fits within SRAM memory chip size
– Adequate for most applications
But, a hash calculation by SW (XIP on ext.
SRAM) is too slow since it requires greater
computational power 8
32KB inst SRAM
32KB data SRAM
2KB data SRAM 32KB inst
SRAM
2KB data SRAM 32KB inst
SRAM
2KB data SRAM 32KB inst
SRAM
2KB data SRAM 32KB inst
SRAM
4-bit
8-bit
16-bit
32-bit
x 35.6 slow
x 18.9 slow
x 11.7 slow
x 8.1 slow
Candidates
![Page 9: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/9.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Original
Candidates
With or without of Authentication Engine HW
Authentication engine
– SHA256, HMAC, Benes-network
SHA256 performance
– Original: 7760 clk/blk
– 8-bit bus: 242966 clk/blk
– 8-bit bus + HW assist: 12474 clk/blk
– Original + HW assist: 350 clk/blk
Area (FPGA)
– LUT 18K-> 21K, + 33%
– FF 15K->17K, + 21%
9
32KB inst SRAM
32KB data SRAM
2KB data SRAM 32KB inst
SRAM 8-bit
2KB data SRAM 32KB inst
SRAM 8-bit Auth. Engine
(w/o engine) x 31.3 slow
(w/ engine) x 1.6 slow
Hash Algorithm performance is adequately restored with addition of the authentication engine HW. And the BootROM can support Hash calc. w/ Auth. engine
![Page 10: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/10.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Our Modified Processor Architecture (2KB SRAM + external 32KB MPI-SRAM)
2KB Data SRAM
Boot ROM (xKB)
32KB MPI-SRAM (XIP)
Analog
10 10 IBM Confidential
RV32IMC
core 2KB
data
SRAM
MU
X M
UX
boot
ROM
MU
X
32 KB MPI-SRAM
Application
AXI
APB
MU
X
bridge bridge bridge bridge
AXI MPI
master bridge
data SRAM 2KB
boot ROM 8KB
MPI-SRAM XIP
0x0000 0000
0x0000 8000
0x0000 4000
0x0020 0000
IO space 0x0010 0000
Authentication
Engine
Analog
Clock gen
Reset gen
PD in
LED out UART
SoC ctrl
Micro
LED
PV, PD
XIP
8-bit width
![Page 11: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/11.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
ASIC Implementation: Processor, MPI-SRAM, Debug Chip
Global foundries 14LPP
11
X=295um Y=256um (0.076mm2)
Processor die 32KB MPI SRAM die
X=164um Y=256um
Debug die (Proc + SRAM + PADRING)
X=?um Y=?um
2.5D integration
![Page 12: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/12.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Testing the Debug Chip
12
Singulation Packaging Testing the operation
Dicing QFP64, wire bonding On testbed PCB
Processors
SRAM SR
AM
Pro
cess
ors
Power ON
RSTn
CLK
Processor send out 0x00 to host
Boot – OK Application upload & run - OK
![Page 13: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/13.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Testing the 2.5D integrated Device
13
Singulation Packaging Testing the operation
Etching On Si interposer On probe station
Boot – OK
![Page 14: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/14.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Summary of processor specs
14
PULPino Our processor
ASIC node 65nm 14nm
ASIC size 1mm2 0.076mm2
Memory I-SRAM 32KB
D-SRAM 32KB
I-SRAM N/A
D-SRAM 2KB
Ext. Memory 128MB Flash MPI-SRAM 32KB
Clock XXMhz 1-100MHz
I/O UART, I2C.SPI UART
Debug SPI slave (SPI slave)
Analog - LDO, Clock/Reset, LED driver, PD input
![Page 15: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/15.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018
Conclusions and future plans
Our 1st generation device is under full evaluation
–Preliminary tests showed functionality
–Target application: authentication
Our 2nd generation device was taped out Feb. 2018
–New SoC design with I-cache, radio interface, sensors
–Target applications: Blockchain and IoT application
Our 3rd generation device is under consideration
It's all RISC-V
15
![Page 16: Smallest RISC-V Device for Next-Generation Edge Computing€¦ · Title: NGS Joint Program Proposals Author: Dinesh Verma Created Date: 5/8/2018 4:44:50 PM](https://reader033.vdocuments.us/reader033/viewer/2022042918/5f5ee2bafcc813260d66ac00/html5/thumbnails/16.jpg)
Copyright (c) 2018 IBM All Right Reserved. May 8, 2018 16
MicroLED RISC-V processor
32KB SRAM
Thank you
100𝝁m