drisa: a dram-based reconfigurable in-situ acceleratorshuangchenli/tr/drisa v1.0.pdf · scalable...
TRANSCRIPT
![Page 1: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/1.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
http://seal.ece.ucsb.edu/ SEAL@UCSB
Scalable and Energy-efficient Architecture Lab (SEAL)
DRISA: A DRAM-based
Reconfigurable In-Situ Accelerator
Shuangchen Li, Dimin Niu, Krishna T. Malladi,
Hongzhong Zheng, Bob Brennan, Yuan Xie
University of California, Santa Barbara
Memory Solutions Lab, Samsung Semiconductor Inc.
![Page 2: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/2.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Motivation and Observation
• Merging the computing resources
and memory fabrics
2
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
Norm
aliz
ed O
n-c
hip
M
em
.Capacity p
er A
rea
Normalized Peak Perf. per Area
![Page 3: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/3.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Motivation and Observation
• Merging the computing resources
and memory fabrics
– Memory-rich processor: low memory
capacity
2
Shidiannao (ASICs)
Dadiannao
TITAN X (GPU)
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
Norm
aliz
ed O
n-c
hip
M
em
.Capacity p
er A
rea
Normalized Peak Perf. per Area
Memory-rich
Processor
![Page 4: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/4.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Motivation and Observation
• Merging the computing resources
and memory fabrics
– Memory-rich processor: low memory
capacity
– Compute-capable memory: low
performance
2
Shidiannao (ASICs)
BufferedComp
NeuroCube
Dadiannao
TITAN X (GPU)
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
Norm
aliz
ed O
n-c
hip
M
em
.Capacity p
er A
rea
Normalized Peak Perf. per Area
Compute-capable
Memory (PIM)
Memory-rich
Processor
![Page 5: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/5.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Motivation and Observation
• Merging the computing resources
and memory fabrics
– Memory-rich processor: low memory
capacity
– Compute-capable memory: low
performance
2
Shidiannao (ASICs)
BufferedComp
NeuroCube
Dadiannao
This Work
TITAN X (GPU)
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
Norm
aliz
ed O
n-c
hip
M
em
.Capacity p
er A
rea
Normalized Peak Perf. per Area
Compute-capable
Memory (PIM)
Memory-rich
Processor
![Page 6: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/6.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Motivation and Observation
• Merging the computing resources
and memory fabrics
– Memory-rich processor: low memory
capacity
– Compute-capable memory: low
performance
2
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
Shidiannao (ASICs)
BufferedComp
NeuroCube
Dadiannao
This Work
TITAN X (GPU)
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
Norm
aliz
ed O
n-c
hip
M
em
.Capacity p
er A
rea
Normalized Peak Perf. per Area
Compute-capable
Memory (PIM)
Memory-rich
Processor
![Page 7: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/7.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Key Ideas and Approaches
3
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
![Page 8: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/8.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Key Ideas and Approaches
3
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
DRAM technology
![Page 9: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/9.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Key Ideas and Approaches
3
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
DRAM technology
Logic Incompatible
![Page 10: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/10.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Key Ideas and Approaches
3
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
DRAM technology
Logic Incompatible
Simple Boolean logic
Operation
Bitline
SA
Cells
NOR
![Page 11: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/11.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Key Ideas and Approaches
3
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
DRAM technology
Logic Incompatible
Simple Boolean logic
Operation
General Purpose
Reconfigurable
Bitline
SA
Cells
NOR
SHIFT
![Page 12: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/12.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Key Ideas and Approaches
3
To have BOTH:
(1) Use DRAM technology
(2) Remove “sys-memory” constraints
Building an accelerator with DRAM
technology
DRAM technology
Logic Incompatible
Simple Boolean logic
operations
General Purpose
Reconfigurable
High Pref. Improve Parallelism
Unblock Data Mov.
Optimize Activation
Multi-subarray
active
Multi-bank active
Bitline
SA
Cells
NOR
SHIFT
![Page 13: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/13.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
4
(a) Chip
Group
Bank
Group
BankBank
Bank
Group
![Page 14: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/14.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
4
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
Subarry
Mat
BankBank
Bank
Group
![Page 15: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/15.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
4
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
(c) Subarray and mat
sCtrl
DRAM Cells
SA supports Boolean logic operations
Shifter Subarry
Mat
BankBank
Bank
Group
![Page 16: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/16.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
– Change decoders to controllers
4
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
(c) Subarray and mat
sCtrl
DRAM Cells
SA supports Boolean logic operations
Shifter Subarry
Mat
BankBank
Bank
Group
![Page 17: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/17.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
– Change decoders to controllers
– Change SA to support logic operations
4
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
(c) Subarray and mat
sCtrl
DRAM Cells
SA supports Boolean logic operations
Shifter Subarry
Mat
BankBank
Bank
Group
![Page 18: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/18.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
– Change decoders to controllers
– Change SA to support logic operations
– Add shifters
4
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
(c) Subarray and mat
sCtrl
DRAM Cells
SA supports Boolean logic operations
Shifter Subarry
Mat
BankBank
Bank
Group
![Page 19: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/19.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Architecture Overview
• DRAM modifications:
– Change decoders to controllers
– Change SA to support logic operations
– Add shifters
– Others: Group/Bank buffers helps internal data transfer, Bank/Subarray reorganization,
Spitted cell array regions 4
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
(c) Subarray and mat
sCtrl
DRAM Cells
SA supports Boolean logic operations
Shifter Subarry
Mat
BankBank
Bank
Group
(a) Chip (b) Bank
Group
Bank
Group
bC
trl
Mat
(c) Subarray and mat
sCtrl
DRAM Cells
SA supports Boolean logic operations
Shifter Subarry
Mat
BankBank
Bank
Group
![Page 20: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/20.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (1/2)
• Three solutions:
5
Bitline
SA
Cells
NOR
SHIFT
![Page 21: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/21.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (1/2)
• Three solutions:
– 3T1C: natural NOR on BL
5
Rs
Rt
Rr
rWL
wBL
rBL
SA
wWL
3T1C-NOR
Bitline
SA
Cells
NOR
SHIFT
![Page 22: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/22.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (1/2)
• Three solutions:
– 3T1C: natural NOR on BL
– 1T1C: adds gates or adopting AMBIT’s methods
5
Rs
Rt
Rr
rWL
wBL
rBL
SA
wWL
3T1C-NOR
10
0 1
01
0.3 0.6
0 1
<0.5 >0.5SA
and
Pre-load
orRs
Rt
Rr latch
logic gate
Rs
Rt
Rr
SAOr
1T1C-NOR/MIX
Bitline
SA
Cells
NOR
SHIFT
![Page 23: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/23.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (1/2)
• Three solutions:
– 3T1C: natural NOR on BL
– 1T1C: adds gates or adopting AMBIT’s methods
– 1T1C-adder: adds full-adders to BL
5
Rs
Rt
Rr
rWL
wBL
rBL
SA
wWL
3T1C-NOR
10
0 1
01
0.3 0.6
0 1
<0.5 >0.5SA
and
Pre-load
orRs
Rt
Rr latch
logic gate
Rs
Rt
Rr
SAOr
1T1C-NOR/MIX
...
...
...
...latches
n-bit adder
Rs
Rt
Rr
SA
1T1C-ADDER
Bitline
SA
Cells
NOR
SHIFT
![Page 24: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/24.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
Bitline
SA
Cells
NOR
SHIFT
![Page 25: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/25.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
Bitline
SA
Cells
NOR
SHIFT
![Page 26: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/26.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
Bitline
SA
Cells
NOR
SHIFT
![Page 27: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/27.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
Bitline
SA
Cells
NOR
SHIFT
![Page 28: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/28.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
Step-1: ෨𝑋 = NOR(0, 𝑋)
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
!X
Bitline
SA
Cells
NOR
SHIFT
![Page 29: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/29.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
Step-1: ෨𝑋 = NOR(0, 𝑋)
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
!X
!Y
Step-2: ෨𝑌 = NOR(0, 𝑌)
Bitline
SA
Cells
NOR
SHIFT
![Page 30: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/30.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
Step-1: ෨𝑋 = NOR(0, 𝑋)
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
!X
!Y
!S
Step-2: ෨𝑌 = NOR(0, 𝑌)
Step-3: ሚ𝑆 = NOR(0, 𝑆)
Bitline
SA
Cells
NOR
SHIFT
![Page 31: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/31.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
!X
!Y
!S
!(!X+!S)Step-4: tmp1 = NOR( ሚ𝑆, ෨𝑋)
Bitline
SA
Cells
NOR
SHIFT
![Page 32: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/32.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
!X
!Y
!S
!(!X+!S)
!(!Y+S)
Step-4: tmp1 = NOR( ሚ𝑆, ෨𝑋)
Step-5: tmp2 = NOR(𝑆, ෨𝑌)
Bitline
SA
Cells
NOR
SHIFT
![Page 33: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/33.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
X
Y
S
!X
!Y
!S
!(!X+!S)
!(!Y+S)
!R
Step-4: tmp1 = NOR( ሚ𝑆, ෨𝑋)
Step-5: tmp2 = NOR(𝑆, ෨𝑌)
Step-6: ෨𝑅 = NOR(tmp1,tmp2)
Bitline
SA
Cells
NOR
SHIFT
![Page 34: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/34.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Make BL Be Able To Compute (2/2)
• Example: selector
6
X
Y
S
!X
!Y
!S
!(!X+!S)
!(!Y+S)
!R
R
𝑅 = 𝑆 ⋅ 𝑋 + ሚ𝑆 ⋅ 𝑌
෨𝑅 = NOR( NOR( ሚ𝑆, ෨𝑋), NOR(𝑆, ෨𝑌) )
NOR-only logic
𝑅 = (𝑆 == 1)? 𝑋: 𝑌
Step-7: 𝑅 = NOR(0, ෨𝑅)
Bitline
SA
Cells
NOR
SHIFT
![Page 35: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/35.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (1/2)
• Why include shifters:
– E.g., carry-in propagation
7
Bitline
SA
Cells
NOR
SHIFT
![Page 36: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/36.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (1/2)
• Why include shifters:
– E.g., carry-in propagation
7
X0
Y0
Cin0
X1
Y1
Bitline
SA
Cells
NOR
SHIFT
![Page 37: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/37.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (1/2)
• Why include shifters:
– E.g., carry-in propagation
7
X0
Y0
Cin0
S0
X1
Y1
Bitline
SA
Cells
NOR
SHIFT
![Page 38: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/38.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (1/2)
• Why include shifters:
– E.g., carry-in propagation
7
X0
Y0
Cin0
S0
Cout0
X1
Y1
Bitline
SA
Cells
NOR
SHIFT
![Page 39: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/39.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (1/2)
• Why include shifters:
– E.g., carry-in propagation
7
X1
Y1
X1
Y1
X0
Y0
Cin0
S0
Cout0
Cin1
Bitline
SA
Cells
NOR
SHIFT
![Page 40: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/40.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (2/2)
• Multiple hierarchies:
8
Bitline
SA
Cells
NOR
SHIFT
![Page 41: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/41.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (2/2)
• Multiple hierarchies:
– Intra-lane: bit shift inside 8 bit lane
8
Virtual lane (INT8) Virtual lane (INT8)
Bitline
SA
Cells
NOR
SHIFT
![Page 42: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/42.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (2/2)
• Multiple hierarchies:
– Intra-lane: bit shift inside 8 bit lane
– Inter-lane: array element shift
8
Virtual lane (INT8) Virtual lane (INT8)
Bitline
SA
Cells
NOR
SHIFT
![Page 43: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/43.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Shifters (2/2)
• Multiple hierarchies:
– Intra-lane: bit shift inside 8 bit lane
– Inter-lane: array element shift
– Forwarding: access any element in the array
8
Virtual lane (INT8) Virtual lane (INT8)
Bitline
SA
Cells
NOR
SHIFT
![Page 44: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/44.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Putting Compute-capable BLs and Shifters Together
• Observations:
– CSA is preferred: reduction works fine
9
0
10
20
30
40
2 4 8 16
Cycle
s
Operand bit length
CSA FA
![Page 45: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/45.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Putting Compute-capable BLs and Shifters Together
• Observations:
– CSA is preferred: reduction works fine
– Affordable MUL: need to have one operand within 2-bit
9
0
10
20
30
40
2 4 8 16
Cycle
s
Operand bit length
CSA FA
1
10
100
1000
1 2 4 8 16
Cycle
s
Operand-1 bit length
Operand-2 bit length = 2 bit 4 8 16
![Page 46: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/46.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Optimizations for high performance
10
![Page 47: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/47.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Optimizations for high performance
10
DRAM technology
Logic Incompatible
Simple Boolean logic+ Serially run
General Purpose
Reconfigurable
High Pref.
![Page 48: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/48.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Optimizations for high performance
10
DRAM technology
Logic Incompatible
Simple Boolean logic+ Serially run
General Purpose
Reconfigurable
High Pref.
![Page 49: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/49.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Optimizations for high performance
• Adopting commodity DRAM:
– 13-cycles for 8-bit CSA
– tRC (46ns) 10
DRAM technology
Logic Incompatible
Simple Boolean logic+ Serially run
General Purpose
Reconfigurable
High Pref.
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
No
rma
lize
d O
n-c
hip
M
em
.Ca
pa
city p
er A
rea
Normalized Peak Perf. per Area
Compute-capable
Memory (PIM)
Memory-rich
Processor
![Page 50: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/50.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Optimizations for high performance
• Adopting commodity DRAM:
– 13-cycles for 8-bit CSA
– tRC (46ns) 10
DRAM technology
Logic Incompatible
Simple Boolean logic+ Serially run
General Purpose
Reconfigurable
High Pref.
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
No
rma
lize
d O
n-c
hip
M
em
.Ca
pa
city p
er A
rea
Normalized Peak Perf. per Area
Compute-capable
Memory (PIM)
Memory-rich
Processor
un-optimized
![Page 51: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/51.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Optimizations for high performance
• Adopting commodity DRAM:
– 13-cycles for 8-bit CSA
– tRC (46ns) 10
DRAM technology
Logic Incompatible
Simple Boolean logic+ Serially run
General Purpose
Reconfigurable
High Pref. Improve Parallelism
Unblock Data Mov.
Optimize Activation
Target
1.E+00
1.E+01
1.E+02
1.E+03
1E+00 1E+01 1E+02 1E+03 1E+04
No
rma
lize
d O
n-c
hip
M
em
.Ca
pa
city p
er A
rea
Normalized Peak Perf. per Area
Compute-capable
Memory (PIM)
Memory-rich
Processor
un-optimized
![Page 52: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/52.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Experiment Setup
• DRISA circuit simulator:
– Heavily modified CACTI
– Digital circuit (controller, logic gates)
• From Design Compiler synthesis
• Scaled to DRAM process with 20% perf.
Overhead and 80% area overhead (ISCAS’99)
• DRISA performance simulator:
– A behavior-level simulator
– Including a mapping optimization
framework
11
Performance
Simulator
[In-house]
Mapping
scheme
Design
options
# mat/
subarr
y/bank
Speed
Power
Circuit Simulator
[DesignCompiler+
CACTI-3DD]
Devise
parameter
Design
options
Circuits
Latency/
cyclesPower/ops
Area
Leakage
NN
topology
![Page 53: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/53.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 54: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/54.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 55: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/55.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 56: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/56.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 57: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/57.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 58: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/58.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 59: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/59.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
• 3T1C is not good
– The lowest area overhead
– Large memory cells
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 60: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/60.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Binary weight, 8-bit activation CNN inference
case study
• 3T1C is not good
– The lowest area overhead
– Large memory cells
• 1T1C-adder is not the
best
– The best peak performance
– Low effective performance
• 1T1C-mixed is the best
solution
12
1E-02
1E-01
1E+00
1E+01
1E+02
1 8 64 1 8 64 1 8 64 1 8 64
AlexNet vgg-16 vgg-19 resnet-152 GM
Perf
/Are
a (
fr./
s/m
m2)
3T1C 1T1C-nor
1T1C-mixed 1T1C-adder
GPU-INT
![Page 61: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/61.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
More in the paper
• Microarchitectures of BL-logic operations and shifter
• Interface design
• Optimizations for high performance
• Impact of variation
• CNN mapping and optimizations
• Detail experiment setup and more results
13
![Page 62: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/62.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
Summary
• In-situ computing: building an accelerator with DRAM
technology
– DRAM for large memory capacity
– BL-computing logic design + Shifter for general purpose instructions
– Optimized for high computing performance
14
• Experiments on binary CNN
acceleration:
– perf. per area 8.8x than
ASIC,7.7x than GPU
– energy efficiency per area:
1.2x than ASIC, 15x than GPU
Multi-subarray
active
Multi-bank active
Bitline
SA
Cells
NOR
Bitline
SA
Cells
NOR
SHIFT
![Page 63: DRISA: A DRAM-based Reconfigurable In-Situ Acceleratorshuangchenli/TR/DRISA v1.0.pdf · Scalable and Energy-efficient Architecture Lab (SEAL) Key Ideas and Approaches 3 To have BOTH:](https://reader033.vdocuments.us/reader033/viewer/2022050410/5f879755adc2cc12c2142234/html5/thumbnails/63.jpg)
Scalable and Energy-efficient Architecture Lab (SEAL)
http://seal.ece.ucsb.edu/ SEAL@UCSB
Scalable and Energy-efficient Architecture Lab (SEAL)
DRISA: A DRAM-based
Reconfigurable In-Situ AcceleratorShuangchen Li, Dimin Niu, Krishna T. Malladi,
Hongzhong Zheng, Bob Brennan, Yuan Xie
University of California, Santa Barbara
Memory Solutions Lab, Samsung Semiconductor Inc.
Questions?