Architecture and Circuits for
Dependable 3D-VLSI
Mitsumasa Koyanagi
New Industry Creation Hatchery Center
Tohoku University
Sendai, Japan
JST CREST/DVLSI
Ivo Bolsens (Xilinx CTO), 3D-Architectures for Semiconductor Integration
and Packaging, San Francisco, December 12-14, 2011
Why 3D ?
3D-eDRAM for L3 Cache (IBM)
Hybrid Memory Cube (HMC)
(Micron+Samsung+IBM)
2.5D and 3D FPGA (Xilinx)
3D System Integration (IBM)
3D DRAM
(Samsung)
3D DRAM (Elpida)
Dependability Related Concerns in 3D VLSI
Heat accumulation and heat removal
Influences of mechanical stress
Metal impurity contamination
Reliabilities of TSV’s and metal microbumps
Design methodology and design tools
Testing and test design
Dependability is Key Issue in 3D VLSI !
Architecture and Circuits for Dependable 3D-VLSI
100FIT
Random
Hardware
Failures
Management of
Design Process
etc.
Dangerous Failure
ASIL=C
Failure Rate Allocation to
Components Considering
System Structure
Dependability
Requirement
80FIT Test Coverage for Single-
Point Failures:97% Test Coverage for Combined
Failures:80%
1TFPLOS / 5W
Performance=
Marketability・Safety
Requirements by Application
Device Scaling
3D Integration
Limitation in increasing Test Coverage
Increase of Dynamic Failures
Fault Detection by Multi- Modular Redundant
Random Test
Self-Repair
Redundancy
Demand
Utilization
Realization
Necessity
Cons. Demand
3D VLSI
Failure
<Relation between Requirements by Application and Various Technologies>
Target : ISO 26262 ASIL=C
Application of 3D VLSI to Image Processing VLSI for Automobile
Performance
Requirement
Hazard-causing
Systematic Failure
After T. Kamada (Denso)
Architecture of 3D DVLSI System
Interconnect
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Ch
eck
po
inti
ng
System Level SVP Hardware Level SVP
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Error R
ecovery
System Level SVP
Interconnect
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Ch
eck
po
inti
ng
System Level SVP Hardware Level SVP
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Error R
ecovery
System Level SVP
Interconnect
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
System Level SVP Hardware Level SVP
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Ch
eckp
oin
ting
System Level SVP
Interconnect
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
System Level SVP Hardware Level SVP
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD (Vector)
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
System Level SVP
Interconnect
Wide SIMD
CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD CORE
GPP CORE
Many-Banked LM
Cache
Global Memory Ch
eck
- p
oin
tin
g
Erro
r R
eco
very
System Level SVP Hardware Level SVP
Wide SIMD CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
Wide SIMD CORE
GPP CORE
Many-Banked LM
Cache
Global Memory
System Level SVP Hardware Level SVP
Ch
eck
- p
oin
tin
g
Erro
r R
eco
very
After Prof. H. Kobayashi (Tohoku Univ.)
The Measure for A Reliability Target
MBIST
LBIST
JTAG
Memory
Logic
TSV
Static
failure
Dynamic
failure
Insufficient in a
verification condition
Soft error
Migration
Hot spot
Inspection vector lack
The classification of the error factor
The classification of the
inspection technique
Multi-
plexing
Majority
(New) SVP
The dynamic error case in a car
-Since a power supply variation goes into the place
where connection deteriorated ,then a circuit malfunctions.
- A timing error occurs in a hot spot.
The control cycle of a display system is usually 16mS.
If an operation support system also applies to this,
Then it recovers within 16mS is no problem.
It corresponds by the multiplexing majority.
- Throughput in 16mS
- Reliability of a majority circuit
Majority
It is assisted
by run time execution.
・Fault converge 97%
・80FIT
The case
where failure is
overlooked.
The case
which failure
cannot relieve.
insufficient throughput
insufficient spare circuit
The double failure
which induces the same result occurs.
A majority circuit
breaks down to the danger side.
Quantitive analysis is needed.
↓
It comes back to the reliability
of a majority circuit,
and throughput/parallelism.
After T. Kamada (Denso)
Test Architecture for 3D VLSI with Redundant Tiers
System-Level SVP
Computing Core 1
Computing Core m
Redundant Core 1
Redundant Core (n-m-1)
Hardware-Level SVP (FPGA)
System Configuration, Test System-Level SVP
Task Allocation, Online Self-Test and Repair Control
Test: BIST controlled through TAP (JTAG) I/F Repair: Replacement by another Redundant Core
Ver
tica
l Co
mm
on
Bu
s
Health Info., Logging
Vertically Stacked and Electrically Connected
by Through-Silicon Vias (TSVs) using 3D Integration Technology.
Repair by replacing the failed core with redundant core
Block Diagram of 3D DVLSI (e.g. 4tiers)
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
disabled
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
disabled
Tier 3
Tier 2
Tier 1
Tier 0 Exte
rnal
M
em
ory
Vertical Bus using TSVs
System Bus
PB Bridge
Processor Core
On-Line Self-Test
Controller
Stacked Shared
Memory
Vertical Bus
Bridge
Memory Controller
disabled
3D Test Access Port (3D TAP)
Self-Test Control by System-Level SVP in 3D Dependable VLSI System
• SVP (Supervisor Processor) controls TAP and Chain in the stacked 3D dependable LSI – Drives TCK, TMS, TRST, TDI – Read TDO to get test data
registers in the stacked dice
Assuming TAP signals are connected by quadruple TSV that has much higher reliability than single TSV
Tier Core
Tier
TA
P
intra-tier scan chain
chai
n r
etu
rn p
ath
Tier Core
Tier
TA
P
intra-tier scan chain
chai
n r
etu
rn p
ath
Tier Core
Tier
TA
P
intra-tier scan chain
chai
n r
etu
rn p
ath
Tier Core
Tier
TA
P
intra-tier scan chain
chai
n r
etu
rn p
ath
TDI L
TDO
L
TCK
L
TMS L
TRST
L
Top Tier
Tier i+1
Tier i
Bottom Tier
TAP Control Line JTAG Scan Chain
3D DfT Architecture
Functional Design
• Stacked Dies, Core-Based • Inter-Connect: TSVs • Extra-Connect: Pins
Existing Design-for-Test
• Core: Internal Scan, TDC, LBIST, MBIST; IEEE 1149.1 wrappers, TAPC • Stack Product: IEEE Std 1149.1
3D-DfT Architecture - Test Wrapper per Die
• Based on IEEE 1149.1 • Two Entry/Exit Points per Die: - Pre-Bond : Extra Probe Pads - Post-Bond: Extra TSVs
Ps
eu
do
-Ran
do
m
Pa
tte
rn G
en
era
tor
MIS
R/ C
om
pa
rato
r
Die 1
(Bottom)
Ps
eu
do
-Ran
do
m
Pa
tte
rn G
en
era
tor
MIS
R/ C
om
pa
rato
r
Ps
eu
do
-Ran
do
m
Pa
tte
rn G
en
era
tor
MIS
R/ C
om
pa
rato
r
Sys. SVP
(Top)
Sys. SVP
Die n
TDI TDO TDI TDO
HW SVP
New 3D VLSI DfT Architecture for Online Self-Test of Dies Based on IEEE 1149.1.
Tier BIST Dynamically Controlled by System-Level SVP
Memoryグループ
collar
controller
TAPC
Soft-IP(core)
hph05shxe100(core)
hph05shxeva2n0
scan_chain
Soft-IP(ext)
CPG/PLL
STC-DIV
PR
PG
MIS
R
IR DR
CO
MP
AR
E
hph05shxeva2n0(SVP)
TDI,TMS,TCK
TSV-scan
各moduleへクロックを供給
STC-OCC
scan_chain
scan_chain
scan_chain
scan_chain
scan_chain
scan_chain
scan_chain
最大F/F段数 1500
chai
n数
32本
Memoryグループ
collar
controller
Memoryグループ
collar
controller
Memoryグループ
collar
controller
Memoryグループ
collar
controller
Memoryグループ
collar
controller
LBIST/PRPG LBIST/MISR
LBIST/Comparator
Memory+MBIST Controller
TAP Controller
Scan Chains
TSV Test/Repair Circuit Tier
Max chain length = 1500
32
sca
n c
hai
ns
General Purpose Processor
Clock Controller
from/to Sys-SVP
Peripherals
Redundancy for Through Si Vias (TSVs)
No Repair Multiplexed TSV With TSV Repair
m: multiplicity n signals : r redundant TSVs
2 4 4:2 16:4
Area +0% +100% +300% +50% +25%
Capacity +0% +100% +300% +0% +0%
Switches/Sig 0 0 0 3 5
TSV Group Yield (n TSVs)
RTSVn 1 − 1 − RTSV
m n n+ r
iRTSV
i 1 − RTSVn+r−i
n+r
i=n
2,000 1.9 × 10−7% 81.87% 99.99% 99.03% 99.98%
5,000 1.5 × 10−20% 60.65% 99.99% 97.59% 99.96%
10,000 2.2 × 10−42% 36.79% 99.99% 95.23% 99.91%
20,000 5.1 × 10−86% 13.53% 99.98% 90.69% 99.83%
Ass
em
bly
Y
ield
*
*assumed RTSV = 0.99
Samsung (ISSCC 2009) This Work
Approach for HW-SVP
Establishing a self-repair scheme for HW-SVP
Soft-error recovery using dynamic reconfiguration
Designing recovery controller and Scrubbing controller
Designing fault-tolerant system using TMR scheme
Hard-error avoidance using partial reconfiguration
Relocating partial reconfiguration bitstream (PRB)
Designing TMR scheme with Spare
Self-repair scheme and Evaluation system
Developing Evaluation system to evaluate soft-error tolerability
TMR : Triple Modular Redundancy
After Prof. T. Sueyoshi (Kumamoto Univ.)
System Configuration of HW-SVP
Triplicating processor core and peripheral modules
Implementing RM, RC and Spare
RM and RC control recovery sequence
Spare is used for hard-error avoidance
RC : Recovery Controller
Plasma Plasma
Plasma
Spare
Selector + Voter + Detector
Memory (ECC protected)
メモリコントローラ
UART
メモリコントローラ
UART
Memory controller
UART RC ICAP
FrameECC
Memory
ICAP : Internal Configuration Access Port
RM
RM : Recovery Module
Implemented on:Xilinx Virtex-6 XC6VLX240T
After Prof. T. Sueyoshi (Kumamoto Univ.)
Soft-Error Recovery in HW-SVP
Readback and Overwrite reconfiguration (Scrubbing)
ICAP : Internal Configuration Access Port
* Frame : Minimum unit of reconfiguration (1 frame = 2,592bit on Virtex-6)
Readback and error detect
ICAP
Frame ECC ・・・
FPGA
・・・ ・・・ ・・・
(3) Repair readback data
(4) Overwrite same frame
Reconfigure to correct error
Error detected
Apply these sequence for all frame
ICAP
Frame ECC
Frame*
・・・
FPGA
・・・ ・・・ ・・・
(2) Create syndrome
(1) Readback configuration data
After Prof. T. Sueyoshi (Kumamoto Univ.)
ICAP
Hard-Error Recovery in HW-SVP Relocate PRB and separate a broken module
・・・ ・・・ ・・・
FPGA
・・・ ・・・ ・・・
Module_0
Module_1
Module_2
Spare
Hard error
Selector
Voter
Implementing a copy of Module on Spare to reconstruct TMR configuration
* This is realized by uniforming inner configuration of PR region (reported on Dec. 2011)
Readback
Reconfiguration
PRB relocation *
After Prof. T. Sueyoshi (Kumamoto Univ.)