monolithic 3d ic: the time is now - stanford university · monolithic 3d ic: the time is now brian...
TRANSCRIPT
Monolithic 3D IC: The Time is Now
Brian Cronquist and Zvi Or-Bach MonolithIC 3D Inc.
2014 Intl. Workshop on Data-Abundant System Technology, April 2014
2014 Intl. Workshop on Data-Abundant System Technology, April 2014
Agenda:
There is a Bright Future for the Semiconductor Industry, yet, we are reaching an Inflection Point
Monolithic 3D IC – The next generation technology driver What is it?
Monolithic 3D provides more than just scaling
Four paths proposed to enable monolithic 3D ICs Path 3
A few near–term applications 3D FPGA to prototype 2D SoC
Logic Redundancy
Memory on Logic
The Bright Future
The semiconductor opportunity is growing
2014 Intl. Workshop on Data-Abundant System Technology, April 2014
$15-34 trillion, annual =>~$5T Semi /year
Source: McKinsey Global Institute Analysis 2013
Semiconductor Industry is Facing an
Inflection Point Dimensional Scaling has reached Diminishing Returns
However…
2014 Intl. Workshop on Data-Abundant System Technology, April 2014
The Current 2D-IC is Facing Escalating Challenges
Lithography is Dominating Fab cost Dominating device cost and
diminishing scaling’s benefits Dominating device yield Dominating IC development costs
On-chip interconnect is Dominating device power
consumption Dominating device performance Penalizing device size and cost
Cost per transistor is no longer scaling
Moore’s Law has stopped at 28nm
Dinesh Maheshwari, CTO, Memory Products Division at Cypress Semiconductors, ISSCC2014
Embedded SRAM isn’t Scaling Beyond 28nm (1.1x instead off 4x) eSRAM > 60% of Die Area => End of Dimensional Scaling !
Monolithic 3D – The next generation technology driver
“CEA-Leti Signs Agreement with Qualcomm to Assess Sequential (monolithic)3D Technology” Business Wire December 08, 2013
“Monolithic 3D (M3D) is an emerging integration technology poised to reduce the gap significantly between transistors and interconnect delays to extend the semiconductor roadmap way beyond the 2D scaling trajectory predicted by Moore’s Law.”
Geoffrey Yeap, VP of Technology at Qualcomm, Invited paper, IEDM 2013
14
Enables: TSV Monolithic
Layer Thickness
∼50µ ~50nm
Via Diameter ~5µ ~50nm
Via Pitch ~10µ ~100nm
Wafer (Die) to Wafer
Alignment
~1µ ~1nm
Overall Scale
microns nano-meters
MONOLITHIC 10,000x the Vertical Connectivity of TSV
Monolithic 3D version of the same 2D logic core is ½ silicon area and ¼ the footprint
MonolithIC 3D Inc. Patents Pending 15
22nm node 600MHz logic core
2D-IC 3D-IC 2 Device Layers
Comments
Metal Levels 10 10 Average Wire Length 6um 3.1um Av. Gate Size 6 W/L 3 W/L Since less wire cap. to drive Die Size (active silicon area)
50mm2 24mm2 3D-IC Shorter wires smaller gates lower die area wires even shorter 3D-IC footprint = 12mm2
Power Logic = 0.21W Logic = 0.1W Due to smaller Gate Size Reps. = 0.17W Reps. = 0.04W Due to shorter wires
Wires = 0.87W Wires = 0.44W Due to shorter wires Clock = 0.33W Clock = 0.19W Due to less wire cap. to drive
Total = 1.6W Total = 0.8W IntSim3D
16
Processing on top of copper interconnects should not make the copper interconnect exceed 400oC
How to bring mono-crystallized silicon on top at less than 400oC How to fabricate state-of-the-art transistors on top of copper interconnect
and keep the interconnect below at less than 400oC
Misalignment of pre-processed wafer to wafer bonding step is ~1um
How to achieve 100nm or better connection pitch How to fabricate thin enough layer for inter-layer vias of ~50nm
The Monolithic 3D Challenge Why is it not already in wide use?
Layer Transfer (“Ion-Cut”/“Smart-Cut”) The Technology Behind SOI
p- Si
Oxide
p- Si
Oxide H
Top layer
Bottom layer
Oxide
Hydrogen implant
of top layer
Flip top layer and
bond to bottom layer
Oxide
p- Si
Oxide
H
Cleave using 400oC
anneal or sideways
mechanical force. CMP.
Oxide
Oxide
Similar process (bulk-to-bulk) used for manufacturing all SOI wafers today
p- Si
MonolithIC 3D - 3 Classes of Solutions
RCAT – Process the high temperature on generic structures prior to ‘smart-cut’, and finish with cold processes – Etch & Depositions
Gate Replacement (=Gate Last, HKMG) - Process the high temperature on repeating structures prior to ‘smart-cut’, and finish with ‘gate replacement’, cold processes – Etch & Depositions
Laser Annealing – Use short laser pulse to locally heat and anneal the top layer while protecting the interconnection layers below from the top heat
Two Major Semiconductor Trends help make Monolithic 3D Practical NOW
As we have pushed dimensional scaling: The volume of the transistor has scaled
Bulk um-sized transistors FDSOI & FinFet nm transistors
Processing times have trended lower Shallower & sharper junctions, tighter pitches, etc.
=> Much less to heat and for much shorter time
The Semi Industry Annealing Trend with Scaling
The Top Layer has a High Temperature (>1000°C) without Heating the Bottom Layers (<400°C)
>1000°C
<400°C
}
}
Process Window Set to Avoid Damage
Temperature variation at the 20 nm thick Si source/drain region in the upper active layer during laser annealing. Note that the shield layers are very effective in preventing any large thermal excursions in the lower layers
MonolithIC 3D Inc. Patents Pending
23
~10-100nm N+ P-
Oxide
1-4µ Top Portion of Base Wafer
Pulsed Laser Anneal could be used to repair ion-cut lattice damage
Base Wafer ~775µm
Single-crystal Si on quartz Color Coded Orientation
Silicon finger and neck-down region patterned using FIB Capped with oxide and melted with a 2ms laser pulse <001> single crystals achieved due to strong texturing and neck
region
1 x 10 micron finger 2 x 10 micron finger
F. Pease – 2011 CMOS-ET
Monolithic 3D A few near-term applications
3D Configurable Fabric Competitive volume cost with Standard Cell
Ultra Scale Integration Full redundancy at the logic cone level
3D RAM over Logic Low cost 3D scalable 1T SRAM
Mobile is now. Power is key to mobile. What architecture provides the best power efficiency?
0.01
0.1
1
10
100
1000
10000
100000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Ener
gy (P
ower
) Effi
cienc
y M
OPS
/mW
Chip Number
Microprocessors Programmable DSP Dedicated
(9.6) Computational Photography
Video Decoder (9.5) SDR Processor
Recognition Processor (9.8) Application Processor (9.4)
24 Core Processor (3.6)
(x.x) 2013 ISSCC Paper
Fujitsu 4 Core FR550 Intel Sandy Bridge
Mobile Processors (Atom, Snapdragon, Exynos, OMAP)
DSP Processor (7.5 – 2011)
SVD (90nm)
100,000 X
Berkeley Wireless Research Center Brodersen, Chandrakasan, Markovic 2013 S3S
Trend and Opportunity
The market action has been clear – The market chose flexibility (more embedded SW) and paid with efficiency
Since companies can't risk missing the market window, flexibility is too important Existing FPGA architectures are expensive and power inefficient
⇒ There is huge opportunity for new programmable
fabric by providing ~ 1:1000 reduction in power If it could be compatible to Std Cell (~1 : 2)
The Via Configurable Fabric (ex. eASIC, Triad)
It been validated that via configurable vs. Std Cell provides: Power: x 1-2 Cost: x 2 Speed: x 0.65-0.85
FPGA Achilles’ Heel – PIC (Programmable Interconnect) >30x Area vs. Antifuse/Masked Via
Average area ratio of connectivity element >30
Current FPGAs use primarily pass
transistors with a driver.
Via Pitch - 0.2 µ
.2 x .2= 0.04 µ2
Area: 0.04 µ2
SRAM FPGA connectivity elements @ 45 nm
Via connectivity element @ 45 nm
.2µ
SRAM bit
SRAM bit
SRAM bit
SRAM bit
Bidi buffer Area ≈ 4 µ2
Ratio to AF ≈ 100
TS buffer Area ≈ 2 µ2
Ratio to AF ≈ 50
Pass gate Area ≈ .5 µ2
Ratio to AF ≈ 12
4X
10X
4X
4X
.2µ
Via/AF
The Twin - Field-programmable & via-configurable fabric
Prototype Phase Production Phase
• Prototype volumes • Prototype costs • Std. logic w/mono-3D • OTP till design &
functions stabilized • Specific foundry
• Production volumes • Production costs • Standard logic process • 1-Mask customization • Backup-foundry capable
Anti-fuses
HV Programming Transistors Vp
Logic Redundancy 100x Integration Made Possible
It is well known the more we can integrate on one chip with reasonable yield, the better the cost & performance – Moore’s Law
Yield is the dominating criterion when to use PCB rather than on-chip integration
‘Wafer Scale Integration’ – 99.99% Yield with 3D Redundancy
Swap at logic cone granularity
Negligible design and power penalty
Redundant 1µ above, no performance penalty
Gene Amdahl -“Wafer scale integration will only work with 99.99% yield, which won’t happen for 100 years” (Source: Wikipedia)
Server-Farm in a Box
Watson in a Smart Phone
…
Memory on Logic Multiple Layers Processed Simultaneously Leveraged Cost Reduction (Nx) –”BICS”
Multiple thin layers can be process simultaneously, forming transistors on multiple layers
Cost reduction correlates with number of layers being simultaneously processed (2, 4, 8, 16, 32, 64, 128, ...)
Monocrystalline silicon enables MLC, thereby less layers needed
3D DRAM 3.3x Cost Advantage vs. 2D DRAM
Conventional stacked capacitor DRAM
Monolithic 3D DRAM with 4 memory layers
Cell size 6F2 Since non self-aligned, 7.2F2
Density x 3.3x Number of litho steps 26
(with 3 stacked cap. masks) ~26
extra masks for memory layers, but no stacked cap. masks)
35
(2D) INTEGRATED CIRCUIT
Kilby version: 2D Interconnects not integrated, big sizes
Noyce version (Monolithic 2D): 2D Interconnects integrated, small sizes
Will history repeat itself? Will the monolithic idea become prevalent for 3D too?
3D INTEGRATED CIRCUIT
3D-TSV: 3D Interconnects not integrated, big sizes
Monolithic 3D: 3D Interconnects integrated, small sizes
Summary
Scaling Up is where the industry must go Economics driven, more important than technology
Interconnect is the Key Vertical interconnect density = horizontal density
Monolithic 3D is the Future Do it in Silicon now
Use the existing infrastructure Follow with the ‘right’ next material and device set
Design with monolithic 3D in mind
Back Ups
EDA at the End of Moore’s Law* Bob Colwell, Director MTO, DARPA
*CRA/CCC & ACM SIGDA, Pittsburgh, March 2013
“The End of Roadmap” – Globalfoundries July 2013
Moore’s Law has stopped at 28nm
Embedded SRAM isn’t Scaling Beyond 28nm eSRAM > 60% of Die Area => End of Dimension Scaling !
*
*imec’s 2013 International Technology Forum,
2014 Intl. Workshop on Data-Abundant System Technology, April 2014
Cisco sees $19 Trillion opportunity in IoT
“CES LIVE: Cisco's Chambers Says Internet of Everything, $19 Trillion Opportunity, Is Next Big Thing” 1/7/14
<ttp://www.forbes.com/sites/connieguglielmo/2014/01/07/ces-live-cisco-ceo-chambers-to-deliver-keynote/>
$19 trillion: that’s the opportunity he says for the Internet of Everything in the private and public sector combined. Breakout is $14.4 trillion in private sector and $4.6 trillion in public sector of new revenue generation or new savings. That’s a conservative number he says for public sector.
“This will be bigger than anything done in high tech in a decade.” “As many as 50 billion devices will be connected to the Internet by 2020,
creating a $14.4 trillion business opportunity” said Rob Lloyd, president of sales and development at Cisco, <http://www.eetimes.com/electronics-news/4409928/Cisco-sees--14-trillion-opportunity-in-Internet-of-Things>
Designers Elect Older Nodes (for New Designs)
Designer Elect Older Nodes (for New Designs)
FPGAs Trending Down – EE Times embedded systems annual survey http://www.eetimes.com/document.asp?doc_id=1322014
Monolithic 3D – The next generation technology driver
Monolithic 3D provides more than just scaling Advantages and Applications
II. Reduction die size and power – doubling transistor count - Extending Moore’s law Monolithic 3D is far more than just an alternative to 0.7x scaling !!! III. Significant advantages from using the same fab, design tools IV. Heterogeneous Integration V. Multiple layers Processed Simultaneously - Huge cost reduction (Nx) VI. Logic redundancy => 100x integration made possible VII. Enables Modular Design VIII. Naturally upper layers are SOI IX. Local Interconnect above and below transistor layer X. Re-Buffering global interconnect by upper strata XI. Others A. Image sensor with pixel electronics B. Micro-display
The Monolithic 3D Advantage
Reduction of Die Size & Power – Doubling Transistor Count Extending Moore’s law
Reduction of Die Size & Power IntSim v2.0 free open source >600 downloads
Repeater count increases exponentially with scaling At 45nm, repeaters >50% of total leakage power of chip [IBM]. Future chip power, area could be dominated by interconnect
repeaters [Saxena P., et al. (Intel), TCAD, 2004]
IV. Heterogeneous Integration
Logic, Memories, I/O on different strata Optimized process and transistors for the function Optimizes the number of metal layers Optimizes the litho. (spacers, older node)
Low power, high speed (sequential, combinatorial) Different crystals – E/O
V. Multiple Layers Processed Simultaneously - Huge Cost Reduction (Nx) –”BICS”
Multiple thin layers can be process simultaneously, forming transistors on multiple layers
Cost reduction correlates with number of layers being simultaneously processed (2, 4, 8, 16, 32, 64, 128, ...)
Monocrystalline silicon enables MLC, thereby less layers needed
3D DRAM 3.3x Cost Advantage vs. 2D DRAM
Conventional stacked capacitor DRAM
Monolithic 3D DRAM with 4 memory layers
Cell size 6F2 Since non self-aligned, 7.2F2
Density x 3.3x Number of litho steps 26
(with 3 stacked cap. masks) ~26
extra masks for memory layers, but no stacked cap. masks)
MonolithIC 3D Inc. Patents Pending
VI. Logic Redundancy => 100x Integration Made Possible
It is well known the more we can integrate on one chip with reasonable yield, the better the cost & performance – Moore’s Law
Yield is the dominating criterion when to use PCB rather than on-chip integration
Innovation Enabling ‘Wafer Scale Integration’ – 99.99% Yield with 3D Redundancy
Swap at logic cone granularity
Negligible design and power penalty
Redundant 1µ above, no performance penalty
Gene Amdahl -“Wafer scale integration will only work with 99.99% yield, which won’t happen for 100 years” (Source: Wikipedia)
Server-Farm in a Box
Watson in a Smart Phone
… MonolithIC 3D Inc. Patents Pending
VII. Enables Modular Design
Platform-based design could evolve to: Few layers of generic functions like compute, radios, and one
layer of custom design Few layers of logic and memories and one layer of FPGA ...
VIII. Upper Layers are naturally SOI
SOI wafers provides many benefits with one major drawback: cost of the blank wafer.
In monolithic 3D – all the upper strata are naturally SOI At no cost
IX. Local Interconnect - Above and Below Transistor Layer
Increased complexity requires increased connectivity. Adding more metal layer increases the challenge of connecting upper layers to the transistor layer below.
Intel March, 2013
X. Re-Buffering Global Interconnect by Upper Strata
Global interconnect is done at the upper and thicker metal layers. It would increase efficiency if these layers could re-buffer instead of connecting to base layer using multiple vias and blocking multiple metal tracks. ⇒Use the layers above for re-buffering.
XI. Others A. Image Sensor with Pixel Electronics
With rich vertical connectivity, every pixel of an image sensor could have its own pixel electronics underneath
MonolithIC 3D Inc. Patents Pending
XI. Others B. Micro-display
Use of three crystal layers to form RGB LED arrays with drive electronics underneath
MonolithIC 3D Inc. Patents Pending
MonolithIC 3D Inc. Patents Pending 61
Step 1 - Implant and activate unpatterned N+ and P- layer regions in standard donor wafer at high temp. (~900oC) before layer transfer. Oxidize (or CVD oxide)
top surface.
Step 1. Donor Layer Processing
Step 2 - Implant H+ to form cleave plane for the ion cut
N+ P-
P-
SiO2 Oxide layer (~100nm) for oxide -to-oxide bonding with device wafer.
N+ P-
P-
H+ Implant Cleave Line in N+ or below
MonolithIC 3D Inc. Patents Pending 62
Step 3 - Bond and Cleave: Flip Donor Wafer and Bond to Processed Device Wafer
Cleave along H+ implant line
using 400oC anneal or sideways mechanical force. Polish with CMP.
-
N+ P-
Silicon
SiO2 bond layers on base
and donor wafers
(alignment not an issue with
blanket wafers)
<200nm)
Processed Base IC
MonolithIC 3D Inc. Patents Pending 63
Step 4 - Etch and Form Isolation and RCAT Gate
+N
P-
Gate Oxide
Isolation
•Litho patterning with features aligned to bottom layer •Etch shallow trench isolation (STI) and gate structures •Deposit SiO2 in STI •Grow gate with ALD, etc. at low temp (<350º C oxide or high-K metal gate)
Ox Ox Gate
Advantage: Thinned donor wafer is
transparent to litho, enabling direct
alignment to device wafer alignment marks: no indirect alignment. (common for TSV 3DIC) Processed Base IC
MonolithIC 3D Inc. Patents Pending 64
Step 5 – Etch Contacts/Vias to Contact the RCAT
+N
P-
Processed Base IC
Complete transistors, interconnect wires on ‘donor’ wafer layers Etch and fill connecting contacts and vias from top layer aligned to bottom
layer
Processed Base IC
One path to solving the dopant activation problem: Recessed Channel Transistors with Activation before Layer Transfer
MonolithIC 3D Inc. Patents Pending 65
p- Si wafer
Idea 1: Activate dopants before layer transfer
Oxide
H
Idea 2: Recessed channel transistors @ sub-400oC
• Minimum feature size TSVs • All steps after layer transfer to Cu/low k @ < 400oC!
n+
p
p- Si wafer
p n+
n+ Si p Si
n+ n+ p p
Idea 3: Thin-film Si perfect alignment. TSVs minimum feature size.
Layer transfer of un-patterned film. No alignment issues.
MonolithIC 3D Inc. Patents Pending 66
~700µm Donor Wafer
Fabricate Standard Dummy Gates with Oxide and Poly-Si; >900ºC, on Donor Wafer
PMOS NMOS
Silicon
Poly Oxide
MonolithIC 3D Inc. Patents Pending 67
~700µm Donor Wafer
PMOS NMOS
Silicon
Implant Hydrogen for Cleave Plane
H+
MonolithIC 3D Inc. Patents Pending 68
~700µm Donor Wafer
Silicon
Bond Donor Wafer to Carrier Wafer
H+
~700µm Carrier Wafer
MonolithIC 3D Inc. Patents Pending 69
Deposit Oxide, ox-ox Bond Carrier to Base Wafer
~700µm Carrier Wafer
STI
Oxide-oxide bond
PMOS NMOS
Transferred Donor Layer
Base Wafer
70
Remove Carrier Wafer
Oxide-oxide bond
Transferred Donor Layer
MonolithIC 3D Inc. Patents Pending
PMOS NMOS Base Wafer
71
Replace Dummy Gate with Hafnium Oxide & HK Metal Gate (at low temp.)
Oxide-oxide bond
Transferred Donor Layer
MonolithIC 3D Inc. Patents Pending
PMOS NMOS Base Wafer
Note: Replacing oxide and gate result in oxide and gate that were not damaged by the H+ implant
MonolithIC 3D Inc. Patents Pending 72
Add Interconnect
Oxide-oxide bond
Transferred Donor Layer
MonolithIC 3D Inc. Patents Pending
PMOS NMOS Base Wafer
ILV
MonolithIC 3D Inc. Patents Pending 73
Novel Alignment Scheme using Repeating Layouts
Even if misalignment occurs during bonding repeating layouts allow correct connections.
Above representation simplistic (high area penalty).
Bottom layer layout
Top layer layout
Landing pad
Through-layer connection
Oxide
MonolithIC 3D Inc. Patents Pending 74
Smart Alignment Scheme
Bottom layer layout
Top layer layout
Landing pad
Through-layer connection
Oxide
Two Types of 3D Technology
75
3D-TSV Transistors made on separate wafer
@ high temperature, then thin + align + bond
TSV pitch > 1um*
Monolithic 3D Transistors made monolithically atop
wiring (@ sub-400oC for logic)
TSV pitch ~ 50-100nm
10um- 50um
100 nm
* [Reference: P. Franzon: Tutorial at IEEE 3D-IC Conference 2011]
Ion-cut is great, but will it be affordable? Aren’t ion-cut SOI wafers much costlier than bulk Si today?
• Until 2012: Single supplier SOITEC. Owned basic patent on ion-cut
• Our industry sources + calculations $60 ion-cut cost per $1500-
$5000 wafer in a free market scenario (ion cut = implant, bond, anneal).
• Free market scenario After 2012 when SOITEC’s basic patent expires
• SiGen and Twin Creeks Technologies using ion-cut for solar
Contents: Hydrogen implant Cleave with anneal
SOITEC basic patent expired Sep 2012
Laser Spike Annealing
Types
77
Major Thermal Process Steps in a Modern IC Process
78
Step Purpose Thermal Budget Replaceable? With Laser?
Si3N4 LPCVD STI and well 30 min – 700 °C Yes No
Liner Ox STI 10 min –800 °C Yes No
TEOS Densification STI 20 min – 1000 °C Yes Yes
Implant Activation Well 20 sec – 1000 °C Yes Yes
Dummy Ox Gate 2 min – 800 °C Yes No
Dummy a-Si deposition
Gate 20 min – 600 °C Yes No
Selective epi deposition
SiGe and SiC S/D 20 min – 700 °C Yes No
Silicide formation S/D contact 5 min – 400 °C Yes Yes
Implant Activation S/D, halo, Vt 5 sec – 950°C, +LSA No Yes
BIL oxide+N Gate 5 min – 825 °C Yes No
HfO2 post ALD Gate 30 sec – 700 °C Yes Yes
IntSim: The CAD tool used for our simulation study [D. C. Sekar, J. D. Meindl, et al., ICCAD 2007]
IntSim v1.0: Built at Georgia Tech in Prof. James Meindl’s group IntSim v2.0: Extended IntSim v1.0 to monolithic 3D using 3D wire length distribution models in the literature
MonolithIC 3D Inc. Confidential, Patents Pending 79
Open-source tool, available for use at www.monolithic3d.com