low-power design techniques in digital systems prof. vojin g. oklobdzija university of california...
TRANSCRIPT
Low-Power Design Techniques in Digital Systems
Prof. Vojin G. Oklobdzija
University of California
November 19, 2003
2
Outline of the Talk
• Power trends in VLSI • Scaling theory and predictions• Research efforts in power reduction• Efficiency measures and design guidelines• Latches and Flip-Flops for Low-Power
– Dual-Edge FFs– SOI
• Conclusion: Low-Power perspective
3
Power trends in VLSI
4
“CMOS Circuits dissipate little power by nature. So believed circuit designers”(Kuroda-Sakurai, 95)
“By the year 2000 power dissipation of high-end ICs will exceed the practical limits of ceramic packages, even if the supply voltage can be feasibly reduced.”
(* Taken from Sakurai’s ISSCC 2001 presentation)
959085800.01
0.1
1
10
100P
ow
er (
W)
x4 / 3years
5
Gloom and Doom predictions
Source: Shekhar Borkar, Intel
6Source: Shekhar Borkar, Intel
7
y = 3E-97e0.1131x
y = 2E-124e0.1442x
y = 6E-109e0.126x
y = 2E-222e0.2574x
0
10
20
30
40
50
60
70
80
1995.5 1996 1996.5 1997 1997.5 1998 1998.5 1999 1999.5 2000 2000.5
Year
Pow
er (W
atts
)
RISC
x86
Consumer
Dec Alpha
Expon. (RISC)
Expon. (x86)
Expon. (Consumer)
Expon. (Dec Alpha)
High-end growing at 25% / year
Consumer (low-end)At 13% / year
X86 @ 15% / yrRISC @ 12% / yr
Power versus Year: taken from ISSCC, uP Report, Hot-Chips
8
Year
Vo
ltag
e [V
]
Po
wer
per
ch
ip [
W]
VD
D c
urr
ent
[A]
VDD, Power and Current Trend
1998 2002 2006 2010 20140
0.5
1
1.5
2
2.5
0 0
200 500
Current
Power
Voltage
International Technology Roadmap for Semiconductors 1999 update sponsored by the Semiconductor Industry Association in cooperation with European Electronic Component Association (EECA) , Electronic Industries Association of Japan (EIAJ), Korea Semiconductor Industry Association (KSIA), and Taiwan Semiconductor Industry Association (TSIA)
(* Taken from Sakurai’s ISSCC 2001 presentation)
9
Power Delivery Problem (not just California)
Source: Shekhar Borkar, Intel
Your carstarter !
10
Trend in L di/dt
• di/dt is roughly proportional to I * f, where I is the chip’s current and f is the clock frequency
or I * Vdd * f / Vdd = P * f / VddP * f / Vdd, where P is the chip’s
power. • The trend is: P f Vdd
on-chip L package L slightly decreasesslightly decreases
• Therefore, L di/dt fluctuation increases significantly.
(* Taken from Norman Chang, HP)
11
ISPEC^2/Watt vs Feature Size (microns)
y = 0.3733x-2.5778
1
10
100
0.00 0.20 0.40 0.60Feature Size (microns)
ISP
EC
^2/
Wat
t
Energy-Delay product is improving more than 2x / generation
Saving Grace !
12
ISPEC^2/Watt vsYear
0102030405060708090
100
1995 1996 1997 1998 1999 2000 2001
Year
Consumerx86Server
X86 efficiency improving dramatically 4X / generation
average improving3X / generationHigh-End
processors efficiency not improving
13
Scaling theory and predictions
14
The power dissipation has increased 1000 times over the 15 yearsand is exceeding 70 Watts
Scaling principles:
1. A “constant field scaling” theory [Dennard] assumes that device voltages as well as device dimensions are scaled by a scaling factor x (>1), resulting in a constant electric field in a device:
power density remains constant circuit performance can be improved in terms of:
density x2
speed x power 1/ x2
power-delay product 1/ x3
Limitless progress in CMOS is promised with this scaling scenario
15
In practice neither a supply voltage nor a threshold voltagehad been scaled till 1990 leading to the theory of:
“Constant voltage scaling” which assumes the constant voltage
This assumption yields:
• speed improvement by x2
• power density increases rapidly by x3
16
The constant field is not realistic, x0.5 is satisfactory - however even with that the power dissipation would exceed ECL by 2001: a new philosophy is required !
(* Taken from Sakurai and Kuroda, IEICE 95 paper)
17
High-Performance View Point on Power*taken from Ron Preston, DEC Alpha
P=k C V2 f :
• Shrinking to the new technology (30% reduction in )– C decreases by 30%
– f increases by 1/0.7 = 43%
– Pnew=0.7 (1/0.7) Pold = Pold (No Change in Power ! )
• New design:– Double the No. of devices
– Pnew=2 x 0.7 (1/0.7) Pold = 2 X Pold (Power Doubles !)
Scale Vdd by 30% in the new design:
– Pnew=2 x 0.7 (1/0.7) (0.7)2Pold = Pold (Power stays constant !)
18
High-Performance View Point on Power*taken from Ron Preston, DEC Alpha
Reality:
Paradigm Changes: More Aggressive Circuits, Toggle rate increasing,
Out of Order, Speculative Execution What to Expect: Power will be limited by the package and cooling techniques
Frequency will be determined by the power - as high as package can take !
Chip Vdd Freq. Power
21164 05u 3.3V 300MHz 50W
21264 0.35u 2.0V 600MHz 72W
Change -30% -39% +100% +44%
19
Research Efforts in Low-Power Design
Psw = k CL V2
cc fCLK
Reduce Switching Activity:•Conditional clock•Conditional precharge•Switching-off inactive blocks•Conditional execution
Run it slower:•Use parallelism•Less pipeline stages•Use double-edge flip-flop
Technology scaling:•The highest win•Thresholds should scale•Leakage starts to byte•Dynamic voltage scaling
Reduce the active load:•Minimize the circuits•Use more efficient design•Charge recycling •More efficient layout
20
Reducing the Power Dissipation
• The power dissipation can be minimized by reducing:
• supply voltage• load capacitance• switching activity
– Reducing the supply voltage brings a quadratic improvement
– Reducing the load capacitance contributes to the improvement of both power dissipation and circuit speed.
21
Voltage Scaling
There are three means to maintain the throughput:
• Reduce Vth to improve circuit speed
• Introduce parallel and pipelined architecture while
using slower device speeds (assumes limitless no. of transistors, in reality the transistor density is
only increasing by 60% per year)
• Prepare multiple supply voltages and for each cluster
of circuits choose the lowest supply voltage that satisfies
the speed. (A good level converter is necessary which exhibits small delay and consumes
little power, small area)
22
23
Is there an optimal design point ?
24
Power Dissipation and Circuit Delay
Power : P = pt •fCLK •CL •VDD + I0 •10 •VDD 2
V th
S
(=1.3)
k • CL • VDD
(VDD - Vth)Delay =
k•Q
I=
12
34
-0.400.4
0.8
0
0.2
0.4
0.6
0.8
1x 10
-4
Vth (V)
VDD(V)
Po
wer
(W
)
A
B
12
34
-0.400.40.8
0
1
2
3
4
5x 10
-10
Del
ay (
s)
Vth (V)
VDD(V)
AB
(* Taken from T. Sakurai)
25
Power-Delay Product, Energy-Delay Product
Lowest Voltage – Highest Threshold –
no optimum
•Power-Delay Product is a misleading measure; it will always favor a processor that operates at lower frequency
•Energy-Delay is more adequate - but Energy-Delay2 should be used
(*from Sakurai, Kuroda, IEICE 95 paper)
26
Power-Delay Product, Energy-Delay Product
Horowitz, Indermaur, Gonzales argue against Power-Delay, SLPE’94
27
Energy-Delay**2
(*courtesy of Prof. T. Sakurai)
28
Energy-Delay Product vs. Energy-Delay**2
Nowka, Hofstee, Carpenter of IBM argue against Energy-Delay as a design efficiency measure (private communication)
29
Energy-Delay Product vs. Energy-Delay**2
Nowka, Hofstee, Carpenter of IBM argue against Energy-Delay as a design efficiency measure (private communication)
The same design should have relatively
the same efficiency
Optimal point: (due to to Vth being fixed ?)
30
Feature 601+ 604 620 Diff.
FrequencyMHz
100 100 133(100)
same
CMOS Process .5u 5-metal .5u 4-metal .5u 4-metal ~same
Cache Total 32KB Cache 16K+16K Cache
64K ~same
Load/Store Unit No Yes Yes
Dual Integer Unit No Yes Yes
Register Renaming No Yes Yes
Peak Issue 2 + Br 4 Insts 4 Insts ~double
Transistors 2.8 Million 3.6 Million 6.9 Million +30% /+146%
SPECint92 105 160 225(169)
+50% /+61%
SPECfp02 125 165 300(225)
+30% /+80%
Power 4W 13W 30W(22.5W)
+225%/+463%
Spec/Watt 26.5/31.2 12.3/12.7 7.5/10 -115%/-252%
PF=Watt/Freq**3 4.0E-6 13.0E-6 12.8E-6
(PF/Trans)*E12 1.43 3.61 1.86
IPC 1.05 1.6 1.69
PE*IPC**3 (*E6) 4.01 12.98 12.69
PE=Watt/Spec**3 3.46E-6 3.17E-6 2.63E-6
Example: PowerPC
31
Feature Digital 21164
MIPS 10000 PowerPC 620
HP 8000 Sun Ultra-Sparc
Freq 500 MHz 200 MHz 200 MHz 180 MHz 250 MHz
Pipeline Stages 7 5-7 5 7-9 6-9
Issue Rate 4 4 4 4 4
Out-of-Order Exec. 6 lds 32 16 56 none
Register Renam. (int/FP) none/8 32/32 8/8 56 none
Transistors/Logic transistors
9.3M/1.8M
5.9M/2.3M
6.9M/2.2M
3.9M*/3.9M
3.8M/2.0M
SPEC95(Intg/FlPt)
12.6/18.3 8.9/17.2 9/9 10.8/18.3 8.5/15
Power 25W 30W 30W 40W 20W
SpecInt/Watt
0.5 0.3 0.3 0.27 0.43
1/Energy*Delay 6.4 2.6 2.7 2.9 3.6
Watt/Freq**3 0.2E-6 3.75E-6 3.75E-6 6.86E-6 1.28E-6
(PF/Trans)*E12 0.022 0.64 0.54 1.76 0.34
(PF/LTrans)*E12 0.11 1.63 1.7 1.76 0.64
Watt/Spec**3 12.5E-3 42.5E-3 41.5E-3 31.7E-3 32.5E-3
32
Sensitivity to Vth fluctuation
VTH (V)
0 0.2 0.4 0.7 1
1.5 V
3.0 V
5.0 V
0.6
1.0
1.4
1.8N
orm
aliz
ed D
elay ± 0.15V
VDD =1.0 V
± 0.05V
ΔVTH =
0.5
(* Taken from T. Sakurai)
33
Use of Different Circuits Families
34
Capacitance Reduction
The load capacitance is the sum of:
• gate capacitance• diffusion capacitance • routing capacitance
Using small number of transistors, or small size of transistorscontributes to the reduction in the gate capacitance and the diffusion capacitance.
Pass transistor logic may have advantage because it comprises fewer transistors and exhibits smaller straycapacitance than conventional static CMOS logic.
35
Pass-Transistor Logic
36
Pass-Transistor Logic: CVSL, CPL, SRPL, DSL, DPL, DCVSPG
37
SAPL:Sense-Amplifying Pass-transistor
Logic
All nodes are first discharged and then evaluated by inputs.Outputs are 100mV above GND
38
Where does the power go ?
39
Power use is different from chip to chip:
MPU1 is a low end microprocessorMPU2 is a high-end CPU with large cacheASSP1 is MPEG-2 decoderASSP2 is an ATM switch
(*from Sakurai, Kuroda, IEICE 95 paper)
40
Design Example: Strong Arm 110Two power modes: idle and sleep
Power:0.5W using 1.1V internal PS: 184 Drystone/MIPS @162MHz
1.1W using 2V internal PS: 245 Drystone/MIPS @ 215MHz
Power Breakdown:I-Cache 27%
D-Cache 16%
I-Unit 18%
Exec-Unit 8%
I-MMU 9%
D-MMU 8%
Clock 10%
Others 4% (PLL < 1%)
*from D. Dobberpuhl
41
Design Example: Strong Arm 110
*from D. Dobberpuhl
42
Design Example: Strong Arm 110
However, leakage currents starts to affect stand-by power
*from D. Dobberpuhl
*from D. Dobberpuhl
43
Controlling both: VDD and VTH for low power
44
Controlling VDD and VTH for low power
Low power Low VDD Low speed Low VTH High leakage VDD-VTH control
Active Stand-byMultiple VTH Dual-VTH MTCMOS
Variable VTH VTH hopping VTCMOS
Multiple VDD Dual-VDD Boosted gate MOS
Variable VDD VDD hopping
*) MTCMOS: Multi-Threshold CMOS*) VTCMOS: Variable Threshold CMOS• Multiple : spatial assignment• Variable : temporal assignment
Software-hardware cooperation
Technology-circuit cooperation
(* from Prof. T. Sakurai)
45
Dual-VTH concept
Low-VTH circuit(High leakage)
High-VTH circuit(Low leakage)
Critical paths
Non-critical paths
(* from Prof. T. Sakurai)
46(* from Prof. T. Sakurai)
Clustered Voltage Scaling for Multiple VDD’s
Lower VDD portion is shown as shaded
CVS StructureConventional Design
Critical Path
Level-Shifting F/F
Critical Path
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
FF
M.Takahashi et al., “A 60mW MPEG4 Video Codec Using Clustered Voltage Scaling with Variable Supply-Voltage Scheme,” ISSCC, pp.36-37, Feb.1998.
Once VL is applied to a logic gate, VL is applied to subsequent logic gates until F/F’s to eliminate DC current paths. F/F’s restore VH.
47
Energy consumption isproportional tothe square of VDD.
Energy consumption isproportional tothe square of VDD.
VDD should be loweredto the minimum levelwhich ensuresthe real-time operation.
VDD should be loweredto the minimum levelwhich ensuresthe real-time operation.
Normalized workload0.0 0.2 0.4 0.6 0.8 1.0
No
rmal
ized
po
wer
0.0
0.2
0.4
0.6
0.8
1.0
Variable VddFixed Vdd
If you don’t need to hussle,VDD should be as low as possible
(* from Prof. T. Sakurai)
48
Measured voltage waveforms
1 sync frame
200ms
Sleep
V DDmax =8% on average
V DD
V DDmax
V DDmin
Sleep signal
Sleep=6% on average
(* from Prof. T. Sakurai)
49
Measured power characteristics
Total power = 0.8W x 0.08 + 0.16W x 0.86 + 0.07W x 0.06 = 0.2WTotal power = 0.8W x 0.08 + 0.16W x 0.86 + 0.07W x 0.06 = 0.2W
VDD hopping can cut down power consumption to 1/4VDD hopping can cut down power consumption to 1/4
0.8W
0
0.2
0.4
0.6
0.8
1
Supply voltage: VDD [V]
Po
wer
: P
[W
]
0 1 2
ƒ=100MHz
ƒ=200MHz
0.16W
Downto 1/5
Time for sleep: 6% 0.07W
Time for VDDmin : 86%
Time for VDDmax : 8%
(* from Prof. T. Sakurai)
50
Simulation results
0.0 0.2 0.4 0.6 0.8 1.00.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
RPC: 2 levels (f,f/2)RPC: 3 levels (f,f/2,f/3)RPC: 4 levels (f,f/2,f/3,f/4)RPC: infinite levelspost-simulation analysis
0.0 0.2 0.4 0.6 0.8 1.00.00
0.04
0.08
0.12
0.16
0.20
0.24
0.28
0.32
RPC: 2 levels (f,f/2)RPC: 3 levels (f,f/2,f/3)RPC: 4 levels (f,f/2,f/3,f/4)RPC: infinite levelspost-simulation analysis
MPEG-2 video decoding VSELP speech encoding
No
rma
lize
d P
ow
er
P/P
FIX
No
rma
lize
d P
ow
er
P/P
FIX
Transition Delay TTD
(ms)Transition Delay TTD
(ms)
(* from Prof. T. Sakurai)
51
Aggressive Voltage Scaling
If we can dynamically scale Vdd and Vth the advantage is obvious
*Taken from Kuroda
52
Example
53
TransMeta Example
*Taken from Doug Laird’s presentation, January 19 th 2000
54
TransMeta Example
*Taken from Doug Laird’s presentation, January 19 th 2000
55
TransMeta Example
• “Code Morphing” is another contributor to power reduction since it eliminates unnecessary external memory access
*Taken from Doug Laird’s presentation, January 19 th 2000
56
TransMeta Example
57
Latches and Flip-Flops for Low-Power
58
Simulation Condition and TestbenchTiming
Total FF overhead is setup + clock-to-output time
Circuit optimization towards td-q
Clock skew robustness obtained from observing DQ curve
Power-Delay Product Overall performance
parameter at fixed frequency
)(tmint QDt
dCLKD
dissd PtEDPf) fixedPDP(at
D
Q
QSET
CLR
Clk
Data In
Clock
14X m ininv
14X m ininv
14X m ininv
59
Flip-Flop Performance Comparison
• Total power consumed– internal power
– data power
– clock power
• Measured for four cases– no activity (0000… and 1111…)
– maximum activity (0101010..)
– average activity (random sequence)
Test bench
Delay is (minimum D-Q):Clk-Q + Setup time
C lk
D ata
C lo ck
5 0 fF
2 0 0 fF
2 0 0 fFD Q
Q
60
OLD TEST BENCH:
• Total Power = Drivers Power + Test Unit Power
• PDP- Optimized = Equal Trade-off on Power and Delay
• Improper Load on Drivers
NEW TEST BENCH:
• Drivers: Fixed Gain and Driving Test Unit Only
• Data-to-Output Delay
• PD2P Optimized = Best for Constant-Field Scaling
OLD TEST BENCH
NEW TEST BENCH
61
Comparison in terms of speed and EDPtot
Technology: 0.2u, Vdd=2V, T=20oC, measured @ 100MHz
• Delay: below 200ps• SDFF 187ps• HLFF 199ps• K-6 ETL 200ps
– 200-300ps• PowerPC latch 266ps• 21264 Alpha FF 272ps• Strong Arm FF 275ps• mC2MOS latch 292ps
– above 500ps• SSTC latch 592ps• DSTC latch 629ps• SSTC* latch 898ps• DSTC* latch 1060ps
• PDPtot @100MHz
– below 30fJ• PowerPC latch 28fJ
– 30 - 50fJ• HLFF 29fJ• SDFF 39fJ• mC2MOS latch 40fJ• 21264 Alpha FF 43fJ• Strong Arm FF 45fJ
– 50 - 70fJ• K-6 ETL 70fJ
– above 70fJ• SSTC latch 95fJ• DSTC latch 125fJ
62
Delay comparison
• F-F design brings the fastest structures
0
50
100
150
200
250
300
350
SDFF HLFF K6 PowerPC Alpha 21264FF
Strong ArmFF
mC2MOS
De
lay
[p
s]
63
Delay comparison
• F-F design brings the fastest structures
0
50
100
150
200
250
300
350
SDFF HLFF PowerPC mC2MOS
Del
ay [
ps]
0
100
200
300
400
500
600
700
K6 SA-F/F StrongArm SSTC DSTC
Del
ay
[ps
]
64
Overall rankingPDPtot ranges
0
50
100
150
200
250
300
350
HLFF SDFF Pow erPC mC2MOS StrongArm
Alpha21264
K6 SSTC DSTC SSTC* DSTC*
PDPt
otal
[fJ]
Activity=0.25 equaltransition probability
• EDPtot accepted as the overall cost function• Proposed “low-power” latches from Yuan & Svensson, compared with
other presented structures do not show advantage, (the optimization was not properly done - optimization is yet to be repeated under different setup)
@100MHz
65
Overall ranking, zoomed
0
10
20
30
40
50
60
70
80
HLFF SDFF Pow erPC mC2MOS Strong Arm Alpha21264
K6
PD
Pto
t [f
J]
Activity=0.25 equal transition probability
• Real signals have the activity between 0 and 1.0 ()• Precharged hybrid structures are the fastest but their power consumption
strongly depends on the probability of “ones”• More “ones” above the point
66
Overall performance
• Real signals have the activity between 0 and 1.0 ()• Precharged hybrid structures are the fastest but their power
consumption strongly depends on the probability of “ones”
• More “ones” above the point
0
10
20
30
40
50
60
HLFF SDFF PowerPC mC2MOS
PDP
tot [
fJ]
Activity=0.5 equal transition probability
0
20
40
60
80
100
120
140
160
SA-F/F StrongArm110
K6 SSTC DSTC
PD
Pto
t [fJ
]Activity=0.5 equal transition probability
67
Conventional Clk-Q vs. minimum D-Q
• Hidden positive setup time
• Degradation of Clk-Q
0
50
100
150
200
250
300
350
400
150 200 250 300 350 400 450 500 550 600 650
Delay [ps]
To
tal
po
we
r [u
W]
HLFF
PowerPC
Strong Arm FF
Alpha 21264 FF
mC2MOS latch
K6 ETL
SSTC
DSTC
SDFF
0
50
100
150
200
250
300
350
400
100 150 200 250 300 350
Clk-Q delay [ps]
To
tal
Po
we
r [u
W]
HLFF
PowerPC
Strong Arm FF
Alpha 21264 FF
mC2MOS latch
K6 ETL
SSTC
DSTC
SDFF
68
Internal Power distribution
• Four sequences characterize the boundaries for internal power consumption– …010101… maximum
– random, equal transition probability, average
– …111111… precharge activity
– …000000… leakage + internal clock processing
0
50
100
150
200
250
300
350
400
Random,activity=0.5
…01010101…activity=1
…11111111…activity=0
…00000000…activity=0
Data patterns
Inte
rna
l P
ow
er
[uW
]
HLFF SDFF PowerPC 603 latch
mC2MOS latch StrongARM FF Alpha 21264 FF
K6 ETL
69
Comparison of Clock power consumption
0 10 20 30 40 50
Local Clock power consumption [W]
DSTC MS latch
SSTC MS latch
K6 ETL
StrongArm FF
SA-F/F
mC2MOS
PowerPC MS latchSDFF
HLFF
70
Using Dual-Edge Flip-Flop(run at ½ of the frequency
save on the power consumed in clock distribution tree)
71
Dual-Edge vs. Single-Edge Flip-Flops Comparison
0
50
100
150
200
250
300
350
400
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Delay [ps]
0
50
100
150
200
250
300
350
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Total Power [W]
•Fujitsu 0.18u process; Clock frequency 500MHz (250MHz for Dual Edge FFs)•Data activity ratio = 0.5•VDD = 1.8V•Temp = 25º
72
Dual-Edge vs. Single-Edge Flip-Flops Comparison
0
50
100
150
200
250
300
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Internal Power [W]
0
10
20
30
40
50
60
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Clock Power [W]
0
5
10
15
20
25
DETFF1DETFF2DETFF3SDFFHLFFPowerPC
Data Power [W]•Fujitsu 0.18u process; Clock frequency 500MHz (250MHz for Dual Edge FFs)•Data activity ratio = 0.5•VDD = 1.8V•Temp = 25º
73
Silicon on Insulator (SOI) Technology
74
SOI Comparison
0
10
20
30
40
50
60
70
Delay [ps]0
20
40
60
80
100
120
140
Internal Power [uW]05
10
15202530
35404550
Clock Power [uW]
0
20
40
60
80
100
120
140
160
Total Power [uW]0
1
2
3
4
5
6
EDP [fJ]@500Mhz
HALPowPCHLFFSDFFSAFFSA 110
F= 1GHz, = 0.5, Le = 0.08 m, VDD=1.3V, T = 25C
75
In conclusion….
What can we expect that low power will bring to us ?
76
Wearable Computer
77
Wearable Computer
78
Wearable Computer
79
Digital Ink
80
Implantable Computer
81
Bluetooth
82
Brain Ultra small volumeSmall number of neuron cellsExtremely low power
Real time image processing(Artificial) Intelligence3D flight control
Sensor
InfraredHumidityCO2
Mosquito
Year 2110
Long lifetime by DNA manipulation Bio-computer
Year 2010Extrapolation of the trend with some saturation
Many important interesting applicationHome, Entertainment, Office, Translation , Health care
Year 2020???
More assembly technique: 3D
Combination of bio and semiconductor