combination of multiple mechanism for post-silicon reliability prediction
DESCRIPTION
Combination of Multiple Mechanism for Post-Silicon Reliability Prediction. April 30, 2014. Joseph B. Bernstein Ofir Delly , Moti Gabbay Ariel University Yizhak Bot (BQR) [email protected]. We always try learning from the past in order to improve the Future. One Problem….. - PowerPoint PPT PresentationTRANSCRIPT
April 30, 2014 1
Combination of Multiple Mechanism for Post-Silicon
Reliability Prediction
April 30, 2014
Joseph B. Bernstein Ofir Delly,
Moti GabbayAriel University
Yizhak Bot (BQR)[email protected]
April 30, 2014 2
We always try learning from the past in order to improve the Future.
One Problem..…Everyone sees the past
differently !
April 30, 2014 3
“It is possible to fail in many ways...while to succeed is
possible only in one way…” Aristotle
If We don’t learn from the past, We are condemned to repeat
it…George Santayana, 1952
April 30, 2014 4
SO, WHAT’S THE BIG PROBLEM ???
WHY IS LIFETIME PREDICTION
SO DIFFICULT???
April 30, 2014 5
The Semiconductor Test Industry Today
We test the parts “blindly” and then “see how they run”…
April 30, 2014 6
Field Data Analysis Results
= 1 ± .2 for all systemsField Failures are generally Constant Rate Occurrences, Beta = 1 is Poisson.
Cumulative data for over 10,000,000 Military Electronic Systems
Physics of Failure
MTBF Region
So, we should keep MTBF and FIT
April 30, 2014 7
•Modern Electronics have nearly constant failure rate
•Few (very rare) exceptions•Keep the idea of Constant Rate and
work within the framework of Failure-In-Time (FIT)
Some Observations:
April 30, 2014 8
So what’s the problems with FIT ?Handbooks are Pretty outdated
oMIL 217 is OLD and USELESS.oFIDES is updated but only applies a
single mechanism approach.oPhysics of Failure (PoF) approach
looks to TTF and not FIT.oProbabilistic DfR requires unique
distributions for each mechanism.oHALT/HASS cannot predict l.
April 30, 2014 9
JEDEC Publication JEP 122G Rev. Oct. 2011 I Bet You didn’t know JEDEC says this:
2 Terms and definitions (cont’d) quoted failure rate: The predicted failure rate for typical
operating conditions. (This is the FIT)NOTE: The quoted failure rate is calculated from the observed
failure rate under accelerated stress conditions multiplied by an accelerated factor; e.g..…
“When multiple failure mechanisms and thus multiple acceleration factors are involved, then a proper summation technique, e.g., sum-of-
the-failure rates method, is required”.
April 30, 2014 10
Semiconductor Industry ‘Joke’ The Magical Mysterious Decreasing FIT
Intel
Maxim
1 FIT = 1 Failure per 10,000 parts in 12 years.If ONLY this were true!
PLX
April 30, 2014 11
Measured Component FIT (l) vs. Year Produced
•Compared to previous avionic system data, the trend continues at a much greater than expected rate.
•Bernstein’s Law: ~10x increase in FIT every 10 years
Field Return Data
0.25 m : ~20-50 FIT
90 nm : ~ 150 - 300 FITACTUAL Failures per Billion Part-Hours 65nm:~ 300-450 FIT
Avionic and Military
Expectation!
130 nm : ~90-120 FIT
45-22 nm??? :
April 30, 2014 12
Reliability
Performance
Benefits to Accurate Prediction !!Performance is
Designed for a required Reliability specification
A small reduction in performance
can bring a huge gain in reliability
(illustrative)
1 .X
X 2.
Suggestion:Two products;
One design
More customers for the same
Design
More applications means more Sale$
Multiple Accelerated Test Matrix for Reliability Prediction 12
April 30, 2014 13
Performance vs. Reliability
•I could double the speed for free If I KNOW the reliability, maybe I CAN improve performance?!?!
Why not operate here?
0.5 1 1.5 2 2.5 3 3.50.00E+00
1.00E+07
2.00E+07
3.00E+07
4.00E+07
5.00E+07
6.00E+07
7.00E+07
8.00E+07
9.00E+07
1.00E+0821 inverter RO
Core Voltage (V)
Freq
. (Hz
)
Nominal Voltage
April 30, 2014 14
Qualification TODAY Industry ‘Standard’ FIT (failures in time) model:
Acceleration Factor (AF) is the product of Voltage and Temperature acceleration factors.
3 KILLER problems:.1This does NOT fit with KNOWN failure models.
.2When ZERO failures are reported, there is NO statistical meaning to the acceleration factor.
.3Uncertainty is assumed for 0/1 fails, while AF has ZERO uncertainty; no accounting for error in AF!!
April 30, 2014 15
Multiple Mechanisms Are Here to Stay
• Traditional Reliability approach fails to predict Field Failures .
• Modeling, Simulation and Acceleration alone will NOT yield true results without Accurate Failure Analysis.
• HOWEVER: We CAN model and PREDICT Failure Rate under Known Conditions with a
more complete picture of the mechanisms???
April 30, 2014 16
Multiple Mechanisms Don’t Add Up!!! Single Mechanism Model:
–AFsystem = AFThermal* AFElectrical
–So, 1/MTTFuse = 1/(MTTFtest *AFMM)
Multiple Mechanism Model:–1/MTTFuse = P1/(MTTFtest *AFmech1) + P2/(MTTFtest *AFmech2)
–Therefore, the effective AF for multiple mechanisms is:AFMM = 1
P1P2AFmech1AFmech2
•The True acceleration factor is the SMALLER one, not the one which exposes a failure at accelerated test.
+
April 30, 2014 17
Traditional Methodology•Single Mechanism Model (old JEDEC Standard):
–77 Devices tested for 1000 hours with 0 failures…•For Example: AFT = 100 and AFV = 130
AFS= 100*130 = 13000 !!
Zero failures at High V and High T
Assume 1 failure after 1000 hours :Thus FIT: 109 / (77 * 1000 * 13000) = 1 FIT!!
•NICE! Now, we have done a great job and can go home and celebrate our success !!! NOT!!!
April 30, 2014 18
The Reality of Multiple Mechanisms•BUT….Multiple Mechanisms Compete!
•Same Example: AFV from HCI and AFT from EM –EM has Ea = 1 eV and voltage g ~ 1.
–HCI has Ea ~ 0 eV and voltage g ~ 14
•NOW, AFS = 2/(1/100 + 1/130) = 163
•So our correct calculation for the same data:
FIT: 109 / (77 * 1000 * 163) = 113 FIT!!This is compared to 1 FIT based on HTOL.
Traditional FIT is ALWAYS too low as compared to considering multiple mechanisms
April 30, 2014 19
Failure Rate Estimation at System LevelNew System Reliability Model
Replacement Program (collaboration)
FM1 FM2 FM3
Nth Component
Each component is comprised of several sub-components in
proportion to their function and relative reliability stress.
Base Failure rate can be determined at various accelerated conditions in order to normalize the matrix and
make physics based reliability assessment from test data combined
with knowledge of the application
April 30, 2014 20
FIXtress™ : A MORE ACCURATE FIT
Time to Fail (years)24681012
Calculated PDF
(FIT)
l
l~S(1/MTTF1+1/MTTF2+…+1/MTTFn)The manufacturers have the data, we can make the
prediction (BQR Software Tool)!
λTDDB
λHCI
λNBTI
λEM
λPackage
April 30, 2014 21
“It is better to be roughly right than precisely
wrong”. ―John Maynard Keynes
Our Guiding Principle:
April 30, 2014 22
How can we match data from reliability Models with experimentally obtained AF from HTOL?
PROPOSAL: Run Multiple Tests at different conditions while monitoring degradation.
22
Physics of Failure Models (JEDEC)
AF from Burn-in at different T, V
Matrix solution can match
Post-Silicon Test Strategy
April 30, 2014 23
JEDEC or TSMC Physics models
MTBF / FIT
DOE Burn-In
Relative AF Relative MTBF/FIT
System (TEST) measurements
DPPM per Fmax limit (real FIT at V, T test)
Matrix solution
24 failure mechanisms
over 4 categories
TDDB
HCI
BTI
EM
T1,V1
λTDDB λHCI λBTI λEM
T2,V2
T3,V3
T4,V4
Rel. AF
=
X
Reliability solution: FIT, DPPM
Output
Proportionality parameter X
Input Input
Input
Our New Approach (ARIEL)
April 30, 2014 24
45nm
Temp Volt TDDB HCI BTI EM FIT200 1.2 2.93E+03 8.35E+00 4.26E+04 2.40E+05 242750140 1.2 3.71E+02 1.59E-01 4.55E+02 9.71E+03 9710
-35 2.4 3.19E+08 2.12E+13 9.08E+07 8.16E-05 9710000140 2.4 5.10E+13 5.13E+11 2.20E+13 9.71E+03 703975
30 1.2 1.00 1.00 1.00 1.00 185 1.2 30 0.67 34 399.00 Use
120 1.8 5305442428 739966 42398594 5362 HTOL
Contributions from JEDEC ModelsDifferent Dominant Mechanism at
each test condition
April 30, 2014 25
HTOL is OVERWHELMINGLY measuring only TDDB
•This is very convenient when Zero failures arise during the 1000 hour HTOL test.
•Foundries design the gate oxides very well so there WILL be NO TDDB failures during HTOL testing.
•3 other mechanisms are just ignored during final test and qualification.
April 30, 2014 26
Separation of Mechanisms•Failure Mechanisms can be separated by
properly selecting test conditions.•High Voltage and Low Voltage tests EM
•High Temperature and High Voltage tests for NBTI and for TDDB
•Low Temperature and High Voltage tests for HCI
April 30, 2014 27
Two Distinct Mechanisms! •HCI frequency dependence
•See at LOW T and High V•NBTI No Freq. dependence
•Seen at High T and High V
0 50 100 150 200 250 300 350 400 4500
0.001
0.002
0.003
0.004
0.005
0.006
0 100 200 300 400 500 600 7000
0.001
0.002
0.003
0.004
0.005
0.006
F(MHz) F(MHz)
-35°C2.4 V
140° C2.4 V
Note: -35°C has >2.5X failure rate as at 140°C for the same Voltage!!
April 30, 2014 28
TDDB from NBTI
0.5 1 1.5 2 2.5 3 3.50
100000000
200000000
300000000
400000000
500000000
600000000
700000000
21-stage RO Frequency vs. Voltage
Voltage-core
Soft breakdownPerf
orm
ance
(fre
q.)
Time-Dependent Dielectric Breakdown (TDDB)
Neg. Bias-Temperature Instability (NBTI)
April 30, 2014 29
Prediction for 28nm
30 40 50 60 70 80 90 1001101201
10
100
1000 FIT for f=1GHz
Temperature °C
FIT
per
Billi
on G
ates
Voltage
1.2
1.1 1.0
0.8 0.9
30 40 50 60 70 80 90 100 110 1201
10
100
1000 FIT for V=1.0 V
Temperature °C
FIT
per
billi
on G
ates
Dominant Mechanisms are EM and BTI, so strong T and Freq. dependence but weak V dependence.
2 GHz1.5
1.0 0.5
0.1
April 30, 2014 30
Observation
•Increase voltage by 20%•Increase performance by 20%
•Increases FIT by only factor of 2•Increased customer satisfaction
•Increased sales for FREE!!!
April 30, 2014 31
Main Observations.1 Dominant Mechanism at HTOL test is Never
the dominant mechanism at USE conditions.2Acceleration Factor based on 1 mechanism
model Significantly Overestimates Reliability.3Foundry models today are quite
sophisticated and consider N- and P-MOS based on their own data AND companies trust these models.
.4The chip companies WANT to consider the true contributions of EACH mechanism.
April 30, 2014 32
Conclusions
•We have developed a prediction model that is based on 4 failure mechanisms
•Our model is more accurate than the single failure model currently in use
•Collaboration with Industry is Necessary to Verify our Models and to keep pace with advancing technology
April 30, 2014 3333
Thank You