![Page 1: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/1.jpg)
January 4-8, 2008 VLSI Design 2008 1
Single Event UpsetSingle Event UpsetAn Embedded TutorialAn Embedded Tutorial
Fan WangVishwani D. Agrawal
Department of Electrical and Computer EngineeringAuburn University, AL 36849 USA
21th International Conf. on VLSI Design, Hyderabad, India, January 4-8, 2008
![Page 2: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/2.jpg)
January 4-8, 2008 VLSI Design 2008 2
Motivation for This Work
With the continuous downscaling of CMOS technologies, the device reliability has become a major bottleneck.
The sensitivity of electronic systems can potentially become a major cause of soft (non-permanent) failures.
It is necessary for both circuit designer and test engineer to have the basic knowledge of soft errors caused by the basic radiation mechanisms, and the soft error mitigation techniques.
![Page 3: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/3.jpg)
January 4-8, 2008 VLSI Design 2008 3
Outline
Introduction to Soft ErrorsWhat is Soft Error?Historical notes
Basic radiation mechanisms in siliconSoft error resilience techniquesA case studyConclusion
![Page 4: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/4.jpg)
January 4-8, 2008 VLSI Design 2008 4
Introduction to SEUCertain behaviors in the state of the art
electronic circuits caused by random factors.
Single event upset (SEU) is non-permanent, non-functional error.
Definition from NASA Thesaurus: “Single Event Upset (SEU): Radiation-induced errors
in microelectronic circuits caused when charged particles (usually from the radiation belts or from cosmic rays) lose energy by ionizing the medium through which they pass, leaving behind a wake of electron-hole pairs”.
![Page 5: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/5.jpg)
January 4-8, 2008 VLSI Design 2008 5
What is Soft Error A “fault” is the cause of errors. A non-permanent fault is a non-destructive fault and
falls into two categories: Transient faults, caused by environmental conditions like
temperature, humidity, pressure, voltage, power supply, vibrations, fluctuations, electromagnetic interference, ground loops, cosmic rays and alpha particles.
Intermittent faults caused by non-environmental conditions like loose connections, aging components, critical timing, resistive or capacitive variations and noise in the system.
With advances in manufacturing, “soft error” caused by cosmic rays and alpha particles are dominant causes of failures in electronic systems.
![Page 6: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/6.jpg)
January 4-8, 2008 VLSI Design 2008 6
Historical Notes In the period 1954 through 1957 failures in digital electronics were
reported during the above-ground nuclear bomb tests. In 1962, Wallmark and Marcus predicted that cosmic rays would start
upsetting microcircuits due to heavy ionized particle strikes when feature sizes become small enough.
In 1970s and early 1980s, the effects of radiation received attention and more researchers examined the physics of these phenomena. Same as the fault tolerant computing theory.
In 1978, May and Woods of Intel Corporation determined that these errors were caused by the alpha particles emitted in the radioactive decay of uranium and thorium present just in few parts-per-million levels in package materials.
In 1979, Guenzer and Wolicki reported that the error causing particles came not only from uranium and thorium but that nuclear reactions generated high energy neutrons and protons. The term “SEU” has been in use since this paper.
In 1979, Ziegler and Lanford from IBM predicted that cosmic rays could result in the same upset phenomenon in electronics (not only memories) even at sea level.
![Page 7: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/7.jpg)
January 4-8, 2008 VLSI Design 2008 7
Soft Error Rate of Specific Applications Figure of Merit:
1. Fail In Time (FIT) 2. MTTF (Mean Time To Failure)
The number of failures per 109 device hours. 1 year MTTF = 109/(24*365) FIT = 114,155 FIT
SER of contemporary commercial chips is controlled to within 100~1000 FITs!!! Most hard failure mechanisms produce error rate on the order of 1~100 FIT Programmable Logic SER is almost 100 times larger than combinational logic
FPGA XC4010E XC4010XL
Process 0.60um 0.35um
Vcc 5v 3.3v
1 SEU every 1×106 hours 2.8×105 hours
M. Ohlsson, P. Dyreklev, K. Johansson and P. Alfke, “Neutron Single Event Upsets in SRAM-Based FPGAs”, proc. 1998 IEEE Nuclear & Space Radiation Effects Conference
Chuck Stroud, “FPGA Architectures and Operation for Tolerating SEUs”, Electrical Engineering VLSI design and test seminar, Spring 2007, Auburn University.
Soft Error Rate for SRAM-Based FPGAs:
Smaller design rule and lower supply voltagesUsed radiation chamber to calculate SEU frequency at altitude of 10km at 60°N (Sweden)
Projecting this for 3 design rule shrinks and 2 voltage reductions we get ≈1 SEU every 28.2 hrs
![Page 8: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/8.jpg)
January 4-8, 2008 VLSI Design 2008 8
Example: SRAM-Based FPGA System*
Table
cont.
*1. Example (1) is tested at Denver, using SpaceRad 4.5 (a software radiation effects prediction software program). Source: Actel.
2. All systems are without any protection.
![Page 9: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/9.jpg)
January 4-8, 2008 VLSI Design 2008 9
Radiation Mechanisms for Silicon (1)1. Alpha particles are emitted when the nucleus of an
unstable isotope decays to a lower energy state. (dominant soft error cause for DRAM in 1970s)
Uranium and thorium have the highest activity among naturally occurring radioactive materials.
In the terrestrial environment, major sources of radioactive impurities are lead-based isotopes in solder bumps of the flip-chip technology, gold used for the bond wires and lid plating, aluminum in ceramic packages, lead-frame alloys and interconnect metalization.
**With carefully selected materials, this mechanism effect can be greatly reduced.
![Page 10: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/10.jpg)
January 4-8, 2008 VLSI Design 2008 10
Radiation Mechanisms for Silicon (2)2. High-energy ( > 1 MeV*) neutrons from cosmic
radiation induces soft errors in semiconductor devices via secondary ions produced by the neutron reaction with silicon nuclei.
Cosmic rays which are of galactic origin react with the Earth’s atmosphere to produce complex cascades of secondary particles.
Neutrons are the most likely cosmic radiation sources to cause SEU in deep-submicron semiconductors at terrestrial altitude. The neutron flux is dependent on the altitude above sea level, the density of the neutron flux increases with altitude
**Nowadays, Neutron is the major cause among all fail mechanisms.
*MeV: Million Electron Volts
![Page 11: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/11.jpg)
January 4-8, 2008 VLSI Design 2008 11
Radiation Mechanisms for Silicon (3)3. The secondary radiation induced from the interaction of
cosmic ray neutrons and boron is the third significant source of ionizing particles in electronic systems.
Low-energy cosmic neutron interactions with the isotope boron-10 (10B). 10B is commonly used as p-type dopant for junction formation IC package.
**This mechanism can be greatly reduced or eliminated by removing source of 10B
Baumann et al, IEEE Trans. Device and Materials Reliability, vol. 1, no. 1, pp. 17–22, 2001.
![Page 12: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/12.jpg)
January 4-8, 2008 VLSI Design 2008 12
Single Event Transient (SET)
SET is caused by the generation of charge due to a high-energy particle passing through a sensitive node.
Each SET has its unique characteristics like polarity, waveform, amplitude, duration, etc. depend on particle impact location, particle energy, device technology, device supply voltage and output load.
The off transistors struck by a heavy ion with high enough LET* in the junction area are most sensitive to SEU.
Specifically, the channel region of the off-NMOS transistor and the drain region of the off-PMOS transistor.
*Linear Energy Transfer is a measure of the energy transferred to the device per unit length as an ionizing particle travels through a material.
![Page 13: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/13.jpg)
January 4-8, 2008 VLSI Design 2008 13
More Details of SET Generation
(a) Along the path traverses, the particle produces a dense radial distribution of electron-hole pairs.
(b) Outside the depletion region the non-equilibrium charge distribution induces a temporary funnel-shaped potential distortion along the trajectory of the event (drift component).
(c) Funnel collapses, diffusion component then dominates the collection process until all excess carriers have been collected, recombined, or diffused away from the junction area.
(d) Current vs. Time to illustrate the charge collection and SET generation.
![Page 14: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/14.jpg)
January 4-8, 2008 VLSI Design 2008 14
Analytical Model of SET The time constants depend strongly on the type of ion, its initial
energy and the properties of the specific technology. Approximate analytical model for ion track charge collection is a
double-exponential form. It gives an induced current with a rapid rise time but a more gradual fall time:
*Typical values are approximately
1.64 x 10-10sec for
and 5.10x10-11sec for .*Experimental Results from NASA JPL
![Page 15: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/15.jpg)
January 4-8, 2008 VLSI Design 2008 15
SET in CMOS Inverter
*For example, in ami12 technology, when the output load capacitance is 100fF and the cumulative collected charge is 0.65pC, the amplitude of the voltage pulse is 0.65pC/100fF = 0.65 x10-12C/100 x10-15F = 0.65V .
![Page 16: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/16.jpg)
January 4-8, 2008 VLSI Design 2008 16
Soft Error Mitigation Techniques The soft error tolerant techniques can be classified into
two types: recovery and prevention. Recovery: Recovery error after it does occur. Include on-line recovery mechanisms, fault tolerant computing,
ECC/parity check, redundancy etc. Prevention: The methods to protect microchips from soft-errors
before it occurs. The need for a recovery mechanism stems from the fact
that prevention techniques may not be enough for contemporary microchips.
Soft error is not the only reason why computer systems need to resort to a recovery procedure. Random errors due to noise, unreliable components, and coupling effects may also require the recovery mechanism.
![Page 17: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/17.jpg)
January 4-8, 2008 VLSI Design 2008 17
Some Mitigation Techniques Prevention Techniques
1. Purify the Fabrication Material: Uranium and thorium impurities have been reduced below one hundred
parts per trillion for high reliability. To eliminate 10B, alternative insulators that don’t contain boron are used.
2. Radiation Hardened Process Technologies SER performance can be greatly improved by adapting the process
technology either to reduce the collected charge or increase the critical charge.
Specific methods: use additional well isolation; replace bulk silicon with SOI.
10x reduction in SER achieved over conventional bulk devices when a fully depleted SOI substrate is used. But SOI is more expensive and parasitic bipolar action limit further reduction of SER.
![Page 18: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/18.jpg)
January 4-8, 2008 VLSI Design 2008 18
Picked Mitigation Techniques Recovery Techniques
1. Redundancy To gain higher system reliability by sacrificing the minimality of time or space or both. Classic design: Triple Modular Redundancy (TMR) with majority voter New design: time redundancy based on C-element gate to compare two samples
of combinational primary outputs at t0 and t0+d.
2. Error Detection and Correction Code (EDAC) Simple solution for memory: add a parity bit to each memory word. In most situations, it must be combined with a system-level approach for error
recovery.
*S. Mitra, Z. Ming, S. Waqas, N. Seifert, B. Gill, and K. S. Kim, “Combinational Logic Soft Error Correction,” in Proc. International Test Conference, 2006, pp. 1–9.
![Page 19: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/19.jpg)
January 4-8, 2008 VLSI Design 2008 19
A Case Study: IBM eServer z990 System z990 configuration
1. z990 contains 4 pluggable nodes connected through a planar board.
2. Each node contains up to 64 GB physical memory and 32 MB L2 cache for a system capacity of 256 GB memory and 126 MB L2 cache.
Error tolerance techniques used:1. Extensive use of ECC and parity with retry on data
and controls;
2. Full SRAM ECC and parity protection
3. Microprocessor mirroring
![Page 20: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/20.jpg)
January 4-8, 2008 VLSI Design 2008 20
Conclusion SER in logic and memory chips will
continue to increase as devices become more sensitive to soft errors at sea level
Open soft error issues:1. How EDA tools handle soft error hardening?2. Analysis of radiation mechanisms (too complex
to be comprehensive)3. Soft error rate analysis for logics4. Error mitigation methods
![Page 21: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/21.jpg)
January 4-8, 2008 VLSI Design 2008 21
Useful References and Further Readings
1. “Single Event Phenomena”, (Messenger and Ash, 1993)2. “Ionizing Radiation Effects in MOS Devices and Circuits”, (Ma and
Dressendorfer, 1989)3. “Handbook of Radiation Effects”, (A. Holmes-Siedle and L. Adams,1993)4. “Fault-Tolerance Techniques for SRAM-Based FPGAs”, (Kastensmidt,
Fernanda Lima, Carro, Luigi, Reis, Ricardo, 2006)
5. Test methods and standard: JEDEC89, JEDEC89A, JEDEC89-26. Journals: IEEE Trans on Nuclear Science, IEEE Trans Reliability7. NASA Goddard’s test group: http://radhome.gsfc.nasa.gov/radhome/papers/seeca5.htm
7. NASA Space Environment and Effects Program http://see.msfc.nasa.gov/… …
![Page 22: January 4-8, 2008VLSI Design 20081 Single Event Upset An Embedded Tutorial Fan Wang Vishwani D. Agrawal Department of Electrical and Computer Engineering](https://reader035.vdocuments.us/reader035/viewer/2022062423/56649c905503460f94949ea5/html5/thumbnails/22.jpg)
January 4-8, 2008 VLSI Design 2008 22
Thank You . . .