reliability issues in flash memory storage...
TRANSCRIPT
![Page 1: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/1.jpg)
Reliability Issues in Flash Memory Storage Devices
2011. 08. 01Sang Lyul Min
Seoul National Universityhttp://archi.snu.ac.kr/symin
![Page 2: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/2.jpg)
Outline
Flash Memory Basics Our 10-year Research and Technology Transfer Lessons Learned Conclusions
![Page 3: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/3.jpg)
Conventional MOS Transistor
gate (G)
p-substraten+ source (S) n+ drain (D)
Schematic symbol
G
S
D
![Page 4: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/4.jpg)
Conventional MOS Transistor: A Constant-Threshold Transistor
Id
VgsVth
GS
RonS D
Vgs > Vth
![Page 5: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/5.jpg)
Flash Memory
Control gate
erasure
p-substrate
Floating gate
Thin tunneling oxide
n+ source n+ drain
programming
Schematic symbol
G
S
D
![Page 6: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/6.jpg)
Flash Memory
Control gate
p-substraten+ source n+ drain
Control gate
p-substraten+ source n+ drain
Erased Cell Programmed Cell
![Page 7: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/7.jpg)
Flash Memory:A “Programmable-Threshold” Transistor
Id
VgsVth-0 Vth-1
“1” state “0” state
Control gate
p-substraten+ source n+ drain
Control gate
p-substraten+ source n+ drain
Erased state Programmed state
![Page 8: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/8.jpg)
More Bits Per Transistor
Source: Eli Harari (SanDisk), “NAND at Center Stage,” Flash Memory Summit 2007.
![Page 9: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/9.jpg)
(NAND) Flash Memory Interface
vs.
SpareData
…
SpareData
SpareData
SpareData
SpareData
…
SpareData
SpareData
SpareData
SpareData
…
SpareData
SpareData
SpareData
………
2j blocks
2i
pages
SpareData
…
SpareData
SpareData
SpareData
SpareData
…
SpareData
SpareData
SpareData
SpareData
…
SpareData
SpareData
SpareData
………
2j blocks
2i
pages Each read / (in-place) write takes 5~35 ms
Read physical page (chip #, block #, page #) 20 ~ 80 us
Write physical page (chip #, block #, page #) 200~800 us
Erase block (chip#, block #) 2~3 ms
![Page 10: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/10.jpg)
NAND Flash Memory Characteristics
The Good- Low latency - Low power consumption- High shock/vibration resistant- Small form factor- Massive parallelism
….
From the dark night
![Page 11: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/11.jpg)
NAND Flash Memory Market Trends
$/MB DRAM NAND Flash
2000 $0.97 $1.35
2001 0.22 0.43
2002 0.22 0.25
2003 0.17 0.21
2004 0.17 0.10
2005 0.11 0.05
2006 0.096 0.021
2007 0.057 0.012
2008 ~0.025 <0.005
CAGR -32.1%/yr -50.0%/yr
Source: Lane Mason (Denali Software), “NAND FlashPoint Platform”
![Page 12: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/12.jpg)
NAND Flash Memory Market Trends
Millions GB DRAM NAND Flash
2000 30 1.1
2001 50 1.6
2002 71 4.6
2003 98 14.6
2004 158 68
2005 240 200
2006 340 600
2007 645 1600
2008 1000 4000
CAGR +60.0%/yr +150%/yr
Source: Lane Mason (Denali Software), “NAND FlashPoint Platform”
![Page 13: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/13.jpg)
Outline
Flash Memory Basics Our 10-year Research and Technology Transfer Lessons Learned Conclusions
![Page 14: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/14.jpg)
14
Our 10-year Research on Flash Memory
2000년SSFTL (For Commercial CF cards) (2000. 05 ~ 2002.01)
2006년
2005년
2011년
2002년
2004년
SSFTL - SeoulSSFTL - Hong Kong SSFTL - Vancouver
USB 2.0-based SSD (Flash-only) (2004. 06)
Hybrid HDD
Chameleon SSD (Flash/FRAM Hybrid) (2005.12)
Hydra SSD (Flash-only) (2006.02)
Hydra FPGA version Technology TransferHydra ASIC version Technology Transfer
![Page 15: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/15.jpg)
15
Hydra SSD Platform
NV-RAM Modules
Samsung SLC NAND
Samsung MLC NAND
Hynix MLC NAND
FREESCALE MRAM (parallel)
RAMTRON FRAM (parallel)
RAMTRON FRAM (serial)
![Page 16: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/16.jpg)
16
Performance Results
PC Mark 05 Results
0
10
20
30
40
50
60
70
80
XP Startup Application Loading General Usage Virus Scan File Write
MB/
s
Seagate 2.5 in HDD Seagate 3.5 in HDD Adtron 3.5 in SSD M-Systems 2.5 in SSDSamsung 2.5 in SSDHydra SSD
3431 5169 2255 4494 6080 11045
PCMark05 HDD Score
Seong, Y.J., Nam, E.H., Yoon, J.H., Kim, H., Choi, J.-Y., Lee, S., Bae, Y.H., Lee, J., Cho, Y., Min, S.L. “Hydra: A block-mapped parallel flash memory solid-state disk architecture” (2010), IEEE Transactions on Computers, 59 (7), pp. 905-921.
Nam, E.H., Kim, S.J., Eom, H., and Min, S.L. “Ozone (O3): An Out-of-order Flash Memory Controller Architecture,” To appear in the IEEE Transactions on Computers.
![Page 17: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/17.jpg)
17
Technology Transfer
Oct. 30, 2007, NotebookReview.com“… In fact, it may well be the single fastest storage medium available to the customer today….”
![Page 18: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/18.jpg)
Outline
Flash Memory Basics Our 10-year Research and Technology Transfer Lessons Learned Conclusions
![Page 19: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/19.jpg)
19
NAND Flash Memory Characteristics
The Good- Low latency - Low power consumption- Small form factor- Massive parallelism
….
The Bad- Power failures- Bad blocks
(Program/Erase Errors)- Program disturbance- Read disturbance
From the dark night
![Page 20: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/20.jpg)
20
On A Fine Spring Day in 2007
………..
![Page 21: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/21.jpg)
21
Tragic Remains
![Page 22: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/22.jpg)
22
Around That Time…
http://en.wikipedia.org/wiki/Edison_Chen_photo_scandal
![Page 23: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/23.jpg)
23
A Few of Recovered Photos
![Page 24: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/24.jpg)
24
Happy Ending….
![Page 25: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/25.jpg)
25
………………………………………………………………..............
time
Reliability Analysis of NAND Flash Memory-based Storage Device (Ideal)
25
Power failure
Flash operation failure
….
………………………………………………………………..............
user1
user2
usernUsers
1day 1month 1year 10years … time
…
…
……………………......user3
user4
..…
timetk tk+1 time
cumulativefailure ratio
cumulativefailure ratio
cumulativefailure ratio
cumulative distribution
…1
………………………………………………………………..............
systemfailure
systemfailure
![Page 26: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/26.jpg)
2626
………………………………………………………………..............
time
Power failure
Flash operation failure
………………………………………………………………..............
1min 30min 6hour 3days … physical time…
..…
cumulativefailure ratio
cumulative distribution
1
………………………………………………………………..............
Emulatedusers
failstate
failstate
timetk tk+1 time
cumulativefailure ratio
cumulativefailure ratio
…
……………………......
1day 1month 1year 10years virtual time
Reliability Analysis of NAND Flash Memory-based Storage Device (Practical)
![Page 27: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/27.jpg)
27
Failure Analysis
2727
virtual time
s1
s2
sn
…
fail
fail
good
W1
W2
Wn
Reliabilityassessment Debugging
time
cumulativefailure ratio 1
Fail statediagnostics
Regression test
virtualtime
rollback &replay
restoresystemstate
restoresystemstate
snapshotrepository
rollback &replay
symptom 1 symptom 2
symptom 3symptom 4
symptom k
![Page 28: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/28.jpg)
2828
Post-mortem Analysis of Customer Returns
Causal relationship between bugs and symptoms
symptom 1 symptom 2
symptom 3symptom 4
symptom k
bug 1
bug 3
bug 2
bug 4
bug n
Post-mortemanalysis
symptom 1
symptom 3
symptom 2
symptom 4
Failure instances in real world
symptom 1
symptom 2
symptom 3
symptom 4
symptom k
…
![Page 29: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/29.jpg)
29
Two Key Findings
“Many” customer returns (cellular phones) are due to bugs in flash memory management software
“Most” bugs in flash memory management software are due to inadequate/incorrect handling of nested power failures and flash memory errors
![Page 30: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/30.jpg)
30
NAND Flash Memory Characteristics
The Good- Low latency - Low power consumption- High Reliability- Small form factor- Massive parallelism
….
The Bad- Power failures- Bad blocks
(Program/Erase Errors)- Program disturbance- Read disturbance
From the dark night
The Ugly- Limited Endurance- Retention errors
![Page 31: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/31.jpg)
31
The State of Affairs – Flash Memory
Scanning tunneling microscope image of a silicon surface showing 10 nm is ~20 atoms across
Source: B. Shirley, “The Many Flavors of NAND … and More to Come,” Flash Memory Summit 2009
![Page 32: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/32.jpg)
32
A “Deadly” Combination
Source: R. E. Kaufman, “Vaccine Role – History, Prevention and Future Projects,” Swine Flu: What Employers Need To Know, Sept. 24, 2009
![Page 33: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/33.jpg)
Outline
Flash Memory Basics Our 10-year Research and Technology Transfer Lessons Learned Conclusions
![Page 34: Reliability Issues in Flash Memory Storage Devicescse.snu.ac.kr/sites/default/files/node--seminar/20110801CIC_민상렬.pdf17 Technology Transfer Oct. 30, 2007, NotebookReview.com](https://reader034.vdocuments.us/reader034/viewer/2022042018/5e75c14508d8070610786783/html5/thumbnails/34.jpg)
34
Conclusions (Call for Actions)
“Provably-Correct” Flash Memory Software Bad Block Management Scheme Crash Recovery Scheme etc
“Open” Reliability Evaluation Platform High-fidelity Flash Memory Modeling Configurable Fault Injection
Prepare for the future “when everything fails”