safely overclocking flash i/o in ssds
Post on 21-Aug-2015
14 Views
Preview:
TRANSCRIPT
Safely Overclocking Flash I/O in SSDs
Kai Zhao, SanDisk
Rino Micheloni, PMC-Sierra
Tong Zhang, Rensselaer Polytechnic Institute
Flash Memory Summit 2015Santa Clara, CA 1
Outline
• Motivation – ECC capability may be under utilized
• Benefit – improve system performance, especially for read operations
• Reliability – leverage LDPC codes to mitigate the impact of overlocking
Flash Memory Summit 2015Santa Clara, CA 2
Page Reliability Variation
• Larger reliability variations for chips even from the same batch
• Raw BER varies dramatically under different P/E cycles and with different retention time
Flash Memory Summit 2015Santa Clara, CA 3
0 5 10 15 20 25 3010
-5
10-4
10-3
10-2
Retention Time (day)
Ra
w B
ER
1000 P/E cycles5000 P/E cycles
C
D
B
A
0 5 10 150.00
0.05
0.10
0.15
Pa
ge
Co
un
t F
req
ue
ncy
Number of page errors
1k P/E cycles, 2 days retention
0 200 4000.00
0.05
0.10
0.15
Point DPoint CPoint B
5k P/E cycles, 30 days retention
Point A
Data Link Overclocking
• Worst-case based ECC is under utilization in most of the time• We could trade reliability (when raw BER is low) for performance• One possible way – Overclocking on data link
Flash Memory Summit 2015Santa Clara, CA 4
Potential Benefits
• Total latency = Operation (read/program) latency +
Data transfer latency + ECC decoding latency
• ECC decoding latency is relatively small
• Transfer latency is comparable to read latency
• Smaller transfer latency could significantly improve read performance
Flash Memory Summit 2015Santa Clara, CA 5
𝜏 h𝑝 𝑦−𝑎𝑐𝑐=𝜏𝑜𝑝+𝜏 𝑥𝑓𝑒𝑟+𝜏 𝐸𝐶𝐶
Practical Considerations
• Characterize errors caused by overclocking
• Errors in the overclocked data link
• Errors in the NAND flash cell array
• Overclocking for read and/or write path
• Evaluate read/write path separately
• Permanent errors on write path
• Adaptation of overclocking in late lifetime of the drive
• Read retry
• Multiple data transfer rate
Flash Memory Summit 2015Santa Clara, CA 6
Feasibility of Overclocking
• ONFI 2.2 compliant chips supporting data transfer rate up to 166 MBPS according to the datasheet
• Data link overclocked up to 275 MBPS
Flash Memory Summit 2015Santa Clara, CA 7
200 212.5 225 237.5 250 262.5 27510
-6
10-5
10-4
10-3
10-2
Transfer Data Rate (MHz)
RB
ER
Read Overclocking: Plane 0Read Overclocking: Plane 1Write Overclocking: Plane 0Write Overclocking: Plane 1
<
Different overclocking capability for the 2 planes
System Performance for Read Overclocking
Flash Memory Summit 2015Santa Clara, CA 8
WebSearch prj mds hm Financial0.0
0.5
1.0
1.5
Nor
mal
ized
Rea
d R
espo
nse
Tim
e 166 MBps 200 MBps 250 MBps 275 MBps
WebSearch prj mds hm Financial0.00
0.05
0.10
0.15
0.20
0.25
Rea
d R
espo
nse
Tim
e R
educ
tion 4 kB 8 kB 16 kB
•Read response time, page size is 8 kB
•Data link overclocked up to 275 MBPS
•Comparison of average read response time reduction with respect to page size
•Data transfer rate is 275 MBPS
Simulator: DiskSim with SSD addonTraces: MSR Cambridge trace, UMASS trace repository
WebSearch prj mds hm Financial0.85
0.90
0.95
1.00
1.05
Nor
mal
ized
Rea
d R
espo
nse
Tim
e
250 MBps Read-Retry, 10% retry probability Read-Retry, 5% retry probability) Multi-Transfer-Rate
Late Lifetime Performance
• Average read response time under different adaption strategies• Read retry vs. multi-transfer-rate for different planes
• Read retry doesn’t sense the flash array, just re-transfer data• Direct lower transfer speed for the worse plane
Flash Memory Summit 2015Santa Clara, CA 9
2500 3000 3500 4000 4500 5000
0.9
0.95
1
1.05
1.1
Number of P/E CyclesN
orm
aliz
ed
Re
ad
Re
spo
nse
Tim
e
WebSearchFinancialhmprj
Read retry only
Late Lifetime Performance
• Average read response time when combining read retry and multi-transfer-rate
• After 4000 P/E cycles, only lower the transfer rate for Plane 0
Flash Memory Summit 2015Santa Clara, CA 10
2500 3000 3500 4000 4500 5000
0.9
0.95
1
1.05
Number of P/E Cycles
No
rma
lize
d R
ea
d R
esp
on
se T
ime
WebSearchFinancialhmprj
Link Error Mitigation using LDPC
• Effectiveness of the proposed overclocking design strategy depends on how well the ECC can handle the extra errors
• Leveraging LDPC codes and data link error characteristics to• Improving the tolerance to overclocking induced errors, which can lead to
lower data re-transfer rate
• Reducing the average number of LDPC decoding iterations, which can lead to lower power consumption and higher decoding throughput
• Two design techniques• Overclocking-aware LLR Calculation
• Inter-plane interleaving
Flash Memory Summit 2015Santa Clara, CA 11
Inter-plane Interleaving
• Each plane has its own page register and I/O circuitry
• Performance after overclocking may be varying
• In our measurement, Plane 0 is much more vulnerable to overclocking-induced I/O errors
Flash Memory Summit 2015Santa Clara, CA 12
0 100 2000.0
0.1
0.2
0.3
Plane 0Plane 1
Pag
e C
ount
Fre
quen
cy
Number of Errors in One Page
Read Overclocking 275 MBps
• The input of LDPC code decoding is the estimated log-likelihood ratio (LLR)
• Binary Asymmetric Channel (BAC). Different error characteristics for each wire and plane
• In the overclocked SSD read channel, LLR can be expressed as
Overclocking-aware LLR Calculation
Flash Memory Summit 2015Santa Clara, CA 13
LLR (𝑥𝑘)=𝑝 (𝑥𝑘=0∨𝑦 𝑘)𝑝 (𝑥𝑘=1∨𝑦𝑘 )
LLR (𝑥𝑘)=𝛴𝑖=0
𝑙 𝑝𝑚(𝑙 ) ( 𝑦𝑘∨𝑣𝑘=𝑖 )𝑝 (𝑣𝑘=𝑖∨𝑥𝑘=0 , 𝑐 )
𝛴 𝑖=0𝑙 𝑝𝑚
( 𝑙 ) (𝑦 𝑘∨𝑣𝑘=𝑖 )𝑝 (𝑣𝑘=𝑖∨𝑥𝑘=1 , 𝑐 )
Error Pattern Characterization
• Plane 0 suffers from more overclocking induced errors, moreover, these errors non-uniformly distributed across the 8-bit bus.
• The pattern ‘0’=>’1’ dominates the errors from Plane 0, while on Plane 1, patterns ‘0’=>’1’ and ‘1’=>’0’ evenly distributed.
Flash Memory Summit 2015Santa Clara, CA 14
0
0.2
0.4
0.6
0.8
1
1.2
Rel
ativ
e Fr
eque
ncy Plane 0
'0' => '1'
Plane 0'1' => '0'
Plane 1'1' => '0'
Plane 1'0' => '1'
Error Pattern
Simulation Results
• LDPC decoding power reduction• The decoding power is determined by the number of iterations• Data re-transfer rate reduction• Data link overclocked to 275 MBPS
Flash Memory Summit 2015Santa Clara, CA 15
0 1000 2000 3000 40000
20
40
60
80
100
120
LDP
C d
ecod
ing
pow
er r
educ
tion
(%)
P/E cycles
Overclocking-aware LDPC decodingInter-plane interleavingCombined
0 1000 2000 3000 4000 500010
-6
10-4
10-2
100
Re-
tran
sfer
pro
babi
lity
P/E cycles
Conventional LDPC decoding
Overclocking-aware LDPC decoding
Inter-plane interleaving
Combined
Conclusion
• Overclock data link to fully utilize the ECC capability
• Read performance can be significantly improved
• Mitigate the impact of overclocking by leveraging LDPC codes and the error characterization.
Flash Memory Summit 2015Santa Clara, CA 16
Flash Memory Summit 2015Santa Clara, CA 17
Thank you
top related