safely overclocking flash i/o in ssds

Post on 21-Aug-2015

14 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Safely Overclocking Flash I/O in SSDs

Kai Zhao, SanDisk

Rino Micheloni, PMC-Sierra

Tong Zhang, Rensselaer Polytechnic Institute

Flash Memory Summit 2015Santa Clara, CA 1

Outline

• Motivation – ECC capability may be under utilized

• Benefit – improve system performance, especially for read operations

• Reliability – leverage LDPC codes to mitigate the impact of overlocking

Flash Memory Summit 2015Santa Clara, CA 2

Page Reliability Variation

• Larger reliability variations for chips even from the same batch

• Raw BER varies dramatically under different P/E cycles and with different retention time

Flash Memory Summit 2015Santa Clara, CA 3

0 5 10 15 20 25 3010

-5

10-4

10-3

10-2

Retention Time (day)

Ra

w B

ER

1000 P/E cycles5000 P/E cycles

C

D

B

A

0 5 10 150.00

0.05

0.10

0.15

Pa

ge

Co

un

t F

req

ue

ncy

Number of page errors

1k P/E cycles, 2 days retention

0 200 4000.00

0.05

0.10

0.15

Point DPoint CPoint B

5k P/E cycles, 30 days retention

Point A

Data Link Overclocking

• Worst-case based ECC is under utilization in most of the time• We could trade reliability (when raw BER is low) for performance• One possible way – Overclocking on data link

Flash Memory Summit 2015Santa Clara, CA 4

Potential Benefits

• Total latency = Operation (read/program) latency +

Data transfer latency + ECC decoding latency

• ECC decoding latency is relatively small

• Transfer latency is comparable to read latency

• Smaller transfer latency could significantly improve read performance

Flash Memory Summit 2015Santa Clara, CA 5

𝜏 h𝑝 𝑦−𝑎𝑐𝑐=𝜏𝑜𝑝+𝜏 𝑥𝑓𝑒𝑟+𝜏 𝐸𝐶𝐶

Practical Considerations

• Characterize errors caused by overclocking

• Errors in the overclocked data link

• Errors in the NAND flash cell array

• Overclocking for read and/or write path

• Evaluate read/write path separately

• Permanent errors on write path

• Adaptation of overclocking in late lifetime of the drive

• Read retry

• Multiple data transfer rate

Flash Memory Summit 2015Santa Clara, CA 6

Feasibility of Overclocking

• ONFI 2.2 compliant chips supporting data transfer rate up to 166 MBPS according to the datasheet

• Data link overclocked up to 275 MBPS

Flash Memory Summit 2015Santa Clara, CA 7

200 212.5 225 237.5 250 262.5 27510

-6

10-5

10-4

10-3

10-2

Transfer Data Rate (MHz)

RB

ER

Read Overclocking: Plane 0Read Overclocking: Plane 1Write Overclocking: Plane 0Write Overclocking: Plane 1

<

Different overclocking capability for the 2 planes

System Performance for Read Overclocking

Flash Memory Summit 2015Santa Clara, CA 8

WebSearch prj mds hm Financial0.0

0.5

1.0

1.5

Nor

mal

ized

Rea

d R

espo

nse

Tim

e 166 MBps 200 MBps 250 MBps 275 MBps

WebSearch prj mds hm Financial0.00

0.05

0.10

0.15

0.20

0.25

Rea

d R

espo

nse

Tim

e R

educ

tion 4 kB 8 kB 16 kB

•Read response time, page size is 8 kB

•Data link overclocked up to 275 MBPS

•Comparison of average read response time reduction with respect to page size

•Data transfer rate is 275 MBPS

Simulator: DiskSim with SSD addonTraces: MSR Cambridge trace, UMASS trace repository

WebSearch prj mds hm Financial0.85

0.90

0.95

1.00

1.05

Nor

mal

ized

Rea

d R

espo

nse

Tim

e

250 MBps Read-Retry, 10% retry probability Read-Retry, 5% retry probability) Multi-Transfer-Rate

Late Lifetime Performance

• Average read response time under different adaption strategies• Read retry vs. multi-transfer-rate for different planes

• Read retry doesn’t sense the flash array, just re-transfer data• Direct lower transfer speed for the worse plane

Flash Memory Summit 2015Santa Clara, CA 9

2500 3000 3500 4000 4500 5000

0.9

0.95

1

1.05

1.1

Number of P/E CyclesN

orm

aliz

ed

Re

ad

Re

spo

nse

Tim

e

WebSearchFinancialhmprj

Read retry only

Late Lifetime Performance

• Average read response time when combining read retry and multi-transfer-rate

• After 4000 P/E cycles, only lower the transfer rate for Plane 0

Flash Memory Summit 2015Santa Clara, CA 10

2500 3000 3500 4000 4500 5000

0.9

0.95

1

1.05

Number of P/E Cycles

No

rma

lize

d R

ea

d R

esp

on

se T

ime

WebSearchFinancialhmprj

Link Error Mitigation using LDPC

• Effectiveness of the proposed overclocking design strategy depends on how well the ECC can handle the extra errors

• Leveraging LDPC codes and data link error characteristics to• Improving the tolerance to overclocking induced errors, which can lead to

lower data re-transfer rate

• Reducing the average number of LDPC decoding iterations, which can lead to lower power consumption and higher decoding throughput

• Two design techniques• Overclocking-aware LLR Calculation

• Inter-plane interleaving

Flash Memory Summit 2015Santa Clara, CA 11

Inter-plane Interleaving

• Each plane has its own page register and I/O circuitry

• Performance after overclocking may be varying

• In our measurement, Plane 0 is much more vulnerable to overclocking-induced I/O errors

Flash Memory Summit 2015Santa Clara, CA 12

0 100 2000.0

0.1

0.2

0.3

Plane 0Plane 1

Pag

e C

ount

Fre

quen

cy

Number of Errors in One Page

Read Overclocking 275 MBps

• The input of LDPC code decoding is the estimated log-likelihood ratio (LLR)

• Binary Asymmetric Channel (BAC). Different error characteristics for each wire and plane

• In the overclocked SSD read channel, LLR can be expressed as

Overclocking-aware LLR Calculation

Flash Memory Summit 2015Santa Clara, CA 13

LLR (𝑥𝑘)=𝑝 (𝑥𝑘=0∨𝑦 𝑘)𝑝 (𝑥𝑘=1∨𝑦𝑘 )

LLR (𝑥𝑘)=𝛴𝑖=0

𝑙 𝑝𝑚(𝑙 ) ( 𝑦𝑘∨𝑣𝑘=𝑖 )𝑝 (𝑣𝑘=𝑖∨𝑥𝑘=0 , 𝑐 )

𝛴 𝑖=0𝑙 𝑝𝑚

( 𝑙 ) (𝑦 𝑘∨𝑣𝑘=𝑖 )𝑝 (𝑣𝑘=𝑖∨𝑥𝑘=1 , 𝑐 )

Error Pattern Characterization

• Plane 0 suffers from more overclocking induced errors, moreover, these errors non-uniformly distributed across the 8-bit bus.

• The pattern ‘0’=>’1’ dominates the errors from Plane 0, while on Plane 1, patterns ‘0’=>’1’ and ‘1’=>’0’ evenly distributed.

Flash Memory Summit 2015Santa Clara, CA 14

0

0.2

0.4

0.6

0.8

1

1.2

Rel

ativ

e Fr

eque

ncy Plane 0

'0' => '1'

Plane 0'1' => '0'

Plane 1'1' => '0'

Plane 1'0' => '1'

Error Pattern

Simulation Results

• LDPC decoding power reduction• The decoding power is determined by the number of iterations• Data re-transfer rate reduction• Data link overclocked to 275 MBPS

Flash Memory Summit 2015Santa Clara, CA 15

0 1000 2000 3000 40000

20

40

60

80

100

120

LDP

C d

ecod

ing

pow

er r

educ

tion

(%)

P/E cycles

Overclocking-aware LDPC decodingInter-plane interleavingCombined

0 1000 2000 3000 4000 500010

-6

10-4

10-2

100

Re-

tran

sfer

pro

babi

lity

P/E cycles

Conventional LDPC decoding

Overclocking-aware LDPC decoding

Inter-plane interleaving

Combined

Conclusion

• Overclock data link to fully utilize the ECC capability

• Read performance can be significantly improved

• Mitigate the impact of overclocking by leveraging LDPC codes and the error characterization.

Flash Memory Summit 2015Santa Clara, CA 16

Flash Memory Summit 2015Santa Clara, CA 17

Thank you

top related