solid-state drives (ssds)csl.skku.edu/uploads/ice3028s11/4-ssd.pdf · feature ssd (samsung) hdd...
TRANSCRIPT
Jin-Soo Kim ([email protected])Computer Systems Laboratory
Sungkyunkwan Universityhttp://csl.skku.edu
Solid-StateDrives (SSDs)
Solid-State Drives (SSDs)
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 3
HDDs vs. SSDs (1)
2.5” HDD Flash SSD(101x70x9.3mm)
1.8” HDD Flash SSD(78.5x54x4.15mm)
Top Bottom
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 4
HDDs vs. SSDs (2)
1 Source: http://forums.macrumors.com/showthread.php?t=658571 2 Source: http://www.danawa.com (As of Mar. 17, 2011)
Feature SSD (Samsung) HDD (Seagate)
Model MMDOE56G5MXP (PM800) ST9500420AS (Momentus 7200.4)
Capacity 256GB(16Gb MLC x 128, 8 channels)
500GB(2 Discs, 4 Heads, 7200RPM)
Form factor 2.5”Weight: 84g
2.5”Weight: 110g
Host interface Serial ATA-2 (3.0 Gbps)Host transfer rate: 300MB
Serial ATA-2 (3.0 Gbps)Host transfer rate: 300MB
Power consumption Active: 0.26WIdle/Standby/Sleep: 0.15W
Active: 2.1W (Read), 2.2W (Write)Idle: 0.69W, Standby/Sleep: 0.2W
Performance Sequential read: Up to 220 MB/sSequential write: Up to 185 MB/s
Power-on to ready: 4.5 secAverage latency: 4.17 msec
Measured performance1
(On MacBook Pro,256KB for sequential, 4KB for random)
Sequential read: 176.73 MB/sSequential write: 159.98 MB/sRandom read: 10.56 MB/sRandom write: 2.93 MB/s
Sequential read: 86.07 MB/sSequential write: 84.64 MB/sRandom read: 0.61 MB/sRandom write: 1.28 MB/s
Price2 539,190 won 80,400 won
NAND Constraints
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 6
No In-Place Update
▪ Can’t overwrite data ▪ Previous data should be erased▪ The erase unit is much larger than the
read/write unit• Read/write unit: page (4KB, 8KB)• Erase unit: block (64 ~ 128 pages)
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 7
Bit Errors▪ Error correction codes (ECC) in spare area▪ Hardware vs. software
Source: Micron Technology, Inc.
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 8
Bad Blocks
▪ Factory-marked bad blocks▪ Run-time bad blocks
▪ Bad block table▪ Bad block remapping
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 9
Limited Lifetime
▪ Limited program/erase cycles▪ 100K for SLCs▪ 5K ~ 10K for MLCs▪ Wear-leveling
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 10
Page Programming▪ NOP
• The number of partial-page programming is limited
• 1 / sector for most SLCs (4 for 2KB page)• 1 / page for most MLCs
▪ Sequential page programming• Pages should be programmed sequentially
inside a block• For large block SLCs and MLCs
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 11
Paired Pages in MLCs▪ Performance difference▪ Interference
Vth0V
11 10 01 00 Page 0
11 10 01 00
MSB Page LSB Page
Page 0
Page 1
I/O-split MLCs
Page-split MLCs
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 12
Beauty and the Beast▪ NAND Flash memory is beauty
• Small, light-weight, robust, low-cost, low-power non-volatile device
▪ NAND Flash memory is a beast• Much slower program/erase operations• No in-place-update• Erase unit > write unit• Limited lifetime (5K ~ 100K P/E cycles)• Bad blocks, …
Introduction toFlash Translation Layers
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 14
Storage Abstraction▪ Abstraction given by block device drivers:
▪ Operations• Identify(): returns N• Read (start sector #, # of sectors)• Write (start sector #, # of sectors, data)
512B 512B 512B
0 1 N-1
Source: Sang Lyul Min (Seoul National Univ.)
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 15
What is FTL?▪ A software layer to make NAND flash fully
emulate traditional block devices (or disks)
+Device Driver
Read Write Erase
File System
Read Sectors Write Sectors
Flash Memory
Mismatch!
+Device Driver
Flash Memory
FTL
+
Read Sectors Write Sectors
File System
Read Sectors Write Sectors
Source: Zeen Info. Tech.
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 16
Flash Cards Internals
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 17
SSD Internals
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 18
Implementing FTLs
Applications
Operating System
File Systems
Block Device Driver
Flash Translation Layer
NAND Controller
NAND Flash Memory
Applications
Operating System
File Systems
Block Device Driver
Flash Translation Layer
NAND Controller
NAND Flash Memory
Flash Cards, SSDs Embedded Flash Storage
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 19
For Performance▪ Indirect mapping (address translation)▪ Garbage collection▪ Over-provisioning▪ Hot/cold data identification/separation▪ Interleaving over multiple channels/flash
chips/planes▪ Request scheduling▪ Buffer management, …
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 20
For Reliability
▪ Bad block management▪ Wear-leveling▪ Power-off recovery▪ Error correction code (ECC)▪ ...
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 21
Other Features
▪ Encryption▪ Compression▪ Deduplication▪ ...
Basic Mapping Techniques
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 23
Naïve Block Mapping▪ Each table entry maps one block▪ Small RAM usage▪ Inefficient handling of small writes
00112233
44556677
889910101111
1212131314141515
LBN 0LBN 0LBN 1LBN 1LBN 2LBN 2LBN 3LBN 3
Data blocks
00000010 11
Logicalblock number
Page offset within a block
Logical page #11
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 24
Example (1)▪ W = <{4,5,6,7}, {1}>
• write ({4,5,6,7}, ABCD)
00112233
889910101111
1212131314141515
LBN 0LBN 0LBN 1LBN 1LBN 2LBN 2LBN 3LBN 3
Data blocks
Spareblocks
AABBCCDD
4567
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 25
Example (2)▪ W = <{4,5,6,7}, {1}>
• write ({4,5,6,7}, ABCD)• write ({1}, E)
0123
889910101111
1212131314141515
LBN 0LBN 0LBN 1LBN 1LBN 2LBN 2LBN 3LBN 3
Data blocks
Spareblocks
AABBCCDD
4567
00EE2233
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 26
ANAND Scheme▪ M-Systems ▪ US Patent 5,937,425▪ Uses one or more
replacement blocks▪ Data is written to
an unwritten physical page with the desired physical page offset
LBN 0LBN 0LBN 1LBN 1LBN 2LBN 2LBN 3LBN 3
Data blocks
0A
4567
891011
1213
BC
D
E
F
Replacementblocks
W = <{1}, {4,5}, {8}, {10}, {8}>
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 27
Naïve Page Mapping▪ Most flexible▪ Efficient handling of small
writes▪ Large memory footprint
• 32MB for 32GB MLC (4KB page)
▪ Sensitive to the amount of reserved blocks
▪ Performance affected as the system ages
0011441414
1010556677
228811111313
151599331212
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 28
Example (1)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)
001441414
1010556677
228811111313
151599331212
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
AA
Spareblocks
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 29
Example (2)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)• write ({2}, B)
001441414
1010556677
28811111313
151599331212
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
AABB
Spareblocks
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 30
Example (3)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)• write ({2}, B)• write ({8}, C)
001441414
1010556677
2811111313
151599331212
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
AABBCC
Spareblocks
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 31
Example (4)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)• write ({2}, B)• write ({8}, C)• write ({1}, D)
001441414
1010556677
2811111313
151599331212
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
ABBCCDD
Spareblocks
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 32
Example (5)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)• write ({2}, B)• write ({8}, C)• write ({1}, D)• write ({2}, E)
001441414
1010556677
2811111313
151599331212
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
ABCCDD
Spareblocks
EE
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 33
Example (6)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)• write ({2}, B)• write ({8}, C)• write ({1}, D)• write ({2}, E)• write ({12,13}, FG)
001441414
1010556677
28111113
1515993312
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
ABCCDD
Spareblocks
EEFFGG
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 34
Example (7)▪ W = <{1}, {2}, {8}, {1}, {2}, {12,13}, {9}>
• write ({1}, A)• write ({2}, B)• write ({8}, C)• write ({1}, D)• write ({2}, E)• write ({12,13}, FG)• write ({9}, H)
001441414
1010556677
28111113
151593312
Data blocks
LPN 0LPN 0LPN 1LPN 1LPN 2LPN 2LPN 3LPN 3LPN 4LPN 4LPN 5LPN 5LPN 6LPN 6LPN 7LPN 7LPN 8LPN 8LPN 9LPN 9LPN 10LPN 10LPN 11LPN 11LPN 12LPN 12LPN 13LPN 13LPN 14LPN 14LPN 15LPN 15
ABCCDD
Spareblocks
EEFFGGHH
ICE3028: Embedded Systems Design (Spring 2011) – Jin-Soo Kim ([email protected]) 35
Victim Selection Policies▪ Greedy
• A block with the largest amount of invalid data
▪ Cost-benefit [Kawaguchi 1995]• A block with the maximum • u: utilization• age: the time sine the most recent modification
▪ Cost Age Times (CAT) [Chiang 1999]• A block with the minimum• count: erase count
ageuu
costbenefit
2
)1(
countageu
u
)1(