flashing in the memory hierarchy · global block map / log block based ftl 24 {boukhobza,...
TRANSCRIPT
Flashing in the Memory Hierarchy An Overview on Flash Memory Internals
Jalil Boukhobza, Stéphane Rubini
Lab-STICC, Université de Bretagne Occidentale
15/11/2012 {boukhobza, rubini}@univ-brest.fr 1
NAND Flash in the hierarchy
15/11/2012 {boukhobza, rubini}@univ-brest.fr 2
Where is the NAND flash memory ?
15/11/2012 {boukhobza, rubini}@univ-brest.fr 3
15/11/2012 {boukhobza, rubini}@univ-brest.fr 4
15/11/2012 {boukhobza, rubini}@univ-brest.fr 5
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 6
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 7
Flash memory cells
15/11/2012 {boukhobza, rubini}@univ-brest.fr 8
Invented by F. Masuoka Toshiba 1980
Introduced by Intel in 1988
Type of EEPROM (Electrically Erasable & Programmable Read Only Memory)
Use of Floating gate transistors
Electrons pushed in the floating gate are trapped
3 operations: program (write), erase, and read
Control Gate
Floating gate
P substrate N+ N+
Oxyde
layer
Source Drain
Flash memory operations
15/11/2012 {boukhobza, rubini}@univ-brest.fr 9
Erase operation
FN (Fowler-Nordheim) tunneling: Apply high voltage to substrate (compared to the operating voltage of the chip - usually between 7– 20V)
electrons off the floating gate
Logic « 1 » in SLC
Control Gate
N+ N+
20V 20V
0V
Program / write
operation
Apply high voltage to
the control gate
electrons get
trapped into the
floating gate
Logic « 0 »
Control Gate
N+ N+
20V
0V 0V Control Gate
Floating gate
N+ N+
Ref. voltage
0V 0V
Read operation
Apply reference voltage to the control gate: If floating gate
charged: no current flow
If not charged; current flow
NOR Vs NAND
15/11/2012 {boukhobza, rubini}@univ-brest.fr 10
NOR Byte random access
Low density
Higher cost (/bit)
Fast read
Slow write
Slow erase
Code storage (XIP – eXecute In Place)
NAND Page access
High density
Slower read
Faster write
Faster erase (block granularity)
Data storage
Other types: DiNOR, AND, … Source EETimes: http://www.eetimes.com/design/memory-
design/4009410/Flash-memory-101-An-Introduction-to-NAND-flash
NAND flash memory architecture
15/11/2012 {boukhobza, rubini}@univ-brest.fr 11
Read/Write page
Erasures blocks
Page: 2-8KB
Block: 128-1024 KB
Source: http://www.electroiq.com/articles/sst/2011/05/solid-state-drives.html
Different densities: SLC, MLC, TLC
15/11/2012 {boukhobza, rubini}@univ-brest.fr 12
SLC
(Single Level Cell)
MLC
(Multi Level Cell)
TLC
(Tri Level Cell)
Storage 1 bit / cell 2 bits / cell 3 bits /cell
Performance +++ ++ +
Density + ++ +++
Lifetime (P/E
cycles)
~ 100 000 ~ 10 000 ~5 000
ECC complexity + ++ +++
Applications Embedded and
industrial applications
(high end SSDs…)
Most consumer
applications (e.g.
memory cards)
Low-end consumer
applications not
needing data updates
(e.g. mobile GPS)
Compound Annual Growth Rate (CAGR)
15/11/2012 {boukhobza, rubini}@univ-brest.fr 13
Source
: G. W
ong, «
Infle
ction p
oin
ts », F
lash Su
mm
it 2011
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 14
Flash memory constraints
15/11/2012 {boukhobza, rubini}@univ-brest.fr 15
Write/Erase
granumarity
assymetry
(Cons1)
Erase-before-
write rule
(Cons2)
Limited cell
lifetime (Cons3) Control
Gate
Floating
gate
N+ N+
• Electrons get trapped
in the oxide layer
deteriorating its
characteristics
• Electrons cannot move
from oxyde layer to
floating gate
Data update:
- Invalidate
- Out-of-place update
+
Logical to physical
mapping Garbage Collection
+
Wear leveling
Wear leveling
15/11/2012 {boukhobza, rubini}@univ-brest.fr 16
You already do that with your tyres …
Keeping a balanced erasures’ distrubution over flash
memory blocks.
Pierelli
Courtesy
http://www.presence-pc.com/tests/ssd-flash-disques-22675/5/
More balanced
Garbage Collection
15/11/2012 {boukhobza, rubini}@univ-brest.fr 17
Moving valid pages from blocks containing invlid data
and then erase/recycle the blocks
Inv. Op. Moving people to a new city
and « erasing » the old one
to reuse the space !!!
Operating System Layer
FFS (Flash file System)
Flash Memory Device
FTL (Flash Translation Layer)
Flash memory structure
15/11/2012 {boukhobza, rubini}@univ-brest.fr 18
Logical to
physical mapping
Garbage
Collection
Wear
leveling
Other
Logical to
physical mapping
Garbage
Collection
Wear
leveling
Other
Application
Standard File System
Flash Memory Array
Application
Raw Flash Memory
Memory Technology Device (MTD)
Specific Technology Drivers
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 19
Basic mapping schemes
a- Page mapping ideal scheme
15/11/2012 {boukhobza, rubini}@univ-brest.fr 20
Each page mapped independently
High flexibility
Ideal performance
High RAM usage unfeasible
32GB flash memory, 2KB per page and 8 bytes/table entry 128MB table !!!
Optimal performance reference
Basic mapping schemes
b- Block mapping scheme
15/11/2012 {boukhobza, rubini}@univ-brest.fr 21
Only blocks numbers in
the mapping table
Page offsets remain
unchanged
Small mapping table
(memory footprint)
Very bad performance
for write updates
Same config. with 64
pages/block: 2MB page
table
Basic mapping schemes
c- Hybrid mapping scheme
15/11/2012 {boukhobza, rubini}@univ-brest.fr 22
Use of both block and page
mappings
Example:
Block mapping (BM)
Some data blocks are page
mapped (PM)
Current designs use whether:
One global BM and use log
blocks
Partition flash memory in one
BM region and a PM region
Performance of PM with a
memory footprint approaching
BM
FTL complex mapping schemes
15/11/2012 {boukhobza, rubini}@univ-brest.fr 23
FTL mapping
Block mapping
Page mapping
Hybrid mapping
Global block map/log blocks
Region partitionning
Global block map / log block based FTL
15/11/2012 {boukhobza, rubini}@univ-brest.fr 24
PB with block mapping:
Each page update: one block erase op. + pages copy for each page update
[Shinohara99] use of log pages in each block
Use of OOB to save the @ of the data page written to the spare
area
[Ban99] use of log blocks: blocks dedicated to absorb data update
ANAND : respecting the page offset in log blocks (cons: no more
than one same page update before merging data and log blocks)
FMAX: Allowing associativity in log blocks (use of OOB area)
1 to 1 data/log block correspondance
Data pages
Log pages
Update
page 1
Update
page 2
Update
page 3
Log block based FTL -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 25
Merge operations:
[RNFTL10] Reuse-Aware NAND FTL: only 46% of the
data blocks are full before merge operation happen
erase only log block and use the rest of data block as log pages
for other blocks
[BAST02] Block Associative Sector Translation FTL
FMAX with log blocks
Log blocks managed by page mapping table
One log block for a given data block
Data block Log block
A B
A` A`` B`
C B`
C
Merge data and log
block into a free block
Log block based FTL -3-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 26
[FAST07] Fully Associative Sector Translation FTL
Log blocks poorly filled because of associativity
Divide log blocks in 2 regions (spatial locality):
One sequentially written log block
Randomly accessed log blocks fully associative page mapped
[LAST08] Locality Aware Sector Translation FTL
FAST: less merges operations but very costly because pages coming from many different blocks
As for FAST 2 regions
Big writes sequential log blocks
Small write random log blocks
In random region: hot and cold region to avoid costly merge operations
Log block based FTL -4-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 27
[EAST08] Efficient and Advanced Space management Technique
FAST: bad usage of log blocks … yet not full
No sequential and random regions but:
In-place data updates in log blocks in first pace
Out-of-place data updates if more updates are achieved
Fix the number of log blocks that can be dedicated to one data blocks according to flash characteristics
[KAST09] K-Associative Sector Translation FTL
FAST: costly merge operation for the random region
Limit the associativity of log blocks (K)
Many sequential log blocks
Migration between random and sequential region
[STAFF07] State Transition Applied Fast FTL
Block: (F)ree state, (M)odified in-place, complete in-place
(S)tate, modified out-of-place (N), or in (O)bsolete state
(no valid data)
Page mapping table for
blocks in the N state.
RAM usage unpredictability
because of N
[HFTL09] Hybrid FTL
Use of hot data identifier:
Hot data are page mapped
Cold data use FAST
HFTL
Hybrid FTLs, region partitioning
15/11/2012 {boukhobza, rubini}@univ-brest.fr 28
Logical sector number
HFTL hot data
identifier
Flash memory
FAST Page Map
Hybrid FTLs, region partitioning -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 29
[WAFTL11] Workload adaptive FTL
Page mapped region
Random data and partial updates
Block mapped region
Sequential data and mapping tables
Source
: http
://storage
confe
rence
.org/2
011/P
rese
ntatio
ns/R
ese
arch/6
.Wei.p
df
Page mapping FTL
15/11/2012 {boukhobza, rubini}@univ-brest.fr 30
[DFTL09] Demand based FTL Idea : use page mapping and keep only part of the mapping table
in RAM Rest of the mapping table stored in the flash
[SFTL11] Spatial locality FTL Reduces the size of the mapping table by keeping track of
sequential accesses and use only one table entry.
Source: [DFTL09]
Tablet Apple IPad
15/11/2012 [email protected] 31
Virtual Flash Layer (VFL): remaps bad blocks and
presents an error-free NAND to the FTL layer
FTL YaFTL (Yet another FTL)
http://esec-lab.sogeti.com/post/Low-level-iOS-forensics
YaFTL
15/11/2012 [email protected] 32
Page mapping (implemented as a walk table)
DFTL principles: a part of the map is stored on the Flash
Splits the virtual address space into superblocks
3 types of superblocks
User data page
Index page (page map)
Context (index page+ erase count)
FTL (very partial) taxonomy
15/11/2012 {boukhobza, rubini}@univ-brest.fr 33
FTL mapping
Block mapping
Page mapping
DFTL
CDFTL
SFTL
Hybrid mapping
Global block map
Mitsubish ANAND &
FMAX
RNFTL BAST
FAST LAST
EAST
Region partitionning
STAFF
WAFTL
HFTL
See [Boukhobza13] for more details
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 34
Wear leveling
15/11/2012 {boukhobza, rubini}@univ-brest.fr 35
Objective: keep all the flash memory space usable as long as possible
Based on the number of erasures or writes performed on a block
< mean value : cold block
> mean value: hot block
Maintain the gap between hot and cold block as small as possible
Swap data from hot blocks to cold blocks (costly)
Which blocks are concerned: only free ? All of them ?
[DualPool95] adds 2 additional block mapping tables
Hot and cold table: free block taken from cold blocks
Periodically the mean erase number is recalculated and hot and cold tables updated
Wear leveling -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 36
Write count based wear leveler
[Achiwa99]: erase count maintained in RAM in
addition to write count
Put the more written data into the less erased blocks
[Chang07]: like the dual pool but according to the
number of writes (pages level)
[Kwon11] considering groups of blocks reducing the
RAM usage of the wear leveler
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 37
Garbage Collection / cleaning policy
15/11/2012 {boukhobza, rubini}@univ-brest.fr 38
Process that recycles free space from previously
invalidated pages in different blocks.
Answers the questions:
1. When should it be launched ?
2. Which blocks to choose and how many ?
3. How should valid data be written ?
4. (where to write the new data ?) wear leveler
1. Free blocks < 3
2. Containing the most
invalid pages to free
3 blocks (1,3,5)
3. Respecting pages’
placement
1 2 3 4 5 6 7
1 2 3 4 5 6 7
1 3 5 5
GC
Garbage Collection -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 39
Minimizing cleaning cost wile maximizing cleaned space
Cleaning cost:
Number of erase operations
Number of valid pages copy
Main considered metric: ratio of dirty pages in blocks
Maintaining counters in RAM or in metadata area
Separate cold and hot data when performing valid pages
copy (Q. 3 In previous slide)
Otherwise GC frequently launched (due to hot data updates)
[DAC08] Dynamic dAta Clustering: partitioning flash memory
into many regions depending on update frequency
Garbage Collection -3-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 40
Which blocks to choose (Q. 2)
Greedy policy: blocks with the most dirty pages
Efficient if flash memory accessed uniformly
[Kawaguchi95]
Space recycled: (1-u), cost of read and write valid data: (2u)
Elapsed time since the last modification: age
Age * (1-u)/2u block with the highest score is chosen
[CAT99]: Cost AgeTimes
Hot blocks are given more time to accumulate more invalid data direct erase without GC
CleaningCost * (1/age) * NumberOfCleaning
CleaningCost: u/(1-u) u: percentage of valid data in the block
NumberOfCleaning: number of generated erases
The less the score, the more chances to be cleaned
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 41
Flash specific cache systems
15/11/2012 {boukhobza, rubini}@univ-brest.fr 42
They are mostly write specific and try to:
absorb most page/block write operations at the cache level
reveal sequentiality by buffering write operations and reorganizing
them
[CFLRU06] Clean First LRU (caches reads & writes)
LRU list divided into two regions:
A working region: recently accessed pages
A clean first region: candidate for eviction
It evicts first clean pages that do not generate any write (e.g. P7, P5,
P8, then P6.)
Flash specific cache systems -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 43
[FAB06] Flash Aware Buffer
Flushes the largest groups of pages belonging to the same
block (minimize merges)
If the same number of pages: uses LRU
[BPLRU08] Block Padding LRU
Block level LRU scheme
Page padding: read lacking pages and flush full blocks
LRU compensation: sequentially written data are moved to
the end of the LRU queue
LRU like algorithms, page/block granularity, & double
objective : caching, reducing erasures
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 44
C-lash: a cache for flash [C-lash11]
15/11/2012 {boukhobza, rubini}@univ-brest.fr 45
Hierarchical cache
Cache with no WL or GC
2 regions:
Page region (P-space)
Block region (B-space)
Read operations
Hit: Read from cache
Miss: no copy to the cache
Write operations
Hit: update in the cache
Miss: write in the P-spaces
C-lash: a cache for flash -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 46
2 eviction policies
P-space B-space: largest set of
pages from the same block
(spatial locality)
B-space flash : LRU (temporal
locality)
Early and late cache merge in
B-space.
Switch (p-space/b-space)
Proved good performance for
mostly sequential workload
Bad for very random ones
CACH-FTL Cache-Aware Configurable
Hybrid FTL [CACH-FTL13]
15/11/2012 {boukhobza, rubini}@univ-brest.fr 47
Most flash systems have cache mechanisms on top of FTL
Most flash specific cache systems flush groups of pages
Flash specific cache system
Groups of
pages
CACH-FTL -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 48
PMR (Page Mapped Region) garbage collection: Launched when the number of
free blocks in the PMR goes under a predefined threshold
Greedy reclamation algorithm (least number of valid pages)
BMR (Block Mapped Region) garbage collections PMR-GC cannot find any physical
block containing enough invalid pages to recycle a block
Greedy reclamation algorithm selecting the largest group of PMR pages belonging to the same data block
CACH-FTL -3-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 49
• Flashsim+Disksim
simulator
• CACH-FTL configuration:
• threashold =8 pages
• Over-prov. of 10%
• OLTP real traces + Synth.
traces
CACH-FTL -4-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 50
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 51
Performance and energy considerations
15/11/2012 {boukhobza, rubini}@univ-brest.fr 52
2006 2012: nearly exponential growth of published
work on flash memory
“Tape is dead, disk is tape, flash is disk, RAM locality is
King” Jim Gray 2006
Flash disks outperform hard disk drives (HDD)
Sequential reads and writes
Random reads (no mechanical elements)
Random writes Achilles' heel
Depends on flash intricacies
Flash disks are generally more energy efficient
More than 5x less energy in some cases [Park11]
Performance and energy -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 53
Flash disk performance is heterogeneous
Depend on internal structure and workload
Performance disparities between SSDs from the same
constructor and between different technologies are
significant [Park11]
Performance and energy -3-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 54
A wide performance and energy asymmetry between
reads and writes [Park11]
The more free space the better the write performance
Performance and energy -4-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 55
Flash performance needs time to reach steady
state.
Source
: http
://snia.o
rg/sites/d
efau
lt/files/SSS%
20PT
S%20C
lient%
20-%
20v1
.1.p
df
Performance and energy -5-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 56
Flash memory design space is
large !
Different FTLs (mapping, WL,
GC)
Degree of concurrency
[Agrawal08]:
Parallel requests: parallel
requests to each element of the
flash array, a queue per element.
Ganging: using a gang of flash
elements in synchrony to
optimize multi-page request
Interleaving: within a die
Background cleaning
Capacity > 2TB, cost 0.7€/GB
Source: Bonnet, Bouganim, Koltsidas, Viglas, VLDB 2011
Performance and energy -6-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 57
Intra SSD performance:
SSD System level: among channels
Chip-level: among chips in a channel
Dies level: among dies in a chip
Among planes in a die
Channel
Package
Chip
Die
Plane
Block
Page
Page
Block
Page
Page
Plane
Block
Page
Page
Block
Page
Page
Die
Plane
Block
Page
Page
Block
Page
Page
Plane
Block
Page
Page
Block
Page
Page
Chip
Package
Channel
Performance and energy -6-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 58
Contributions
Power consumption & performance modeling of embedded
systems with an embedded OS (Pierre Olivier PhD)
15/11/2012 {boukhobza, rubini}@univ-brest.fr 59
Performance and energy
consumption at different
layers.
Microbenchmarking different
FFS / different initial states
Simple, atomic access.
Legacy NAND commands :
Read and write (page)
Erase (block)
Linux :
MTD Device versus Block Device
15/11/2012 [email protected] 60
Block device MTD device
Consists of sectors Consists of eraseblocks
Sectors are small (512, 1024 bytes) Eraseblocks are larger (typically 128KiB)
Maintains 2 main operations: read
sector and write sector
Maintains 3 main operations: read
from block, write to eraseblock, and
erase eraseblock
Bad sectors are re-mapped and hidden
by hardware
Bad eraseblocks are not hidden and
should be dealt with in software
Sectors are devoid of the wear-out
property
Eraseblocks wear-out and become bad
and unusable
Source: http://www.linux-mtd.infradead.org/
Performance
15/11/2012 {boukhobza, rubini}@univ-brest.fr 61
Tests programs:
Kernel level : modules (MTD
low level calls)
MTD-userspace : shell scripts
Ex : writes (MTD kernel
level)
Power consumption
15/11/2012 {boukhobza, rubini}@univ-brest.fr 62
Same test programs, max access on test partition
Ex : erases(MTD kernel level)
https://www.open-people.fr
Used tool: Flashmon [Flashmon11]
15/11/2012 {boukhobza, rubini}@univ-brest.fr 63
Flash access profiler (kernel module) for raw flash-
based Linux embedded systems
Monitors and log events (read, write, erase)
Stores access counter for each block of the monitored
flash memory
Linux programs profiled with Flashmon
Flash access numbers injected into the models to
estimate Flash I/O time and power consumption
Flashmon -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 64
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 65
Flash based subsystems
15/11/2012 [email protected] 66
Flash based subsystems
Flash memory +
Controller (FTL, wear
leveler) + (buffer) +
Hardware interface,
software API
Solid State Drive: ATA,
SATA, PCIe
USB mass storage
…
USB driver
SCSI driver
Flash Memory Device
FTL (Flash Translation Layer)
Logical to
physical mapping
Garbage
Collection
Wear
leveling
Other
Application
Standard File System
Flash Memory Array
Memory Card & Drive
15/11/2012 [email protected] 67
Card Type Interface
(command)
CompactFlash N + PCMCIA,
PC_Card (PATA)
MMC/SD N + SPI (MMC)
XQD N + PCIexpress
USB flash drive USB (SCSI)
UFS UniPro (SCSI)
CFAST N+ SATA (SCSI)
SSD Drives: SATA, SAS,
NVMexpress,
N=Native
Flash memory integration
15/11/2012 {boukhobza, rubini}@univ-brest.fr 68
Source: J. Cooke, Micron, Flash Summit 2012
Integrated with
RAM
• Storage cache
• Hybrid system
• SSD storage
Universal Flash Storage (UFS)
15/11/2012 [email protected] 69
JEDEC standard V1.1, 2012
Features:
Multiple command queues,
multiple partitions parallel
functions, boot partition
Max interface speed: 5.8 Gbps
Command set: SCSI+UFS
specific
Compatible eMMC
Serial interconnection
(unipro), chain topology
http://www.toshiba-components.com/ufs/index.html
Real-Time Constraints
15/11/2012 [email protected] 70
Full-duplex host interface
High Priority Interrupt
Read Read Read
Write Write&merge
Missed deadline
MP3 play
Download app
Read Read Read
Write Wr&merge
MP3 play
Download app
Multiple NAND channels & partitions
15/11/2012 [email protected] 71
Full utilization of interleaving across NAND channels
Read-while-write (full-duplex) and dual write across
multiple channels
NAND 1
NAND 2
UFS Host UFS Device Multiple partitions
Technology mix:
Part 1: SLC
Part 2: TLC
NAND 2
TLC optionnal
Async.
command
Interface / product
15/11/2012 {boukhobza, rubini}@univ-brest.fr 72
Presentation outline
1. Flash memory basics
2. Flash memory characteristics
3. Flash memory support
1. Mapping schemes
2. Wear leveling
3. Garbage collection
4. Flash specific cache systems
5. Some contributions
4. Performance & energy considerations
5. Flash memory interfacing
6. Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 73
Conclusions & perspectives
15/11/2012 {boukhobza, rubini}@univ-brest.fr 74
Source. E. Grochowski, “ Future Technology
Challenges For NAND Flash And HDD
Products”, FlashSummit 2012
Just one of a team !
Conclusions & perspectives -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 75
Integration of the non-volatile memory in operating system stack (not in the storage stack).
Non volatile memories’ coexistence
Toward the predictibility of access times in flash memory: Real time systems
Data base cost models
Leveraging data center’s energy/performance bottleneck: Storage represents 20% to 40% of the total energy consumption
[Carter10]
EMC forecasted that the amount of digital information created annually will grow by a factor of 44 from 2009 to 2020 [Farmer10]
Microsoft, Google, and Yahoo are showing the way …
Energy proportioinality
Architectural design space exploration tools
…
References
15/11/2012 {boukhobza, rubini}@univ-brest.fr 76
[Shinohara99] Shinohara, T. (1999), Flash Memory Card with Block Memory Address Arrangement, United States Patent, No
5,905,993.
[Ban99] Ban, A. (1999), Flash File System Optimized for Page-mode Flash Technologies, United States Patent, No 5,937,425.
[RNFTL10] Wang, Y., Liu, D., Wang, M., Qin, Z., Shao, Z., & Guan, Y. (2010), RNFTL: a Reuse-aware NAND Flash Translation Layer
for Flash Memory, In Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
(LCTES).
[BAST02] Kim, J. Kim, J. M., Noh, S. H., Min, S. L., & Cho, Y. (2002), A Space-Efficient Flash Translation Layer for Compact Flash
Systems, IEEE Transactions on Consumer Electronics, 48(2), 366-375.
[FAST07] Lee, S., Park, D., Chung, T., Lee, D., Park, S., & Song, H. (2007), A Log Buffer Based Flash Translation Layer Using Fully
Associative Sector Translation, ACM Transactions on Embedded Computing Systems, 6(3), 1-27.
[LAST08] Lee, S., Shin, D., Kim, Y., & Kim, J. (2008), LAST: Locality Aware Sector Translation for NAND Flash Memory Based
Storage Systems, ACM SIGOPS Operating Systems Review, 42(6), 36-42.
[EAST08] Kwon, S. J., & Chung, T. (2008), An Efficient and Advanced Space-management Technique for Flash Memory Using
Reallocation Blocks, IEEE Transactions on Consumer Electronics, 54(2), 631-638.
[KAST09] Cho, H., Shin, D., & Eom, Y. I. (2009), KAST: K-associative sector translation for NAND flash memory in real-time
systems. In Proceedings of Design, Automation, and Test in Europe (DATE), 507-512.
[STAFF07] Chung, T. S., & Park, H. S. (2007), STAFF: a Flash Driver Algorithm Minimizing Block Erasures, Journal of Systems
Architectures, 53(12), 889-901.
[HFTL09] Lee, H., Yun, H., & Lee, D. (2009), HFTL: Hybrid Flash Translation Layer Based on Hot Data Identification for Flash
Memory, IEEE Transactions on Consumer Electronics, 55(4), 2005-2011.
[WAFTL11] Wei, Q., Gong, B., Pathak, S., Veeravalli, B., Zeng, L., & Okada, K. (2011), WAFTL: A Workload Adaptive Flash
Translation Layer with Data Partition, In Proceedings of 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).
[DFTL09] Gupta, A., Kim, Y., & Urgaonkar, B. (2009), DFTL: a Flash Translation Layer Employing Demand-based Selective Caching of
Page-level Address mappings, In Proceedings of the 14th international conference on Architectural support for programming languages and
operating systems (ASPLOS).
[SFTL11] Jiang, S., Zhang, L., Yuan, X., Hu, H., & Chen, Y. (2011), SFTL: An Efficient Address Translation for Flash Memory by
Exploiting Spatial Locality, In Proceedings of 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST).
[Boukhobza13] Boukhobza, J. (2013), Flashing in the Cloud: Shedding some Light on NAND Flash Memory Storage System, chapter
to appear in Data Intensive Storage Services for Cloud Environment, IGI Global.
References -2-
15/11/2012 {boukhobza, rubini}@univ-brest.fr 77
[DualPool95] Assar, M., Namazie, S., & Estakhri, P. (1995), Flash Memory Mass Storage Architecture Incorporation Wear Leveling
Technique, United States Patent, No 5,479,638.
[Achiwa99] Achiwa, K., Yamamoto, A., & Yamagata, O. (1999), Memory Systems Using a Flash Memory and Method for Controlling
the Memory System, United States Patterns, No 5,930,193
[Chang07] Chang, L. (2007), On Efficient Wear Leveling for Large-scale Flash-Memory Storage Systems, In Proceedings of the 2007
ACM Symposium on Applied Computing (SAC).
[Kwon11] Kwon S. J., Ranjitkar, A., Ko, Y., & Chung, T. (2011), FTL Algorithms for NAND-type Flash Memories, Design Automation
for Embedded Systems, 15(3-4), 191-224.
[DAC08] Chiang, M., & Chang, R. C. (1999), Cleaning Policies in Mobile Computers Using Flash Memories, Journal of Systems and
Software, 48(3), 213-231.
[Kawaguchi95] Kawaguchi, A., Nishioka, S., & Motoda, H. (1995), A Flash Memory Based File System, In Proceedings of the USENIX
1995 Annual Technical Conference (ATC).
[CAT99] Chiang, M., & Chang, R. C. (1999), Cleaning Policies in Mobile Computers Using Flash Memories, Journal of Systems and
Software, 48(3), 213-231.
[CFLRU06] Park, S., Jung, D., Kang, J., Kim, J., & Lee, J. (2006), CFLRU: a Replacement Algorithm for Flash Memory, In Proceedings of
the 2006 International conference on Compilers, architecture and synthesis for embedded systems (CASES).
[FAB06] Jo, H., Kang, J., Park, S., Kim, J., & Lee. J. (2006), FAB: a Flash-aware Buffer Management Policy for Portable Media Players,
IEEE Transactions on Consumer Electronics, 52(2), 485-493.
[BPLRU08] Kim, H., & Ahn, S. (2008), BPLRU: A Buffer Management Scheme for Improving Random Writes in Flash Storage, In
Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST).
[C-lash11] Boukhobza, J., Olivier, P., & Rubini, S. (2011), A Cache Management Strategy To Replace Wear Leveling Techniques for
Embedded Flash Memory, 2011 International Symposium on Performance Evaluation of Computer & Telecommunication Systems (SPECTS).
[CACH-FTL13] Boukhobza, J., Olivier, P., & Rubini, S., A Cache-Aware Configurable Hybrid Flash Translation Layer , To appear in
the 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP).
[Park11] Park, S., Kim, Y., Urgaonkar, B., Lee, J., & Seo, E. (2011), A Comprehensive Study on Energy Efficiency of Flash Memory
Storages, Journal of Systems Architecture (JSA), 57(4), 354-365.
[Agrawal08] Agrawal, N., Prabhakaran, V., Wobber, T., Davis, J. D., Manasse, M., & Panigrahy, R. (2008), Design Tradeoffs for SSD
Performance, In Proceedings of the USENIX Annual Technical Conference (ATC).
[Flashmon11] B., Khetib, I., and Olivier, P., (2011). Flashmon : un outil de trace pour les accès à la mémoire flash NAND, in
Proceedings of the Embed With linux Workshop, France, 2011,
Price of storage
15/11/2012 {boukhobza, rubini}@univ-brest.fr 78
Price of NAND storage
15/11/2012 {boukhobza, rubini}@univ-brest.fr 79
SoC TI OMAP4 Mobil Application
Platform
15/11/2012 [email protected] 80
Development of planned features for the
Smartphones and Mobile Internet Devices
OMAP Boot System
15/11/2012 [email protected] 81
Flash in the boot process
15/11/2012 [email protected] 82
Flash NOR
boot code, executable, configuration data
XIP : eXecute In Place capability
Flash NAND:
eMMC, USB mass storage
Copy the boot code into RAM before being executed
Pre-Flashing: the code provided by a peripheral is
automatically stored into a Flash for the next boots.
NAND Flash with NOR interface
15/11/2012 [email protected] 83
Samsung OneNAND
Simplest interface (NOR NAND), XIP, prefetch
Host In
terface
NAND
State
machine +
boot
loader
Internal
registers Error
correction
Boot code
Data
RAM buffer
Address bus
Data bus