the dark and the bright side of solid state drives for computer forensics in the enterprise
TRANSCRIPT
Student Number: 100747819
Gaetan Cardinal
The dark and the bright side of Solid State Drives for computer forensics in
the enterprise
Supervisor: Lorenzo Cavallaro
Submitted as part of the requirements for the award of the
MSc in Information Security
at Royal Holloway, University of London.
I declare that this assignment is all my own work and that I have acknowledged all quotations
from published or unpublished work of other people. I also declare that I have read the
statements on plagiarism in Section 1 of the Regulations Governing Examination and
Assessment Offences, and in accordance with these regulations I submit this project report as
my own work.
Signature:
Date:
2
Acknowledgements
My sincere thanks to my wife Marie-Afining for her patience, encouragements and continuous
assistance during the course of my MSc adventure.
I would like to express my gratitude to my colleagues who helped me during the MSc and the
redaction of this thesis and especially to Chris and Vicente for their uninterrupted support, Jean-
François and Emmanuel for their very valuable technical comments, John for having been kind
enough to correct my English prose and Jacek for his logistical support.
Many thanks to my project supervisor Lorenzo Cavallaro and the Royal Holloway lecturers for
sharing their knowledge and their real commitment in assisting their students.
Special thanks to Chez Ciechanowicz for his availability and kindness and to people from the GCSEC
who made us feel very welcome in their facilities.
3
Table of Contents Acknowledgements .......................................................................................................................... 2
List of figures .................................................................................................................................... 5
List of tables ..................................................................................................................................... 5
Glossary of terms.............................................................................................................................. 6
Executive Summary .......................................................................................................................... 7
1 Introduction ................................................................................................................................... 8
1.1 Objectives ............................................................................................................................... 8
1.2 Structure ................................................................................................................................. 9
2 SSD technology concepts ............................................................................................................ 10
2.1 SSD components ................................................................................................................... 10
2.1.1 NAND Flash .................................................................................................................... 10
2.1.2 SSD Controller ................................................................................................................ 12
2.2 Overview of Read and Write operations .............................................................................. 13
2.3 Advantages of SSD compared to HDD .................................................................................. 15
2.4 Drawbacks of SSD compared to HDD ................................................................................... 16
2.4.1 Performance degradation .............................................................................................. 17
2.4.2 NAND flash lifetime ....................................................................................................... 18
3 SSD lifetime and performance optimization ............................................................................... 20
3.1 Overprovisioning .................................................................................................................. 20
3.2 Wear Leveling ....................................................................................................................... 21
3.3 Garbage collection ................................................................................................................ 23
3.4 TRIM ...................................................................................................................................... 24
3.4 Writeless technologies ......................................................................................................... 25
3.4.1 Write combining ............................................................................................................ 26
3.4.2 Data deduplication ........................................................................................................ 26
3.4.3 Compression .................................................................................................................. 28
3.5 Conclusion ............................................................................................................................ 29
4 SSD impact on computer forensics in the enterprise .................................................................. 30
4.1 Logical Data Recovery ........................................................................................................... 30
4.1.1 Quick formatting ............................................................................................................ 30
4.1.2 Deleted data .................................................................................................................. 33
4.1.3 Slack space ..................................................................................................................... 34
4
4.2 Physical data recovery .......................................................................................................... 34
4.2.1 Techniques overview ..................................................................................................... 35
4.2.2 Benefits compared to data recovery at logical layer ..................................................... 36
4.2 Forensic Imaging ................................................................................................................... 38
4.2.1 Write blocker efficiency ................................................................................................. 38
4.2.2 Forensic imaging challenge ............................................................................................ 39
4.3 Data Sanitization ................................................................................................................... 42
4.3.1 File secure wiping .......................................................................................................... 42
4.3.2 Drive sanitization ........................................................................................................... 43
4.4.3 Data remanence ............................................................................................................ 44
4.4.4 Physical destruction ....................................................................................................... 45
4.5 Chapter conclusions ............................................................................................................. 46
5 Analysis of potential mitigation techniques in the Enterprise .................................................... 47
5.1 SSD encryption ...................................................................................................................... 47
5.1.1 Software based encryption ............................................................................................ 47
5.1.2 Self-Encrypting Drives ................................................................................................... 49
5.2 Software optimization proposal ........................................................................................... 51
5.2.1 Drive imaging ................................................................................................................. 51
5.2.2 Logical data recovery ..................................................................................................... 52
5.3.3 File sanitization .............................................................................................................. 55
5.3 Chapter conclusions ............................................................................................................. 56
6 Conclusion ................................................................................................................................... 57
6.1 Assessment of objectives ..................................................................................................... 57
6.2 Conclusion ............................................................................................................................ 59
Bibliography .................................................................................................................................... 60
Annex A: Tests details .................................................................................................................... 65
5
List of figures
Figure 1: SSD physical architecture ................................................................................................ 10
Figure 2: Flash chip architecture .................................................................................................... 10
Figure 3: SLC and MLC voltage ....................................................................................................... 11
Figure 4: SLC vs MLC performance ................................................................................................. 11
Figure 5: File Translation Layer ...................................................................................................... 12
Figure 6: write operation example (1) ........................................................................................... 14
Figure 7: write operation example (2) ........................................................................................... 14
Figure 8: write operation example (3) ........................................................................................... 14
Figure 9: write operation example (4) ........................................................................................... 14
Figure 10: write operation example (5) ......................................................................................... 15
Figure 11: The 3 overprovisioning levels ........................................................................................ 20
Figure 12: Wear leveling impact on cell usage ............................................................................... 22
Figure 13: Logical view of the Garbage Collection process ............................................................ 23
Figure 14: Write Amplification during Garbage Collection ............................................................ 24
Figure 15: Small write combining ................................................................................................... 26
Figure 16: Data deduplication example ......................................................................................... 27
Figure 17: Illustration of Slack Space on HDD ................................................................................ 34
Figure 18: XOR effect ...................................................................................................................... 35
Figure 19: Examples of inefficient destructive techniques ............................................................ 45
Figure 20: SED security implementation ........................................................................................ 49
List of tables
Table 1: I/O Operations Timings of SLC and MLC ........................................................................... 17
Table 2: Write cycles comparison between SLC, MLC and TLC ...................................................... 18
Table 3: Percent blocks of test file recovered after SSD formatting .............................................. 32
Table 4: Percent blocks of test file recovered after file deletion ................................................... 33
Table 5: Drive Sanitization with overwriting techniques ............................................................... 44
Table 6: Drive encryption performance impact ............................................................................. 48
6
Glossary of terms
AK Authentication Key
ATA Advanced Technology Attachment
DEK Data Encryption Key
ETRIM Enforceable TRIM
FTL Flash Translation Layer
GC Garbage Collection
HDD Hard Disk Drive
IOPS Input Output Per Second
LBA Logical Block Address
MD5 Message Digest 5 algorithm
MLC Multi Level Cell
NOR Not OR
NAND Negated AND
NIST National Institute of Standards and Technology
NTFS New Technology File System
OP Over-Provisioning
OS Operating System
PBA Physical Block Address
PCI Peripheral Component Interconnect
P/E Program Erase
PGP Pretty Good Privacy
RAM Random Access Memory
SATA Serial Advanced Technology Attachment
SCSI Small Computer Serial Interface
SDRAM Synchronous Dynamic Random Access Memory
SED Self-Encrypting Drive
SHA-1 Secure Hash Algorithm v1
SLC Single Level Cell
SRAM Static Random Access Memory
SSD Solid State Drive
USB Universal Serial Bus
TCG Trusted Computing Group
TLC Tri Level Cell
TOD TRIM On Demand
VHS Video Home System
WA Write Amplification
WAF Write Amplification Factor
7
Executive Summary
Solid State Drives are flash memory based drives providing many benefits in comparison to
traditional Hard Disk Drives which explains why they are very likely to continue to spread
significantly in the near future. These drives handle data storage in a completely different way
which can cause issues when conducting computer forensic analysis with tools designed to work
on Hard Disk Drives. This document aims to understand how this new technology works and how
it impacts computer forensic analysis in the enterprise, especially in regards to recovery of deleted
data, forensic drive imaging and data sanitization. At the end of the impact assessment, we will
analyze existing solutions which can be used to circumvent these issues and, when no suitable
solution exists, propose our own.
This study demonstrates that deleted data recovery can be impacted by using Solid State Drives
but to what extent is very difficult to predict and can greatly vary according to multiple factors. It
also appears that the amount of deleted data that is recoverable can be very different when the
analysis is conducted at the logical or at the physical layer. This paper proposes a solution called
Trim On Demand which allows customizing the retention of deleted data on drive at the logical
layer in order to make Solid State Drives more effective than Hard Disk Drives in the recovery of
deleted data.
The document shows that taking a forensic image can be challenging on Solid State Drives as the
internal controller can continue to optimize data storage in the background even after the drive
has been disconnected from the host computer. This study demonstrates that this behavior does
not necessarily prevent taking multiple forensically identical images of a drive and also proposes
a solution to limit the effects of such behavior.
The paper demonstrates that data sanitization can be effective at the drive level when the drive
supports and correctly implements some standard ATA commands. The study also explains how
Self Encrypting Drives can be used to increase the reliability of drive sanitization and shows that
file sanitization can currently not be achieved either at the logical or at the physical layer. However
a solution called ETRIM is proposed in order to make file sanitization possible at the logical layer.
The study also explains why we believe file sanitization at the physical layer is not achievable at
this moment in time.
The conclusion of this study is that computer forensic analysis of Solid State Drives in the
enterprise is currently challenging mainly because the technology is relatively new, not very well
understood, proprietary and because of the lack of dedicated tools and standards. However this
document demonstrates that solutions exist and that there is no technical reason preventing Solid
State Drives to behave at least as well if not better than Hard Disk Drives in term of computer
forensic analysis.
8
1 Introduction
Modern enterprises rely heavily on IT infrastructure to conduct their business, and computers
have proven to be so effective in work optimization that they are spreading everywhere in the
enterprise. With the advent of the Internet, enterprise networks have extended their tentacles to
the whole world and are now interconnected not only with their own entities but also with their
customers, shareholders and partners. These new technologies have provided a lot of benefit in
terms of speed, efficiency and revenue but this didn’t come without drawbacks. Among those,
computer crime is one of the main threats faced by enterprises nowadays. This threat can take a
large variety of forms such as malwares infection, inappropriate user behavior or data theft, each
of those potentially leading to disastrous consequences. A common way for enterprises to
confront these problems is to use computer forensic investigation to inquire about suspicious
behavior on their systems. A very useful source of information for such an investigation is the
permanent storage used on computers most of the time in the form of Hard Disk Drives (HDD).
Because this technology didn’t evolve much during the last decades and was by far the most
common on the market, computer forensic science has been focusing on it for years and most of
the commercial solutions have been developed around HDD. Nowadays things are changing. Hard
Disk Drives are reaching their limits in term of speed and have become a bottleneck that the
industry wants to get rid of. In the computer speed race, the savior is called Solid State Drive (SSD),
a new type of drive based on flash memory which dwarfs traditional hard drives in term of
performance. This promising technology has already started to spread in a lot of different places
but because it is intrinsically different than HDDs, traditional computer forensics techniques in use
for years are not adapted to it. Consequently multiple alarming articles flourish on the Web to
warn about the risks inferred by SSD in regards to computer forensics. While some articles contain
pertinent information about the impact of this new technology on computer forensics, we found
out that several others draw hasty or positioned conclusions based on partial analysis of the
multiple factors involved. This is leading to a pessimistic and inaccurate picture of the situation.
We chose this topic in order to understand where these problems are coming from, what is their
exact nature and if there is something which can be done about it. This document focuses mainly
on the computer forensic aspect in the Enterprise, which means we assume we are working in a
fully controlled environment where the storage can be setup according to the Enterprise policy.
1.1 Objectives
This document aims to cover following subjects:
Explain the technical differences between HDD and SSD technologies and how these
differences interact with data organization on SSD.
9
Explain the impact of SSD in terms of computer forensics in the following areas:
o Deleted data recovery at the logical layer
o Deleted data recovery at the physical layer
o Forensic drive imaging
o File sanitization
o Drive sanitization
Assess existing solutions with a view to circumventing the drawbacks related to SSD use in
terms of computer forensics.
When no acceptable existing solution can be found, propose innovative solutions in order
to provide at least the same level of functionality on SSD than on HDD in terms of
computer forensics capability.
1.2 Structure
Chapter 2 will explain in detail how SSD technology works especially in regards to the NAND flash
and the drive controller and why they are likely to be widely deployed in the future. Based on
these explanations, we will discuss the two main drawbacks of this technology, namely cells
lifetime and performance degradation, which are playing an important role in the way data is
organized and treated on SSD.
Chapter 3 will explain in detail most of the mechanisms used by SSD controllers in order to
minimize the two problems related to this technology explained in chapter 2. Among those we
will discuss Garbage Collection, TRIM command, Wear Leveling… We will show how these
mechanisms interfere in data organization and optimization and should be considered as the root
cause of most of the issues we will describe later on.
Chapter 4 will discuss the practical impact of SSD technology on computer forensics. Based on
several studies and homemade tests, we will analyze how SSD behave in terms of deleted data
recovery at the logical and physical layer, forensic drive imaging, file and drive sanitization. Some
other aspects such as data recovery in the slack space or data remanence on flash memory will
also be discussed.
Chapter 5 will analyze the effect of software and hardware encryption on data recovery and drive
sanitization. Several innovative solutions will also be proposed in order to improve computer
forensics capability of SSD in regards to file sanitization, forensic drive imaging and data recovery
at the logical layer.
10
2 SSD technology concepts
Hard Disk Drives (HDD) have existed for decades and the technology they use didn’t evolve much
during that time which explains why they are usually well understood. On the other side, Solid
State Drives (SSD) technology is not only relatively recent but also far more complex and
heterogeneous. This chapter intends to introduce the reader to the technological concepts
required for a good understanding of the matters which will be addressed later on in this
document.
2.1 SSD components
Simply said, an SSD is a device able to store data permanently on flash memory. As shown in Figure 1, the physical architecture of an SSD is composed of 4 main parts. In addition to the host interface (SATA, SCSI, PCI, …) used to interconnect the drive with the host computer, the SSD also contains the flash memory used to store the information persistently and the controller which can be seen as the interface between the memory chips and the host Input/Output. Some additional fast and non-persistent RAM is also used by the controller as a caching mechanism.
2.1.1 NAND Flash
2.1.1.1 Internal architecture
According to [LH12] SSD use non-volatile flash memory based on floating gate transistors which
can be seen as very tiny cages able to trap the charge of the electrons present into it without the
need of any external power source. Each transistor (also called a cell) can store a single bit having
a value of 1 if the cell is uncharged or 0 for a charged cell. The cells are then organized into a grid
and wired-up. The way the transistors are wired together defines the flash memory type, the two
most common types being NOR (Not OR) and NAND (Negated AND) flash memory. SSD use NAND
flash which has been designed to be simpler and smaller than NOR flash.
As you can see in Figure 2, cells are physically grouped into pages which are assembled into blocks usually made of 64 to 256 pages. Blocks are grouped into planes and multiple planes are found on a single NAND flash die. A single SSD is commonly made of several flash die which can be accessed in parallel to increase the data throughput. From these different physical entities, the smallest addressable unit is a page which means that an SSD can read or write to a page but not to a cell. A page is quite similar to what a sector is on a HDD. The size of a page can vary depending on the type of cells used and the device vendor implementation. As an order of magnitude, a page usually has a 4 or 8 KB size.
Figure 1: SSD physical architecture Source [TR10]
Figure 2: Flash chip architecture Source [CK11]
11
2.1.1.2 Cell types : SLC, MLC and TLC
So far we have seen that a cell was able to store 2 different states or 1 bit of data. This type of cell
is known as SLC standing for Single Level Cell. Due to the production costs of SLC memory, the
industry developed a new approach called MLC (Multi Level Cell) allowing for the storage of 4
states instead of 2. MLC uses a very similar technology but, for an identical maximum voltage, the
amount of charge is stored more precisely in the floating gate as shown in Figure 3. This allows
MLC to double the density while keeping the same wafer size which helps in reducing the
production cost. Nowadays, the concept is pushed further with TLC (Tri Level Cell) which can store
3 bits of data per cell. While these new architectures might look at first as a great improvement,
they don’t come without drawbacks, increasing the number of bits per cell has several negative
impacts.
Figure 3: SLC and MLC voltage
Source [AK11]
First of all, as shown in Figure 4, the write and read performance are severely impacted when
increasing the number of bits per cell. This problem comes from the more precise voltage which
must be applied to the cell to store more different states. On SLC a voltage between 1 and 4V
means the cell is uncharged (value=1) while a value between 4 and 6V means the cell is charged
which represents a value of 0 and an acceptable interval of 3V for each state. For MLC, the
acceptable interval for each different state goes down to 1,5V which implies that more time is
needed to generate this more precise voltage. The principle is similar when measuring the amount
of charge stored in a cell which impacts the reading performance.
Figure 4: SLC vs MLC performance
Source [TA06]
The second issue is related to the error level rate which is more important for cells able to contain
more different states because of the increased level of precision required. Finally, the architecture
choice will also strongly impact the cells longevity [2.4.2]. The performance and longevity
degradation explains why SLC is still widely used on high end professional SSD where reliability
12
and performance is essential. MLC is commonly used for consumer products because they are a
good compromise between price and performance compared to traditional hard disk drives. First
mass market products using TLC have been recently released but we will need to wait a bit to see
if these products are reliable enough to be widely used.
2.1.2 SSD Controller
According to [TR10], “An SSD controller, also referred to as a processor, includes the electronics
that bridge the Flash memory components to the SSD input/output interfaces. The controller is an
embedded processor that executes firmware-level software. The SSD firmware is device specific,
and in most cases can be updated”. The controller is a critical and powerful component which
performs multiple functions such as erasing, reading from and writing to the flash memory as well
as many other optimization functions such as wear leveling [3.2] and garbage collection [3.3].
Another fundamental role of the controller is to take care of the flash memory addressing. On a
traditional hard drive, the OS uses LBA (Logical Block Address) to map the file system with the
physical sectors. Because an SSD is seen as a standard hard drive by the OS and because most of
the file systems currently in use are sector based, the OS uses the same mapping mechanism but,
the SSD flash memory architecture being completely different, an additional mechanism is
required to translate this information. This is achieved through the Flash Translation Layer (FTL).
Figure 5: File Translation Layer
Source [KH08]
The FTL is the software interface which is used to hide the complexity of the flash memory to the
file system. As shown in Figure 5, it does it by translating the LBA to the flash memory physical
location. In short, the OS just believes it knows where the data is physically located but it doesn’t.
This is a fundamental difference between HDD and SSD. A very interesting example of this is the
fact that with a standard HDD, it’s possible to only use a part of the physical drive by creating a
partition of a given size. Doing the same thing with an SSD would result in not knowing which part
of the storage is used. Actually it’s not even possible to limit the amount of storage used by an
13
SSD because the controller will use the whole storage available. For instance, a partition of 32GB
on a 64GB SSD could result in data being split across the 64GB of storage available. In order to
optimize the cells lifetime, the controller could end-up having used the 64GB of memory available
if enough data have been deleted and recreated. Of course, from the OS point of view, there is
only 32GB available.
In order to be able to retrieve the information which has been stored on the flash memory, the
FTL must also keep the logical to physical mapping information in an address mapping table. This
table is stored in fast cache (SRAM) but, because this memory is volatile, this information is also
stored in NAND flash to ensure long term retention.
2.2 Overview of Read and Write operations
At a first glance, HDD and SSD work quite similarly when reading and writing data. The main
difference being that on a HDD, the OS decides where to write data and knows exactly where the
data is physically on the drive. On SSD, the OS keeps the same behavior but the controller use the
FTL to map the data in a physical location of its choice. This gives SSD more flexibility to store data
in the most appropriate place in order to optimize the storage. It becomes a completely different
story when data has to be overwritten.
On HDD, overwriting data is similar to a normal write. It works like a VHS or audio tape, there is
no need to erase a used tape to overwrite its content. Unfortunately, if an SSD is able to read or
write to a page, it cannot overwrite data which has been written. To be able to overwrite data,
the SSD must first erase the data previously stored. Even worse, it is not possible to erase a single
page, the smallest entity which can be erased is a block (commonly 128 pages). This limitation
comes from the high voltage which is needed to set back a cell to an empty state and, according
to [LH12] “It's difficult to confine the effect only to the cells that need to be altered; the high
voltages can cause changes to adjacent cells. … Consequently, in order to avoid corruption and
damage, SSDs are only erased in increments of entire blocks, since energy-intensive tunneling
inhibition isn't necessary when you're whacking a whole bunch of cells at the same time.”
Practically speaking this is how it is working. Let’s imagine we have a very small SSD which only
contains 1 block of 4 pages, each page having a size of 8KB. The SSD is brand new and all pages
are empty. For simplicity, let’s assume that a LBA (Logical Block Address) has the same size than
a page.
14
1. We want to write 2 files to the SSD. File1 is 8KB and file2 is 16KB. The OS sends the
command to the SSD controller and data are written as shown in Figure 6
Figure 6: write operation example (1)
2. Now we delete the first file. The OS receives the request and removes the entry from the
file system allocation table. From the OS point of view, file1 doesn’t exist anymore but
the OS doesn’t send this information to the SSD controller and the SSD still considers that
3 pages are in use as shown in Figure 7
Figure 7: write operation example (2)
3. We want to write another 16KB file (file3) and the OS sends a request to the SSD controller
asking to overwrite the space which was allocated to File1 and to use the remaining
available space. The controller is facing a problem because there are not enough empty
pages to immediately write the data to the pages, he will first have to erase the block to
clean-up the pages. Obviously the controller also needs to keep the valid data which are
currently stored in the pages. He also knows that file1 is not needed anymore because he
has received the overwriting request from the OS and he is aware the last page is empty.
At the end, page 2 and page 3 are the only pages containing valid data and the controller
will read those and store them in cache. The controller will also store file3 in cache as
shown in Figure 8
Figure 8: write operation example (3)
4. Now that the valid data are stored in the cache, the controller can erase the whole block
as shown in Figure 9
Figure 9: write operation example (4)
15
5. At this stage, there are enough empty pages to write the data stored in the cache and the
controller can copy these data to the empty pages as shown in Figure 10
Figure 10: write operation example (5)
According to [JL10] “This is sometimes termed the ‘read-modify-erase-write’ process, where the
data is read, modified within the cache, erased from the block, and finally written to the pages.”
As we have seen in this example, in such a situation, there is some overhead when an overwriting
operation needs to occur. Indeed, instead of just copying the data to the pages, the controller
had first to read the data from the pages, store these in cache, erase the block and rewrite the
content of the cache to the pages. This is obviously slowing down the performance and this issue
will be addressed more in details in [2.4.1].
2.3 Advantages of SSD compared to HDD
Assessing the impact of SSD on computer forensic wouldn’t have been such an interesting topic if
we hadn’t been convinced this technology was going to grow considerably in the upcoming years.
Here are the main reasons why we believe SSD will become more and more popular in the near
future.
Low latency: Because they contain mechanical parts, HDD have an access time significantly
higher than SSD (around 10ms vs 1ms). The access time seriously impacts performance
especially for random access operations.
Random Access: HDD are quite good in sequential read and write access for instance when
accessing big files on disk with low fragmentation. However, most of the time accessed
files are quite small and stored randomly on the drive which requires jumping frequently
from one part of the disk to another and we know HDD are not good at that because of
their high access time. SSD are far better in random access operations.
Shock Resistance: because of the high rotation speed of the platters and the proximity of
the heads with these platters, HDD are very sensitive to shock and vibration while SSD
don’t require any moving part. Typical HDD are shock resistant up to 300G while SSD can
resist up to 1500G.
Energy consumption: The consumption of SSD (around 1W) is far lower than HDD (around
10W) because there is no mechanical parts in movements.
Heating: Less consumption and no mechanical parts in movement allows keeping the
temperature in SSD to a lower level
16
Resistance to magnetic field: studies like [MW10] have demonstrated that all SSD
components are extremely resistant to magnetic field and degaussing technique which
helps in protecting data stored on drives which is not the case for HDD.
No impact of data location: On HDD, the data stored on the inner side cannot be read or
written as fast as data written on the outer side. This is due to the fact that an outer
cylinder is bigger (and thus contain more data) than an inner cylinder. So, for a single
platter rotation, the amount of data which can be read on the outer cylinder is bigger.
Location of NAND cells does not impact the time it takes to read or write to them.
Ability to evolve: HDD are limited by the physical constraints of their moving parts. While
HDD makers are still quite innovative to increase the capacity of their drives, it becomes
more and more difficult to increase the overall performance. SSD are quite recent and
there are still plenty of areas which can be explored to improve their technology. On top
of that they can rely on their controller to optimize the way they work and we see huge
improvement from one controller generation to another in terms of reliability and
performance.
Throughput: the latest SSD generation can reach more than 500 MB/s in read or write
access while the best HDD are usually limited around 120MB/s. These are obviously peak
values only valid for sequential access but it gives an order of magnitude of the throughput
which can be expected for each technology.
IOPS: the number of IOPS (Input Output Per Second) of the fastest HDD is around 400 while
some SSD can go higher than 1 million IOPS. While an IOPS value is not absolute and
depends on the operation parameters (read or write, sequential or random, data transfer
size), it gives an indication of the difference in terms of performance which can be
achieved by SSD compared to HDD.
2.4 Drawbacks of SSD compared to HDD
SSD present a lot of benefit compared to HDD but they are not better in all areas and we believe
HDD and SSD will continue to co-exist for a long time. Among the drawbacks of SSD, we can note:
Storage capacity: so far, SSD have a smaller capacity than HDD. Things are however quickly
evolving and the first 2,5 inches SSD having a 1TB storage capacity has been recently
released which is more or less the same capacity as the biggest 2,5 inches HDD. It tends
to prove that SSD can reach the same storage density as their competitors. So far the
technique which has been used to increase storage capacity is to reduce the size of the
cells and store more bits per cell (MLC and TLC) but these solutions are creating issues in
term of performance, reliability and cell lifetime and it’s not sure the same techniques can
be used to extend further SSD capacity. More details about this problem can be found in
[JL10A,HN10].
17
Price: The price per GB is still noticeably higher for SSD than for HDD. Even if prices are
dropping very quickly, it’s still not clear if the price/GB will be able to reach the same level
as for HDD without sacrificing too many of the benefits brought by SSD by further
decreasing cells size and increasing the number of bits per cell.
These 2 drawbacks are related to the future of the technology and are more a concern in
regards to the proliferation of the SSD on the market. Two more issues still need to be
assessed in-depth because of the impact they have on the forensic investigation.
2.4.1 Performance degradation
When the first generation of SSD came out on the market, some users started to realize that the
performance was dropping over time. According to tests conducted in [ID12], there is a very
significant drop in writing bandwidth which happens just after an SSD has been fully filled in with
data. After some time, this drop can even reach 80% of the original write performance. This is
due to the way data is written to NAND flash cells.
Let’s take an example to better understand what’s happening. Let’s imagine we have a brand new
SSD using SLC, completely clean, which means all the flash cells are empty and let’s assume a page
is 4KB and a block contains 128 pages for a total of 512KB. At the beginning, each time we want
to write on the disk, the OS sends the write command to the SSD controller and because all cells
are empty, it’s easy and straightforward for the controller to store the data in the cells. At this
time, the controller is writing at his maximum capacity.
Now let’s say that we have completely filled in the drive and there is no more space left. After
having deleted some files, we want to write some new files on the SSD. Based on the “read-
modify-erase-write” process described in [2.2] we must read the block from cells, write it in cache,
erase the NAND flash block and write the data from the cache to the NAND flash cells. This
drastically increases the number of operations which must be conducted compared to the write
operation on an empty SSD. Moreover, as shown in Table 1, erase operations are significantly
slower (8x for SLC) than a write which makes things even worse. All of this explains why the
performance degrades when the disk is full. Fortunately some mechanisms, which will be
discussed in the next chapter, exist to help minimizing the performance drop.
SLC NAND flash MLC NAND flash
Random Read 25 μs 50 μs
Erase 2 ms per block 2 ms per block
Programming (Write) 250 μs 900 μs
Table 1: I/O Operations Timings of SLC and MLC
Source [JL10]
18
2.4.2 NAND flash lifetime
Because of the advantage in term of production cost and storage density, the industry is clearly
looking at increasing the number of bits which can be stored by a single cell and this is why most
of the modern SSD are based on MLC. The first SSD based on TLC have also been recently released
but increasing the number of bits per cell has a serious impact on the NAND flash lifetime.
Table 2: Write cycles comparison between SLC, MLC and TLC
Source [SM12]
As we can see in Table 2, the number of P/E (program/erase) cycles is roughly divided by 10 from
SLC to MLC and by 10 again from MLC to TLC. There are multiple factors influencing the maximum
number of P/E cycles of NAND flash cells. According to [LH12], each time a program/erase cycle
occurs on a cell, some charge is trapped in the dielectric layer of the floating gate, which modifies
the resistance of the cell. This increases the amount of current required to change a gate’s state
and the time a gate needs to flip. Overtime, this resistance becomes significant enough to require
a so high voltage and it takes so long for a cell to change its state that it becomes useless as a fast
data storage device. This degradation doesn’t affect the ability to read data, reading being mostly
a passive operation. The cell longevity is negatively impacted by the number of states which can
be stored per cell. The more states a cell has, the more sensitive the cell is to changes in residual
charge and the more difficult it becomes to add a new charge. Finally, because of the cell size
reduction, cells can absorb less residual charge before they become too unresponsive to be useful.
Another phenomenon caused by the “read-modify-erase-write” process has an undesirable effect
on cells longevity and is known as write amplification. According to [LX12] “Write amplification
(WA) is defined as the average number of physical writes per logical write:
𝑊𝐴 =Average number of physical writes
Average number of logical writes
This measure is also sometimes referred to as the write amplification factor (WAF).
To better understand the write amplification issue, let’s go back to the example where we have a
brand new SSD completely clean, on which a page is 4KB and a block contains 128 pages for a total
of 512KB. Now let’s see what happens when we want to write a 4KB size file. In the best case
scenario the controller can find an unallocated page and just needs to write the page in the free
cells. In this scenario, the write amplification factor is said to have a value of 1. Now, in the worst
case scenario, all pages are already allocated. When the SSD gets the writing request from the OS,
the controller reads the block containing the page which must be overwritten and stores the block
19
in cache. Then it deletes the block and re-writes the whole block after having replaced the
overwritten page. In this case, for a write request of 4KB, the SSD did write a full block of 128
pages for a total of 512KB. The write amplification factor is 128 which is not only extremely
inefficient but also has a very negative impact on cells lifetime.
As we can see there are many factors which negatively impact the flash cells longevity. Even
worse, the values we see in Table 2 can appear to be very low even for SLC memory and one can
wonder if SSD can be considered as a reliable long-term storage solution. These concerns are
legitimate but they must be put into perspective. Unlike HDD which store the data where the OS
asks it to, SSD use a controller which can decide where to physically store data. This means the
controller can optimize the location of the data to avoid having some cells being used too much.
This considerably increases the lifetime of the SSD as a whole and several studies like [SN08] show
that SSD based on MLC are perfectly suitable for normal usage while SLC can be used for a very
intensive usage while keeping their reliability for a long time which makes them perfectly suitable
for professional applications. It’s also worth noting that it is possible to easily extend the lifetime
of an SSD by increasing its size. Indeed, according to [WC10] “Lifetime writes can be estimated as:
𝐿𝑖𝑓𝑒𝑡𝑖𝑚𝑒 𝑤𝑟𝑖𝑡𝑒𝑠 =𝐷𝑟𝑖𝑣𝑒 𝑆𝑖𝑧𝑒 ∗ Program Erase Cycles
Write Amplification
This clearly indicates the proportionality between the drive size and the lifetime, assuming the
writing pattern is identical in both situations.
Recent researches like [YC12] are quite promising regarding the lifetime extension of NAND cells
which could increase to 100 million P/E cycles before wearing out by using some relatively simple
production techniques. While this still needs to be fully validated, it’s in our opinion very likely
that the industry will at some point find a solution to increase the SSD cells lifetime in the future.
Nevertheless, this issue does currently exist and the SSD vendors put a lot of effort to optimize
the lifetime of their products. As we will see in more detail in the second chapter, this optimization
does considerably impact the way SSD store their data and the computer forensic investigations
conducted on these drives.
20
3 SSD lifetime and performance optimization
We have seen in the previous chapter that the SSD technology has two main drawbacks being the
performance degradation over time and the limited flash cells lifetime. In order to circumvent
these limitations, controller vendors have developed multiple optimization solutions. Those are
impacting the way data is physically stored on flash memory and, consequently, the computer
forensic analysis conducted on these data. This chapter intends to give the reader an overview of
the main techniques used nowadays in regards to these issues.
3.1 Overprovisioning
On SSD, what is called over-provisioning (OP) is the amount of physical storage which is not seen
by or allocated to the operating system.
Figure 11: The 3 overprovisioning levels
Source [KS12]
As shown in Figure 11 and according to [KS12] there are 3 levels of OP. The first level is related to
the difference between the binary and decimal calculation of the physical storage. Both HDD and
SSD vendors measure the size of a GB (Giga Byte) based on decimal computation (10^9 bytes) but
because flash memory is assembled in power of 2, the real amount of bytes is 2^30 which makes
a difference of 7,37%. From an OS point of view an SSD and an HDD of 128GB will be seen as
having the same usable size but in reality, the SSD is hiding the extra capacity to the OS for
overprovisioning. The second level is related to the vendor design choice. On standard SSD,
vendors usually reserve 0, 7 or 28 percent of space for OP. Depending on the value chosen, a
128GB drive will be respectively marketed by the vendor as having a capacity of 128, 120 or 100GB.
This OP must be added to the first level of OP. The third level of OP is defined by the user. On
some drives the user can use a dedicated tool to reduce the amount of physical storage which can
be seen by the OS which is equivalent to increase the OP size. The user can also undersize the
partition used on the SSD. For instance creating a partition of 100GB on a drive where 120GB can
be used by the OS provides an extra 20GB of overprovisioning. It is worth noting that this very last
technique is not considered as real OP and can be less efficient especially when not using the TRIM
function [3.4].
21
The extra storage provided via OP is used by the SSD for multiple purposes like firmware storage,
pool of spare blocks used to replace defective cells and write access optimization via a process
called Garbage Collection [3.3]. One of the benefits of OP is to increase the cells lifetime. This
happens because the wear leveling [3.2] mechanism can use more cells to spread the data across.
For instance, if you have 50% of the drive used for overprovisioning, the SSD controller has twice
the same physical storage capacity available compared to the logical storage available to the user.
Based on this consideration alone, your SSD lifetime should be multiplied by 2 in this scenario.
Even more interesting, the impact on longevity is not linear with OP size because OP also allows
optimizing write operations by reducing the number of read-modify-erase-write cycles and the
write amplification. According to [KS12], an OP of 20% can reduce the write amplification by 50%
on standard SSD. OP also positively impacts the performance. As explained in [MR12] : “it can
also improve performance by giving the flash controller additional buffer space for managing
program/erase (P/E) cycles and improving the probability that a write operation will have
immediate access to a pre-erased block.”. To provide an order of magnitude of the impact on the
IO rate for random write operations, a test conducted in [IT12] on an Intel 160GB 320 Series shows
that “Performance improves by as much as 3x by allocating 20% of the total space to over-
provisioning, and 4x by allocating 40%.”
As we can see the overprovisioning technique is quite simple but can be extremely effective to
increase flash SSD lifetime and minimize the performance degradation when used in conjunction
with a good controller. The drawback is obviously the usable space reduction but it must be put
into perspective being given the first OP level natively present on all SSD and the important
improvement in performance and longevity obtained with a relatively low percentage of OP. It
explains why this technique is so common in SSD but also illustrates how different the physical
data storage on flash memory can be in comparison to traditional HDD.
3.2 Wear Leveling
As shown in Figure 12, a drive not using any wear leveling mechanism doesn’t use its cells equally
in terms of P/E cycles. Given the limited number of P/E cycles supported by a flash cell, this results
in a quick degradation of some cells which become unusable. When this happens, the controller
can use the spare cells reserved for the overprovisioning to replace those so the user continues to
see the same amount of storage being available. The side effect of this behavior is to reduce the
overprovisioning capacity which increases the write amplification and tends to degrade the
remaining cells faster, creating a snowball effect. At some point there are no more spare cells
available to replace the defective cells and the drive reaches its end of life while a lot of cells can
still have a very low number of P/E cycles. To circumvent the problem, the controller can use a
wear leveling mechanism in order to increase drive longevity.
22
Figure 12: Wear leveling impact on cell usage Source[PD11]
According to [BR09], “Wear leveling is a set of algorithms that attempt to maximize the lifetime of
flash memory by evening out the use of individual cells”. While each controller manufacturer uses
its own algorithm, wear leveling techniques can be placed in two main categories: static and
dynamic.
The kind of data stored on a drive can vary depending on the usage but commonly most data is
static. Static data can for instance be executable, documents, movies and so on which usually stay
in place for a relatively long time without being modified or removed. To the contrary, dynamic
data such as logs or temporary files are very frequently modified by the OS or the applications.
According to [MT08], “Dynamic wear leveling is a method of pooling the available blocks that are
free of data and selecting the block with the lowest erase count for the next write. This method is
most efficient for dynamic data because only the non-static portion of the NAND Flash array is
wear-leveled”. While this system has a clear positive impact on the drive longevity, it is only
working on dynamic data which represents a fraction of the full data set.
Static wear leveling uses a more efficient but more complex algorithm. According to [PD11], the
controller records the number of P/E cycles for each block as well as the last time the block has
been used for writing. When data has to be written, the controller will first look for the block
having the lowest amount of P/E cycles. If the block is free, the controller uses it. If the block
contains data, the controller will check when was the last time it was written to in order to
determine if data on this block is static or dynamic. If data is static, the controller will move the
data currently stored on the block to another free but highly used block and then copy the data
on the block. If data is dynamic, the controller will look for another block in its blocks pool and
follow the same logic again. This technique allows moving static data on blocks which already
have a high number of P/E cycles. As shown in Figure 12, this technique allows keeping the
number of P/E cycles per cell to a nearly identical level on all cells which considerably extends the
SSD lifetime.
Disk with no wear leveling
P/E
Wri
te c
ycle
s
Data Blocks
Disk with wear leveling
P/E
Wri
te c
ycle
s
Data Blocks
23
Ironically enough the static wear leveling algorithm used to extend the cells life also increases the
write amplification phenomenon by moving static data from block to block but this approach still
remains very effective and drastically increases the SSD lifetime. It’s also worth noting that this
improvement has a cost in terms of write performance because of the additional read-modify-
erase-write cycles.
3.3 Garbage collection
According to [KS11], “In flash memory, GC (Garbage Collection) is the process of relocating existing
data, deleting stale data, and creating empty blocks for new data”. Stale data is data which has
been marked as not being needed anymore by the controller which happens for instance when
data is overwritten.
Figure 13: Logical view of the Garbage Collection process
Source [AS09A]
As shown in Figure 13 and according to [AS09A], when the controller needs to write a page of data
to a block that is full of both valid and invalid pages, a new free block coming from a pool of free
blocks is allocated (1). The write request and the valid data are copied to that new block (2). The
old block is sent to a pool of blocks which must be cleaned (3). The GC process will select the block
with the least number of valid pages and wipe it before moving it to the free blocks pool.
The first GC benefit is to consolidate valid data into free blocks which allows cleaning up these
blocks, making them ready to receive new data. The second benefit is to keep a pool of free blocks
available for writing operation. It means that even if the block we need to write to doesn’t have
enough free empty pages because of invalid data, another free block can be used immediately
which avoids having to wait for block erasing which is a very time consuming process. This process
considerably helps in reducing the performance degradation issue which happens when there is
not enough unallocated pages.
24
Figure 14: Write Amplification during Garbage Collection
Source [AS09A]
The GC process we just described is referred to as standard or foreground or real-time garbage
collection which means the blocks are reorganized when there is a need for it. As shown in Figure
14, it implies a certain level of write amplification because on top of the new data which has been
received by the controller, the valid data located in the block which will be optimized must also
be rewritten. This requires extra read and write operations in real-time which can slow down write
operations. This GC flavor is thus reserved to SSD having good write performance. Another GC
process called background or idle GC can also be used. In this case, the drive performs the GC
operations when it is idle, moving data and erasing blocks in advance. While this solves the
performance issues, it also implies having some data being moved needlessly which decreases the
cells lifetime. This is especially true when used in conjunction with TRIM [3.4] because it can
happen that some pages become invalid after deletion on a block which has already been
optimized. In this case, another optimization cycle will happen on this block while no data has
been written to it. In that scenario, a block on which no data needs to be written can be optimized
by the GC multiple times. With standard GC, the optimization will only happen when there is a
write request received and even if the number of invalid pages changes multiple times on the
block, it will only be optimized once. At the end, it is the choice of the vendor to choose the level
of GC aggressiveness that suits its drive best.
3.4 TRIM
The Garbage Collection process we just described is an interesting technique to clean-up invalid
data and reorganize data more efficiently across the blocks. Unfortunately, when a file is deleted,
the OS doesn’t send this information to the drive which doesn’t know this data has become invalid.
Because of that, the GC and wear leveling mechanisms will continue to move this data, even if
they are not needed anymore which results in write amplification. Some vendors have tried to
circumvent this issue by analyzing the file system used on their drives but it’s a rather difficult job
and can only work for well documented file systems. This issue has recently been addressed by
the TRIM command. TRIM is an ATA command which provides to the SSD controller information
about data which is no longer in use. This can for instance happen when a file is deleted or when
a partition is formatted. When this happens, the SSD controller can mark this data as being invalid
so it knows there is no need to continue to move it during the GC or wear leveling optimization.
To work the TRIM command must be supported by the whole chain (OS, SSD controller, driver).
According to [SS12], “To illustrate, some database systems and RAID solutions are not yet TRIM-
25
aware, and subsequently do not know how to pass that information onto the SSD. In environments
like that, the SSD will continue saving and garbage collecting those blocks until the OS utilizes the
LBA's for new write requests”. According to [MP10], the TRIM command can also be used on OS
not supporting it by using third-party software which reads the file allocation table of the file
system and informs the SSD controller about the LBA which are no longer in use.”
Because of its benefits, there is a lot of interest in TRIM but there also seems to be some confusion
about this command. One of these is the belief that having a TRIM enabled system means the
data is physically removed from the drive immediately after data has been logically deleted from
the file system. This is not true because the TRIM command is just sending information about the
data which has become invalid but it is the SSD controller which decides which actions must be
taken based on this information. This is clearly explained in [NC09], “TRIM differs from ‘Secure
Delete’ command. ‘Secure Delete’ means it must be done. TRIM is just an advisory”. Another
confusion comes from a relatively common analogy made between TRIM and defrag, as found in
[MR12B], “TRIM, which conceptually can be compared to a defrag utility on a spinning hard drive,
…”. We understand why the author wants to use this analogy but we believe it is very confusing
because defrag is an active command modifying the physical data arrangement on the drive while
TRIM is just a command used to provide information about invalid data to the SSD controller.
Finally it’s also worth noticing that TRIM is often considered as a generic command used to
optimize physical storage on all compatible SSD. This is only partially true because TRIM is barely
working for SSD using a SATA interface. According to [MR12B], “The SAS committee has added
UNMAP, which is similar to TRIM, to the SAS/SCSI specification”. So, while TRIM is often used as
a generic term to explain this optimization mechanism, it is not compatible with all SSD but can
sometimes be replaced by a similar command.
Advantages of TRIM are to provide more accurate information to the controller in order to help it
in optimizing data storage. It helps reducing the write amplification, reduces the performance
degradation problem and increases the free space available which can be seen as an increased
overprovisioning size. It’s also worth noting that TRIM is complementary to the GC process but
not a replacement for it.
3.4 Writeless technologies
In order to increase flash memory longevity, some controller’s firmware implement writing related
algorithms designed to minimize the write amplification. This is sometimes referred to as
“writeless technologies”. According to [KJ11], “Writeless is a set of techniques including write
buffer, compression, and deduplication for reducing WAF (Write Amplification Factor) in flash
memory”.
26
3.4.1 Write combining
Write combining is the process of combining multiple small write requests into larger write
operations in order to improve performance and traffic throughput. Write combining relies on
write caching to store multiple write requests in order to optimize them. According to [PH10],
“write caching adds the very relevant benefit of reduced small block writes. If write caching is off,
every write command is sent by the OS one at a time to the SSD, including any TRIM commands,
waiting for responses for each. With write caching enabled, they're combined in RAM before being
written to the disk”. Each vendor has its own secret sauce but here are some techniques which
can be used to improve performance and longevity with write combining.
Figure 15: Small write combining
Source [AS09A]
One example of write combining is shown in Figure 15 where multiple small writes are assembled
in order to take advantage of the parallelization by writing simultaneously on each flash die.
According to [AS09A], “Random writes rarely happen in a separated manner, they come in bursts
with many at a time. A write combining controller will take a group of 4KB writes, arrange them in
parallel, and then write them together at the same time. This does wonders for improving random
small file write performance, as everything completes as fast as a larger sequential write would”.
Write combining can also be used to reduce write amplification. According to [DH11], “multiple
write requests are collected by the controller prior to being written to the drive. The goal is to
bundle several small write requests into a single, larger write operation while hoping that the
neighboring pages are likely be changed at the same time, and that these pages actually belong
to the same file. This technique may significantly reduce the write amplification factor”
3.4.2 Data deduplication
An interesting technique exists in order to reduce the write operations number and is known as
data deduplication. The concept is fairly simple. When a drive receives a write request, it splits
the data content in chunks and checks if the same chunk has already been stored on the drive. If
this is the case, the data is not copied but a pointer is added in a mapping table. Thanks to that
the controller can reduce the amount of data being written on the flash memory. This technique
is often used for backup or in virtualized environments in order to optimize the storage capacity
and improve the IOPS (data not copied does not use IO for writing and they can be used for other
operations). This mechanism can also apply to SSD in order to mitigate the limited P/E cycles issue.
27
Figure 16: Data deduplication example
Source [KJ12]
The Figure 16 shows an example of data deduplication implementation. When a write request is
received by the controller, it contains the LBA (ie: 10,11,12) and the data (ie: A,B,C). The fingerprint
generator calculates the hash value for the data (ie: A’,B’,C’). Then the fingerprint manager looks
in its table for a similar hash value. If it doesn’t find it, it means this data doesn’t exist in flash.
The mapping table is updated with the LBA and the Physical Block Address (ie: 100,103). If the
hash value is found in the fingerprint manager, it means the data already exist on the flash
memory. In this case the mapping table is updated with the LBA and the PBA but the PBA is
pointing to the same physical location than the data which already exist in flash and data is not
physically written on flash memory.
Multiple factors need to be considered when implementing such a mechanism. First of all the
choice of the hash algorithm is crucial because SSD are usually rather limited in CPU power and
SDRAM capacity. Using a strong hashing algorithm reduces the likelihood of a hash collision which
could result in an unacceptable loss of data integrity. However it would take longer to generate
the hash and a stronger algorithm often means longer hashes values. This could impact the SSD
performance without completely protecting against hash collision which is always possible, just
less likely with a better algorithm. Some implementations prefer to use weaker and faster hash
algorithms but compare the data to be written with the data stored in flash when there is a match
between the hashes. This solution requires extra overhead for reading the value from the flash
but ensures data integrity. The extra overhead is also somewhat compensated by the faster hash
computation and shorter hash values. Another important parameter to consider is the chunk size
to be used as input for the hashing function. As demonstrated in [KJ12], the deduplication rate
decreases when using larger data chunks. It’s thus a matter of finding the right balance between
write operation reduction and the additional overhead caused by smaller chunks in terms of
higher numbers of hashes to compute and store.
The data duplication factor data will obviously greatly depend on the data but it’s interesting to
get an order of magnitude of what can be expected in real life situations. According to [KJ12],
“Experimental results have shown that our proposal can identify 4 51% of duplicate data with
an average of 17%, for the nine workloads which are carefully chosen from Linux and Windows
environments”. At first it doesn’t look that impressive but the nice thing with deduplication is that
it engenders a chain of benefits. The first benefit is obviously to reduce the number of write
operations. The second is to improve the Garbage Collection process by reducing the number of
pages which have to be re-written during that process. The third benefit is the impact on the wear
28
leveling and FTL performance. According to [KJ11], “the duplication information can be utilized to
differentiate hot/cold pages for enhancing garbage collection and wear-leveling performance. In
other words, incorporating the deduplication software into FTL seamlessly might not only provide
the reduction of WAF but also enhance the performance of FTL”. These benefits help considerably
in reducing the write amplification which improves the drive longevity. According to [KJ12]
“deduplication can expand the lifespan up to 4.1 times with an average of 2.4 times, compared
with the no deduplication results”. In regards to the overhead required to implement data
deduplication, the tests conducted in [KJ12] indicate that the write latency is also decreasing by
15% on average. However these tests are based on a system using a direct hash values comparison
and don’t take into account the additional latency required on systems reading data from the flash
memory to compare them with data to be written. The main drawback of data deduplication is
the increased risk of data loss but some vendors implement internal RAID-like mechanism to
improve data protection.
3.4.3 Compression
Data compression is the process of encoding information by using fewer bits than the original
representation and this feature is natively supported by some SSD controllers. This compression
does happen on small chunks of data and not at the file level. Contrary to file system compression,
SSD compression is not made to extend storage capacity but to increase performance and drive
longevity. Actually the SSD will store information related to the data size before compression and
presents this value to the OS which means that even if the drive is able to compress 60GB of data
to 30GB, the controller will continue to report 60GB of used storage to the OS. This extra storage
is used by the controller to optimize wear leveling and garbage collection. This will considerably
improve the drive longevity by reducing the number of write operations and the write
amplification factor. Some SSD vendors claim compression will also drastically increase write
throughput but they tend to make their calculation based on fully compressible data. Some tests
like [JL11] demonstrate that the gain exists but quickly reduces with the percentage of
compressible data. Finally, it should be noted that enabling file system compression on an SSD
drive supporting compression will considerably reduce the controllers ability to compress data,
furthermore as the data has already been compressed at the file system level this reduces the
benefit of using SSD compression.
29
3.5 Conclusion
In this chapter, we have provided an overview of the most common mechanisms used to optimize
performance and increase lifetime on SSD. Thanks to their programmable controller, SSD are very
flexible and we have seen there are a lot of different techniques being used. We have also
demonstrated that all elements are interacting together which implies that controller vendors
must optimize their firmware based on their product range and the expected SSD behavior. The
reader should now have a strong assurance that the SSD technologies and optimization algorithms
implemented on the controller have a big impact on the way data is stored and moved around in
the background. This theoretical approach was necessary to assimilate all these concepts but it’s
now time to start analyzing the impact of these technologies on computer forensics in real world
situations.
30
4 SSD impact on computer forensics in the enterprise
According to [LD12], “computer forensics is the collection, preservation, analysis, and
presentation of electronic evidence for use in a legal matter using forensically sound and generally
accepted processes, tools, and practices”. This is obviously a quite broad topic and this document
doesn’t intend to fully cover it. Our analysis will focus on 3 of the most important challenges faced
by enterprises in this area. We will first discuss data recovery, which is very helpful in situations
such as post-mortem analysis on a compromised system or investigating malicious activities
conducted by an insider. Data recovery is going to be tackled under two different angles; the
logical and the physical approaches which are leading to fundamentally different conclusions. The
second area we will address is forensic imaging which is a fundamental step when forensic
evidence needs to be presented in court. The last point we will analyze is related to analysis of
anti-forensic techniques used to sanitize data on SSD. While this is not a computer forensic domain
stricto sensu, this aspect needs to be seriously considered by enterprises in order to ensure
protection of sensitive information in multiple situations such as drive warranty replacement or
storage decommissioning.
4.1 Logical Data Recovery
Before starting his analysis, a forensic investigator must first try to gather as much data as possible,
including data inaccessible without dedicated tools, in order to optimize the chance to find useful
information. In this section we will discuss how the data can be recovered logically, which means
the controller is still in charge of interpreting and filtering data physically stored on the drive. We
will analyze what happens when trying to recover data after a quick format or after data has been
deleted via the OS. We will also briefly discuss data recovery from the slack space.
4.1.1 Quick formatting
Multiple techniques exist to format a drive and while what is commonly referred to as a full format
does not necessarily mean all data is zeroed out on the disk, we will use the term quick formatting
to describe a format which resets the file system but does not delete the file content. On a HDD,
quick formatting a drive will leave most of, if not all, the data recoverable via dedicated recovery
tools. On SSD we could expect a different behavior because of the garbage collection process
which tries to optimize performance by cleaning-up blocks containing pages no longer in use,
making blocks ready for the next writing. This section will provide some concrete examples about
what is exactly happening when using a quick format on a SSD.
31
In the experiment conducted in [GB10], a SSD is filled in with multiple copies of the same file. Then
the SSD is quickly formatted and the system is immediately rebooted afterwards. Just after the
reboot a script is run every 5 seconds to measure the percentage of bytes which are zeroed out
from a chosen sample. The result shows that the GC process started 3 minutes after power-on.
Three minutes later the selected samples were nearly completely wiped-out. 15 minutes
afterwards a full image of the SSD was taken and a full forensic investigation conducted on the
copy. This second analysis happened a bit further because some time was needed to take the
image but the results are more accurate because they are conducted on the full drive and not only
on some byte samples. Some of the results are summarized hereunder:
Zero files were recovered fully intact
The maximum level of recovery for a file was approximately 50% of the data
The amount of data which survived the GC process was approximately 0.03%
Before going further let’s put things into perspective in regards to these results:
The test has only been conducted on a single drive type
It has been run with NTFS file system only which could influence the result
The system has been restarted just before starting the test. This power off/power on cycle
could have triggered the GC process. In other words, the time it took the drive to erase
data might have been significantly longer if the system hadn’t been restarted after the
quick format.
The obtained results are very enlightening as they confirm what we were suspecting namely that
SSD can modify data stored on disk without being asked to do so by the OS. This phenomenon is
sometimes referred to as self-corrosion. That being said we must be extremely careful when
drawing conclusions based on experiments involving SSD. As we will see many times in this
document, predicting or generalizing SSD behavior is extremely risky and we shouldn’t assume it
is safe to extrapolate experimentally obtained results to other situations if some parameters are
changing. What seems reasonable to conclude is that a quick format can have a big impact on the
data recovery and that the aggressiveness of the GC algorithm will probably have a big impact on
the percentage of recoverable data. Finally it should be noted that some SSD could start their GC
process later than after 3 minutes which would potentially open a window of opportunity for data
recovery.
In another study [CK11], a similar experiment has been conducted on 15 SSD from different
vendors, on multiple OS. Two reference files (a big binary file of 650MB and a small text file of
1MB) have been written to the drive before launching the default format command on the
installed OS. Some drives had the TRIM function but Windows 7 was the only OS tested supporting
TRIM. The test is measuring the percentage of recovery of each file (the large and the small one)
and the percentage shown are thus not related to the whole drive.
32
Table 3: Percent blocks of test file recovered after SSD formatting Source [CK11]
The results shown in Table 3 demonstrate how different can the behavior be from one drive to
another. As we can see the recovery percentage can vary from 0 to 100% depending on the drive.
Interestingly the same kind of variation can be observed on the same drive depending on which
file, the small or the big one, the recovery is attempted on. The variations seem to be influenced
by the OS, the drive TRIM support and the size of the file the tester tries to recover.
It is worth noting that some of the results are very difficult to explain logically. For instance if we
look at the result for the PQI1 drive we see that for the large file setup the recovery is 100% on
Windows 7 and at 0% on Linux. This could be explained by the fact that the GC is more effective
on Linux. Similarly we could explain that for the small file test the recovery on Windows 7 is at 0%
because the GC on this controller is more effective with small files. The logical conclusion would
be that the GC process effectiveness is better on Linux and on small files. But how can we then
explain that for the small file test on Linux the recovery level is at 100% ? This is a pretty good
example of the unpredictability of the results we discussed above.
While it is impossible to know if the apparent inconsistencies are related to the controller GC
algorithm or to the testing methodology used, we consider it is a bit unfortunate that the study
doesn’t provide more indication about the way the tests have been conducted. For instance it
would have been interesting to know how long the researchers have been waiting before starting
the recovery procedure. As we have seen in the test conducted in [GB10] it is difficult to know
when the GC process starts and we would have liked to get some certainty that enough time has
been given to the drive to complete its GC process. We would also have preferred having more
than 2 reference files on the drive as the results could potentially be very different if the drives
had been completely filled in with reference files.
Anyway, despite the tests limitations we mentioned in these 2 examples, the results obtained are
still very interesting. We have seen that SSD are able to modify their content by themselves
without getting instruction from the OS. In some circumstances the amount of data unrecoverable
after a quick format can be quite significant which is obviously negatively impacting the data
recovery process. However it should be noted that full data recovery after a quick format on SSD
is also sometimes possible. Finally these tests demonstrated the difficulty in predicting the rate of
data recoverability after a quick format.
33
4.1.2 Deleted data
On traditional HDD, when a file is deleted by the OS, the file is not physically removed from the
disk which makes the file recovery possible until this physical space is overwritten. As seen
previously, on SSD, the TRIM command and the GC process are ensuring a correct level of
performance by freeing up pages no longer in use. We have also seen that cleaning-up and
rearranging data too aggressively can sometimes be counterproductive and increases the write
amplification which in return decreases cells lifetime. It’s now time to see how these concepts
apply in real life situations.
In [CK11] study, 15 SSD drives are tested with multiple OS and 2 different disk usage. The low
usage scenario simulates a freshly installed OS plus a small reference text file (1MB) and a big
reference binary file (650MB) leaving most of the disk empty which should place the GC process
in an optimal situation. The high usage scenario simulates a freshly installed OS, the rest of the
space being filled in with binary files plus the small and big reference files. The test consists in
deleting both the small and big reference files and trying to recover them afterwards. It is worth
reminding that Windows 7 is the only OS supporting TRIM in this test.
Table 4: Percent blocks of test file recovered after file deletion
Source [CK11]
Some trends can be found in the test results shown in Table 4. According to [CK11], “All TRIM
enabled drives showed close to 0% data recovery for the large file. For small files, the difference
between manufacturers is pronounced.” On the other hand when TRIM is not supported by the
drive or the OS, the recovery rate is significantly better and often higher than 90%. While these
trends are quite clear they should not be considered as an absolute truth and there are some
notable exception. For instance the Intel2 drive shows a recoverability of 0% on large file, high
usage on Linux while other tests ran on Linux for the same drive show a recoverability close to
100%. It is also interesting to note that the 2 other Intel controllers don’t show the same behavior.
Here again it’s hard to say if this kind of exception comes from the controller behavior or from the
testing methodology used. Nevertheless these results clearly demonstrate that in general, when
the TRIM command is supported, the amount of data which can be recovered significantly
decreases.
34
4.1.3 Slack space
According to [AH08],”Slack space refers to portions of a hard drive that are not fully used by the
current allocated file and which may contain data from a previously deleted file”. On a HDD, a
cluster is the smallest allocation unit an OS can address and thus the smallest writable portion of
the disk. As shown in Figure 17, a cluster can contain multiple sectors and when data located on a
cluster is overwritten, it is possible that the new data is smaller than the previous data written on
the cluster. When this happens, the remaining part of the last written sector can continue to store
previously written data or this data can be deleted, depending on the OS. The unused sectors can
also contain some residual data. When working on HDD, forensic investigators are using dedicated
tools to retrieve residual information in the slack space. On SSD the overwriting process is
completely different and there is no residual data to recover in overwritten pages as explained in
[ZS11]: “There is no slack space as entire block are erased before write”.
Figure 17: Illustration of Slack Space on HDD Source [AH08]
4.2 Physical data recovery
So far we have seen how TRIM and GC are impacting data recovery on SSD. While the conclusions
we reached through the different experiments are true from a logical point of view, it is not
necessarily the case from a physical point of view. This is a major distinction between HDD and
SSD. On HDD all the physical sectors can be directly accessed via software based methods and the
physical recovery is only needed in some very specific cases such as defective HDD or advanced
techniques based on data remanence analysis. However with SSD all accesses to flash are
managed by the controller. The consequence is that some parts of the flash memory can simply
not be accessed and that some data shown as erased by the controller can still be present at the
physical level. While the techniques used to recover data from the physical flash chips are very
specific and go far beyond the scope of this document, we believe it is worth providing some
insight on these in order to allow the reader to estimate the feasibility of such techniques.
35
4.2.1 Techniques overview
The first step is to de-solder flash memory chips in order to be able to read its content in a memory
chip reader. This step is very tricky and special equipment is required to avoid any physical
damage to the chip which would result in data loss. This step allows bypassing the flash controller
in order to directly access the memory content. At this stage, data can be extracted from the
memory which is a relatively simple process.
Once raw data has been acquired, the most difficult part begins. Indeed, without the assistance of
the controller to present data in a logical way, the forensic examiner has to re-organize raw data
to make it usable. According to [MR13], data can be organized via multiple mechanisms by the
controller, each of those having to be reverse-engineered:
Separation by byte: the controller will write one byte on one component and the next
byte on a second component. This behavior is made to increase read/write operations
and the concept is similar to what is used on RAID (Redundant Array of Inexpensive Disks)
systems. The separation can also happen on more than 2 components.
Inversion, XOR, ciphering : these techniques are mutually exclusive so only one should be
reversed
o Inversion: because flash cell value is at 1 by default and because data usually
contain more 0, it is more efficient to inverse the bit value at the physical layer
o XOR: too many similar bit values located in the same
area can potentially lead to some errors during
operations on flash cells. As shown in Figure 18, XORing
allows for better distributing of the bits when needed.
Reverse-engineering the XOR function requires finding
out what was the value used for XORing.
o Ciphering: These functions use a concept similar to
encryption with the notable difference that this
manipulation does not intend to protect user data but to protect management
algorithms used by the controller. They usually can’t be reverse-engineered.
At the end of this phase, data is visible and files smaller than 1 page can be read. However
bigger files are corrupted and we need first to deal with interlacing and block organization
to make them accessible.
Interlacing: the goal of interlacing is to distribute data pages across multiple memory areas
in order to improve read/write performance. Multiple interlacing techniques exist, like
historical internal, internal and external interlacing which are respectively distributing data
pages across banks, blocks or components. The forensic investigator will test the different
options to find the one matching the controller he is working on. At this stage data could
be accessible if the controller was not also optimizing used blocks via multiple processes
like the wear leveling or garbage collection. The difficulty is now to re-order the pages and
blocks properly.
Without XOR With XOR Figure 18: XOR effect Source [MR13]
Figure 18: XOR effect
36
Block organization: the next step is to find all the blocks in use and to reorder them in
order to rebuild the logical volume. The first difficulty is to find the block number. The
controller usually stores the block number in the service zone of each page. As this
number is the same for each page belonging to a block, the number can be identified and
it is possible to create a list of blocks. When this is done, the blocks can be reordered.
However it is worth noting that multiple blocks can have the same number. This is due to
the duplication phenomenon. When the controller needs to write data to a block which
does not have enough free pages, it can decide to use another block already available to
speed up the writing or because of the wear leveling algorithm. The original block will be
cleaned-up later on by the GC but until then data is duplicated. According to [MW10], the
FTL can duplicate a file up to 16 times which is not negligible. Unfortunately the version
number of each block is hard to identify so it is difficult to know which one is the latest
one and some special algorithms need to be used to reconstruct data for all blocks having
a similar number. Once all the blocks are re-ordered the file system is accessible.
No need to say that such a manipulation requires special hardware, software and experienced
engineers but given the relative novelty of this science, the lack of standardization of SSD
controllers and the secrecy around the algorithms used, the achieved results in terms of physical
data recovery can be considered as impressive. However this kind of operation requires time and
money and cannot be used in every circumstance. According to [SH09], “As a result, the average
SSD recovery at Gillware Inc. costs $2850 and takes approximately three weeks to perform” which
is, according to [SH09] approximately 4 times more expensive and 3 times more time consuming
than recovering data on HDD. Things are improving overtime and software vendors are adapting
their products by providing information related to the various mechanisms used in each specific
drive like XOR values, interlacing type used, … which helps speed up the recovery process. It is
also worth noting that physical data recovery is not always possible, for instance when ciphering
or encryption are used but some research is being conducted in this area. Special optimization
algorithms such as de-duplication or compression will also add an extra layer of complexity during
the reverse engineering phase but should not be insurmountable. Finally, according to [MB07],
there are some other techniques like accessing the memory through the JTAG (Joint Test Action
Group) access port which can allow accessing data at the physical layer without having to de-
solder flash memory chips. However this technique is not a panacea and still requires a certain
level of expertise. Another issue is that the port does not exist on all systems and is sometimes
disabled at the factory.
4.2.2 Benefits compared to data recovery at logical layer
In the techniques explained here above, the forensic investigator has been able to rebuild the file
system from the physical memory chip. Now we need to analyze what are the extra benefits in
terms of data recovery compared to traditional software forensics methods working at the logical
layer.
37
Overprovisioning: as we have seen there might be a significant amount of flash memory used
for overprovisioning and not accessible via the OS. These blocks can virtually contain any type
of data because of the data duplication. Overprovisioned space is an area where the SSD is
better than HDD in terms of data recovery because it is possible to get multiple instances of
the same block. According to [MR13] “By reading the component at the physical level, it is
possible to find a block chronicle. It is interesting from a forensic point of view in order to find
files modifications” (translated from French by the Author). The amount of recoverable data
stored will depend on multiple factors such as the aggressiveness of the GC process and the
disk usage. It should also be noted that because the recovered blocks are not organized in a
file system, some special investigation techniques such as data carving will have to be used for
files bigger than one page.
Bad blocks: According to [FY11], “Since the smallest erasable area unit is the block, for any
unrecoverable error arising in any page, the whole block to which the page belongs will be
invalidated requiring the replacement of such block, so it will not accessed again”. This means
that in some case, for a single defective page, hundreds of pages could still contain accessible
data. Moreover when a block has been written to too many times, the writing speed becomes
so slow that the controller could decide not to use it anymore but the data stored on this block
is still readable. This is less likely to happen on modern SSD having a proper wear leveling
mechanism but there is still an interesting potential for recovering some data.
TRIM: In a very interesting test conducted in [MR13], some data are deleted on a SSD
supporting TRIM. Then data recovery is conducted via a software forensic tool which tries to
retrieve data from the logical location of the deleted files. As expected the software is only
able to retrieve bytes zeroed out. Then the forensic investigators conduct a physical access
recovery of the same data and they are able to retrieve 93,44% of the deleted data. This is a
very important point about TRIM which is very often misunderstood in the literature. First of
all, as we already pointed out, the TRIM command does not delete data, it just informs the
controller that this data is not needed anymore, it is up to the controller to decide what to do
with the blocks containing the data. The tricky part is that after you delete your files on a TRIM
enabled system, you could see a part of the data still being accessible and a part being
unrecoverable. The logical conclusion would be to think that data not recoverable has been
physically deleted from the drive but this is not true. Let’s explain this behavior by a quotation
found in [NC09], “If a read comes down before the first write after a Trim, the data returned
can be the last data written to the range (or) a fixed pattern (zeros or ones)”. Said differently,
for deleted data, the controller can decide to return the real data stored in the pages or a list
of bytes set to 1 or 0. Actually it sounds logical that the controller doesn’t systematically
immediately remove all the data marked for deletion. Indeed it wouldn’t make a lot of sense
to clean-up immediately this data after having received the TRIM command, this could lead to
write amplification in a similar way to what was described in [3.3]. That being said the reason
why the TRIM command has been setup in such a way is not clear and it would have been
better from a forensic point of view to return the data physically stored on the flash. Finally it
38
would be interesting to conduct a similar test after a quick formatting on a TRIM enabled SSD.
We would not be surprised to discover that the data we believed to be gone is still at least
partially present at the physical layer.
As we can see the potential for data recovery is significantly higher when directly accessing at the
physical layer. It should also be noted that because this kind of analysis allows to get rid of the
controller, it is possible to prevent any further modification to be done by the GC process. While
this is not a determining advantage in most scenarios it could still be advantageous in some
specific cases.
4.2 Forensic Imaging
A forensic image is created by imaging the entire drive content to a file. One or multiple checksum
values are then calculated to verify the integrity of the image file. A forensic image allows the
investigator to work on a copy of the original drive in order to reduce the risk of drive alteration.
The checksum is used as evidence that the original data has not been tampered with which is
required to ensure the validity of the analysis during criminal investigation. In this section we will
see how the forensic imaging process is impacted when conducted on SSD.
4.2.1 Write blocker efficiency
According to [RB13], “A write blocker is a special-purpose hardware device or software application
that protects drives from being altered, which is to say that they enforce a look-but-don’t-touch
policy”. Write blockers are used to ensure the drive being investigated is not modified during the
forensic image acquisition process. While they are very effective when protecting HDD they are
unfortunately not working that well when used with SSD. In the experiment conducted in [GB10],
Windows XP (no TRIM support) is used to do a quick format on a SSD and the system is
immediately shut down afterwards. Then the drive is connected to a hardware write blocker, the
system is restarted and the amount of data zeroed out is measured. Without a write blocker the
amount of recoverable data is close to 0% after only 3 minutes. With the write blocker, the results
are quite surprising as they vary depending on how long one waits at the login screen before
starting the test. If the login happens just after the login screen appears, the amount of
recoverable data is close to 100% but if you wait 20 minutes before logging in and running the
test, then 18,74% of the data is zeroed out. These results demonstrate that data self-corrosion
can happen even when a write blocker is connected to the system. However these results also
demonstrate that a write blocker is not necessarily completely ineffective. In the first test the
write blocker prevents nearly all the self-corrosion. In the second test, only a sixth of the data are
self-corroded which is still significantly lower than the result obtained without write blocker. An
explanation for this behavior would be that the GC process is triggered and/or influenced by some
ATA commands received from the system. This experiment also demonstrates one more time
how difficult it is to predict the behavior of a SSD and it is likely that results would be significantly
impacted if the same tests were run with different drives, write blocker or testing parameters.
39
4.2.2 Forensic imaging challenge
The literature related to forensic imaging of SSD is frequently partial and doesn’t provide a
complete overview of the problem. In this section we will try to get a better understanding of
what is possible and what is not in terms of forensic imaging of SSD and what are the consequences
on the forensic investigation.
The first question to raise is the feasibility of taking multiple identical forensic images of a SSD. We
found some articles which defend the point of view that it is not achievable. For instance,
according to [MS12C], “with no reliable method of repeatedly obtaining the same hash twice,
SSD’s will have to be treated just the same as any other volatile evidence source. Investigators will
have to rely on both documentation and demonstration skills to show exactly what steps have
been taken while working on the evidence, and hope for an understanding jury.” This conclusion
is based on the fact that the tester has been able to generate 2 different images of a SSD connected
to a write blocker. We don’t contest the test which has been performed but we are skeptical
about the conclusion. In some of the tests discussed earlier in this document, we have seen that
data can be modified by the GC process even after the SSD is disconnected from the computer.
Therefore if one takes an image just after the modification occurs and another image later on, it
is likely that both images will be different and so will the checksums. However it also seems logical
to think that after a while the GC will have completed all the optimization tasks. At that time no
more modifications should occur on the drive and we believe it should be possible to take multiple
images having the same checksum. In order to verify our assumption we did setup the following
test:
Test 1
1. Connect a SSD supporting TRIM to a computer installed with Windows 7. The SSD contains
data but we don’t care too much about their content (approximately 50% used of 64GB).
We choose to have a system supporting TRIM to make the test more difficult as TRIM is
generating more background operations for the GC process when data is deleted. The
drive is connected as a secondary drive via the SATA port.
2. Delete a certain amount of files from the drive and from the recycle bin and, 30 seconds
later, take an image using FTK Imager which will also calculate MD5 and SHA-1 hashes of
the image.
3. When the first image is completed, repeat the imaging procedure twice for a total of 3
images. We expect checksums from image 1 and 2 to be different and checksums from
image 2 and 3 to be similar.
Each forensic imaging took approximately 15 minutes to complete and the results were conclusive.
First and second images were different but second and third images were identical. In order to
see if further modifications were happening at regular interval on the drive due to the GC process,
40
we took 2 more forensic images which had identical checksums to image 2 and 3. This seems to
indicate that at some point the drive is reaching a stability level and that no more GC process
happens. However this should be tested on a longer period of time to be validated and it can
certainly not be extrapolated to all SSD. What this test demonstrates is that it is possible to take
multiple identical images from this type of drive which is good news from a forensic investigation
point of view.
Test 2
At this stage we decided to test if a power off/power on cycle on the SSD would reactivate a new
GC process cycle. When Windows shuts down or boots, it can modify some files content and/or
metadata. To circumvent that problem we unplugged the SSD while the computer was still
running, plugged the SSD to a write blocker and took an image. The obtained image was identical
and we decided to take another image after a full reboot. Here again the image didn’t change.
These tests demonstrate that when this drive has reached what we call the “GC stability”, the
logical drive content is not modified anymore even after a reboot or a power off/power on cycle,
assuming the drive is connected to a write blocker. While we have previously seen that the write
blocker cannot be considered as fully reliable, these tests demonstrate it is still needed in order
to prevent standard OS modification to the file system. It also confirms that it is possible to keep
the drive stable after the initial GC optimization.
Test 3
The remaining problem at this stage is that the GC is still modifying the logical content of the drive
when creating our first image. As we have seen the drive becomes stable very soon after the
modifications happen on the drive (less than 15 minutes after the recycle bin has been emptied).
In our final test we will try to see if further modifications can be prevented by the write blocker.
In this test we are filling in the drive close to its full capacity and we delete nearly all the data.
Immediately after emptying the recycle bin we wait for 30 seconds as we did in the previous test
in order to ensure that the TRIM command has enough time to send the information to the drive.
Then we unplug the SSD while the computer is still running and we connect it to the hardware
write blocker. As soon as possible after the drive is reconnected we take two consecutive images.
We ran the test multiple times and always got the same result, all the images taken after the write
blocker was connected to the drive were identical. This should certainly not lead us to conclude
that the write blocker is effectively preventing the GC to happen. There are multiple factors which
are different between the tests with and without a write blocker. For instance, because the write
blocker is using USB, the TRIM function is disabled. In theory it shouldn’t impact the test because
the OS had 30 seconds to send the TRIM command to the drive but we assumed, maybe
incorrectly, that the TRIM command was sent immediately after file deletion. Another factor is
that the imaging process is taking significantly longer (1 hour instead of 15 minutes) because of
the USB. This could result in having an untouched part of the drive being imaged first and the
modified part being imaged after the GC collection completed. Finally we have seen in the tests
41
conducted in [GB10] that the write blocker could behave differently based on some parameters
apparently unrelated. None of these explanations are really satisfactory and the behavior could
anyway be completely different on a different SSD. There is however a logical conclusion which
can be drawn from this test. Even if it can seem illogical, this test and the tests conducted in [GB10]
demonstrate that, in some situation, a hardware write blocker can, at least partially, preserve
some of the data on the drive.
Tests findings summary:
For the tested SSD and after the ongoing GC tasks are completed, it seems possible to keep
the drive in a stable state when connected to a computer even after multiple reboots (in this
last scenario a write blocker is needed).
If a hardware write blocker is connected immediately to the drive after an erasing data
command is sent, the write blocker may possibly prevent some further modifications on the
drive.
It will probably never be possible to guarantee that no modification occurred on a SSD
between the time the drive is seized and the time the image acquisition starts because some
modifications could happen by the simple fact the drive is powered on.
A hardware write blocker is still a very useful tool at least to prevent file system modifications
during a system reboot or a manipulation error.
We must keep in mind that these results have only been validated for a specific SSD with a specific
hardware write blocker and are not necessarily true if the tests conditions are changing.
To conclude, it seems possible to take multiple identical images of the same SSD but we cannot
be sure that no modifications occurred between the time the drive has been seized by the
authorities and the time the first image is taken. This opens the way towards legal discussion.
Indeed it seems reasonable to think that even if the data has been modified on the drive, the GC
process can only remove data and not add or modify existing data. However given the opacity and
the diversity of the GC technologies, this is an assumption which might be very difficult to prove
in court.
42
4.3 Data Sanitization
In this section we will first analyze the effectiveness of the most common techniques used to
securely wipe files and drives. Afterwards we will briefly cover the problem of data remanence
after data deletion. Finally we will address the efficiency of several techniques used to physically
destroy flash memory.
4.3.1 File secure wiping
There are multiple situations where a user can wish to sanitize a file to make it unrecoverable.
This can usually be easily accomplished with a dedicated tool when working on a HDD by
overwriting the file content one or multiple times with 0,1 or random values. On SSD things are
different because of the additional abstraction layer provided by the FTL. We will discuss the
differences on the basis of two distinct studies.
In the first study [MF09], multiple tests are conducted to sanitize files via well-known tools and
commands called “eraser”, “sdelete”, “wipe” and “dd”. This is achieved by copying various files,
different in size and type, on the drive and deleting them with one of the aforementioned tools.
Then a bit-to-bit image is taken with dd and this image is analyzed with “Scalpel”, a carving tool
used to recover as much data as possible. The test is repeated for each of the tools and the
percentage of recoverable data is measured.
According to [MF09], “It appears that files cannot be recovered after a SSD has been securely
wiped, even from the cache function used by SSDs to delete data to create free space”. This
sentence is an excellent example of conclusion which can easily be misinterpreted when used
outside of its original context. In order to better understand it, we will shed some light on some
aspects of this study to put this conclusion in perspective. The study demonstrates that among all
files which have been analyzed by “Scalpel”, none of the carved files was loadable which means
they couldn’t be opened or viewed, what the authors seem to consider as proof of secure wiping.
However it is very important to note that “Scalpel” found between 17 to more than 3000 files
depending on the sanitization command used. One noticeable exception is “dd” for which no file
could be found by “Scalpel”. The fact a file cannot be loaded or carved data cannot be linked to a
file doesn’t mean that no data can be recovered. This seems to be confirmed in [MF09] “While no
carved files were totally recovered, a large amount of data was carved from each image”. To come
back to the original conclusion drawn in this article we would only consider the deleted files as
securely wiped if not a single byte of those was recoverable which is apparently not the case.
Actually, based on the presented tests results, our conclusion would be that files cannot be
securely wiped on SSD with traditional file sanitization tools. Finally it is again very important to
note that all the conclusions from this study are based on logical data recovery and that results
could be completely different when going to the physical layer. This is what we will discuss in the
next paragraph.
43
In the study conducted in [MW10], a series of 1GB files filled in with fingerprints was created on a
drive. A different sanitization technique was used on each file for a total of 13 different protocols
published as a variety of government standards as well as commercial software. At the end of the
sanitization the drive was dismantled and fingerprints were looked for. In order to minimize the
interaction of using multiple different techniques on the same drives, the test was conducted 3
times with sanitization techniques being used in a different order. According to [MW10], “All
single-file overwrite sanitization protocols failed … between 4% and 75% of the files’ contents
remained on the SATA SSDs”. The researchers also tried to overwrite the free space 100 times and
to defragment the drive in order to encourage the FTL to reuse more physical storage but this test
was also ineffective with a percentage of recoverable free space ranging from 79 to 87%. While
this test was conducted on a single drive type, the results obtained are significant enough to
demonstrate that file secure wiping at the physical layer is inefficient. This makes perfect sense if
you think about the cell lifetime limitation. The controller must preserve the cells and should only
erase them when necessary in regards to the performance.
While some drives may behave better than others from this point of view, it seems pretty clear
that file sanitization is not fully possible on SSD when using standard solutions used on HDD. At
the logical layer the file sanitization looks better and would probably be even better with TRIM
being enabled. However, to be called secure, a wiping technique must sanitize 100% of the data
which doesn’t seem to be possible with current standard tools.
4.3.2 Drive sanitization
In the study conducted in [MW10], 2 categories of drive sanitization techniques are tested, the
ATA security commands and the overwriting techniques. After the sanitization is completed, the
drive is dismantled and remaining data are physically read from the flash. The ERASE UNIT ATA
command is used to erase all user-accessible areas by writing all binary or ones while the ERASE
UNIT ENH command is similar but uses a vendor defined pattern. The test was conducted on
multiple drives after having filled them in multiple times with fingerprints in order to use as much
overprovisioned space as possible. Among the drives supporting the ERASE UNIT command, 4
passed the test successfully, 2 others reported the command as having failed and the last one
reported the command to be successful while all the data was still accessible on the drive. Among
the 4 drives supporting the ERASE UNIT ENH, all successfully passed the test but it’s worth noting
these drives were also the ones which passed the ERASE UNIT test successfully. The conclusion is
that one should not rely on the fact a drive is supporting these ATA commands to consider a drive
can be properly sanitized. Actually it seems a bit strange that some of the drives passed the test
because these 2 ATA commands are only supposed to sanitize the user accessible areas which
means that, even if the command is properly implemented, there should be some residual data
on the spare and overprovisioning areas. We don’t have any explanation about this and it would
be interesting to know the reason for this behavior. Finally the study also refers to the new ACS-2
specification which specifies a BLOCK ERASE command which is supposed to erase all memory
blocks containing data even if they are not user accessible. This specification was still in draft at
the time of the study and none of the drives were supporting it.
44
The second category of sanitization tests conducted in [MW10] was based on “sequentially
overwriting the drive with between 1 and 35 bit patterns”. To measure the effectiveness of the
sanitization the drives were filled in with fingerprints and overwritten each time completely with
new fingerprinting. The initialization and overwriting phases were achieved sequentially or
randomly and the researchers calculated how many passes were required to ensure none of the
original fingerprints remain. As shown in Table 5 the best results are achieved when the
initialization and overwriting are sequential. Under these specific circumstances most of the drives
are sanitized after 2 passes with the notable exception of 1 drive not being sanitized after 20
passes. The rest of the tests are more difficult to interpret because of the lack of resource in terms
of SSD available (each test is destructive) or in terms of time required when random initialization
or overwriting was involved. These results led the authors to conclude: “Overall, the results for
overwriting are poor: while overwriting appears to be effective in some cases across a wide range
of drives, it is clearly not universally reliable. It seems unlikely that an individual or organization
expending the effort to sanitize a device would be satisfied with this level of performance”.
Table 5: Drive Sanitization with overwriting techniques
Source [MW10]
4.4.3 Data remanence
In his 1996 article [PG96], Peter Gutmann suggested that it was possible to retrieve overwritten
information from HDD with a technique called magnetic force microscopy. While the feasibility of
such a recovery technique is put in question in articles such as [DF03], it is still worth to wonder if
it is, at least theoretically, possible to retrieve information from flash cells after they have been
overwritten.
According to [ES08], “Data remanence in NAND flash is mainly caused by a so-called hot-carrier
effect, where electrons get trapped in the gate oxide layer and can stay there as excess charge.
The amount of trapped charge can be determined by measuring the gate-induced drain leakage
current of the cell, or more indirectly by measuring the threshold voltage of the cell”. It is thus
45
theoretically feasible to retrieve information in NAND cells even after data has been overwritten.
However, according to [PG01] and [SH89], “the changes are particularly apparent in fresh cells but
tend to become less noticeable after around 10 program/erase cycles”. The data remanence
should thus be mainly considered as a possibility on barely used drives or on drives where wear
leveling is not effectively implemented. It should also be noted that because of the smaller
difference in voltage required on multi layers cells, data are more difficult to recover on MLC and
TLC than on SLC. Finally, at the time of writing, recovering data on overwritten NAND cells should
be considered as an extremely difficult and uncertain task.
4.4.4 Physical destruction
When data sensitivity requires it, it may be necessary to physically destroy a drive. For HDD, an
effective technique called degaussing can be used in order to eliminate unwanted magnetic field
stored on the disk. According to [MW10], “Degaussing is a fast, effective means of destroying hard
drives, since it removes the disks low-level formatting (along with all the data) and damages the
drive motor”. A test described in [MW10] applies the same degaussing technique to 7 flash chips.
While flash doesn’t rely on magnetism to store data, the idea was to see if the current induced by
the strong alternating magnetic fields in the chip’s metal layer could damage the chip and make it
unreadable. The result pointed out that data remained intact on all chips. This demonstrates one
more time that techniques applicable to HDD should be cautiously considered and tested before
being used on SSD.
In a very enlightening and entertaining document [BP07], multiple destructive tests are conducted
against flash chips stored in USB sticks. The stick is then either connected to a pc when possible or
dismantled to access the flash directly and try to recover data. While shooting the stick with a
pistol, smashing it with a hammer or cooking it in a microwave oven definitely made the stick
unreadable, some other very aggressive techniques proved to be inefficient.
Figure 19: Examples of inefficient destructive techniques
Source [BP07]
46
As shown in Figure 19, some USB sticks which may look completely unrecoverable after having
been incinerated in a petrol can for several minutes, are actually fully recoverable. Other tests
such as stomping on the stick multiple times, soaking it into water or overvolting it with a car
battery gave the same results and data proved to be recoverable. Beyond the entertaining aspect
of these experiences the important fact to remember is that it should not be assumed a
destructive technique is effective on SSD if it hasn’t been scientifically validated.
4.5 Chapter conclusions
In this chapter we have seen that data recovery on SSD can be pretty challenging when compared
to recovery on HDD. At the logical layer the GC process and the TRIM command can eliminate a
lot of data very quickly from the drive, making the life of forensic investigators more difficult. On
the other hand we have also seen that SSD behavior varies greatly depending on the conditions
and that in some circumstances the amount of data which can be recovered is very similar to what
is recoverable from a HDD. At the physical layer, the amount of recoverable data is pretty high
and data apparently deleted by the TRIM/GC may actually still be present on the drive. Moreover
the spare and overprovisioning areas increase the amount of data potentially recoverable and
some data can be duplicated a significant amount of times. This lead to the conclusion that SSD
are not necessarily worse in terms of data recovery than HDD, they are just different and their
behavior is less predictable. However the techniques used to recover data at the physical layer are
more complex, time consuming and costly which doesn’t make them suitable in any
circumstances.
We have seen that the forensic imaging process is more difficult on SSD because of the GC process
which can modify data during the acquisition phase and lead to some legal issues. However, in the
tests we carried out, forensic imaging was still achievable in good conditions but further research
should be conducted in this area in order to draw more definitive conclusions.
We have also demonstrated that secure file wiping was apparently not achievable with standard
techniques and that the reliability of most common drive sanitization techniques greatly varied
depending on the drive and the method used. While the presented sanitization techniques cannot
be considered as perfectly secure they could however be suitable for some users depending on
their expectations and the value of the data to protect. We also saw that even if theoretically
feasible, recovery of overwritten data based on flash cells remanence should not be of much
concern in real life situations. We finally showed it was necessary to carefully evaluate physical
destruction methods when used on flash cells.
47
5 Analysis of potential mitigation techniques in the Enterprise
Based on the chapter 4 outcome, this chapter will assess some of the existing techniques and
propose innovative solutions which could be used to mitigate the negative impact of SSD on
forensic investigation. We will first assess if software and hardware drive encryption can help in
improving deleted data recovery and drive sanitization. After that we will propose solutions in
order to improve recovery of deleted data at the logical layer, file sanitization and drive imaging.
The proposed solutions are Enterprise oriented which means we assume that we are in full control
of the whole storage setup.
5.1 SSD encryption
In this section we will analyse the effect of encryption on SSD from a forensic point of view. The
idea behind this approach is that encrypting drives could paradoxically help in data recovery by
minimizing the effect of data removal on drive. This is seen under an enterprise approach which
means the encryption key is known by the forensic investigator. We will also evaluate if destroying
the encryption key can be considered as an effective means of sanitize the drive.
5.1.1 Software based encryption
Let’s consider a partition or a drive encrypted with a tool such as “TrueCrypt”. In this situation the
whole volume is encrypted which means that even an advanced GC process, aware of the file
system, won’t be able to determine which files have been deleted or modified. So theoretically
the GC shouldn’t be able to clean-up the unused space and should thus preserve the deleted data.
This assumption is true if we consider the GC alone but two situations should be taken into
consideration.
If TRIM is not enabled, according to [ES12], “As TRIM is a major let-down in forensic data recovery
of any deleted information, by acquiring the PC with an SSD drive holding an encrypted volume
opens the way to access all information stored in the container including deleted files (provided
that a decryption key is known)”. In this scenario, the SSD behaves similarly to an HDD in term of
deleted data management because there is no TRIM and the GC process is not able to identify if
any part of the volume contains deleted data due to the full volume encryption.
If TRIM is enabled, the OS will be able to inform the drive about the pages which are not in use
anymore. Because the TRIM can decide to zero out this space, data stored at this location is not
logically recoverable. It is worth noting that TRIM can only work effectively if not blocked by the
software encryption tool. TrueCrypt for instance allows TRIM to function properly even on an
encrypted partition.
48
Table 6: Drive encryption performance impact Source [MW12]
Software encryption can assist in forensic investigation of deleted data at the logical level when
TRIM is disabled. The real question is to know if it does make sense to go that way for an
enterprise. Indeed there are some drawbacks associated with software encryption. The bigger
impact comes from the drop in performance which can be quite significant and reduce the benefit
of using SSD as shown in Table 6. Having the TRIM function disabled will reinforce that problem
especially if there is not enough free space available to let the GC process optimize the write
operations. This solution would also add an extra layer of complexity as all drives would have to
be encrypted and a key management system implemented. Even so, this could still be a viable
approach for enterprises which intend to use software encryption anyway. However, and even if
it is beyond the scope of this discussion, it would then be very important for these enterprises to
seriously evaluate the drawbacks caused in terms of data protection when using software
encryption on SSD. Among these problems, the fact software encryption is too slow to cope with
SSD performance leads some software developers to propose certain options in order to improve
the performance. For instance, in PGP, according to [BG11],”AES-128 support was introduced to
speed up encryption/decryption operations in order to keep up with the much increased
throughput of SSD drives. Weaker 128-bit encryption is less secure compared to traditionally used
AES-256”. Similarly, according to [BG11],”…To combat this, we introduced a command line option:
--fast. If you encrypt using this option, it doesn't encrypt blank sectors. Due to security
considerations, this is an advanced option only available on the command line”. It should also be
noted that because of the data duplication, it would be a lot better to create the encrypted
partition before starting to store any data on the drive as explained in [TC13], “Due to security
reasons, we recommend that TrueCrypt volumes are not created/stored on devices (or in file
systems) that utilize a wear-leveling mechanism (and that TrueCrypt is not used to encrypt any
portions of such devices or file systems). If you decide not to follow this recommendation and you
intend to use in-place encryption on a drive that utilizes wear-leveling mechanisms, make sure the
partition/drive does not contain any sensitive data before you fully encrypt it (TrueCrypt cannot
reliably perform secure in-place encryption of existing data on such a drive; however, after the
partition/drive has been fully encrypted, any new data that will be saved to it will be reliably
encrypted on the fly)”.
49
5.1.2 Self-Encrypting Drives
On SED (Self-Encrypting Drive), the whole encryption process is handled by the drive on a single
chip called hardware encryptor. This technology is available for HDD and SSD and many SED
vendors have adopted the Opal SSC specifications from the Trusted Computing Group (TCG), a
not-for-profit and vendor-neutral organization, which provides a wide set of functionalities
including authentication, access control and cryptographic commands. There are multiple ways to
secure the data on the SED. Let’s review quickly one of those, proposed by the TCG. As shown in
Figure 20, the bios attempts to read the MBR which redirects towards a pre-boot area where a
pre-boot OS is loaded. This is usually a tiny hardened Linux containing the minimal set of
instructions needed to continue the process. Then the user is requested to authenticate via
password called the authentication key (AK). Afterwards, the AK is hashed and the hash is
compared with the value stored on the drive. If they match the user is authenticated and the AK
is used to decrypt the DEK (Data Encryption Key) stored on the drive. The drive is now able to use
the DEK to decrypt all data stored on the drive and can load the original MBR and start normal
operations.
Figure 20: SED security implementation
Source [MW12]
From a practical point of view, SEDs provide many benefits. For instance, after the initial
authentication, the process is completely transparent to the user and, as shown in Table 6, the
performance of a SED is very similar to an unencrypted drive. The key management is also a lot
easier as the DEK is managed by the drive and never leaves it. Modifying the authentication key is
also an easy task which doesn’t require drive re-encryption as the only thing to do is to encrypt
the DEK with the new AK.
50
From a security point of view, SEDs are better than software encryption. For instance, user data is
always encrypted, and encryption cannot be turned off. As the DEK is generated on the drive and
never leaves it, attacks against the key cannot be conducted at the host level. Finally the DEK is
stored in a safe area of the drive which makes it hard to retrieve it and even then, because DEK is
encrypted, the key would still have to be decrypted before being usable.
Now that we have some level of confidence that the entire data set is properly protected by the
encryption mechanism, let’s see if encryption can be considered as a proper form of sanitization.
It could be argued that SEDs are theoretically vulnerable to some type of attacks. For instance the
Authentication Key which is basically the user password is probably the most vulnerable part of
the cryptographic system we described. It is likely that the password will be easier to break than
the DEK which would annihilate the protection. Even if some drives implement additional
mechanism to slow down or seriously obstruct password brute forcing, this threat should still be
considered seriously. This is why the destruction of the DEK is a compulsory requirement when
using encryption as a sanitization mechanism. The theory is that if a drive is encrypted properly,
destroying the encryption key should make the data irrecoverable which could be considered as a
form of sanitization, usually referred to as Cryptographic Disk Erasure.
With software encryption, we have seen there is a risk of having unencrypted data still available
on the drive if the encryption has been setup while unencrypted data was already present on the
drive. This won’t happen with a SED as all data are encrypted from the beginning. Also, with
software encryption and according to [ES12], “In particular, encrypted volume headers that are
typically overwritten when the user changes the password may have multiple replicas in accessible
and non-accessible areas of the flash memory. Non-accessible areas, in turn, can be read with
inexpensive FPGA or controller-based hardware, providing an attacker the ability to try one of the
previously used passwords to derive decryption keys”. Based on these risks the cryptographic disk
erasure should be considered skeptically for software encryption.
On their side, SEDs use a far more secure approach and it seems reasonable to consider
cryptographic disk erasure as an appropriate means of sanitizing data on those. However there
are some residual risks which shouldn’t be underestimated. For instance we have seen that some
drives don’t properly implement the standard ATA commands used to erase the drive. The same
kind of issue could happen during the key erasure process. According to [SS10], “the
implementation of key destruction may be faulty. The long history of incorrectly or insecurely
implemented cryptographic systems, makes it likely that these weakness will exist in at least some
SSDs”. It should also be noted that the implementation we described in Figure 20 is one way to
implement security mechanisms in SED but other potentially less effective approaches could be
adopted by some SED vendors. As we can see, the effectiveness of the sanitization will depend on
the proper implementation of the encryption mechanism and on the proper encryption key
destruction. Consequently it is very important for an enterprise to get some certainty regarding
51
the proper functioning of their SEDs if they plan to use cryptographic disk erasure. Some device
vendors have understood this problem and propose SEDs like the SAMSUNG SSD PM810 SED FIPS
140, compliant with the FIPS 140-2 level 2 standard which can provide a sufficient level of
confidence to their customers. Unfortunately the choice on the market is currently far too limited
which makes this sanitization technique somewhat risky. In order to circumvent this problem,
more drives should be evaluated or accredited to ensure they allow a proper level of sanitization
which could be challenging, given the diversity of drives and controllers available on the market.
In conclusion, and as agreed by NIST (National Institute of Standards and Technology) in [RK12],
“Crypto Erase is an acceptable means to purge data on SSD”. This recommendation is however
only valid if the drive properly implements cryptographic disk erasure and we would strongly
recommend to follow the NIST advice in [RK12], “Optionally:After Cryptographic Erase is
successfully applied to a device, use the block erase command (if supported) to block erase the
media. If the block erase command is not supported, SecureErase or the Clear procedure could
alternatively be applied”. Using both valid drive sanitization techniques (crypto disk erasure and
standard SATA erase commands) together should mitigate the risk resulting of improper
implementation of these commands on the drive. That being said, the risk of having both sets of
commands implemented incorrectly cannot be completely suppressed unless the drive has been
carefully tested or accredited.
5.2 Software optimization proposal
So far we have seen that drive sanitization can successfully be achieved through cryptographic
disk erasure or standard ATA commands assuming they are correctly implemented in the
controller. While encryption can also assist in improving deleted data recovery at the logical layer,
this solution can be quite constraining in regards to the benefit provided. We couldn’t find any
fully satisfactory solution proposed by the industry to overcome the drawbacks of the SSD
technology in regards to logical data recovery, file sanitization and drive imaging. In this section
we will propose a few trails which could help not only to tackle the aforementioned issues but also
to improve the forensic capability of SSD compared to traditional HDD.
5.2.1 Drive imaging
The drive imaging issue comes from the background actions taken by the GC process sometimes
assisted by the TRIM. Because of this it seems impossible to control the SSD behavior because the
controller is out of our control when we need to image the drive. However we believe that this
could be solved by adding a new standard command (ATA for instance) supported by the drive to
force it to immediately stop any further destructive action. The drive would then enter into a kind
of frozen state, which would prevent any further modification. The only requests which would be
accepted by the drive in this state would be the read commands but all the background processes
such as wear leveling and GC would stop and all write requests would be refused. The benefits
would be:
52
Make the drive imaging easier and reliable
Increase the amount of recoverable data
Pretty simple to implement for the drive maker
Can be directly implemented at the hardware level in a disk duplicator. When the drive is
seized, the forensic investigator can unplug it immediately and plug it into the disk duplicator
which will immediately send the freezing command to the drive, minimizing the chance of data
modification.
The state can be reverted back to normal operations
Legal issues would be easier to tackle
5.2.2 Logical data recovery
In order to improve logical data recovery we have to cope with 2 problems, the GC process and
the TRIM command. Let’s discuss the TRIM command first.
5.2.2.1 TRIM specification and TRIM On Demand
As we have seen the TRIM command is implemented at the OS level and informs the SSD when a
file is deleted. After having received the command the controller knows this logical space is no
longer in use and can thus decide to clean-up the physical space associated with it. We have also
seen that when a request is made at the OS level to retrieve data stored at that logical location,
the controller can decide to show the data really stored at the physical layer or to display a list of
zeros or ones. This is one of the most annoying behaviors of the drive because we have seen that
a significant amount of the “TRIMmed” data can still be present at the physical layer but is not
retrievable at the logical layer. We suggest that the specification of the TRIM command should be
changed to force the controller to display the data physically stored until they have been
effectively deleted. We believe this solution would significantly increase the amount of logically
deleted data recoverable at the logical layer because the controller has no reason to quickly delete
all the available pages because of the write amplification it would imply and the negative effect it
would have on the cells lifetime.
We also believe the TRIM command currently embedded at the OS level could be more efficiently
handled at the applicative layer via third-party software. This is what we call the “TRIM On
Demand”. Let’s imagine the current scenario. A freshly installed OS on a 100GB partition. After
the OS installation, the storage used is 5GB. The user now copies 20GB of data on the drive and a
bit later deletes 10GB of data. Normally the TRIM command should be sent with the list of files
deleted but this means these 10GB of data, or at least a part of those, will probably become
irrecoverable at the logical layer soon after the TRIM command is issued. Actually there is no
reason for this to happen. There is plenty of space available on the drive, not even taking into
account the overprovisioned space, to optimize write operations and it would be more beneficial
to be able to retrieve this deleted data if needed. Now let’s imagine a third-party software is used
to handle this situation. The software knows that 25GB of data has been written to the disk. It also
53
knows that 10GB of data has been deleted and it will keep the list of deleted files. However this
software won’t send the TRIM command now because there is no need for that and it will wait
until a predefined threshold is reached. The software could for instance decide that 10GB of
physical space must always be free. For instance, in our example, if the user adds 70GB of data to
the drive, the logical amount of data will be 85GB but at the physical layer 95GB will be used.
When the amount of physical data goes above the threshold (90GB in our example), the third-
party software will issue the TRIM command for 5GB of previously deleted files to allow the GC
process to free-up some space for future writing. Practically, once the threshold is reached, the
amount of data physically stored on the drive will never go under that threshold anymore. For
instance, if the user deletes 20GB of data, no TRIM command will be sent and we will stay at a
90GB drive space usage. On the other hand if the user adds 5GB, a TRIM command will be issued
again with a list of files having a total size of around 5GB. The drive will thus behave as if it was
close to its maximum capacity all the time, which is a situation a good SSD should be able to handle
easily.
If this approach is used, the amount of recoverable data is not only bigger, it is also customizable
and it is even possible to secure the erased data. Let’s come back to our example. When you delete
data, it is possible that you overwrite your deleted data the next time you write new data. In
theory there is not much you can do about that because it is the OS which is in charge of choosing
where to write the new data. In our example it would mean that the third-party software must
take that fact into account when making calculation about the used an unused space at the
physical layer. However this software could also interact with the OS to decide where to write the
new data. By doing this it should in theory be possible to prevent the overwriting of deleted data
at least as long as there is enough space available to write the new data on the drive. Another
benefit of such a mechanism is that the user could be in full control of the TRIM and write
operations behavior. It would thus be possible to customize the behavior of the drive according
to user needs. For instance:
Like in our example a threshold can be defined with the maximum amount of physical
space which can be used on the drive. A part of this physical space will be used for valid
data at the logical layer while the rest of the storage will be used for deleted data
retention. This threshold could be defined in terms of space capacity (in GB), percentage
or even in terms of performance degradation of write operations.
It should be possible to prioritize the order of deleted files to be “TRIMmed”. For instance
it could be considered that temporary files should be “TRIMmed” first while files contained
in some specific directories should be kept as long as possible.
A minimum time before “TRIMming” could be defined. For instance it could be desirable
to keep deleted files at least for 1 day before issuing a TRIM command for those. This could
help in preventing a malicious user sanitizing some sensitive files if he suspects he will get
caught soon.
54
We believe this solution is quite interesting because it is purely software and there is no need for
modification of the controller algorithm. This should be relatively easy to implement and very
flexible in terms of functionalities proposed by the software vendor. The cost should be relatively
low and the solution should be manageable on a large scale and is thus suitable for an enterprise.
Finally this solution would not only bring the SSD to a level similar to HDD in terms of data
recovery, it would even improve the level of protection according to user needs.
5.3.2.2 Garbage Collection Customization
The second problem we face in order to optimize our chances of recovering deleted data is the GC
process. We have seen that the GC is usually not aware of the file system and doesn’t know when
files are deleted, this is why the TRIM function is required. However this is not always true and
some controllers can have some understanding of the file system or/and of the drive partitions as
shown in the test conducted in [GB10] that we discussed in [4.1.1]. Thanks to that knowledge, the
GC process is able to re-use space which has been quickly formatted, re-partitioned and even in
some cases, space used by deleted files. We believe the controllers will continue to become
smarter in the future and will be able to detect more and more situations where the logical space
is no longer in use, meaning that the allocated physical space can be cleaned-up for write
optimization and wear leveling by the GC. This is actually a good thing from a performance and
drive optimization point of view but this is interfering with the solution we proposed in the
previous section because the controller could detect unused space and clean-it up, deleting
irretrievably the data stored on it. This mechanism would, at least partially, bypass the mechanism
we described and would make the third-party software we discussed make wrong decisions.
Indeed, if we come back to our previous example, let’s imagine the user deletes 5GB of data on
the drive. As we explained in the previous section, the TRIM command would not be sent at that
time by the third party software so the data would stay on the drive. Now if we have a controller
that is file system aware, this controller would know these files have been deleted and delete the
data stored at the physical layer. It would not only make this data irrecoverable but it would also
make the third-party software believe that these 5GB of data is still in use on the drive while it is
not. This would lead to software misbehavior and cause potential issues in terms of reliability and
performance. To prevent this, it would be desirable being able to force the controller not to take
any destructive action on a part of the drive. This could be achieved during the third-party
software initial setup. For instance:
1. The third-party software is installed and it records the list of files in use and their size. The
software only starts to protect the drive at the installation time, data which has been
deleted before the software installation is not covered by the protection.
2. The software sends a command to the controller requesting the creation of a secure area
on the drive. This secure area can be defined through the software. In our example, this
area would be the 100GB partition.
3. The controller only relies on the TRIM command (handled by the third-party software) to
decide which pages are no longer in use. This means the controller is not trying to detect
the unused pages by itself on this secure area
55
This solution looks relatively easy to implement. In theory the only thing the controller needs to
do is to keep a table of all the pages mapped to the logical secure zone and not remove any
information from those unless authorized by the TRIM command sent by the software. It doesn’t
prevent the controller from optimizing writing or to conduct wear leveling when needed on the
secure area. For instance, if a TRIM command announcing that 100 pages of a block are not in use
anymore, the controller can transfer the valid data on this block to an empty block and replace
the original block by the new one in the mapping table. The old block can then be erased by the
GC process if needed.
5.3.3 File sanitization
The last issue remaining is related to file sanitization. We have seen it is not possible to securely
sanitize a file on a SSD. This problem comes from the TRIM command specification which is just
an information sent to the drive but doesn’t force the controller to take any specific action. Here
again, adding a new ATA command could solve the problem by sending not only the information
about the file which is no longer in use but also by requesting the file to be physically removed
from the flash pages. This command, that we call ETRIM standing for Enforceable TRIM, would
thus be similar to TRIM with the notable difference that an immediate erasure action is required.
This solution shouldn’t be very difficult to implement in the controller algorithm and could be sent
via a dedicated OS command or a dedicated application. While this would make file sanitization
effective at the logical layer it should be noted that it wouldn’t guarantee that data is completely
erased at the physical layer because of the data duplication we previously discussed. It should, in
theory, be possible to ask the controller to look for similar data on each and every page and to
delete those but it would present some severe drawbacks. First of all it would probably require a
certain amount of resources in order to compare each and every page with the pages which must
be deleted. Secondly it would probably require a significant change in the controller algorithm and
potentially lead to important write amplification if big files are deleted on a regular basis. Finally
it could present some risk because two similar pages can exist without being part of the same
original file. As the controller doesn’t always know if a page is no longer in use, there is a risk that
a page no longer in use is considered as valid by the controller and is thus not deleted. This would
result in an imperfect file sanitization. Consequently the file sanitization at the physical layer
would be complex, resource consuming and not fully reliable.
56
5.3 Chapter conclusions
In this chapter we have seen that, when properly implemented, drive sanitization can be effective
when using crypto disk erasure on SED. We have also proposed a solution to drastically improve
the drive imaging process in order to increase the amount of recoverable data and cope with the
related legal issues. In regards to recovery of deleted data at the logical layer, we proposed a
software based solution called TRIM On Demand which provides a flexible and powerful solution
potentially allowing SSD to behave better than HDD in term of deleted data recovery. Finally we
proposed a solution allowing effective file sanitization at the logical layer and explained why we
believe file sanitization at the physical layer is complex and can probably not be effectively
implemented in the current state of the technology.
57
6 Conclusion
6.1 Assessment of objectives
After having explained in detail how SSD technology works and why its advantages make it very
likely to be widely adopted in the near future, we have shown that this technology is suffering two
main drawbacks, namely the limited cells lifetime and the performance degradation overtime. We
have seen that some promising new production techniques could solve the cells lifetime issue but,
at this moment in time, these two problems remain and are inherent to the SSD technology. It
explains why SSD vendors put so much effort into improving their optimization algorithms and we
have demonstrated that a lot of different mechanisms such as garbage collection, wear leveling,
TRIM or overprovisioning are used to circumvent these limitations. We showed that these
solutions have a deep impact on the data organization at the physical and logical layer which
results in an impact on computer forensics. These two drawbacks also explain why the controller
optimization is the real challenge for SSD vendors and why vendors keep all information related
to their optimization algorithms so secret which does not really help in understanding drive
behavior.
After this theoretical part, we came to a more practical approach where we analyzed in depth the
real impact of SSD on computer forensics. We first focused on deleted data recovery at the logical
layer and demonstrated that SSD can, but don’t necessarily, have a big impact on these. Our
analysis showed that the impact was influenced by multiple factors such as the drive type, the
TRIM command or the Operating System used. We also demonstrated that in some circumstances
SSD allow the recovery of as much data as we would with a traditional HDD and showed that SSD
behavior was extremely difficult to predict.
Secondly we explained that it was possible to recover a significant amount of data at the physical
layer even when this data was apparently not available anymore at the logical layer. We
demonstrated one more time that SSD are inherently different from HDD because of the Flash
Translation Layer and the controller which don’t necessarily present the real data stored at the
physical layer to the user, notably when the TRIM command is involved. While data retrieval at
the physical layer presents some challenge on its own, we showed that this technique allowed the
recovery of a significantly higher amount of data compared to the data recovery at the logical
layer. In some circumstances, this analysis at the physical layer could even give to SSD an
advantage in terms of data recovery compared to HDD, notably because of the data duplication
and extra space used for overprovisioning.
58
Thirdly we analyzed SSD behavior during the forensic imaging process. We showed that the
background optimization tasks could continue to modify the drive content even after the drive is
disconnected from the host computer which is making the forensic drive imaging more challenging
compared to HDD. We demonstrated with homemade tests that, at least in some cases, it was
possible to get the drive in a stable state pretty quickly after the drive seizure. When this state is
reached our tests demonstrated that the drive content was not modified anymore, making it
possible to take multiple forensically identical images from the drive and to, at least partially, solve
some potential legal issues. We also saw that a traditional write blocker was not able to reliably
prevent the garbage collection process from happening in the background. However we also
showed that a write blocker was still required during the investigation process and that it could,
in some circumstances, minimize modifications happening on the drive, allowing for the recovery
of more data at the logical layer.
Fourthly, in terms of drive sanitization, we showed that standard ATA erasure commands can be
used to effectively sanitize a drive if these commands are properly implemented at the controller
level. However we demonstrated that a significant percentage of drives don’t properly implement
these functions, the worst case being a drive not removing any data and reporting the sanitization
operation as being successful. This sanitization method should consequently only be considered
for drives that have been carefully tested.
Fifthly, we analyzed the effectiveness of file sanitization at the logical and physical layer based on
two studies. We showed that, to the contrary of what is claimed in the first study conclusion, file
sanitization at the logical layer is not effective with standard sanitization tools. The analysis of the
second study clearly demonstrated that file sanitization was not effective at the physical layer
either.
After having analyzed the impact of SSD on computer forensics, we proposed multiple existing or
innovative solutions which could be used to improve the situation in the enterprise, meaning in
an environment where the storage setup can be fully controlled. We first analyzed the benefits of
using drive encryption and came to the conclusion that using Self-Encrypting Drives could be a
good solution in order to achieve drive sanitization via cryptographic disk erasure assuming this
function is properly implemented at the controller level. Secondly, in regards to forensic drive
imaging, we proposed the creation of a new ATA command which could be used to put the drive
into a frozen state which means that no data modification would occur on the drive after the
command had been received. This would make the imaging process a lot easier, optimize the
amount of recoverable data and reduce the risk of legal issues. Thirdly, in order to optimize
deleted data recovery at the logical layer, we proposed a mechanism called “TRIM on Demand”
where the TRIM command is no longer handled by the OS but by third-party software. We
demonstrated that this solution could not only help in increasing the amount of recoverable data
but also in customizing the deleted data retention parameters in order to better suit the user
needs. We showed that this solution should be used in conjunction with the creation of a secure
59
area on the drive on which the controller would only try to clean-up data if a TRIM command is
received. This method is very flexible, should be relatively easy to implement and would allow SSD
to behave better than HDD in terms of deleted data recovery. Fourthly, in order to solve the file
sanitization issue at the logical level, we proposed the creation of a new ATA command we called
ETRIM. It would be pretty similar to TRIM with the notable difference that this command would
not be informative but enforceable. Immediately after the ETRIM command has been received by
the drive for a given file, the drive will erase that file at the logical layer and return a predefined
pattern if one tries to access to it. Finally we explained why we believe the file sanitization at the
physical layer is not easily achievable with the currently existing technology.
6.2 Conclusion
To conclude we have demonstrated that, at this moment in time, SSD present some drawbacks
which are hard to circumvent in terms of computer forensic capability when compared to HDD.
These problems are real and probably explain the fears of the forensic community in regards to
this technology. We believe these issues are the result of a combination of multiple factors such
as the relative novelty of the technology, its complexity, the use of forensic tools which have been
designed to work on HDD but haven’t been adapted to SSD yet and a lack of common standards
dedicated to SSD. However we don’t see any valid reason why these problems couldn’t be solved
in the future.
We believe forensic investigators should look at the long term and think about the potential
benefits of this new technology and especially the powerful storage management system
introduced via the drive controller. While it is currently seen as an enemy, taking control of data
organization on the drive and acting under its own will, the controller could at the end become a
precious partner in helping enterprises to optimize drive behavior according to their needs, like
the TRIM On Demand method we proposed in this document. In order for this to happen, forensic
investigators and controller developers need to start working together. It could seem at first that
their goals are too different to converge but, in the end, the forensic investigator is working for
the enterprise being the customer of SSD makers. For those, proposing solutions to interact with
their drive controllers could in the future be an opportunity to stand out from the competition.
This cooperation between SSD makers and enterprises could start first with the development of
new standards, like the ones we proposed in the last chapter, and end-up later on with complete
custom solutions made to provide enterprises maximum control on their SSD, letting them decide
what is the best balance between maximum performance and optimized data management.
Setting up this new relationship between enterprises and drive makers will probably take some
time but this is a vital step to be reached not only in regards to SSD but also to be prepared for the
future. Indeed, even if mechanical drives are not dead yet, the new generation is definitely out
there and it seems pretty likely that SSD are the first of a long series of new storage technologies
to come. Computer forensic science will probably have a hard time adapting to these technological
changes but we want to believe it will emerge strengthened from this necessity.
60
Bibliography
[AH08] Andrew Hoog, Slack Space, oct 2008, ViaForensics blog,
https://viaforensics.com/computer-forensic-ediscovery-glossary/what-is-slack-
space.html
[AS09] Anand Lal Shimpi, The SSD Anthology : Understanding SSDs and New Drives
from OCZ, March 2009, Anandtech
[AS09A] Anand Lal Shimpi, The SSD Relapse: Understanding and Choosing the Best SSD,
Aug 2009, Anandtech
[AK11] Andrew Ku, OCZ’s Vertex 3 Pro : Second-Gen SandForce Perf Preview, Tom’s
hardware, Feb 2011, http://www.tomshardware.co.uk/vertex-3-pro-ssd-
sandforce,review-32120.html
[BG11] Bryan Gillson, http://www.symantec.com/connect/forums/pgp-and-ssd-wear-
leveling, Feb 2011, Symantec forum
[BP07] B. J. Phillips, C. D. Schmidt, D. R. Kelly, Recovering data from USB Flash memory
sticks that have been damaged or electronically erased, December 2007, The
University of Adelaide, Australia
[BR09] Bill Roman, Using the Appropriate Wear Leveling to Extend Product Lifespan,
Flash Memory Summit, Aug 2009
[CK11] Christopher King, Timothy Vidas, Empirical analysis of solid state disk data
retention when used with contemporary operating systems, Digital Investigation,
Vol 8, 2011
[DH11] Dominique A. Heger, SSD Write Performance – IOPS Confusion Due to Poor
Benchmarking Techniques, Aug 2011, DHTechnologies
[DF03] Daniel Feenberg, Can Intelligence Agencies Read Overwritten Data, July 2003, The
National Bureau of Economic Research
[ES08] Esther Spanjer, Security Features for Solid State Drives in Defense Applications,
March 2008, white paper, Adtron Corporation
[ES12] ElcomSoft, SSD Evidence Acquisition and Crypto Containers, Dec 2012,
ElcomSoft Co. Ltd
[FY11] Fkrezgy Yohannes, SOLID STATE DRIVE (SSD) DIGITAL FORENSICS
CONSTRUCTION, 2011, Politecnico di Milano, Master of Science in Computing
System Engineering Dipartimento di Elettronica e Informazione
[GB10] Graeme B. Bell, Richard Boddington, Solid State Drives: The Beginning of the End
for Current Practice in Digital Forensic Recovery? , JDFSL The Journal of Digital
Forensic Security and Law
61
[HN10] Henry Newman, Why Solid State Drives won’t replace spinning disks, Jul 2010,
http://www.enterprisestorageforum.com/storage-technology/Why-Solid-State-
Drives-Wont-Replace-Spinning-Disk.htm-3894671.htm
[ID12] IT Dog, Write performance speed drops on all SSD, Jan 2012,
http://www.velobit.com/storage-performance-blog/bid/112643/Warning-Bark-
Write-Performance-Speed-Drops-On-All-SSDs
[IT12] InforTrend, Enhancing the Write Performance of Intel SSDs through Over-
Provisioning, 2012, p11
[JL10] Jeffrey Layton, Fixing SSD Performance degradation, Part 1, Oct 2010,
http://www.enterprisestorageforum.com/technology/features/article.php/3910451/
Fixing-SSD-Performance-Degradation-Part-1.htm
[JL10A] Jeffrey Layton, Why flash drive density will stop growing next year, Sep 2010,
http://www.enterprisestorageforum.com/technology/features/article.php/3904146/
Why-Flash-Drive-Density-Will-Stop-Growing-Next-Year.htm
[JL11] Jeffrey Layton, Real-Time Data Compression's Impact on SSD Throughput
Capability, Apr 2012,
http://www.enterprisestorageforum.com/technology/features/article.php/11192_39
30601_3/Real-Time-Data-Compressions-Impact-on--SSD-Throughput-Capability-
.htm
[KH08] Kim Hyojun, Seongjun Ahn, BPLRU: A Buffer Management Scheme for Improving
Random Writes in Flash Storage, Software Laboratory of Samsung Electronics,
Korea, Feb 2008
[KJ11] Jonghwa Kim(student), Ikjoon Son(student), Jongmoo Choi, Sungroh Yoon,
Sooyong Kang, Youjip Won, Jaehyuk Cha, Deduplication in SSD for Reducing
Write Amplification Factor, Dankook and Hanyang Korea Universities, Feb 2011
[KJ12] Kim Jonghwa, Choonghyun Lee, Sangyup Lee, Ikjoon Son, Jongmoo Choi, Sungroh
Yoon, Hu-ung Lee, Sooyong Kang, Youjip Won, Jaehyuk Cha, Deduplication in
SSDs: Model and Quantitative Analysis, Hanyang and Dankook Korea Universities,
Massachusetts Institute of Technology, USA, Mar 2012
[KS11] Kent Smith, Garbage Collection : Understanding Foreground vs. Background GC
and Other Related Elements, Flash Memory Summit 2011, Aug 2011
[KS12] Kent Smith, Understanding SSD Over Provisioning, Flash Memory Summit 2012,
Aug 2012
[LD12] Larry Daniel, Lars Daniel, Digital forensics for legal professionals, Syngress, 2012,
p3
62
[LH12] Lee Hutchinson, The SSD Revolution : Solid-State revolution : in-depth on how
SSDs really work, June 4 2012, ARS Technica, http://arstechnica.com/information-
technology/2012/06/inside-the-ssd-revolution-how-solid-state-disks-really-work/1/
[LX12] Luojie Xiang, Brian M. Kurboski, Eitan Yaakobi, WOM Codes Reduce Write
Amplification in NAND Flash Memory, Nov 2012, Purdue University West
Lafayette, Japan Advanced Institute of Science and Technology, California Institute
of Technology
[MB07] Marcel Breeuwsma, Martien de Jongh, Coert Klaver, Ronald van der Knijff and
Mark Roeloffs, Forensic data recovery from flash memory, Jun 2007, Small scale
digital device forensics journal, vol. 1 n°1
[MF09] Michael Freeman, Andrew Woodward, Secure State Deletion : Testing the efficacy
and integrity of secure deletion tools on Solid State Drives, December 2009, 7th
Australian Digital Forensics Conference, Edith Cowan University
[MP10] Marc Prieur, SSD, TRIM et IOMeter, Apr 2010,
http://www.hardware.fr/articles/791-2/degradation-trim-theorie.html
[MR13] Matthieu Regnery, Thomas Souvignet, La récupération de données sur SSD : un
défi ? , Mar 2013, Multi-System & Internet Security Cookbook (MISC), p15-24
[MR12] Margaret Rouse, Overprovisioning (SSD overprovisioning), Jan 2012, Search Solid
State Storage, http://searchsolidstatestorage.techtarget.com/definition/over-
provisioning-SSD-overprovisioning
[MR12A] Margaret Rouse, Solid State Storage (SSS) Garbage Collection, Jan 2012, Search
Solid State Storage, http://searchsolidstatestorage.techtarget.com/definition/solid-
state-storage-SSS-garbage-collection
[MR12B] Margaret Rouse, TRIM, Jan 2012, Search Solid Storage,
http://searchsolidstatestorage.techtarget.com/definition/TRIM
[MS12C] Mike Sheward, Rock Solid : Will digital forensics crack SSDs ? , jan 2012, InfoSec
Institute
[MT08] Micron Technology, TN-29-42: Wear-Leveling Techniques in NAND Flash
Devices, Technical Note, Oct2008, p3
[MW10] Michael Wei, Laura M. Grupp, Frederick E. Spada, Steven Swanson, Reliably
Erasing Data From Flash-Based Solid State Drives, Dec 2010, Department of
Computer Science and Engineering, University of California, San Diego
[MW12] Michael Willet, Implementing Stored-Data Encryption, March 2012, Storage
Networking Industry Association
[NC09] Neal Chistiansen, ATA Trim/Delete Notification Support in Windows 7, Storage
Developer Conference 2009
63
[PD11] Pierre Dandumont, Les disques SSD, la fin des disques durs, Mar 2011, p5,
http://www.presence-pc.com/tests/ssd-flash-disques-22675/5/
[PG01] Peter Gutmann, Data Remanence in Semiconductor Devices, Aug 2001,
Proceedings of the 10th USENIX Security Symposium
[PG96] Peter Gutmann, Secure Deletion of Data from Magnetic and Solid-State Memory,
July 1996, Sixth USENIX Security Symposium Proceedings
[PH10] Philip, SSD Speed Tweaks, Feb 2010, http://www.speedguide.net/articles/ssd-
speed-tweaks-3319
[RB13] Reverend Bill Blunden, The rootkit arsenal, Ed. Jones and Bartlett Learning, 2013,
p286
[RK12] Richard Kissel, Matthew Scholl, Steven Skolochenko, Xing Li, Guidelines for
Media Sanitization, Sep 2012, Draft NIST Special Publication 800-88 Revision 1
[SH09] Scott Holewinski, Gregory Andrzejewski, Data Recovery from Solid State Drives,
dec 2009, Gillware Inc.
[SH89] Sameer Haddad, Chi Chang, Balaji Swaminathan, Jih Lien, Degradations due to
Hole Trapping in Flash Memory Cells, March 1989, IEEE Electron Device Letters,
Vol.10, No.3, p.117.
[SM12] Scott Mattie, SSD-What’s the difference between SLC and MLC, May 2012, Scott
Mattie’s Blog, http://www.smattie.com/2012/05/25/ssd-whats-the-difference-
between-slc-and-mlc/
[SN08] Scott Nelson, MLC and SLC flash design tradeoffs, Toshiba America Electronic
Components, Sept 2008
[SS10] Steve Swanson, Michael Wei, SAFE: Fast, Verifiable Sanitization for SSDs, Oct
2010, Non-Volatile Systems Laboratory Computer Science & Engineering,
University of California, San Diego
[SS12] Sean Stead, It Takes Guts to be Great, Flash Memory Summit 2012, Aug 2012
[TA06] Toshiba America Electronic Components INC, NAND vs NOR Flash Memory
Technology Overview, p3, Apr 2006
[TC13] TrueCrypt, http://www.truecrypt.org/docs/wear-leveling, 2013
[TR10] Thomas Rent M., SSD Reference Guide, Apr 2010,
http://www.storagereview.com/ssd_controller
[WC10] W. Chavis, SSDs – Write Amplification, Trim and GC, Mar 2010, OCZ Technology
whitepapers
64
[YC12] Yu-Tzu Chiu, Flash Memory Survives 100 Millions Cycles, Dec 2012,
http://spectrum.ieee.org/semiconductors/memory/flash-memory-survives-100-
million-cycles
[ZS11] Zubair Shah, Forensics Potentials of Solid State Drives (SSD), 2011, MS
Computing System Engineering Dipartimento di Elettronica e Informazione
Politecnico di Milano, Milan, Italy
65
Annex A: Tests details
This Annex provides details regarding tests conducted in [4.2.2]
SSD drive used: OCZ Onyx 64GB, firmware 1.6, TRIM support
Hardware Blocker: Guidance Software Fastbloc2 fe Drive Imaging Software: AccessData® FTK® Imager v3.1.3.2 Operating System: Windows 7 Pro 64bits, v 6.1, number 7601, Service Pack 1, French Edition Processor: AMD Athlon II X4 645