![Page 1: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/1.jpg)
Fast and Reliable Stream Storage through Differential Data JournalingAndromachi Hatzieleftheriou
MSc ThesisSupervisor: Stergios Anastasiadis
![Page 2: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/2.jpg)
2 of 40
Thesis Motivation
• We study the real-time storage of massive stream data▫ real-time or retrospective processing▫ e.g. monitoring applications
continuous data received from sensors in real-time video and audio streams of high quality at high rates environmental measurements at much lower rates
• Traditional file and database systems insufficient ▫ excessive resource requirements in case of high-volume streaming traffic▫ need for system facilities for the storage of heterogeneous streams
different rate and content characteristics
• General-purpose file systems use journaling to synchronously move data or metadata from memory to disk with sequential throughput▫ data journaling high disk overhead
![Page 3: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/3.jpg)
3 of 40
• Data journaling should be enabled with random writes but disabled with large sequential writes
• Need to efficiently and reliably store multiple concurrent streams▫ individual stream appends perfectly sequential▫ aggregate workload random-access▫ unclear what is the most appropriate way to handle the incoming data
• We examine the possibility of employing data journaling techniques▫ combine sequential throughput with low latency during synchronous writes
• We introduce differential data journaling in order to minimize the cost of data journaling▫ only the actually modified bytes are logged, not the entire corresponding blocks
Thesis Motivation
![Page 4: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/4.jpg)
4 of 40
Outline
• Related Work
• Ext3
• Architectural Definition
• Prototype Implementation
• Performance Evaluation
• Conclusions & Future Work
![Page 5: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/5.jpg)
5 of 40
Fast and Reliable Storage
• File system operations can be:▫ data operations that update user data▫ metadata operations that modify the structure of the file system
• Several techniques have been proposed to achieve high performance during data and metadata updates
• Operating systems susceptible to hardware and power failures that damage their efficiency and reliability▫ special utility needed during reboot to recover the file system▫ the system remains offline while the disk is scanned and repaired
![Page 6: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/6.jpg)
6 of 40
Synchronous Writes & Soft Updates
• Synchronous writes ▫ pending writes must complete before the next ones can be submitted▫ significant performance loss
• Soft updates ▫ ordering between metadata writes▫ list of metadata dependencies per disk block▫ after a crash
system mounted and used immediately remaining inconsistencies corrected in the background
![Page 7: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/7.jpg)
7 of 40
Log-Structured File Systems
• Data and metadata updates ▫ initially buffered in the cache▫ then written sequentially to a continuous stream
• Main features ▫ disk treated as a segmented append-only log▫ indexing information needed for efficient read▫ costly seeks are avoided maximized disk
write throughput
• After a crash▫ the system reconstructs its state from the last
consistent point in the log
• Log space needs to be constantly reclaimed ▫ garbage collection
![Page 8: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/8.jpg)
8 of 40
Journaling File Systems
• Metadata updates written to a circular append-only journal before committed to the main file system▫ batching opportunities▫ synchronous writes complete faster
sequential throughput
• Logging of data modifications also supported▫ performance improvement for synchronous
writes▫ significant journal throughput
full blocks logged even with small writes instead of the modified parts
• After a crash ▫ replay the last updates from the journal
![Page 9: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/9.jpg)
9 of 40
Outline
• Related Work
• Ext3
• Architectural Definition
• Prototype Implementation
• Performance Evaluation
• Conclusions & Future Work
![Page 10: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/10.jpg)
10 of 40
General Features
• Each high-level change to the file system is performed in two steps:1. the modified blocks are copied into the
journal2. the modified blocks are sent to their final
disk location
• Journal features:▫ treated as a circular buffer▫ file within the same file system or separate
disk partition
![Page 11: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/11.jpg)
11 of 40
• Ordered Mode
▫ only metadata logged
▫ data writes forced to the fixed location right before metadata is written to the journal
▫ strong consistency semantics
• Data Mode
▫ both data and metadata logged
▫ data blocks written twice
▫ strong consistency semantics
Journal(Metadata)
Commit
Checkpoint
Final Location(Metadata)
Final Location(Data)
Final Location(Data)
Final Location(Data)
sync
time
• Writeback Mode
▫ only metadata logged
▫ data blocks written directly to final location
▫ no ordering between the journal and fixed-location data writes
▫ weak consistency guarantees
Commit
Checkpoint
Journal(Metadata)
Final Location(Metadata)
Final Location(Data)
sync
sync
time
Journaling Modes
Commit
Checkpoint
Journal(Data+Metadata)
Final Location(Data +Metadata)
sync
time
![Page 12: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/12.jpg)
12 of 40
Journal Structure
• Journal superblock ▫ tracks summary information for the
journal
• Journal descriptor block▫ marks the beginning of a transaction▫ describes the subsequent journaled
blocks
• Journal data and metadata blocks
• Journal commit block ▫ written at the end of a transaction▫ marks that data and metadata are safe
on disk
![Page 13: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/13.jpg)
13 of 40
Kernel Buffers
• Page cache ▫ keeps page copies from recently
accessed disk files in memory
• Block buffer ▫ in-memory buffer of each disk block▫ allocated in units called buffer pages
• Buffer head descriptor ▫ specifies all the handling information
required by the kernel to locate the corresponding block on disk
![Page 14: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/14.jpg)
14 of 40
Flushing Dirty Buffers to Disk
• Goal: Dirty pages that accumulate in memory need to be written to disk
• pdflush kernel threads▫ systematically scan the page cache for dirty pages to flush every writeback period
▫ ensure that no page remains dirty for too long more than expiration period
• kjournald kernel thread▫ commits the current state of the file system every commit interval period of time
▫ flushes the dirty buffers of the committed transactions to their final location checkpoint process
• fsync system call ▫ forces all data and metadata dirty buffers of a specified file descriptor to disk
![Page 15: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/15.jpg)
15 of 40
Commit Policy
• Process of writting to journal the dirty buffers modified by a transaction
• Commit is initiated when:▫ the commit interval expires▫ write updates need to be synchronously
written to disk
• For each journal block buffer:▫ a buffer head specifies the respective block number in the journal
points to the original copy of the block buffer
▫ a journal head points to the corresponding transaction
![Page 16: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/16.jpg)
16 of 40
Commit Process
• A journal descriptor block is allocated▫ contains tags that map block buffers to their
final location
• When it fills up1. it is written to the journal2. the corresponding block buffers follow3. a journal commit block is synchronously
written to the journal
• Additional journal descriptor blocks can be allocated
![Page 17: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/17.jpg)
17 of 40
Recovery Policy
• Recovery process ▫ automatically started after an unclean shutdown ▫ scans the log for complete transactions that need to be replayed
• Three phases needed:▫ PASS_SCAN scans the end of the journal
▫ PASS_REVOKE is used to prevent older journal records from being replayed on top of newer data using the same block
▫ PASS_REPLAY writes to their final disk location the newest versions of all the blocks that need to be replayed
• The system can crash before the recovery finishes ▫ the same journal can be reused
![Page 18: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/18.jpg)
18 of 40
Outline
• Related Work
• Ext3
• Architectural Definition
• Prototype Implementation
• Performance Evaluation
• Conclusions & Future Work
![Page 19: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/19.jpg)
19 of 40
Design Goals
• We investigate the performance characteristics of data journaling in the context of synchronous writes
• Data journaling features:▫ Synchronous writes complete faster
take advantage of the sequential journal throughput ▫ Significant amount of traffic sent to the journal
high journal device throughput▫ Traffic changes sublinearly as a function of the write rate▫ Substantial overhead even with small write requests
due to the full-block logging scheme
• Proposal: a new journaling mode ▫ accumulation of multiple write modifications in a
single journal block
0.12
5
0.50
0
1.00
0
2.00
0
4.00
0
8.00
0
16.0
00
32.0
00
64.0
00
128.
000
1
10
100
1000
10000
Requirements
Data Journaling
Writeback Journaling
Ordered Journaling
Request Size (KB)
Tot
al J
ourn
al T
raff
ic (
MB
)
![Page 20: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/20.jpg)
20 of 40
Design Goals
• Partial Block▫ new journal block type▫ accumulates the modifications from multiple writes
• Commit Policy▫ only the modified part of individual data blocks should be journaled▫ for fully modified data or metadata blocks entire blocks can be logged
• Recovery Policy▫ whole blocks read from the journal and written back to their final location▫ for partially modified blocks
the original disk block should be first read from the final location then written back updated with the difference retrieved from the journal
![Page 21: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/21.jpg)
21 of 40
Outline
• Related Work
• Ext3
• Architectural Definition
• Prototype Implementation
• Performance Evaluation
• Conclusions & Future Work
![Page 22: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/22.jpg)
22 of 40
Partial Blocks
• Used to gather the partial updates of data blocks
• Two different types of journal blocks used:▫ partial that store multiple data writes smaller than the default block size▫ non-partial that correspond to metadata or fully written data buffers
![Page 23: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/23.jpg)
23 of 40
Journal Heads & Tags
• Journal heads ▫ we use them to prepare the blocks that are actually sent to the
journal
▫ we added two new fields for partial modifications offset and length of the partially modified block
• Tags▫ allocated during commit per block buffer
▫ contain the following fields: final disk location of the modified block
four flags for journal-specific block properties
a flag indicating whether the corresponding block is partially
modified or not (added)
length of the new bytes (added)
starting offset in the data block of the final disk location (added)
![Page 24: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/24.jpg)
24 of 40
Commit Process
• A descriptor and a partial data block allocated
• Partially modified data blocks ▫ modifications copied consecutively in the partial data block
• Metadata or fully written data blocks▫ the corresponding full blocks are logged
• When the descriptor fills up1. it is written to the journal2. all the corresponding block buffers follow3. a journal commit block is written to the journal
![Page 25: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/25.jpg)
25 of 40
Recovery Policy
• Data modifications retrieved from the journal and applied to the final blocks
• From each retrieved journal descriptor block the included tags are extracted▫ describe partial or full write or metadata modifications
• Non-partial modification▫ next block retrieved from the journal and written to the final location
• Partial modification▫ next block retrieved from the journal partial data block
▫ the original disk block is read into a kernel buffer
▫ the modification is copied from the journal block buffer to the proper final buffer starting offset and length tag fields used
▫ when the end of the current partial block is exceeded the next one is retrieved from the journal
![Page 26: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/26.jpg)
26 of 40
Outline
• Related Work
• Ext3
• Architectural Definition
• Prototype Implementation
• Performance Evaluation
• Conclusions & Future Work
![Page 27: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/27.jpg)
27 of 40
Performance Measurements
• We examine the disk throughput requirements and the average latency of each write under streaming workloads
• We measure performance in an environment of temporary small files▫ investigate the benefit of data journaling in applications other than streaming
• We examine the possible overheads of our implementation▫ recovery time▫ CPU load
![Page 28: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/28.jpg)
28 of 40
Streaming Workloads
• Massive numbers of streams synchronously written on the same disk facility
• We examine the performance characteristics of streams with different rates, while varying the degree of concurrency▫ data rate: the amount of data that is stored per unit of time
• At each execution:▫ a sequence of write updates synchronously applied to the system for a specified
amount of time▫ according to the rate different record sizes are used
low rates small request sizes high rates large request sizes
![Page 29: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/29.jpg)
29 of 40
Flushing Policy
• Manual tuning of the dirty page flush timers according to the rate and the number of the streams
• Low-rate streams▫ accumulation of multiple write updates in memory for a long period
batching opportunities
▫ long expiration interval and frequent writeback period avoid filling up the journal and the memory
• High-rate streams▫ high volumes of data fill up the journal and the memory rather soon▫ default or slightly reduced expiration and writeback periods
according to the generated amount of data
![Page 30: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/30.jpg)
30 of 40
Journal Device Throughput
• Low-rate streams▫ low for the writeback and ordered modes ▫ much higher for the default data journaling▫ comparable to metadata-only modes for the differential data journaling
• High-rate streams▫ significant high for both data journaling modes
1001000
30005000
70000
5
10
15
20
25
30
1 Kbps/stream
Number of Streams
Jour
nal T
hrou
ghpu
t (M
B/s
)
50 500 1000 15000
5
10
15
20
25
30
10 Kbps/stream
Number of Streams
Jour
nal T
hrou
ghpu
t (M
B/s
)
10 25 50 75 1000
5
10
15
20
25
30
1 Mbps/stream
Data Journal-ing
Diff Data Journaling
Writeback Journaling
Ordered Journaling
Number of Streams
Jour
nal T
hrou
ghpu
t (M
B/s
)
![Page 31: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/31.jpg)
31 of 40
Final Location Throughput
• Low-rate streams▫ low for both data journaling modes ▫ several factors higher for metadata-only journaling modes
• High-rate streams▫ comparable to all the four modes
1001000
30005000
70000
2
4
6
8
10
12
14
1 Kbps/stream
Number of Streams
File
Sys
tem
Thr
ough
put
(MB
/s)
50 500 1000 15000
2
4
6
8
10
12
14
10 Kbps/stream
Number of Streams
File
Sys
tem
Thr
ough
put
(MB
/s)
10 25 50 75 1000
2
4
6
8
10
12
14
1 Mbps/stream
Data Journal-ing
Diff Data Journaling
Writeback Journaling
Ordered Journaling
Number of Streams
File
Sys
tem
Thr
ough
put
(MB
/s)
![Page 32: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/32.jpg)
32 of 40
• Much higher for the metadata-only journaling modes compared to the data journaling modes
• Data journaling benefits from the journal’s sequential throughput ▫ fast and reliable storage opportunities
Write Response Time
100
1000
3000
5000
7000
10
100
1000
10000
100000
1 Kbps/stream
Number of Streams
Wri
te L
aten
cy (
ms)
50 500 1000 15001
10
100
1000
10000
100000
10 Kbps/stream
Number of Streams
Wri
te L
aten
cy (
ms)
10 25 50 75 1001
10
100
1000
10000
100000
1 Mbps/stream
Data Journaling
Diff Data Journaling
Writeback Journaling
Ordered Journaling
Number of Streams
Wri
te L
aten
cy (
ms)
![Page 33: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/33.jpg)
33 of 40
CPU Utilization
• Expected CPU overhead for differential data journaling▫ memory copy of the modified parts to the
appropriate journal partial block
• For both high and low rate streams▫ CPU load less than 10%▫ mostly idle
• Insignificant extra CPU cost for differential data journaling
![Page 34: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/34.jpg)
34 of 40
Postmark Benchmark
• Postmark used to study the performance of small writes▫ typical for electronic mail, newsgroups and
web-based commerce
• Significant improvement of the supported transaction rate for data journaling modes▫ low write latency more transactions served
per second128 1024 4096 16384
0
50
100
150
200
250
300
350
400
450
500
Postmark
Data Journal-ingDiff Data Journaling
Writeback Journaling
Ordered Journaling
Request Size (Bytes)
Tra
nsac
tion
s/s
![Page 35: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/35.jpg)
35 of 40
Recovery Time
• Scan phase▫ high latency for default data journaling▫ low latency for metadata-only modes▫ latency of differential data journaling
comparable to metadata-only modes
• Revoke phase▫ equal to all the four modes
• Replay phase▫ comparable latency for the two data journaling
modes despite the extra block reads of differential data
journaling
▫ much lower latency for metadata-only modes
![Page 36: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/36.jpg)
36 of 40
Experimental Results
• Streaming workloads▫ differential data journaling reduces substantially the journal traffic of data
journaling especially for low-rate streams
▫ significant reduction of the write latency for the data journaling modes with respect to metadata-only journaling
• Typical small-write workload▫ substantial improvement in the supported transaction rate
![Page 37: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/37.jpg)
37 of 40
Outline
• Related Work
• Ext3
• Architectural Definition
• Prototype Implementation
• Performance Evaluation
• Conclusions & Future Work
![Page 38: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/38.jpg)
38 of 40
Conclusions
• Emerging need for the real-time storage of massive stream data▫ fresh look at file systems that support data journaling
• New journaling mode: differential data journaling▫ accumulation of multiple updates into a single journal block▫ fast and reliable storage at relatively low disk throughput requirements
![Page 39: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/39.jpg)
39 of 40
Future Work
• Many directions for future work, mainly regarding the performance evaluation of our implementation
▫ Investigation of the automatic tuning of system parameters related to the timing of dirty page flushes
▫ Direct comparison with a log-structured or other journaling file systems in order to demonstrate the benefits of our architecture
▫ Further examination of the performance of differential data journaling under heterogeneous workloads
▫ Examination of the behavior of differential data journaling under some database workload
▫ Experimentation in a real streaming environment
![Page 40: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/40.jpg)
40 of 40
Thank you..!
![Page 41: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/41.jpg)
41 of 40
References
[1] P. J. Desnoyers and P. Shenoy, Hyperion: High Volume Stream Archival for Retrospective Querying, USENIX Annual Technical Conference, June 2007.
[2] Prabhakaran et al., Analysis and Evolution of Journaling File Systems, USENIX Annual Technical Conference, April 2005, pp. 105-120.
[3] S. Tweedie, Journaling the Linux Ext2fs Filesystem, Fourth Annual Linux Expo, Durham, North Carolina, May 1998.
[4] D. Carney et al., Monitoring Streams – A New Class of Data Management Applications, VLDB Conference, August 2002, pp. 215-226.
![Page 42: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/42.jpg)
42 of 40
Journaling Objects
• Log record▫ corresponds to a low-level operation that updates a disk block ▫ represented as full blocks
• Atomic operation handle▫ corresponds to a high-level operation
multiple low-level operations
▫ during recovery either the whole high-level operation is applied or none of its low-level operations
• Transaction ▫ consists of multiple atomic operation handles
![Page 43: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/43.jpg)
43 of 40
Stream Archival Servers
• Design can be based on two possible architectures:
▫ a relational database not designed for rapid and continuous loading of individual data items ill-equipped to handle numerous continuous queries over data streams insufficient for real-time requirements
▫ a conventional file system mainly care to maintain their integrity across crashes without compromising
performance should not compromise the playback performance should exploit the particular I/O characteristics of individual streams e.g. StreamFS used for the storage of high-volume streams
![Page 44: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/44.jpg)
44 of 40
Checkpoint Policy
• Limited amount of journal space that needs to be reclaimed
• Process of ensuring that a section of the log is committed fully to disk, so that that portion of the log can be reused
• Checkpoint occurs when:▫ there is not enough journal space left
free space is between 1/4 and 1/2 of the journal size
▫ the journal is being flushed to disk
![Page 45: Fast and Reliable Stream Storage through Differential Data Journaling Andromachi Hatzieleftheriou MSc Thesis Supervisor: Stergios Anastasiadis](https://reader037.vdocuments.us/reader037/viewer/2022110207/56649d1f5503460f949f31e6/html5/thumbnails/45.jpg)
45 of 40
Enabling/ Disabling Disk Write Cache
• Synchronous write operations return as soon as the data reaches the on-disk write cache rather than the storage media
• Disabling the write cache scales down the performance of the different modes
• Significant advantage of data journaling with respect to the ordered mode▫ small writes