![Page 1: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/1.jpg)
A PLFS Plugin for HDF5 for Improved I/O Performance and
AnalysisKshitij Mehta1, John Bent2, Aaron Torres3, Gary
Grider3, Edgar Gabriel1
1 University of Houston, Texas2 EMC Corp.3 Los Alamos National Lab
DISCS 2012
![Page 2: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/2.jpg)
Talk Outline● Background
– HDF5– PLFS
● Plugin– Goals and Design
● Semantic Analysis● Experiments and Results● Conclusion
![Page 3: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/3.jpg)
HDF5 – An Overview● Hierarchical Data Format● Data model, File format, and API● Tool for managing complex data● Widely used in industry and academia● User specifies data objects and logical relationship
between them● HDF5 maintains data structures, memory
management, metadata creation, file I/O
![Page 4: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/4.jpg)
HDF5 – An Overview (II)● Parallel HDF5
– Build with an MPI library– File create, dataset create, group create etc. are
collective calls● User can select POSIX I/O, or parallel I/O
using MPI-IO (individual/collective)● File portable between access by sequential,
PHDF5
![Page 5: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/5.jpg)
File
HDF5 – An Overview (III)
Group
Group
D1 D2
D3
Metadata D1 D2 D3
.h5 file
PEs
![Page 6: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/6.jpg)
HDF5 – An Overview (IV)● File is a top level object, collection of objects● Dataset is a multi-dimensional array
– Dataspace● Number of dimensions● Size of each dimension
– Datatype● Native (int, float, etc.)● Compound (~struct)
● Group is a collection of objects (groups, datasets, attributes)● Attributes used to annotate user data● Hyperslab selection
– Specify offset, stride in the dataspace– e.g. write selected hyperslab from matrix in memory to selected
hyperslab in dataset in file
![Page 7: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/7.jpg)
HDF5 Virtual Object Layer (VOL)
● Recently introduced by the HDF group
● New abstraction layer, intercepts API calls
● Forwards calls to object plugin● Allows third party plugin
development ● Data can be stored in any
format– netCDF, HDF4 etc.
Public API
.h5netCDF
Object Plugin
![Page 8: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/8.jpg)
Opportunities in HDF5• Preserve semantic information about HDF5
objects• Single .h5 file a black box• Allows performing post-processing on individual
HDF5 objects
• Improve I/O performance on certain file systems• N-1 access often results in sub-optimal I/O
performance on file systems like Lustre
![Page 9: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/9.jpg)
PLFS
• Parallel Log Structured File System developed at LANL, CMU, EMC
• Middleware positioned between application and underlying file system
• Transforms N-1 access pattern into N-N• Processes write to separate files, sufficient
metadata maintained to re-create the original shared file
• Demonstrated benefits on many parallel file systems
![Page 10: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/10.jpg)
Goals of the new plugin
• Store data in a new format, different from the native single file format• Preserves semantic information• Perform additional analysis and optimizations
• Use PLFS to read/write data objects• Tackles performance problem due to N-1 access
![Page 11: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/11.jpg)
Plugin Design• Implementation for various object functions • Provide a raw mapping of HDF5 objects to the
underlying file system• HDF5 file, groups stored as directories• Datasets as PLFS files• Attributes as PLFS files stored as dataset_name.attr_name
• Use PLFS API calls in the plugin• PLFS Xattrs store dataset metadata (datatype,
dataspace,..)• Xattrs provide key-value type access
![Page 12: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/12.jpg)
PLFS Plugin● Relative path describes relationship between objects● User still sees the same API
File
Group
Group
D1 D2
D3
File/
Group/
D1
D2
Group/
D3
![Page 13: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/13.jpg)
Semantic Analysis (I)• Active Analysis
• Application can provide a data parser function• PLFS applies function on the streaming data• Function outputs key-value pairs which can be
embedded in extensible metadata• e.g. recording the height of the largest wave in
ocean data within each physical file• Quick retrieval of the largest wave, since only need
to search extensible metadata• Extensible metadata can be stored on burst buffers
for faster retrieval
![Page 14: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/14.jpg)
Active Analysis (II)
PE PLFSdata
ParserParser Outpu
t
FS
Burst Buffer
![Page 15: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/15.jpg)
Semantic Analysis (II)• Semantic Restructuring
• Allows re-organizing data into a new set of PLFS shards
• e.g. assume ocean model stored row-wise• Column-wise access expensive• Analysis routine can ask for “column-wise re-
ordering”• PLFS knows what it means, since it knows the
structure• Avoids application having to restructure data
by calculating a huge list of logical offsets
![Page 16: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/16.jpg)
Semantic Restructuring (II)
Restructure
HDF5 Datasets
“Re-order wave lengths recorded in October 2012 in column-major (Hour x Day)”
![Page 17: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/17.jpg)
Experiments and Results● Lustre FS, 12 OSTs, 1M stripe size● HDF5 performance tool “h5perf”● Multiple processes write data to multiple
datasets in a file● Bandwidth values presented are average of 3
runs● 1,2,4,8,32,64 PEs
– 4 PEs/node max● 10 datasets, minimum total data size 64G● Comparing MPI-IO Lustre, Plugin, AD_PLFS
(PLFS MPI-IO driver)● Individual I/O (non-collective) tests
![Page 18: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/18.jpg)
Write Contiguous
• Aligned transfer size of 1M• For almost all cases, plugin better than
MPI-IO , AD_PLFS shows best performance
![Page 19: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/19.jpg)
Write Interleaved
• Unaligned transfer size of (1M + 10 bytes)• Plugin performance > MPI-IO
![Page 20: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/20.jpg)
Read Performance
• Contiguous reads ( 1M ) and Interleaved reads ( 1M+10 bytes )
• Similar trend as in writes• MPI-IO < Plugin < AD_PLFS
![Page 21: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/21.jpg)
Conclusion● New plugin for HDF5 developed using
PLFS API● New output format allows for Semantic
Analysis● Using PLFS improves I/O performance● Tests show plugin performs better than
MPI-IO in most cases, AD_PLFS shows best performance
● Future Work: Use AD_PLFS API calls in the plugin instead of native PLFS API calls, provide collective I/O in the plugin
![Page 22: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/22.jpg)
Thank You
Acknowledgements:• Quincey Koziol, Mohamad Chaarawi – HDF group• University of Dresden for access to Lustre FS
![Page 23: A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis](https://reader036.vdocuments.us/reader036/viewer/2022081520/56816034550346895dcf5458/html5/thumbnails/23.jpg)
• Why not use AD_PLFS on default .h5 file ?• Changing output format allows for
semantic analysis• Provides a more object-based storage
(DOE fast forward proposal – EMC, Intel, HDF working towards an object stack)
Questions