update on hdf5 1.8

72
Update on HDF5 Update on HDF5 1.8 1.8 The HDF Group HDF and HDF-EOS Workshop X November 28, 2006 HDF HDF

Upload: the-hdf-eos-tools-and-information-center

Post on 11-Jun-2015

82 views

Category:

Technology


2 download

DESCRIPTION

This presentation targets HDF5 application developers and anyone who is interested in the new HDF5 Library features. The following new features available in 1.8.0 will be discussed: HDF5 cache Meta data working set size is highly variable depending on file structure and access pattern. If the cache is too small, performance will deteriorate. In 1.8 we introduce code to configure metadata cache size automatically and API calls to allow manual configuration of the metadata cache. Text - data type conversion (10 minutes) The new high-level API function, H5LTtext_to_dtype, provides the ability to create a data type through the text description of the data type. The function H5LTdtype_to_text facilitates debugging by printing the text description of a data type. The current supported text description is in DDL format. External Links This feature allows links in a group to refer to objects in another file, and for the library to access those objects as if they are in the current file. We will present the API functions and how external links are supported. Group revisions We will introduce new features of the HDF5 Group object that include compact group storage, new large group storage, intermediate Group Creation and support of Unicode for the HDF5 object's names and datatypes. We will also cover new APIs for copying HDF5 objects between HDF5 files. Compact Groups – This feature allows groups containing only a few links to take up much less space in the file. New Large Group Storage – The method of storing groups with many links has been updated to be faster and more scalable. Intermediate Group Creation – This feature allows intermediate groups that don't exist yet to be created when creating an object in a file. Support for Unicode Character Set – The UTF-8 Unicode encoding is now supported for strings in datasets, the names of links and the names of attributes.

TRANSCRIPT

Page 1: Update on HDF5 1.8

Update on HDF5 Update on HDF5 1.81.8The HDF Group

HDF and HDF-EOS Workshop XNovember 28, 2006

HDFHDF

Page 2: Update on HDF5 1.8

Why HDF5 1.8?Why HDF5 1.8?

Page 3: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

3

… as we know, there are known knowns; there are things we know we know.

We also know there are known unknowns; that is to say we know there are some

things we do not know.

But there are also unknown unknowns -- the ones we don't know we don't know.

Donald Rumsfeld

Page 4: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

4

Some things we knew we Some things we knew we knewknew

• Need high level APIs – image, etc.• Need more datatypes - packed n-

bit, etc.• Need external and other links• Tools needed – h5pack, etc. • Caching embellishments• Eventually, multithreading

Page 5: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

5

Things we knew we did not Things we knew we did not knowknow

• New requirements from EOS and ASCI

• New applications that would use HDF5

• How HDF5 would really perform in parallel

• What new tools, features and options needed

• New APIs, API features

Page 6: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

6

Things we didn’t know we didn’t know

• Completely unanticipated applications• New data types and structures

• E.g. DNA sequences

• New operations• E.g. write many real-time streams

simultaneously

Page 7: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

7

HDF5 1.8 topicsHDF5 1.8 topics

• Dataset and datatype improvements• Group improvements• Link Revisions• Shared object header nessages• Metadata cache improvements• Other improvements• Platform-specific changes• High level APIs• Parallel HDF5• Tool improvements

Page 8: Update on HDF5 1.8

Dataset and Dataset and Datatype Datatype

ImprovementsImprovements

Page 9: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

9

Text-based data type Text-based data type descriptionsdescriptions

• Why:• Simplify datatype creation• Make datatype creation code more

readable• Facilitate debugging by printing the text

description of a data type

• What: • New routine to create a data type through

the text description of the data type: H5LTdtype_to_text

Page 10: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

10

Text data type description – Text data type description – ExampleExample

• Create a datatype of compound type.

/* Create the data type with text description */

dtype = H5Ttext_to_type(( “ “typedef struct foo {int a; float b;} typedef struct foo {int a; float b;} foo_t;”)foo_t;”)

/* Convert the data type back to text */H5Ttype_to_text(dtype, NULL, H5T_C, &tsize)

Page 11: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

11

Serialized datatypes and Serialized datatypes and dataspaces dataspaces

• Why: • Allow datatype and dataspace info to

be transmitted between processes • Allow datatype/dataspace to be stored

in non-HDF5 files

• What: • A new set of routines to

serialize/deserialize HDF5 datatypes and dataspaces.

Page 12: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

12

Int to float convert during I/OInt to float convert during I/O

• Why: Convert ints to floats during I/O

• What: Int to float conversion supported during I/O

Page 13: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

13

Revised conversion exception Revised conversion exception handlinghandling

• Why: Give apps greater control over exceptions (range errors, etc.) during datatype conversion.

• What: Revised conversion exception handling

Page 14: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

14

Revised conversion exception Revised conversion exception handlinghandling

• To handle exceptions during conversions, register handling function through H5Pset_type_conv_cb().

• Cases of exception:• H5T_CONV_EXCEPT_RANGE_HI• H5T_CONV_EXCEPT_RANGE_LOW• H5T_CONV_EXCEPT_TRUNCATE• H5T_CONV_EXCEPT_PRECISION• H5T_CONV_EXCEPT_PINF• H5T_CONV_EXCEPT_NINF• H5T_CONV_EXCEPT_NAN

• Return values: H5T_CONV_ABORT, H5T_CONV_UNHANDLED, H5T_CONV_HANDLED

Page 15: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

15

Compression filter for n-bit Compression filter for n-bit datadata

• Why: Compact storage for user-defined

datatypes

• What:• When data stored on disk, padding

bits chopped off and only significant bits stored

• Supports most datatypes• Works with compound datatypes

Page 16: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

16

N-bit compression exampleN-bit compression example

• In memory, one value of N-Bit datatype is stored like this:

| byte 3 | byte 2 | byte 1 | byte 0 ||????????|????SPPP|PPPPPPPP|PPPP????|

S-sign bit P-significant bit ?-padding bit

• After passing through the N-Bit filter, all padding bits are chopped off, and the bits are stored on disk like this:

| 1st value | 2nd value ||SPPPPPPP PPPPPPPP|SPPPPPPP PPPPPPPP|...

• Opposite (decompress) when going from disk to memory

Page 17: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

17

Offset+size storage filter Offset+size storage filter

• Why:Use less storage when less precision needed

• What:• Performs scale/offset operation on each value• Truncates result to fewer bits before storing• Currently supports integers and floats

• ExampleH5Pset_scaleoffset

(dcr,H5Z_SO_INT,H5Z_SO_INT_MINBITS_DEFAULT);

H5Dcreate(……, dcr)

H5Dwrite (…);

Page 18: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

18

Example with floating-point Example with floating-point typetype

• Data: {104.561, 99.459, 100.545, 105.644}• Choose scaling factor: decimal precision to

keepE.g. scale factor D = 2

1. Find minimum value (offset): 99.4592. Subtract minimum value from each

elementResult: {5.102, 0, 1.086, 6.185}

3. Scale data by multiplying 10D = 100Result: {510.2, 0, 108.6, 618.5}

4. Round the data to integerResult: {510 , 0, 109, 619}

5. Pack and store using min number of bits

Page 19: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

19

““NULL” DataspaceNULL” Dataspace

• Why:• Allow datasets with no elements to be

described • NetCDF 4 needed a “place holder” for

attributes

• What:• A dataset with no dimensions, no data

Page 20: Update on HDF5 1.8

Group Group improvementsimprovements

Page 21: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

21

Access links by creation-time Access links by creation-time orderorder

• Why: • Allow iteration & lookup of group’s

links (children) by creation order as well as by name order

• Support netCDF access model for netCDF 4

• What: Option to access objects in group according to relative creation time

Page 22: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

22

““Compact groups”Compact groups”

• Why: • Save space and access time for small groups• If groups small, don’t need B-tree overhead

• What:• Alternate storage for groups with few links

• Example• File with 11,600 groups• With original group structure, file size ~ 20

MB• With compact groups, file size ~ 12 MB• Total savings: 8 MB (40%)• Average savings/group: ~700 bytes

Page 23: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

23

Better large group storageBetter large group storage

• Why: Faster, more scalable storage and access for large groups

• What: New format and method for storing groups with many links

Page 24: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

24

Intermediate group creationIntermediate group creation

• Why: • Simplify creation of a series of

connected groups • Avoid having to create each

intermediate group separately, one by one

• What: • Intermediate groups can be created

when creating an object in a file, with one function call

Page 25: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

25

Example: add intermediate Example: add intermediate groupsgroups

• Want to create “/A/B/C/dset1”• “A” exists, but “B/C/dset1” do not

/A

/A

BB

dset1dset1

CCH5Dcreate(file_id, “/A/B/C/dset1”,..)

One call creates groups “B” & “C”, then creates “dset1”

Page 26: Update on HDF5 1.8

Link RevisionsLink Revisions

Page 27: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

27

What are links?What are links?

Links connect groups to their members

“Hard” links point to a target by address

“Soft” links store the path to a target root group

Hard link

dataset

Soft link“/target dataset”<address>

Page 28: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

28

file2.h5

file1.h5

New: New: externalexternal Links Links

• Why: Access objects by file & path within file

• What:• Store location of file and path within

that file• Can link across files

root group

“dataset EL”

“file2.h5”

“target dataset”

root group

dataset

“target dataset”

<address>

Page 29: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

29

New: New: User-definedUser-defined Links Links

• Why:• Allow applications to create their own kinds of

links and link operations, such as• Create “hard” external link that finds an object by

address• Create link that accesses a URL• Keep track of how often a link accessed, or other

behavior

• What:• App can create new kinds of links by supplying

custom callback functions• Can do anything HDF5 hard, soft, or external

links do

Page 30: Update on HDF5 1.8

Shared Object Shared Object Header MessagesHeader Messages

Page 31: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

31

Shared object header Shared object header messagesmessages

• Why: metadata duplicated many times, wasting space

• Example:• You create a file with 10,000 datasets• All use the same datatype and dataspace• HDF5 needs to write this information 10,000 times!

Dataset 1

data 1

datatype

dataspace

Dataset 2

data 2

datatype

dataspace

Dataset 3

data 3

datatype

dataspace

Page 32: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

32

Shared object header Shared object header messagesmessages

What:• Enable messages to be shared automatically• HDF5 shares duplicated messages on its

own!

Dataset 1

data 1

datatype

dataspace

Dataset 2

data 2

Page 33: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

33

Shared MessagesShared Messages

• Happens automatically• Works with datatypes, dataspaces, attributes,

fill values, and filter pipelines• Saves space if these objects are relatively large• May be faster if HDF5 can cache shared

messages• Drawbacks

• Usually slower than non-shared messages• Adds overhead to the file

• Index for storing shared datatypes• 25 bytes per instance

• Older library versions can’t read files with shared messages

Page 34: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

34

Two informal testsTwo informal tests

• File with 24 datasets, all with same big datatype• 26,000 bytes normally• 17,000 bytes with shared messages enabled• Saves 375 bytes per dataset

• But, make a bad decision: invoke shared messages but only create one dataset…• 9,000 bytes normally• 12,000 bytes with shared messages enabled• Probably slower when reading and writing, too.

• Moral: shared messages can be a big help, but only in the right situation!

Page 35: Update on HDF5 1.8

Metadata cache Metadata cache improvementsimprovements

Page 36: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

36

Metadata Cache Metadata Cache improvementsimprovements

• Why: • Improve I/O performance and memory

usage when accessing many objects• What:

• New metadata cache APIs• control cache size• monitor actual cache size and current hit rate

• Under the hood: adaptive cache resizing• Automatically detects the current working size• Sets max cache size to the working set size

Page 37: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

37

Metadata cache Metadata cache improvementsimprovements

• Note: most applications do not need to worry about the cache

• See “Advanced topics” for details• And if you do see unusual memory

growth or poor performance, please contact us. We want to help you.

Page 38: Update on HDF5 1.8

Other Other improvementsimprovements

Page 39: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

39

New extendible error-New extendible error-handling APIhandling API

• Why: Enable app to integrate error reporting with HDF5 library error stack

• What: New error handling API• H5Epush - push major and minor error ID on

specified error stack• H5Eprint – print specified stack• H5Ewalk – walk through specified stack• H5Eclear – clear specified stack• H5Eset_auto – turn error printing on/off for

specified stack• H5Eget_auto – return settings for specified

stack traversal

Page 40: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

41

Attribute improvementsAttribute improvements

• Why:• Use less storage when large numbers

of attributes attached to a single object

• Iterate over or look up attributes by creation order

• What:• Property to create index on the order

in which the attributes are created• Improved attribute storage

Page 41: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

42

Support for Unicode Support for Unicode Character SetCharacter Set

• Why:• So apps can create names using Unicode• netCDF 4 needed this

• What• UTF-8 Unicode encoding now supported• For string datatypes, names of links and

attributes

• Example:H5Pset_char_encoding(lcpl_id, H5T_CSET_UTF8)

H5Llink(file_id, "UTF-8 name", …, lcpl_id, …);

Page 42: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

43

Efficient copying of HDF5 Efficient copying of HDF5 objectsobjects

• Why:• Enable apps to copy objects efficiently

• What• New routines to copy an object in an HDF5

file within the current file or to another file• Done at a low-level in the HDF5 file,

allowing• Entire group hierarchies to be copied quickly• Compressed datasets to be copied without

going through a decompression/compression cycle

Page 43: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

44

Performance of object copy Performance of object copy routinesroutines

88.1%

58.7%

35.8%

20.0%

0.3% 0.1%0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

80M a

rray,

compou

nd d

atatyp

e

16K x

16K in

t arra

y

10,00

0 gr

oups

16K x

16K flo

at ar

ray,

chun

ked

10,00

0 att

ribute

s

16Kx1

6K flo

at arra

y, ch

unked,

com

press

ed

relative time for new h5repack using object copy routines vs. old h5repack

Page 44: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

45

Data transformation filterData transformation filter

• Why:• Apply arithmetic operations to data during I/O

• What:• Data transformation filter• Transform expressed by algebraic formula • Only +, -, *, and /supported

• Example:• Expression parameter set, such as x*(x-5)• When dataset read/written, x*(x-5) applied per

element• When reading, values in file are unchanged• When writing, transformed data written to file

Page 45: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

46

Stackable Virtual File DriversStackable Virtual File Drivers

• What is Virtual File Driver (VFD)?

Page 46: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

47

Virtual file I/O (C only)Virtual file I/O (C only) Perform byte-stream I/O operations (open/close, read/write, seek) User-implementable I/O (stdio, network, memory, etc.)

Virtual file I/O (C only)Virtual file I/O (C only) Perform byte-stream I/O operations (open/close, read/write, seek) User-implementable I/O (stdio, network, memory, etc.)

Library internalsLibrary internals• Performs data transformations and other prep for I/O • Configurable transformations (compression, etc.)

Library internalsLibrary internals• Performs data transformations and other prep for I/O • Configurable transformations (compression, etc.)

Structure of HDF5 LibraryStructure of HDF5 Library

Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++) Specify objects and transformation properties Invoke data movement operations and data transformations

Object API (C, Fortran 90, Java, C++)Object API (C, Fortran 90, Java, C++) Specify objects and transformation properties Invoke data movement operations and data transformations

Page 47: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

48

Stackable VFDStackable VFD

• HDF5 VFD allows• Storing data using different physical

file layout. E.g., Family VFD (writes file as “family of files”)

• Doing different types of I/O. E.g., stdio (standard I/O); MPI-I/O (for parallel I/O)

Page 48: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

49

Stackable VFDStackable VFD

• Why “stackable:”• Before now, only one VFD could be used at

a time• VFDs could not inter-operative

• What is “stackable:”• A Non-terminal VFD may stack on top of

compatible non-terminal and eventually Terminal VFD’s

• Two kinds of VFD• Non-terminal (e.g. Family)• Terminal (e.g. stdio; MPI-I/O)

Page 49: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

50

Stackable VFDStackable VFD

HDF5 Files

Application

HDF5 API

stdio

Family Filesplit

mpiioSec2

Default I/O path

TerminalVFD

Non-terminalVFD

metadata rawdata

Page 50: Update on HDF5 1.8

Platform-specific Platform-specific changeschanges

Page 51: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

52

Platform-specific changesPlatform-specific changes

• Why: Better UNIX/Linux Portability • What:

• 1.8 uses latest GNU “auto” tools (autoconf, automake, libtool) • improves portability between many

machine and OS configurations

• Build can now be done in parallel • with gmake “–j” flag• speeds up build, test and install processes

• Build infrastructure includes many other improvements as well

Page 52: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

53

Platforms to be droppedPlatforms to be dropped

• Operating systems• HPUX 11.00 • MAC OS 10.3• AIX 5.1 and 5.2• SGI IRIX64-6.5• Linux 2.4• Solaris 2.8 and 2.9

• Compilers• GNU C compilers

older than 3.4 (Linux)

• Intel 8.*• PGI V. 5.*, 6.0• MPICH 1.2.5

http://www.hdfgroup.org/HDF5/release/alpha/obtain518.html

Page 53: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

54

Platforms to be addedPlatforms to be added

• Systems• Alpha Open VMS• MAC OSX 10.4

(Intel)• Solaris 2.* on Intel

(?)• Cray XT3• Windows 64-bit

(32-bit binaries)• Linux 2.6• BG/L

• Compilers• g95• PGI V. 6.1• Intel 9.*• MPICH 1.2.7• MPICH2

Page 54: Update on HDF5 1.8

High level APIsHigh level APIs

Page 55: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

56

High-Level Fortran APIsHigh-Level Fortran APIs

• Fortran APIs have been added for H5Lite, H5Image and H5Table.

Page 56: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

57

Dimension scales Dimension scales

• Similar to • Dimension scales in HDF4• Coordinate variables in netCDF

• What is a dimension scale ?• An HDF5 dataset with additional metadata

that identifies the dataset as a “Dimension Scale”

• Associated with dimensions of HDF5 datasets• Meaning of the association is left to

applications • A Dimension scale can be shared by two

or more dataset dimensions

Page 57: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

58

Dimension scales exampleDimension scales example

HDF Explorer image

Page 58: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

59

Dimension scales exampleDimension scales example

HDF Explorer image

Page 59: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

60

Sample dimension scale Sample dimension scale functionsfunctions

• H5DSset_scale:H5DSset_scale: convert dataset to a convert dataset to a dimension scaledimension scale

• H5DSattach_scale:H5DSattach_scale: attach scale to a attach scale to a dimensiondimension

• H5DSdetach_scale:H5DSdetach_scale: detach scale detach scale from a dimensionfrom a dimension

• H5DSis_attached:H5DSis_attached: verify if scale verify if scale attached to dataset attached to dataset

• H5DSget_scale_name:H5DSget_scale_name: read name of read name of scalescale

Page 60: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

61

HDF5PacketHDF5Packet

• Why:• High performance table writing• For data acquisition, when there are

many sources of data• E.g. flight test

• What:• Each row is a “packet”: a collection of

fields, fixed or variable length• Append only• Indexed retrieval

Page 61: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

62

Packets in HDF5Packets in HDF5

...

Data

Data

Data

Data

Data

Data

Variable-length recordsFixed-length data records

Tim

e

Tim

e

...

Page 62: Update on HDF5 1.8

Parallel HDF5Parallel HDF5

Page 63: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

64

Collective I/O improvementsCollective I/O improvements

• Why• Collective I/O not available for chunked

data• Collective I/O not available for complex

selections• Collective I/O is key to improving

performance for parallel HDF5• What

• Collective I/O works for chunked storage• Works for irregular selections for both

chunked and contiguous storage

Page 64: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

65

Parallel h5diff (ph5diff)Parallel h5diff (ph5diff)

• Compares two files in an MPI parallel environment.

• Compares multiple datasets simultaneously

Page 65: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

66

Windows MPICH supportWindows MPICH support

• Windows MPICH support: prototype

Page 66: Update on HDF5 1.8

Tool improvementsTool improvements

Page 67: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

68

New features for old toolsNew features for old tools

• h5dump• Dump data in binary format• Faster for files with large numbers of

objects• h5diff

• Can now compare dataset regions • Parallel ph5diff now available

• h5repack• Efficient data copy using H5Gcopy()• Able to handle big datasets

Page 68: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

69

New HDF5 ToolsNew HDF5 Tools

• h5copy• Copies a group, dataset or named datatype from one

location to another• Copies within a file or across files

• h5repart• Partition file into a family of files

• h5import • Import binary/ascii data into an HDF5 file

• h5check • Verifies an HDF5 file against the defined HDF5 File

Format Specification

• h5stat• Reports statistics about a file and objects in a file

Page 69: Update on HDF5 1.8

Thank YouThank You

Page 70: Update on HDF5 1.8

Questions/Questions/comments?comments?

Page 71: Update on HDF5 1.8

Nov. 28, 2006

HDF and HDF-EOS Workshop X, Landover MD

72

For more informationFor more information

• Go to http://www.hdfgroup.org/HDF5/

• Click on “Obtain HDF5 1.8.0 Alpha”

• Look at table “Information”

Page 72: Update on HDF5 1.8

AcknowledgementAcknowledgementThis report is based upon work supported in part by a Cooperative Agreement with NASA under NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this

material are those of the author(s) and do not necessarily reflect the views of the

National Aeronautics and Space Administration.