hdf5 advanced topics

73
06/14/22 HDf-EOS Workshop XI 1 HDF5 Advanced Topics

Upload: denali

Post on 04-Jan-2016

110 views

Category:

Documents


3 download

DESCRIPTION

HDF5 Advanced Topics. Outline. Dataset selections and partial I/O Chunking and filters Datatypes Overview Variable length data Compound datatype Object and dataset region references. Selections and partial I/O. What is a selection?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 1

HDF5 Advanced Topics

Page 2: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 2

Outline

• Dataset selections and partial I/O• Chunking and filters• Datatypes

• Overview• Variable length data• Compound datatype• Object and dataset region references

Page 3: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 3

Selections and partial I/O

Page 4: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 4

What is a selection?

• Selection describes elements of a dataset that participate in partial I/O• Hyperslab selection • Point selection• Results of Set Operations on hyperslab

selections or point selections (union, difference, …)

• Used by sequential and parallel HDF5

Page 5: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 5

Example of a hyperslab selection

Page 6: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 6

Example of regular hyperslab selection

Page 7: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 7

Example of irregular hyperslab selection

Page 8: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 8

Another example of hyperslab selection

Page 9: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 9

Example of point selection

Page 10: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 11

Hyperslab description

• Offset - starting location of a hyperslab (1,1)• Stride - number of elements that separate each

block (3,2)• Count - number of blocks (2,6)• Block - block size (2,1)• Everything is “measured” in number of

elements

Page 11: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 12

H5Sselect_hyperslab

space_id Identifier of dataspace

op Selection operator H5S_SELECT_SET

H5S_SELECT_OR offset Array with starting coordinates of hyperslab stride Array specifying which positions along a dimension to select count Array specifying how many blocks to select from the dataspace, in each dimension block Array specifying size of element block (NULL indicates a block size of a single element in a dimension)

Page 12: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 13

Reading/Writing Selections

1. Open the file2. Open the dataset3. Get file dataspace 4. Create a memory dataspace (data buffer)5. Make the selection(s)6. Read from or write to the dataset7. Close the dataset, file dataspace, memory

dataspace, and file

Page 13: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 14

Example c-hyperslab.c: reading two rows

1 2 3 4 5 6

7 8 9 10 11 12

13 14 15 16 17 18

19 20 21 22 23 24

-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

Data in file4x6 matrix

Buffer in memory1-dim array of length 14

Page 14: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 15

Example: reading two rows

1 2 3 4 5 6

7 8 9 10 11 12

13 14 15 16 17 18

19 20 21 22 23 24

offset = {1,0}count = {2,6}block = {1,1}stride = {1,1}

filespace = H5Dget_space (dataset);H5Sselect_hyperslab (filespace, H5S_SELECT_SET, offset, NULL, count, NULL)

Page 15: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 16

Example: reading two rows

-1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

offset = {1}count = {12}

memspace = H5Screate_simple(1, 14, NULL);H5Sselect_hyperslab (memspace, H5S_SELECT_SET, offset, NULL, count, NULL)

Page 16: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 17

Example: reading two rows

1 2 3 4 5 6

7 8 9 10 11 12

13 14 15 16 17 18

19 20 21 22 23 24

-1 7 8 9 10 11 12 13 14 15 16 17 18 -1

H5Dread (…, …, memspace, filespace, …, …);

Page 17: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 18

Chunking in HDF5

Page 18: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 19

HDF5 chunking

• Chunked layout is needed forExtendible datasetsCompression and other filtersTo improve partial I/O for big datasets

Better subsetting access time; extendiblechunked

Only two chunks will be written/read

Page 19: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 20

Creating chunked dataset

1. Create a dataset creation property list2. Set property list to use chunked storage

layout3. Set property list to use filters4. Create dataset with the above property list

plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk(plist, rank, ch_dims); H5Pset_deflate(plist, 9);

dset_id = H5Dcreate (…, “Chunked”,…, plist); H5Pclose(plist);

Page 20: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 21

HDF5 filters

• HDF5 Filters modify data during I/O operations

• Available filters:1. Checksum (H5Pset_fletcher32)2. Shuffling filter (H5Pset_shuffle)3. Compression

Scale + offset (H5Pset_scaleoffset) N-bit (H5Pset_nbit) GZIP (deflate), SZIP (H5Pset_deflate,

H5Pset_szip) BZIP2 (example of a user-defined compression

filter) http://www.hdfgroup.uiuc.edu/papers/papers/bzip2/

Page 21: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 22

N-bit compression

• In memory, one value of N-Bit datatype is stored like this:

| byte 3 | byte 2 | byte 1 | byte 0 ||????????|????SPPP|PPPPPPPP|PPPP????|

S-sign bit P-significant bit ?-padding bit

• After passing through the N-Bit filter, all padding bits are chopped off, and the bits are stored on disk like this:

| 1st value | 2nd value ||SPPPPPPP PPPPPPPP|SPPPPPPP PPPPPPPP|...

• Opposite (decompress) when going from disk to memory

Page 22: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 23

N-bit compression

• Provides compact storage for user-defined datatypes

• How does it work?• When data stored on disk, padding bits

chopped off and only significant bits stored• Supports most datatypes• Works with compound datatypesH5Pset_nbit(dcr);

H5Dcreate(……, dcr)

H5Dwrite (…);

Page 23: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 24

Offset+size storage filter

• Uses less storage when less precision needed• Performs scale/offset operation on each value• Truncates result to fewer bits before storing• Currently supports integers and floats • Example

H5Pset_scaleoffset(dcr,H5Z_SO_INT,

H5Z_SO_INT_MINBITS_DEFAULT);

H5Dcreate(……, dcr);

H5Dwrite (…);

Page 24: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 25

Example with floating-point type

• Data: {104.561, 99.459, 100.545, 105.644}• Choose scaling factor: decimal precision to keep

E.g. scale factor D = 21. Find minimum value (offset): 99.4592. Subtract minimum value from each element

Result: {5.102, 0, 1.086, 6.185}3. Scale data by multiplying 10D = 100

Result: {510.2, 0, 108.6, 618.5}4. Round the data to integer

Result: {510 , 0, 109, 619}

This is stored in the file.

Page 25: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 27

Writing or reading to/from chunked dataset

1. Use the same set of operation as for contiguous dataset

2. Selections do not need to coincide precisely with the chunks

3. Chunking mechanism is transparent to application 4. Chunking and compression parameters can affect

performance

H5Dopen(…);

…………

H5Sselect_hyperslab (…);

…………

H5Dread(…);

Page 26: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 28

h5zip.c example

Creates a compressed integer dataset 1000x20 in the zip.h5 file

h5dump –p –H zip.h5

HDF5 "zip.h5" {GROUP "/" { GROUP "Data" { DATASET "Compressed_Data" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 1000, 20 )……… STORAGE_LAYOUT { CHUNKED ( 20, 20 ) SIZE 5316 }

Page 27: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 29

h5zip.c example

FILTERS { COMPRESSION DEFLATE { LEVEL 6 } } FILLVALUE { FILL_TIME H5D_FILL_TIME_IFSET VALUE 0 } ALLOCATION_TIME { H5D_ALLOC_TIME_INCR } } }}}

Page 28: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 30

h5zip.c example (bigger chunk)

Creates a compressed integer dataset 1000x20 in the zip.h5 file

h5dump –p –H zip.h5

HDF5 "zip.h5" {GROUP "/" { GROUP "Data" { DATASET "Compressed_Data" { DATATYPE H5T_STD_I32BE DATASPACE SIMPLE { ( 1000, 20 )……… STORAGE_LAYOUT { CHUNKED ( 200, 20 ) SIZE 2936 }

Page 29: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 31

Chunking basics to remember

• Chunking creates storage overhead in the file• Performance is affected by

Chunking and compression parameters Chunking cache size (H5Pset_cache call)

• Some hints for getting better performance Use chunk size no smaller than block size

(4k) on your system Use compression method appropriate for

your data Avoid using selections that do not coincide

with the chunking boundaries

Page 30: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 33

Selections and I/O Performance

Page 31: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 34

Performance of serial I/O operations

• Next slides show the performance effects of using different access patterns and storage layouts.

• We use three test cases which consist of writing a selection to a rectangular array where the datatype of each element is a char.

• Data is stored in row-major order.• Tests were executed on THG smirom using

h5perf_serial and HDF5 version 1.8.

Page 32: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 35

Serial benchmarking tool

• A new benchmarking tool, h5perf_serial, is under development.

• Some features implemented at this time are: Support for POSIX and HDF5 I/O calls. Support for datasets and buffers with multiple

dimensions. Entire dataset access using a single or several I/O

operations. Selection of contiguous and chunked storage for

HDF5 operations.

Page 33: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 36

Contiguous storage (Case 1)

• Rectangular dataset of size 48K x 48K, with write selections of 512 x 48K.

• HDF5 storage layout is contiguous.

• Good I/O pattern for POSIX and HDF5 because each selection is contiguous.

• POSIX: 5.19 MB/s• HDF5: 5.36 MB/s

1

2

3

4

1 2 3 4

Page 34: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 37

Contiguous storage (Case 2)

• Rectangular dataset of 48K x 48K, with write selections of 48K x 512.

• HDF5 storage layout is contiguous.

• Bad I/O pattern for POSIX and HDF5 because each selection is noncontiguous.

• POSIX: 1.24 MB/s• HDF5: 0.05 MB/s

1 2 3 4

1 2 3 4 1 2 3 4 …….

Page 35: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 38

Chunked storage

• Rectangular dataset of 48K x 48K, with write selections of 48K x 512.

• HDF5 storage layout is chunked. Chunks and selections sizes are equal.

• Bad I/O case for POSIX because selections are noncontiguous.

• Good I/O case for HDF5 since selections are contiguous due to chunking layout settings.

• POSIX: 1.51 MB/s• HDF5: 5.58 MB/s

1 2 3 4

1 2 3 4

1 2 3 4 1 2 3 4 …….

POSIX

HDF5

Page 36: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 39

Conclusions

• Access patterns with small I/O operations incur high latency and overhead costs many times.

• Chunked storage may improve I/O performance by affecting the contiguity of the data selection.

Page 37: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 40

HDF5 Datatypes

Page 38: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 41

Datatypes

• A datatype is• A classification specifying the

interpretation of a data element• Specifies for a given data element

• the set of possible values it can have• the operations that can be performed• how the values of that type are stored

• May be shared between different datasets in one file

Page 39: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 42

Hierarchy of the HDF5 datatypes classes

Page 40: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 43

General Operations on HDF5 Datatypes

• Create • Derived and compound datatypes only

• Copy • All datatypes

• Commit (save in a file to share between different datatsets)• All datatypes

• Open• Committed datatypes only

• Discover properties (size, number of members, base type)

• Close

Page 41: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 44

Basic Atomic HDF5 Datatypes

Page 42: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 45

Basic Atomic Datatypes

• Atomic types classes• integers & floats • strings (fixed and variable size)• pointers - references to objects/dataset regions• opaque• bitfield

• Element of an atomic datatype is a smallest possible unit for HDF5 I/O operation• Cannot write or read just mantissa or exponent

fields for floats or sign filed for integers

Page 43: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 46

HDF5 Predefined Datatypes

• HDF5 Library provides predefined datatypes (symbols) for all basic atomic classes except opaque• H5T_<arch>_<base>• Examples:

• H5T_IEEE_F64LE • H5T_STD_I32BE • H5T_C_S1 • H5T_STD_B32LE• H5T_STD_REF_OBJ, H5T_STD_REF_DSETREG• H5T_NATIVE_INT

• Predefined datatypes do not have constant values; initialized when library is initialized

Page 44: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 47

When to use HDF5 Predefined Datatypes?

• In datasets and attributes creation operations• Argument to H5Dcreate or to H5Acreate H5Dcreate(file_id, "/dset", H5T_STD_I32BE,dataspace_id, H5P_DEFAULT);

H5Dcreate(file_id, "/dset", H5T_NATIVE_INT,dataspace_id, H5P_DEFAULT); • In datasets and attributes I/O operations

• Argument to H5Dwrite/read, H5Awrite/read• Always use H5T_NATIVE_* types to describe data in

memory• To create user-defined types

• Fixed and variable-length strings• User-defined integers and floats (13-bit integer or non-

standard floating-point)• In composite types definitions• Do not use for declaring variables

Page 45: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 48

•Data

Time•Data

•Data

•Data

•Data

•Data

•Data

•Data

•Data

Time

HDF5 Fixed and Variable length array storage

Page 46: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 49

Storing strings in HDF5 (string.c)

• Array of characters• Access to each character• Extra work to access and interpret each string

• Fixed lengthstring_id = H5Tcopy(H5T_C_S1);H5Tset_size(string_id, size);

• Overhead for short strings• Can be compressed

• Variable lengthstring_id = H5Tcopy(H5T_C_S1);H5Tset_size(string_id, H5T_VARIABLE);

• Overhead as for all VL datatypes (later)• Compression will not be applied to actual data

• See string.c for how to allocate buffer for reading data for fixed and variable length strings

Page 47: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 50

• Each element is represented by C struct typedef struct { size_t length; void *p;} hvl_t;

• Base type can be any HDF5 type• H5Tvlen_create(base_type)

HDF5 variable length datatypes

Page 48: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 51

•Data

•Data

•Data

•Data

•Data

Creation of HDF5 variable length array

hvl_t data[LENGTH];

for(i=0; i<LENGTH; i++) { data[i].p=HDmalloc((i+1)*sizeof(unsigned int)); data[i].len=i+1;

}

tvl = H5Tvlen_create (H5T_NATIVE_UINT);

data[0].p

data[4].len

Page 49: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 52

HDF5 Variable Length DatatypesStorage

Global heapGlobal heap

Dataset with variable length datatypeDataset with variable length datatype

Raw dataRaw data

Page 50: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 53

Reading HDF5 variable length array

hvl_t rdata[LENGTH];

/* Discover the type in the file */

tvl = H5Tvlen_create (H5T_NATIVE_UINT);

ret = H5Dread(dataset,tvl,H5S_ALL,H5S_ALL,

H5P_DEFAULT, rdata);

/* Reclaim the read VL data */

ret=H5Dvlen_reclaim(tvl,H5S_ALL,H5P_DEFAULT,rdata);

When size and base datatype are known:

Page 51: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 54

Bitfield datatype

• C bitfield• Bitfield – sequence of bytes packed in some

integer type • Examples of Predefined Datatypes

• H5T_NATIVE_B64 – native 8 byte bitfield• H5T_STD_B32LE – standard 4 bytes bitfield

• Created by copying predefined bitfield type and setting precision, offset and padding

• Use n-bit filter to store significant bits only

Page 52: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 55

Bitfield datatype

0 0 0 1 0 1 1 1 0 0 1 1 1 0 0 0

0 7 15

Example:LE0-padding

Offset 3Precision 11

Page 53: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 56

Storing Tables in HDF5 file

Page 54: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 57

Example

a_name (integer)

b_name (float)

c_name (double)

0 0. 1.0000

1 1. 0.5000

2 4. 0.3333

3 9. 0.2500

4 16. 0.2000

5 25. 0.1667

6 36. 0.1429

7 49. 0.1250

8 64. 0.1111

9 81. 0.1000

Multiple ways to store a table Dataset for each field Dataset with compound datatype If all fields have the same type: 2-dim array 1-dim array of array datatype continued…..Choose to achieve your goal!How much overhead each type of storage will create?Do I always read all fields?Do I need to read some fields more often?Do I want to use compression?Do I want to access some records?

Page 55: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 58

HDF5 Compound Datatypes

• Compound typesComparable to C structs Members can be atomic or compound

types Members can be multidimensionalCan be written/read by a field or set

of fieldsNon all data filters can be applied

(shuffling, SZIP)

Page 56: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 59

HDF5 Compound Datatypes

• Which APIs to use?• H5TB APIs

• Create, read, get info and merge tables• Add, delete, and append records• Insert and delete fields• Limited control over table’s properties (i.e. only GZIP

compression, level 6, default allocation time for table, extendible, etc.)

• PyTables http://www.pytables.org• Based on H5TB• Python interface• Indexing capabilities

• HDF5 APIs • H5Tcreate(H5T_COMPOUND), H5Tinsert calls to create

a compound datatype• H5Dcreate, etc.• See H5Tget_member* functions for discovering

properties of the HDF5 compound datatype

Page 57: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 60

Creating and writing compound dataset

h5_compound.c example

typedef struct s1_t { int a; float b; double c; } s1_t;

s1_t s1[LENGTH];

Page 58: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 61

Creating and writing compound dataset

/* Create datatype in memory. */

s1_tid = H5Tcreate (H5T_COMPOUND, sizeof(s1_t)); H5Tinsert(s1_tid, "a_name", HOFFSET(s1_t, a), H5T_NATIVE_INT); H5Tinsert(s1_tid, "c_name", HOFFSET(s1_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s1_tid, "b_name", HOFFSET(s1_t, b), H5T_NATIVE_FLOAT);

Note: Use HOFFSET macro instead of calculating offset by hand Order of H5Tinsert calls is not important if HOFFSET is used

Page 59: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 62

Creating and writing compound dataset

/* Create dataset and write data */

dataset = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT);status = H5Dwrite(dataset, s1_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1);

Note: In this example memory and file datatypes are the same Type is not packed Use H5Tpack to save space in the file

s2_tid = H5Tpack(s1_tid);status = H5Dcreate(file, DATASETNAME, s2_tid, space, H5P_DEFAULT);

Page 60: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 63

File content with h5dump

HDF5 "SDScompound.h5" {GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE { H5T_STD_I32BE "a_name"; H5T_IEEE_F32BE "b_name"; H5T_IEEE_F64BE "c_name"; } DATASPACE { SIMPLE ( 10 ) / ( 10 ) } DATA { { [ 0 ], [ 0 ], [ 1 ] }, { [ 1 ], [ 1 ], [ 0.5 ] }, { [ 2 ], [ 4 ], [ 0.333333 ] }, ….

Page 61: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 64

Reading compound dataset

/* Create datatype in memory and read data. */

dataset = H5Dopen(file, DATSETNAME);s2_tid = H5Dget_type(dataset);mem_tid = H5Tget_native_type (s2_tid);s1 = malloc((sizeof(mem_tid)*number_of_elements) status = H5Dread(dataset, mem_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1);

Note:

We could construct memory type as we did in writing example

For general applications we need to discover the type in the file, find out corresponding memory type, allocate space and do read

Page 62: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 65

Reading compound dataset: subsetting by fields

typedef struct s2_t { double c; int a;} s2_t; s2_t s2[LENGTH];…s2_tid = H5Tcreate (H5T_COMPOUND, sizeof(s2_t)); H5Tinsert(s2_tid, "c_name", HOFFSET(s2_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s2_tid, “a_name", HOFFSET(s2_t, a), H5T_NATIVE_INT);…status = H5Dread(dataset, s2_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s2);

Page 63: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 66

Reading compound dataset: subsetting by fields

Another way to create a compound datatype

#include H5LTpublic.h…..

s2_tid = H5LTtext_to_dtype( "H5T_COMPOUND {H5T_NATIVE_DOUBLE \"c_name\"; H5T_NATIVE_INT \"a_name\"; }", H5LT_DDL);

Page 64: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 67

Reference Datatype

• Reference to an HDF5 object• Pointers to Groups, datasets, and named

datatypes in a filePredefined datatype H5T_STD_REG_OBJH5RcreateH5Rdereference

• Reference to a dataset region (selection)• Pointer to the dataspace selection

Predefined datatype H5T_STD_REF_DSETREGH5RcreateH5Rdereference

Page 65: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 68

Reference to Object

h5-ref2obj.c

Group1Integers MYTYPE

Group2

REF_OBJ.h5

Root

Object References

Page 66: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 69

Reference to Object

h5dump REF_OBJ.h5

DATASET "OBJECT_REFERENCES" {

DATATYPE H5T_REFERENCE

DATASPACE SIMPLE { ( 4 ) / ( 4 ) }

DATA {

(0): GROUP 808 /GROUP1 , GROUP 1848 /GROUP1/GROUP2 ,

(2): DATASET 2808 /INTEGERS , DATATYPE 3352 /MYTYPE

}

Page 67: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 70

Reference to object

• Create a reference to group objectH5Rcreate(&ref[1], fileid, "/GROUP1/GROUP2", H5R_OBJECT, -1);

• Write references to a datasetH5Dwrite(dsetr_id, H5T_STD_REF_OBJ, H5S_ALL, H5S_ALL, H5P_DEFAULT, ref);

• Read reference back with H5Dread and find an object it points to

type_id = H5Rdereference(dsetr_id, H5R_OBJECT, &ref[3]);name_size = H5Rget_name(dsetr_id, H5R_OBJECT, &ref_out[3], (char*)buf,

10);• buf will contain /MYTYPE, name_size will be 8 (accommodating “\0”)

Page 68: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 71

Reference to dataset regionh5-ref2reg.c

REF_REG.h5

Root

Object ReferencesMatrix

1 1 2 3 3 4 5 5 61 2 2 3 4 4 5 6 6

Page 69: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 72

Reference to Dataset Region

%h5dump REF_REG.h5

HDF5 "REF_REG.h5" {

GROUP "/" {

DATASET "MATRIX" {

DATATYPE H5T_STD_I32LE

DATASPACE SIMPLE { ( 2, 9 ) / ( 2, 9 ) }

DATA {

(0,0): 1, 1, 2, 3, 3, 4, 5, 5, 6,

(1,0): 1, 2, 2, 3, 4, 4, 5, 6, 6

}

Page 70: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 73

Reference to Dataset Region

%h5dump REF_REG.h5

DATASET "REGION_REFERENCES" {

DATATYPE H5T_REFERENCE

DATASPACE SIMPLE { ( 2 ) / ( 2 ) }

DATA {

(0): DATASET /MATRIX {(0,3)-(1,5)},

(1): DATASET /MATRIX {(0,0), (1,6), (0,8)}

}

}

}

}

Page 71: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 74

Reference to Dataset Region

• Create a reference to a dataset regionH5Sselect_hyperslab(space_id,H5S_SELECT_SET,start,NULL,

count,NULL);

H5Rcreate(&ref[0], file_id, “MATRIX”, H5R_DATASET_REGION,

space_id);

• Write references to a datasetH5Dwrite(dsetr_id, H5T_STD_REF_DSETREG, H5S_ALL,

H5S_ALL, H5P_DEFAULT, ref);

Page 72: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 75

Reference to Dataset Region

• Read reference back with H5Dread and find a region it points to

dsetv_id = H5Rdereference(dsetr_id,

H5R_DATASET_REGION, &ref_out[0]);

space_id = H5Rget_region(dsetr_id,

H5R_DATASET_REGION,&ref_out[0]);

• Read selectionH5Dread(dsetv_id, H5T_NATIVE_INT, H5S_ALL, space_id,

H5P_DEFAULT, data_out);

Page 73: HDF5 Advanced Topics

04/20/23 HDf-EOS Workshop XI 76

Questions? Comments?

?

Thank you!