advanced utilities extending ncgen to support the netcdf-4 data model dr. dennis heimbigner unidata...
TRANSCRIPT
Advanced Utilities
Extending ncgen to support the
netCDF-4 Data Model
Dr. Dennis Heimbigner
Unidata netCDF Workshop
August 3-4, 2009
Overview The NCGEN4 Utility NCGEN4 Command Synopsis netCDF4 in CDL: Types netCDF4 in CDL: Typed Attributes netCDF4 in CDL: Groups Scope Rules Specifying Data Constants Rules for Using Braces Special Attributes Debugging Note: The Cycle Building and Installing NCGEN4 Extended Example
The NCGEN4 Utility NCGEN4 extends the CDL language Includes all of the netCDF-4 data model True inverse of ncdump Includes the special attributes (chunking, etc) Supports binary, C, NOT FORTRAN (yet)
Experimental: Java and NcML Can produce 4 kinds of binary netCDF files:
netcdf-3 (classic) 32 bit netcdf-3 (classic) 64 bit netcdf-3 (classic, but stored in netcdf-4 file format) netcdf-4 (supports full netcdf-4 model)
NCGEN4 Command SynopsisPrimary Command Line Options[-b] Create a (binary) netCDF file (default)
[-o <file>] Name for the binary netCDF file created
[-k <file_format>] Format of the file to be created 1 => classic 32 bit 2 => classic 64 bit 3 => netcdf-4/CDM 4 => classic, but stored in an enhanced file format
[-x] Don't initialize data with fill values
[-l <language>] Specify output language to use when generating source code to create or define a netCDF file
matching the CDL specification.<filename> Input CDL file
netCDF4 in CDL: Types New section called “types:” Consistent with the output of ncdump Supports the new primitive data types: ubyte, ushort, uint, string, int64 (LL),
uint64 (ULL) Supports the new user-defined types: int enum enum_t {off=0,on=1,unknown=2}; opaque(11) opaque_t; compound cmpd_t { vlen_t f1; enum_t f2;}; int(*) vlen_t;
netCDF4 in CDL: Typed Attributes vlen_t v:attr = {17, 18, 19}; Attribute typing is optional (=> type inferred) Warning! x:attr = “abc”; is inferred to be type
char, not string Instead say string x:attr = “abc”; Why? for backward compatibility with ncgen Good practice to add “_t” to the end of all type
names Why? Because X :attr = … might be interpreted
incorrectly; is X a type or variable?
netCDF4 in CDL: Groups group: g {…} A group can itself contain dimensions, types,
variables, and groups Name prefixing allows references to types and
dimensions that reside in other groups Example: /g/cmpd_t => Do not use ‘/’ in your names Pretty much like the Unix file system Or Windows, but using forward slashes
Scope Rules Scope rules determine how references to a
dimension or type without a prefix are interpreted
General rule:1. Look in immediately enclosing group2. Look in the parent of the immediately enclosing
group and so on up the enclosing groups For dimensions, if not found => error For types, continue to search the whole
group tree to find a unique match, then error if not found
Specifying Data Constants Constants for user defined types require the use
of braces {…} in certain places.dimensions:
d=2;types:
int(*) vlen_t; compound cmpd_t { int64 f1; string f2;};variables:
vlen_t v1(d);cmpd_t v2(d);
data:v1 = {7, 8, 9}, {17,18,19};v2 = {107LL, “abc”}, {1234567LL, “xyz”};
Rules for Using Braces The top level is automatically assumed to be a list
of items, so it should not be inside {...} Different than C constants lists
Instances of UNLIMITED dimensions (other than as the first dimension) must be surrounded by {...} in order to specify the size.
Instances of VLENs must be surrounded by {...} in order to specify the size.
Compound instances must be embedded in {...} Compound fields may optionally be embedded in
{...}. No other use of braces is allowed.
Special Attributes Special attributes specified in an ncgen4 CDL
file will be properly handled Consistent with ncdump -s Global special attributes
“_Format” – specify the netCDF file format “classic” “64-bit offset” “netCDF-4” “netCDF-4 classic model”
Overridden by the -k flag
Special Attributes (cont.) Per-variable special attributes
“_ChunkSizes” – list of chunk sizes 1 per dimension “_DeflateLevel” – compression level: integer (0-9) “_Endianness” – “big” or “little” “_Fletcher32” – “true” or “false” to set check summing “_NoFill” – “true” or “false” to set persistent NoFill
property “_Shuffle” – “true” or “false” to set shuffle filter “_Storage” – “contiguous” or “chunked” to set storage
mode
Debugging Note Use the “Cycle”, Luke Use ncgen/ncgen4 to convert your <file>.cdl
to <file>.nc Then use ncdump to convert your <file>.nc
to <file2>.cdl Compare <file>.cdl to <file2>.cdl Watch out for UNLIMITED!
dimensions: u = unlimited; variables: v1(u); v2(u);
data: v1 = {1,2,3,4}; v2 = {7,8};
Ncdump produces v2 = {7,8,_,_};
Building and Installing NCGEN4 Easy: add --enable-ncgen4 to your list of
./configure flags Ncgen4 will be installed along with ncdump
and the original ncgen
Extended Examplenetcdf foo {
types: ubyte enum enum_t {Clear = 0, Cumulonimbus = 1, Stratus = 2}; opaque(11) opaque_t; int(*) vlen_t;
dimensions: lat = 10; lon = 5; time = unlimited ;
variables: long lat(lat), lon(lon), time(time); float Z(time,lat,lon), t(time,lat,lon); double p(time,lat,lon); long rh(time,lat,lon); string country(time,lat,lon); ubyte tag;
Extended Example (cont.)// variable attributes lat:long_name = "latitude"; lat:units = "degrees_north"; lon:long_name = "longitude"; lon:units = "degrees_east"; time:units = "seconds since 1992-1-1 00:00:00";
// typed variable attributes string Z:units = "geopotential meters"; float Z:valid_range = 0., 5000.; double p:_FillValue = -9999.; long rh:_FillValue = -1; vlen_t :globalatt = {17, 18, 19};
Extended Example (cont.)data: lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90; lon = -140, -118, -96, -84, -52;
group g { types: compound cmpd_t { vlen_t f1; enum_t f2;};} // group g
group h { variables: /g/cmpd_t compoundvar; data: compoundvar = { {3,4,5}, Stratus } ;} // group h
}
Questions?