NetCDF explainer
(this material was part of decision to use NetCDF3 for input files)
NetCDF data format
There are 5 variants of the format known as "netCDF": (See https://www.unidata.ucar.edu/software/netcdf/docs/faq.html#How-many-netCDF-formats-are-there-and-what-are-the-differences-among-them)
- the classic format (aka netCDF3) CDF-1
- the 64-bit offset format with 32 bit record addressing (aka netCDF3) CDF-2
- Although the 64-bit offset format allows the creation of much larger netCDF files than was possible with the classic format, there are still some restrictions on the size of variables. It's important to note that without Large File Support (LFS) in the operating system, it's impossible to create any file larger than 2 GiBytes.
- Assuming an operating system with LFS, the following restrictions apply to the netCDF 64-bit offset format.
- No fixed-size variable can require more than 2^32 - 4 bytes (i.e. 4GiB - 4 bytes, or 4,294,967,292 bytes) of storage for its data, unless it is the last fixed-size variable and there are no record variables.
- When there are no record variables, the last fixed-size variable can be any size supported by the file system, e.g. terabytes.
- A 64-bit offset format netCDF file can have up to 2^32 - 1 fixed sized variables, each under 4GiB in size. If there are no record variables in the file the last fixed variable can be any size.
- No record variable can require more than 2^32 - 4 bytes of storage for each record's worth of data, unless it is the last record variable.
- A 64-bit offset format netCDF file can have up to 2^32 - 1 records, of up to 2^32 - 1 variables, as long as the size of one record's data for each record variable except the last is less than 4 GiB - 4. Note also that all netCDF variables and records are padded to 4 byte boundaries.
- 64-bit offset + 64 bit record addressing, (aka "64bit data") CDF-5. This format has no practical limits on data size.
- the netCDF-4 format
- the netCDF-4 classic model format
The 2 netCDF-4 formats require a netCDF4 library (with HDF5 underneath) to read. NetCDF-4 can read all 5 variants.
PnetCDF, our preferred choice for high-performing parallel netCDF read/write, can only read/write the first 3 formats.
NetCDF documentation says the classic format is the most portable. Any netcdf variant library can read it.
If the file is in a netCDF-4 format but is opened with a PnetCDF library, an error message is the result.
To figure out the format:
- ncdump -k foo.nc (provided ncdump is from a build using netCDF4.0 or later).
- od -An -c -N4 foo.nc which will output depending on the format:
- C D F 001
- C D F 002
- C D F 005
- 211 H D F
- ncvalidator in PnetCDF. See Testing for NetCDF Compatibility/Validity
How much of our inputdata is in each format? See https://github.com/E3SM-Project/E3SM/issues/1970. Most of our files are in a NetCDF-3 format.
NetCDF File format
The format of the file is slightly different from the data format
NetCDF-4 is really an HDF5 file. HDF5 utilities can read them.
NetCDF3 can be written in 3 versions: CDF-1 (classic data format) and CDF-2 (64-bit offset format). PnetCDF added CDF-5 (64-bit data format) which allows variables that have very large dimensions (greater then 2^32). See https://trac.mcs.anl.gov/projects/parallel-netcdf/wiki/FileLimits. NetCDF4.4 and above have added PnetCDF under their API to do parallel I/O with netCDF3 formats.
Size-dependent Issues
Each netCDF filetype imposes size limitations on the underlying data. Trying to exceed these limits leads to an NC_EVARSIZE error. The upshot is that only the CDF5 and netCDF4 formats are adequate for high-resolution coupled E3SM runs, since they (or at least the MPAS-O component) can produce multiples variables each exceeding 4 GB. For more details, read on.
The per-file limit of all netCDF formats is not less than 8 EiB on modern computers, so any NC_EVARSIZE error is almost certainly due to violating a per-variable limit.
Relevant limits:
- netCDF3 CDF1 format limits fixed variables to sizes smaller than 2^31 B = 2 GiB ~ 2.1 GB, and record variables to that size per record. A single variable may exceed this limit if and only if it is the last defined variable.
- netCDF3 CDF2 format limits fixed variables to sizes smaller than 2^32 B = 4 GiB ~ 4.2 GB, and record variables to that size per record. Any number of variables may reach, though not exceed, this size for fixed variables, or this size per record for record variables.
- The netCDF3 CDF5 and netCDF4 formats have no variable size limitations of real-world import.
If any input or output variable in an E3SM simulation exceeds these limits, choose a PIO format capacious enough, either netCDF3 CDF2, PnetCDF/netCDF3 CDF5, or netCDF4.
NetCDF library:
There are 4 ways to build a library that reads netCDF files. Our PIO library can interface to all of them at the same time if built properly. The PIO_TYPENAME is set at runtime to indicate which one is to be used for all I/O operations. The PIO_TYPENAME is in parenthesis below.
- A netCDF-4 build without parallel support: reads/writes all variants in serial (netcdf)
- A netCDF-4 build with compression support: reads/writes with compression in serial (netcdf4c)
- A netCDF-4 build with parallel support: reads/writes all variants but only netCDF-4 in parallel (netcdf4p)
- A PnetCDF build: reads/writes only classic, 64-bit offset and 64-bit data in parallel. (pnetcdf)
One can do a netCDF-3-only build using latest netCDF4 source. This reads/writes classic and 64-bit offset in serial. PIO is not tested with this library version.
$MODEL_PIO_NETCDF_FORMAT in env_run.xml is used to set which of the CDF types is used by PIO when creating new files (so history and restart files written by E3SM). Its ignored for netCDF4.
Building the netCDF-4 library with parallel support requires corresponding parallel builds of HDF5 using MPI. And if one wants to use the F90 interface to NetCDF4p, then the fortran compiler used also matters.
To figure out what library options are available.
You first have to build an E3SM case. Then go to your case directory and do:
./xmlquery --valid-values PIO_TYPENAME
If the default PIO_TYPENAME specified in env_run.xml doesn't work for a file, PIO will quietly try all valid values of PIO_TYPENAME for that case to open the file.
Library Version and CDF5
CDF5, only became available by default to all (not just PnetCDF) users with netCDF library version 4.4.x. Unfortunately the CDF5 implementation in netCDF 4.4.x-4.6.0 was buggy, and netCDF >= 4.6.1 is required for bug-free analysis of CDF5 files. To be clear, the CDF5 written by PnetCDF is correct for all versions of PnetCDF, it is only that downstream applications using the netCDF implementation of CDF5 must use netCDF >= 4.6.1. To make matters more treacherous, netCDF versions 4.5.0-4.6.1 do not install CDF5 capability by default because Unidata wanted to be conservative until they were sure they had fixed the CDF5 bug (which the E3SM team first reported).
In light of this, E3SM should attempt to ensure that netCDF >= 4.6.1 is available and analysis tools are built with it. Otherwise, parallel PnetCDF-based solutions for model runs may lead to downstream problems with data analysis of the runs.