Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is devoted to instruction in ncremap. It describes steps necessary to create grids, and to regrid datasets between different grids with ncremap. Some of the simpler regridding options supported by ncclimo are also described at Generate, Regrid, and Split Climatologies (climo files) with ncclimo. This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.

The Zen of Regridding

Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve that problem simply, including the minimal software installation required. Once the basic regridding vocabulary has been introduced, we solve the prototype problem when one or more inputs are "missing", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:

...

File-level parallelism accelerates throughput when regridding multiple files in one ncremap invocation, and has no effect when only one file is to be regridded. Note that the ncclimo and ncremap semantics for selecting file-level parallelism are identical, though their defaults differ (Background mode for ncclimo and Serial mode for ncremap). Select the desired mode with the argument to --par_typ=par_typ. Explicitly select Background mode with par_typ values of bck, background, or Background. The values mpi or MPI select MPI mode, and the srl, serial, Serial, nil,  and none will all select Serial mode (which disables file-level parallelism, though still allows intra-file OpenMP parallelism).

The default file-level parallelism for ncremap is Serial mode (i.e., no file-level parallelism), in which ncremap processes one input file at a time. Background and MPI modes implement true file-level parallelism. Typically both these parallel modes scale well with sufficent memory unless and until I/O contention becomes the bottleneck. In Background mode ncremap issues all commands to regrid the input file list as UNIX background processes on the local node. Nodes with mutiple cores and sufficient RAM take advantage of this to simultaneously regrid multiple files. In MPI mode ncremap issues commands to regrid the input file list in round-robin fashion to all available compute nodes. Prior to NCO version 4.9.0 (released December, 2019), Background and MPI parallelism modes both regridded all the input files at one time and there was no way to limit the number of files being simultaneously regridded. Subsequent versions allow finer grained parallelism by introducing the ability to limit the number of discrete workflow elements or ``jobs'' (i.e., file regriddings) to perform simultaneously within a an ncremap invocation or ``workflow''.

As of NCO version 4.9.0 (released December, 2019), the --job_nbr=job_nbr option specifies the maximum number of files to regrid simultaneously on all nodes being harnessed by the workflow. Thus job_nbr is an additional parameter to fine-tune file level parallelism and (it has no effect in Serial mode). In both parallel modes ncremap spawns processes in batches of job_nbr jobs, then waits for those processes to complete. Once a batch finishes, ncremap spawns the next batch. In Background mode, all jobs are spawned to the local node. In MPI mode, all jobs are spawned in round-robin fashion to all available nodes until job_nbr jobs are running.

If regridding consumes so much RAM (e.g., because variables are large and/or the number of threads is large) that a single node can perform only one regridding job at a time, then a reasonable value for job_nbr is the number of nodes, node_nbr. Often, however, nodes can regrid multiple files simultaneously. It can be more efficient to spawn multiple jobs per node than to increase the threading per job because I/O contention for write access to a single file prevents threading from scaling indefinitely.

By default job_nbr = 2 in Background mode, and job_nbr = node_nbr in MPI mode. This helps prevent users from overloading nodes with too many jobs. Subject to the availability of adequate RAM, expand the number of jobs per node by increasing job_nbr until, ideally, each core on the node is used. Remember that processes and threading are multiplicative in core use. Four jobs each with four threads each consumes sixteen cores.

We have thus far demonstrated how to control file-level parallelism (with par_typ) and workflow level parallelism (with job_nbr). The third level of parallelism mentioned above is that ncremap can use OpenMP shared-memory techniques to simultaneosly regrid multiple variables within a single file. This shared memory parallelism is efficient because it requires only a single copy of the regridding weights in physical memory to regrid multiple variable simultaneously. Even so, regridding multiple variables at high resolution may become memory-limited, meaning that the insufficient RAM can often limit the number of variables that the system can simultaneously regrid.

By convention all variables to be regridded share the same regridding weights stored in a map-file, so that only one copy of the weights needs to be in memory, just as in serial-Serial mode. However, the per-thread (or i.e., per-variable) OpenMP memory demands are considerable, with the memory required to regrid variables amounting to no less than about 5-7 times (for type NC_FLOAT) and 2.5-3.5 times (for type NC_DOUBLE) the size of the uncompressed variable, respectively. Memory requirements are so high because the regridder performs all arithmetic in double precision to attain retain the highest accuracy, and must allocate separate buffers to hold the input and output (regridded) variable, a tally array to count the number of missing values and an array to sum the of the weights contributing to each output gridcell (the last two arrays are only necessary for variables with a _FillValue attribute). The input, output, and weight-sum arrays are always double precision, and the tally array is composed of four-byte integers. Given the high memory demands, one strategy to optimize thr_nbr for repetitious workflows is to increase it to keep doubling it (1, 2, 4, ...) until throughput stops improving. With sufficient memory, the NCO regridder scales well up to 8-16 threads.

As an example, consider regridding 100 files with a single map. Say you have a five-node cluster, and each node has 16 cores and can simultaneously regrid two files using eight threads each. (One needs to test a bit to optimize these parameters.) Then an optimal (in terms of wallclock time) invocation would request five nodes with 10 simultaneous jobs of eight threads. On PBS or SLURM batch systems this would involve a scheduler command like qsub -l nodes=5 ... or sbatch --nodes=5 ..., respectively, followed by ncremap --par_typ=mpi --job_nbr=10 --thr_nbr=8 .... This job will likely complete between five and ten-times faster than a serial-mode invocation of ncremap to regrid the same files. The uncertainty range is due to unforeseeable, system-dependent load and I/O charateristics. Nodes that can simultaneously write to more than one file fare better with multiple jobs per node. Nodes with only one I/O channel to disk may be better exploited by utilizing more threads per process.

...