Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is devoted to instruction in ncremap. It describes steps necessary to create grids, and to regrid datasets between different grids with ncremap. Some of the simpler regridding options supported by ncclimo are also described at Generate, Regrid, and Split Climatologies (climo files) with ncclimo. This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.

The Zen of Regridding

Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve that problem simply, including the minimal software installation required. Once the basic regridding vocabulary has been introduced, we solve the prototype problem when one or more inputs are "missing", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:

...

zender@firn:~$ ncremap
...

-a alg_typ Algorithm for weight generation (default bilinearnco_con) [alg_typ, algorithm, regrid_algorithm]
ESMF algorithms: bilinear|conserve|conserve2nd|nearestdtos|neareststod|patch
NCO algorithms: nco_con (1st order conservative algorithm similar to "conserve")
Tempest algorithms: tempest|se2fvfv2fv|fv2fv_flx|fv2fv_stt|fv2se_flx|se2fvfv2se_stt|se2fvfv2se_alt|fv2sese2fv_flx|fv2sese2fv_stt|fv2sese2fv_alt|se2se|tempest

...

One valid option argument for each supported interpolation type is shown separated by vertical bars. The arguments shown have multiple synonyms that are equivalent. For example, "-a conserve" is equivalent to "-a aave" and to "--alg_typ=conservative". Use the longer option form for clarity and precision, and the shorter form for conciseness. The full list of synonyms, and the complete documentation, is at http://nco.sf.net/nco.html#alg_typ. The NCO algorithm "nco_conserve" is the default algorithm. Commonly-used algorithms that invoke ERWG are "bilinear" and "conservative" are commonly-used algorithms that invoke ERWG. TR options are discussed below. Peruse the list of options now, though defer a thorough investigation until you reach the "Intermediate Regridding" section.

...

Thus far we have explained how to apply a map-file to data, and how, if necessary, to generate a map-file from known grids. What if there is no map-file and the source or the destination grid-files (or both) are unavailable? Often, one knows the basic information about a grid (e.g., resolution) but lacks the grid-file that contains the complete information for that grid geometry. In such cases, one must create the grid-file via one of two methods. First, one can let ncremap attempt to infer the grid-file from a data file known to be on the desired grid. This procedure is called "inferral" and is fairly painless. Second, one can feed NCO ncremap all the required parameters (for rectangular grids only) and it will generate a grid-file. This requires a precise specification of the grid geometry, and will be covered the sub-section on "Manual Grid-file Generation".

Before we describe what the inferral procedure does, here is an example that demonstrates how easy it is. You can regrid an SE dataset from our prototype example to the same grid as an evaluation dataset. Pick any 2D (i.e., latxlon) dataset to compare the SE data to. Inferral uses the grid information in the evaluation dataset, which is already on the desired destination grid, to create the (internally generated) destination grid-file. Supply ncremap with any dataset on the desired destination grid (dat_dst.nc) with "-d" (for "dataset") instead of "-g" (for "grid"):

ncremap -s ne30np4_pentagons.091226.nc -d dat_dst.nc -m map_ne30np4_to_1x1_bilin.YYYYMMDD.nc

This tells ncremap to infer the destination grid-file and to use it to generate the desired map-file, named with the supposed destination resolution (here, 1x1 degree). To archive the inferred destination grid for later use, supply ncremap with a name for itthe grid with "-g":

ncremap -s ne30np4_pentagons.091226.nc -d dat_dst.nc -g grd_dst.nc -m map_ne30np4_to_1x1_bilin.YYYYMMDD.nc # Requires NCO version >= 4.7.6

Of course one can infer a grid without having to regrid anything. Supply ncremap with a data-template file (dst_dst.nc) and a grid-file name (grd.nc). Since there are no input files to regrid, ncremap exits after inferring the grid-file:

...

Grid-inferral is easier to try than manual grid-generation, and will work if the data file contains the necessary information. The only data needed to construct a SCRIP grid-file are the vertices of each gridcell. The gridcell vertices define the gridcell edges and these in turn define the gridcell area which is equivalent to the all-important weight parameter necessary to regrid data. Of course the gridcell vertices must be stored with recognizable names and/or metadata indicators. The Climate & Forecast (CF) metadata convention calls the gridcell vertices the "cell-bounds". Coordinates (like latitude and longitude) usually store cell-center values, and should, according to CF, have "bounds" attributes whose values point to variables (e.g., "lat_bounds" or "lon_vertices") that contain the actual vertices. Relatively few datasets "in the wild" contain gridcell vertices, though the practice is, happily, on the rise. Formally SE models have nodal points with weights without any fixed perimeter or vertices assigned to the nodes, so the lack of vertices in the SE model output is a feature, not an oversight. The dual-grid (referenced above) addresses this by defining "pretend" gridcell vertices for each nodal point so that an SE dataset can be treated like an FV dataset.

Inferral works well on important categories of grids for which ncremap can guess the missing grid information. In the absence of gridcell vertice information, ncremap examines the location of and spacing between gridcell centers and can often determine from these what type of grid a data-file (not a grid-file!) is stored on. A data-file simply means the filetype preferred for creation/distribution of modeled/observed data. Hence ncremap has the (original and unique, so far as we know) ability to infer all useful rectangular grid-types from data-files that employ the grid. The key constraint here is "rectangular", meaning the grid must be orthogonal (though not necessarily regularly spaced) in latitude and longitude. This includes all uniform angle grids, FV grids, and Gaussian grids. For curvilinear grids (including most swath data), ncremap infers the underlying grid to be the set of curves that bisect the curves created by joining the gridcell centers. This often works well for curvilinear grids that do not cross a pole. Inferral works for unstructured (i.e., 1D) grids only when the cell-bounds are stored in the datafile as described above. Hence inferral will not work on raw output from SE models.

A few more examples will flesh-out how inferral can be used. First, ncremap can infer both source and destination grids in one command:

...

Here the user provides only data files (no grid- or map-files) yet still obtains regridded output! The first positional (i.e., not immediately preceded by an option indicator) argument (dat_src.nc) is interpreted as the data to regrid, and the second positional argument (dat_rgr.nc) is the output name. The -d argument is the name of any dataset (dat_dst.nc) on the desired destination grid. ncremap infers the source grid from dat_src.nc, then infers the destination grid from dat_dst.nc, then generates weights (with the default algorithm since none is specified) and creates a map-file and that it uses it to regrid dat_src.nc to dat_rgr.nc. No grid-file or map-file names were specified (with -g or -m) so both grid-files and the map-file are generated internally in temporary locations and erased after use.

Second, this procedure, like most ncremap features, works for multiple input files:

...

Unless a map-file or source grid-file is explicitly provided (with -m or -s, respectively), ncremap infers a separate source grid-file (and computes a corresponding map-file) for each input file. This so allows it can to regrid lists of uniquely gridded data (e.g., satellite swaths each on its own grid) to a common destination grid. When all source files are on the same grid (as is typical with models), then turn-off the expensive multiple inferral and map-generation procedures with the -M switch to save time:

ncremap -M -d dat_dst.nc -I drc_in -O drc_rgr

...

If a desired grid-file is unavailable, and no dataset on that grid is available (so inferral cannot be used), then one must manually create a new grid. Users create new grids for many reasons including dataset intercomparisons, regional studies, and fine-tuned graphics. NCO and ncremap support manual generation of the most common rectangular grids as SCRIP-format grid-files. Create a grid by supplying ncremap with a grid-file name and "grid-formula" (grd_sng) that consistscontains, at a minimum, the grid-resolution. The grid-formula is a hash-separated string of name-value pairs each representing a grid parameter. All parameters except grid resolution have reasonable defaults, so a grid-formula can be as simple as "latlon=180,360":

ncremap -g grd.nc -G latlon=180,360

Once created, the Congratulations! The new grid-file grd.nc is a valid source or destination grid for ncremap commands.

Grid-file generation documentation in the NCO Users Guide at http://nco.sf.net/nco.html#grid describes all the grid parameters and contains many examples. Note that the examples in this section use grid generation API for ncremap version 4.7.6 (August, 2018) and later. Earlier versions can use the ncks API explained in the Users Guide.

The most useful grid parameters (besides resolution) are latitude type (lat_typ), longitude type (lon_typ), title (ttl), and, for regional grids, the SNWE bounding box (snwe). The three supported varieties of global rectangular grids are Uniform/equiangular (lat_typ=uni), Cap/FV (lat_typ=cap), and Gaussian (lat_typ=gss). The four supported varieties of longitude types are the first (westernmost) gridcell centered at Greenwich (lon_typ=grn_ctr), western edge at Greenwish (grn_wst), or at the Dateline (lon_typ=180_ctr and lon_typ=180_wst, respectively). Grids are global, uniform, and have their first longitude centered at Greenwich by default. The grid-formula for this is 'lat_typ=uni#lon_typ=grn_ctr'. Some examples (remember, this API requires NCO 4.7.6+):

...