Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is devoted to instruction in ncremap. It describes steps necessary to create grids, and to regrid datasets between different grids with ncremap. Some of the simpler regridding options supported by ncclimo are also described at Generate, Regrid, and Split Climatologies (climo files) with ncclimo. This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.

The Zen of Regridding

Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve that problem simply, including the minimal software installation required. Then we solve the same simple problem in more powerful (workflow-oriented, parallel) ways. Once the basic regridding vocabulary has been introduced, this HOWTO doubles-back to explain how to we solve the prototype problem when one or more inputs are "missing", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:

Software Requirements
Prototypical Regridding I:

...

At a minimum, install a recent version NCO on your executable $PATH with the corresponding library on your $LD_LIBRARY_PATH. NCO installation instructions are here (fxm: link). We highly recommend installing NCO through the conda package. That will automatically install another important piece of the regridding toolchain, the ESMF_RegridWeightGen (aka ERWG) executable. Execute 'ncremap --config' to verify you have a working installation:

zender@aerosol:~$ ncremap --config
ncremap, the NCO regridder and map- and grid-generator, version 4.7.6-alpha03
...[Legal Stuff]...
Config: ncremap running from directory /Users/zender/bin
Config: Calling NCO binaries in directory /Users/zender/bin
Config: Binaries linked to netCDF library version 4.4.1.1
Config: No hardcoded path/module overrides
Config: ESMF weight-generation command ESMF_RegridWeightGen found as /opt/local/bin/ESMF_RegridWeightGen
Config: Tempest weight-generation command GenerateOfflineMap found as /usr/local/bin/GenerateOfflineMap

Only NCO is required for basic ncremap operation. To create new mapfile you will want ERWG. TempestRemap is not yet available in a pre-packaged format and must be built from scratch. It is only required for power-users. Make sure ncremap reports a sufficiently working status as above before proceeding further.

The Prototype Regridding Problem I: Use Existing Map-file

...

Use Existing Map-file
Prototypical Regridding II: Create Map-file from Known Grid-files
Prototypical Regridding III: Infer Grid-file from Data-file
Prototypical Regridding IV: Manual Grid-file Generation
Intermediate Regridding I: MPAS-mode (TBD)
Intermediate Regridding II: Renormalization (TBD)
Intermediate Regridding III: TempestRemap (TBD)
Intermediate Regridding IV: Parallelism (TBD)
Advanced Regridding I: Regional SE Output (RRG-mode) (Done!)
Advanced Regridding II: Sub-Gridscale Regridding (SGS-mode) (TBD)
Advanced Regridding III: Make All Weight Files (MWF-mode) (TBD)

Software Requirements:

At a minimum, install a recent version NCO on your executable $PATH with the corresponding library on your $LD_LIBRARY_PATH. NCO installation instructions are here (fxm: link). We highly recommend installing NCO through the conda package. That will automatically install another important piece of the regridding toolchain, the ESMF_RegridWeightGen (aka ERWG) executable. Execute 'ncremap --config' to verify you have a working installation:

zender@aerosol:~$ ncremap --config
ncremap, the NCO regridder and map- and grid-generator, version 4.7.6-alpha03
...[Legal Stuff]...
Config: ncremap running from directory /Users/zender/bin
Config: Calling NCO binaries in directory /Users/zender/bin
Config: Binaries linked to netCDF library version 4.4.1.1
Config: No hardcoded path/module overrides
Config: ESMF weight-generation command ESMF_RegridWeightGen found as /opt/local/bin/ESMF_RegridWeightGen
Config: Tempest weight-generation command GenerateOfflineMap found as /usr/local/bin/GenerateOfflineMap

Only NCO is required for basic ncremap operation. To create new mapfile you will want ERWG. TempestRemap is not yet available in a pre-packaged format and must be built from scratch. It is only required for power-users. Make sure ncremap reports a sufficiently working status as above before proceeding further.

Prototypical Regridding I: Use Existing Map-file

The regridding problem most commonly faced is converting output from a standard resolution model simulation to equivalent data on a different grid for visualization or intercomparison with other data. The EAM v1 model low-resolution simulations are performed and output on the ne30np4 SE (spectral element) grid, aka the "source grid". Data on this source grid have only one horizontal dimension (i.e., 1D) which makes them difficult to visualize. The recommended (fxm: link) 2D grid for analysis of these simulations is the 129x256 FV (finite-volume) grid, aka the "destination grid". The single most important capability of a regridder is the intelligent application of weights that transform data on the input grid to the desired output grid. These weights are stored in a "map-file", a single file that contains all the necessary weights and grid information necessary. Most regridding problems revolve around creating the appropriate map-file. This prototype problem is well-trod ground, so the appropriate map-file (map.nc) already exists (fxm: link) and ncremap can immediately transform the input dataset (dat_src.nc) to the output (regridded) dataset (dat_rgr.nc):

...

When an output directory is not specified, ncremap writes to the current working directory. When the output and input directories are the same, ncremap appends a string (based on the destination grid resolution) to each input filename to avoid name collisions. Finally, be aware that multiple-file invocations of ncremap execute serially by default. Power users will want to parallelize this as described in the section on "Intermediate Regridding".

...

Prototypical Regridding

...

II: Create Map-file from Known Grid-files

The simplest regridding procedure applies an existing map-file to your data, as in the above example. If the desired map-file cannot be found, then you must create it. Creating a map-file requires a complete specification of both source and destination grids. The files that contain these grid specifications are called "grid-files". E3SM grid-files are available here (fxm: link). At most DOE High Performance Computing (HPC) centers these can also be found in my (@czender's) directory as ~zender/data/grids. Take a minute now to look there for the prototype problem grid-files, i.e., for FV 129x256 and ne30np4 grid-files.

...

One valid option argument for each supported interpolation type is shown separated by vertical bars. The arguments shown have multiple synonyms that are equivalent. For example, "-a conserve" is equivalent to "-a aave" and to "--alg_typ=conservative". Use the longer option form for clarity and precision, and the shorter form for conciseness. The full list of synonyms, and the complete documentation, is at http://nco.sf.net/nco.html#alg_typ. "bilinear" and "conservative" are the most-used algorithms. Peruse the list of options now, though defer a thorough investigation until you reach the "Intermediate Regridding" section.

...

Prototypical Regridding

...

III: Infer Grid-file from Data-file

Thus far we have explained how to apply a map-file to data, and how, if necessary, to generate a map-file from known grids. What if there is no map-file and the source or the destination grid-files (or both) are unavailable? Often, one knows the basic information about a grid (e.g., resolution) but lacks the grid-file that contains the complete information for that grid geometry. In such cases, one must create the grid-file via one of two methods. First, one can let ncremap attempt to infer the grid-file from a data file known to be on the desired grid. This procedure is called "inferral" and is fairly painless. Second, one can feed NCO all the required parameters and it will generate a grid-file. This requires a precise specification of the grid geometry, and will be covered the sub-section on "Manual Grid-file Generation".

...

ncremap -M -d dat_dst.nc -I drc_in -O drc_rgr

...

Prototypical Regridding

...

IV: Manual Grid-file Generation

If a desired grid-file is unavailable, and no dataset on that grid is available (so inferral cannot be used), then one must manually create a new grid. Users create new grids for many reasons including dataset intercomparisons, regional studies, and fine-tuned graphics. NCO and ncremap support manual generation of the most common rectangular grids as SCRIP-format grid-files. Create a grid by supplying ncremap with a grid-file name and "grid-formula" (grd_sng) that consists, at a minimum, the grid-resolution. The grid-formula is a hash-separated string of name-value pairs each representing a grid parameter. All parameters except grid resolution have reasonable defaults, so a grid-formula can be as simple as "latlon=180,360":

...

The "extra" quotation marks protect the spaces in the title string from being interpreted as option separators.

Intermediate Regridding:

...

Intermediate Feature #XXX: Parallelism

Advanced Regridding:

Advanced Feature #XXX: Regridding Regional SE Output

...

The sections on Prototypical Regridding were intended to be read sequentially and introduced the most frequently required ncremap features. The Intermediate and Advanced regridding sections are an a la carte description of features most useful to particular component models, workflows, and data formats. Peruse these sections in any order.

Intermediate Regridding I: MPAS-mode

Intermediate Regridding II: Renormalization

Intermediate Regridding III: TempestRemap

Intermediate Regridding IV: Parallelism

Advanced Regridding I: Regional SE Output (RRG-mode)

EAM and CAM-SE will produce regional output if requested to with the finclNlonlat namelist parameter. Output for a single region can be higher temporal resolution than the host global simulation. This facilitates detailed yet economical regional process studies. Regional output files are in a special format that we call RRG (for "regional regridding"). There can be An RRG file may contain any number of regions in a single RRG file, however all regions must be rectangular in latitude/longitude. rectangular regions. The coordinates and variables for one region do not interfere with other (possibly overlapping) regions because all variables and dimensions are named with a per-region suffix string, e.g., lat_128e_to_134e_9s_to_16s. ncremap can easily regrid RRG output from an FV-dycore easily because ncremap can infer (as discussed above) the regional grid from any FV data file. This example demonstrates how to regrid Regridding regional SE output. This is crucial for efficiently conducting regional process studies with EAM. Multiple reasons make regridding SE RRG data difficult.
1. SE grids have well-defined weights not vertices for each gridpoint
2. Grid generation software (ESMF and TempestRemap) currently data, however, is more complex because SE gridcells are essentially weights without vertices (as and SE weight-generators are not yet flexible enough to generate the regional weights. To summarize, regridding RRG data leads to three SE-specific difficulties (#1-3 below) and two difficulties (#4-5) shared with FV RRG files:

1. RRG files contain only regional gridcell center locations, not weights
2. Global SE grids have well-defined weights not vertices for each gridpoint
3. Grid generation software (ESMF and TempestRemap) only create global not regional SE grid files
4. Non-standard variable names and dimension names
5. Regional files
3. RRG files contain only SE gridcell center locations, not weights
The RRG file format requires handling two other difficulties, shared by FV and SE data, prior to regridding:
1. Non-standard variable names and dimension names
2. Regional files can contain multiple regions

ncremap requires at least two and up to six additional input arguments for RRG regridding.

...

can contain multiple regions

ncremap's RRG mode resolves these issues to allow trouble-free regridding of SE RRG files. The user must provide two additional input arguments, '--dat_glb=dat_glb' and '--grd_glb=grd_glb' to point to a global SE dataset and grid, respectively, of the same resolution as the model that generated the RRG datasets. Hence a typical RRG regridding invocation is:

ncremap --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

Here grd_rgn is a regional destination grid-file, dat_rgn is the RRG file to regrid, and dat_rgr is the regridded output. Typically grd_rgn is a uniform rectangular grid covering the same region as the RRG file. Generate this as described in the last example in the section above on "Manual Grid-file Generation". grd_glb is the standard dual-grid grid-file for the SE resolution of the simulation, e.g., ne30np4_pentagons.091226.nc. ncremap regrids the global data file dat_glb to the global dual-grid in order to produce a intermediate global file annotated with gridcell vertices. Then it hyperslabs the lat/lon coordinates (and vertices) from the regional domain to use with regridding the RRG file. A grd_glb file with only one 2D field suffices (and is fastest) for producing the information needed by the RRG procedure. One can prepare an optimal dat_glb file by subsetting any 2D variable (e.g., ncks -v FSNT in.nc dat_glb.nc) from a full global SE output dataset.

ncremap RRG mode supports two additional options to override parameters set internally. First, the per-region suffix string may be set with '--rnm_sng=rnm_sng'. RRG mode will, by default, regrid the first region it finds in an RRG file. Explicitly set the desired region with rnm_sng for files with multiple regions, e.g., "--rnm_sng= ". Second, the bounding-box of the region may be explicitly set with '--bb_wesn=lon_wst,lon_est,lat_sth,lat_nrt'. The normal parsing of the bounding-box string from the suffix string may fail in (as yet undiscovered) corner cases, and the "--bb_wesn" option provides a workaround. The bounding-box string must include the entire RRG region, specified in WESN order. The two override options may be used independently or together, as in:

ncremap --rnm_sng='_128e_to_134e_9s_to_16s' --bb_wesn='128.0,134.0,-16.0,-9.0' --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

RRG-mode supports most normal ncremap options, including input and output methods and regridding algorithms.

Advanced Regridding II: Sub-Gridscale Regridding (SGS-mode)

Advanced Regridding III: Make All Weight Files (MWF-mode)