Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is devoted to instruction in ncremap. It describes steps necessary to create grids, and to regrid datasets between different grids with ncremap. Some of the simpler regridding options supported by ncclimo are also described at Generate, Regrid, and Split Climatologies (climo files) with ncclimo. This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.

The Zen of Regridding

Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve that problem simply, including the minimal software installation required. Once the basic regridding vocabulary has been introduced, we solve the prototype problem when one or more inputs are "missing", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:

Software Requirements
Prototypical Regridding I: Use Existing Map-file
Prototypical Regridding II: Create Map-file from Known Grid-files
Prototypical Regridding III: Infer Grid-file from Data-file
Prototypical Regridding IV: Manual Grid-file Generation
Intermediate Regridding I: MPAS-mode (TBD)
Intermediate Regridding II: Renormalization (TBD)
Intermediate Regridding III: TempestRemap (TBDDone)
Intermediate Regridding IV: Parallelism (TBD)
Advanced Regridding I: Regional SE Output (RRG-mode) (Done!)
Advanced Regridding II: Sub-Gridscale Regridding (SGS-mode) (TBD)
Advanced Regridding III: Make All Weight Files (MWF-mode) (TBDDone)

Software Requirements:

At a minimum, install a recent version NCO on your executable $PATH with the corresponding library on your $LD_LIBRARY_PATH. NCO installation instructions are here (fxm: link). We highly recommend installing NCO through the conda package. That will automatically install another important piece of the regridding toolchain, the ESMF_RegridWeightGen (aka ERWG) executable. Execute 'ncremap --config' to verify you have a working installation:

...

The sections on Prototypical Regridding were intended to be read sequentially and introduced the most frequently required ncremap features. The Intermediate and Advanced regridding sections are an a la carte description of features most useful to particular component models, workflows, and data formats. Peruse these sections in any order.

Intermediate Regridding I: MPAS-mode (TBD)

Intermediate Regridding II: Renormalization (TBD)

Intermediate Regridding III: TempestRemap

Intermediate Regridding IV: Parallelism

Advanced Regridding I: Regional SE Output (RRG-mode)

EAM and CAM-SE will produce regional output if requested to with the finclNlonlat namelist parameter. Output for a single region can be higher temporal resolution than the host global simulation. This facilitates detailed yet economical regional process studies. Regional output files are in a special format that we call RRG (for "regional regridding"). An RRG file may contain any number of rectangular regions. The coordinates and variables for one region do not interfere with other (possibly overlapping) regions because all variables and dimensions are named with a per-region suffix string, e.g., lat_128e_to_134e_9s_to_16s. ncremap can easily regrid RRG output from an FV-dycore because ncremap can infer (as discussed above) the regional grid from any FV data file. Regridding regional SE data, however, is more complex because SE gridcells are essentially weights without vertices (as and SE weight-generators are not yet flexible enough to generate the regional weights. To summarize, regridding RRG data leads to three SE-specific difficulties (#1-3 below) and two difficulties (#4-5) shared with FV RRG files:

1. RRG files contain only regional gridcell center locations, not weights
2. Global SE grids have well-defined weights not vertices for each gridpoint
3. Grid generation software (ESMF and TempestRemap) only create global not regional SE grid files
4. Non-standard variable names and dimension names
5. Regional files can contain multiple regions

ncremap's RRG mode resolves these issues to allow trouble-free regridding of SE RRG files. The user must provide two additional input arguments, '--dat_glb=dat_glb' and '--grd_glb=grd_glb' to point to a global SE dataset and grid, respectively, of the same resolution as the model that generated the RRG datasets. Hence a typical RRG regridding invocation is:

ncremap --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

Here grd_rgn is a regional destination grid-file, dat_rgn is the RRG file to regrid, and dat_rgr is the regridded output. Typically grd_rgn is a uniform rectangular grid covering the same region as the RRG file. Generate this as described in the last example in the section above on "Manual Grid-file Generation". grd_glb is the standard dual-grid grid-file for the SE resolution of the simulation, e.g., ne30np4_pentagons.091226.nc. ncremap regrids the global data file dat_glb to the global dual-grid in order to produce a intermediate global file annotated with gridcell vertices. Then it hyperslabs the lat/lon coordinates (and vertices) from the regional domain to use with regridding the RRG file. A grd_glb file with only one 2D field suffices (and is fastest) for producing the information needed by the RRG procedure. One can prepare an optimal dat_glb file by subsetting any 2D variable (e.g., ncks -v FSNT in.nc dat_glb.nc) from a full global SE output dataset.

ncremap RRG mode supports two additional options to override parameters set internally. First, the per-region suffix string may be set with '--rnm_sng=rnm_sng'. RRG mode will, by default, regrid the first region it finds in an RRG file. Explicitly set the desired region with rnm_sng for files with multiple regions, e.g., "--rnm_sng= ". Second, the bounding-box of the region may be explicitly set with '--bb_wesn=lon_wst,lon_est,lat_sth,lat_nrt'. The normal parsing of the bounding-box string from the suffix string may fail in (as yet undiscovered) corner cases, and the "--bb_wesn" option provides a workaround. The bounding-box string must include the entire RRG region, specified in WESN order. The two override options may be used independently or together, as in:

ncremap --rnm_sng='_128e_to_134e_9s_to_16s' --bb_wesn='128.0,134.0,-16.0,-9.0' --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

RRG-mode supports most normal ncremap options, including input and output methods and regridding algorithms.

Advanced Regridding II: Sub-Gridscale Regridding (SGS-mode)

...

TempestRemap is the chief alternative to ERWG for regridding weight-generation, and is sleighted for increased adoption in E3SM. Tempest algorithms, written by @paulullrich, have many numerical advantages described in papers and at https://acme-climate.atlassian.net/wiki/spaces/Docs/pages/178848194/Transition+to+TempestRemap+for+Atmosphere+grids. Tempest lacks pre-built binary package support under most OSs and on most DOE machines so users must often download and install it by following the instructions at https://github.com/ClimateGlobalChange/tempestremap. Verify ncremap can access your Tempest installation as described in the above section on "Software Requirements". Once installed, Tempest can be as easy to use as ERWG with SCRIP grid-files, e.g.,

ncremap -a tempest -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_tempest.YYYYMMDD.nc

This command, the same as shown in the "Create Map-file from Known Grid-files" section above, except using algtyp='tempest', is a jumping-off point for understanding Tempest features and quirks. First, simply note that the ncremap interface for ERWG and Tempest are the same even though the underlying ERWG and Tempest applications have different APIs. Second, Tempest accepts SCRIP-format grids (as shown) and Exodus-format grid-files, also stored in netCDF though typically with a '.g' suffix, e.g., ne30.g as described on the Transition-to-Tempest page just mentioned. Exodus grid-files contain grid "connectivity" and other information required to optimally treat advanced grids like SE. Hence this also works

ncremap -a tempest -s ne30.g -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_tempest.YYYYMMDD.nc

This produces subtly different weights because ne30.g encodes the SE ne30np4 grid-specification, not its dual-grid FV representation. Experiment, intercompare, and find what works best. As mentioned in the Transition-to-Tempest page, the Tempest overlap-mesh generator expects to be sent the two grid-files in the order smaller then larger (ERWG has no corresponding restriction). For example, Tempest considers the global ocean to be a smaller domain than the global atmosphere since it covers less area (due to masked points). Hence ncremap must be told when the source grid-file covers a larger domain than the destination. Do this with the "--a2o" or "--l2s" switch (for "atmosphere-to-ocean" or "large-to-small", respectively):

ncremap --l2s -a se2fv_flx -s atm_se_grd.nc -g ocn_fv_grd.nc -m map.nc # Source larger than destination
ncremap -a fv2se_flx -s ocn_fv_grd.nc -g atm_se_grd.nc -m map.nc # Source smaller than destination
ncremap -a se2fv_flx -s atm_se_grd.nc -g atm_fv_grd.nc -m map.nc # Source same size as destination

Finally, the Tempest-transition page describes six specific global FV<->SE mappings that optimize different combinations of accuracy, conservation, and monotonicity desirable for remapping flux-variables (flx), state-variables (stt), and an alternative (alt) mapping. A plethora of boutique options and switches control the Tempest weight-generation algorithms for these six cases. To simplify their invocation, ncremap names these six algorithms fv2se_stt, fv2se_flx, fv2se_alt, se2fv_stt, se2fv_flx, and se2fv_alt. E3SM maps with these algorithms have adopted the suffixes "mono" (for both se2fv_flx and fv2se_alt), "highorder" (se2fv_stt, fv2se_stt), "intbilin" (se2fv_alt), and "monotr" (fv2se_flx). The six relevant Tempest map-files can be generated with

ncremap -a fv2se_flx -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_monotr.20180901.nc
ncremap -a fv2se_stt -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_highorder.20180901.nc
ncremap -a fv2se_alt -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_mono.20180901.nc
ncremap --l2s -a se2fv_flx -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_mono.20180901.nc
ncremap --l2s -a se2fv_stt -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_highorder.20180901.nc
ncremap --l2s -a se2fv_alt -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_intbilin.20180901.nc

These maps, applied to appropriate flux and state variables, should exactly reproduce the online remapping in the E3SM v2/v3 coupler. However, explicitly generating all six standard maps this way is not recommended because ncremap includes an MWF-mode (for "Make All Weight Files") described below. MWF-mode generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps with both ERWG and Tempest.

Intermediate Regridding IV: Parallelism (TBD)

Advanced Regridding I: Regional SE Output (RRG-mode)

EAM and CAM-SE will produce regional output if requested to with the finclNlonlat namelist parameter. Output for a single region can be higher temporal resolution than the host global simulation. This facilitates detailed yet economical regional process studies. Regional output files are in a special format that we call RRG (for "regional regridding"). An RRG file may contain any number of rectangular regions. The coordinates and variables for one region do not interfere with other (possibly overlapping) regions because all variables and dimensions are named with a per-region suffix string, e.g., lat_128e_to_134e_9s_to_16s. ncremap can easily regrid RRG output from an FV-dycore because ncremap can infer (as discussed above) the regional grid from any FV data file. Regridding regional SE data, however, is more complex because SE gridcells are essentially weights without vertices (as and SE weight-generators are not yet flexible enough to generate the regional weights. To summarize, regridding RRG data leads to three SE-specific difficulties (#1-3 below) and two difficulties (#4-5) shared with FV RRG files:

1. RRG files contain only regional gridcell center locations, not weights
2. Global SE grids have well-defined weights not vertices for each gridpoint
3. Grid generation software (ESMF and TempestRemap) only create global not regional SE grid files
4. Non-standard variable names and dimension names
5. Regional files can contain multiple regions

ncremap's RRG mode resolves these issues to allow trouble-free regridding of SE RRG files. The user must provide two additional input arguments, '--dat_glb=dat_glb' and '--grd_glb=grd_glb' to point to a global SE dataset and grid, respectively, of the same resolution as the model that generated the RRG datasets. Hence a typical RRG regridding invocation is:

ncremap --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

Here grd_rgn is a regional destination grid-file, dat_rgn is the RRG file to regrid, and dat_rgr is the regridded output. Typically grd_rgn is a uniform rectangular grid covering the same region as the RRG file. Generate this as described in the last example in the section above on "Manual Grid-file Generation". grd_glb is the standard dual-grid grid-file for the SE resolution of the simulation, e.g., ne30np4_pentagons.091226.nc. ncremap regrids the global data file dat_glb to the global dual-grid in order to produce a intermediate global file annotated with gridcell vertices. Then it hyperslabs the lat/lon coordinates (and vertices) from the regional domain to use with regridding the RRG file. A grd_glb file with only one 2D field suffices (and is fastest) for producing the information needed by the RRG procedure. One can prepare an optimal dat_glb file by subsetting any 2D variable (e.g., ncks -v FSNT in.nc dat_glb.nc) from a full global SE output dataset.

ncremap RRG mode supports two additional options to override parameters set internally. First, the per-region suffix string may be set with '--rnm_sng=rnm_sng'. RRG mode will, by default, regrid the first region it finds in an RRG file. Explicitly set the desired region with rnm_sng for files with multiple regions, e.g., "--rnm_sng= ". Second, the bounding-box of the region may be explicitly set with '--bb_wesn=lon_wst,lon_est,lat_sth,lat_nrt'. The normal parsing of the bounding-box string from the suffix string may fail in (as yet undiscovered) corner cases, and the "--bb_wesn" option provides a workaround. The bounding-box string must include the entire RRG region, specified in WESN order. The two override options may be used independently or together, as in:

ncremap --rnm_sng='_128e_to_134e_9s_to_16s' --bb_wesn='128.0,134.0,-16.0,-9.0' --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

RRG-mode supports most normal ncremap options, including input and output methods and regridding algorithms.

Advanced Regridding II: Sub-Gridscale Regridding (SGS-mode) (TBD)

Advanced Regridding III: Make All Weight Files (MWF-mode)

As mentioned above in the TempestRemap section, ncremap includes an MWF-mode (for "Make All Weight Files") that generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps with both ERWG and Tempest. MWF-mode automates the laborious and error-prone process of generating numerous map-files with various switches. Its chief use occurs when developing and testing new global grid-pairs for the E3SM atmosphere and ocean components. Invoke MWF-mode with a number of specialized options to control the naming of the output map-files:

ncremap -P mwf -s grd_ocn -g grd_atm --nm_src=ocn_nm --nm_dst=atm_nm --dt_sng=date

where grd_ocn is the "global" ocean grid, grd_atm, is the global atmosphere grid, nm_src sets the shortened name for the source (ocean) grid as it will appear in the output map-files, nm_dst sets, similarly, the shortend named for the destination (atmosphere) grid, and dt_sng sets the date-stamp in the output map-file name map_${nm_src}_to_${nm_dst}_${alg_typ}.${dt_sng}.nc. Setting nm_src, nm_dst, and dt_sng is optional though highly recommended. For example,

ncremap -P mwf -s ocean.RRS.30-10km_scrip_150722.nc -g t62_SCRIP.20150901.nc --nm_src=oRRS30to10 --nm_dst=T62 --dt_sng=20180901

produces the 12 map-files:

map_oRRS30to10_to_T62_aave.20180901.nc
map_oRRS30to10_to_T62_blin.20180901.nc
map_oRRS30to10_to_T62_ndtos.20180901.nc
map_oRRS30to10_to_T62_nstod.20180901.nc
map_oRRS30to10_to_T62_patc.20180901.nc
map_oRRS30to10_to_T62_tempest.20180901.nc
map_T62_to_oRRS30to10_aave.20180901.nc
map_T62_to_oRRS30to10_blin.20180901.nc
map_T62_to_oRRS30to10_ndtos.20180901.nc
map_T62_to_oRRS30to10_nstod.20180901.nc
map_T62_to_oRRS30to10_patc.20180901.nc
map_T62_to_oRRS30to10_tempest.20180901.nc

The ordering of source and destination grids is immaterial for ERWG maps since MWF-mode produces all map combinations. However, as described above in the TempestRemap section, the Tempest overlap-mesh generator must be called with the smaller grid preceding the larger grid. For this reason, always invoke MWF-mode with the smaller grid (i.e., the ocean) as the source, otherwise some Tempest map-file will fail to generate. The six optimized SE<->FV Tempest maps described above in the TempestRemap section will be generated when the destination grid has a ".g" suffix which ncremap interprets as indicating an Exodus-format SE grid (NB: this assumption is an implementation convenience that can be modified if necessary). For example,

ncremap -P mwf -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g --nm_src=oRRS30to10 --nm_dst=ne30np4 --dt_sng=20180901

produces the 6 map-files:

map_oRRS30to10_to_ne30np4_monotr.20180901.nc
map_oRRS30to10_to_ne30np4_highorder.20180901.nc
map_oRRS30to10_to_ne30np4_mono.20180901.nc
map_ne30np4_to_oRRS30to10_mono.20180901.nc
map_ne30np4_to_oRRS30to10_highorder.20180901.nc
map_ne30np4_to_oRRS30to10_intbilin.20180901.nc

MWF-mode takes significant time to complete (~20 minutes on my MacBookPro) for the above grids. To accelerate this, consider installing the MPI-enabled instead of the serial version of ERWG. Then use the --wgt_cmd option to tell ncremap the MPI configuration to invoke ERWG with, for example:

ncremap -P mwf --wgt_cmd='mpirun -np 12 ESMF_RegridWeightGen' -s ocean.RRS.30-10km_scrip_150722.nc -g t62_SCRIP.20150901.nc --nm_src=oRRS30to10 --nm_dst=T62 --dt_sng=20180901

Background and distributed node parallelism (as described above in the the Parallelism section) of MWF-mode are possible though not yet implemented. Please let us know if this feature is desired.