Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is devoted to instruction in NCO’s regridding operator, ncremap. It describes steps necessary to create grids, and to regrid datasets between different grids with ncremap. Some of the simpler regridding options supported by ncclimo are also described at Generate, Regrid, and Split Climatologies (climo files) with ncclimo. This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.

The Zen of Regridding

Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve that problem simply, including the minimal software installation required. Once the basic regridding vocabulary has been introduced, we solve the prototype problem when one or more inputs are "missing", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:

...

At a minimum, install a recent version NCO on your executable $PATH with the corresponding library on your $LD_LIBRARY_PATH. NCO installation instructions are here. Unless you have reason to do otherwise, we recommend installing NCO through the Conda package (conda install -c conda-forge nco) or via activating the E3SM-Unified environment. The Conda NCO package automagically installs two other important regridding tools, the ESMF_RegridWeightGen (aka ERWG) executable and the TempestRemap (aka TR) executables. Execute 'ncremap --config' to verify you have a working installation:

zender@aerosol:~$ ncremap --config
ncremap, the NCO regridder and grid, map, and weight-generator, version 4.9.3-alpha02 "Fuji Rolls"
...[Legal Stuff]...
Config: ncremap script located in directory /Users/zender/bin
Config: NCO binaries located in directory /Users/zender/bin, linked to netCDF library version 4.7.3
Config: No hardcoded machine-dependent path/module overrides. (If desired, turn-on NCO hardcoded paths at supported national labs with "export NCO_PATH_OVERRIDE=Yes").
Config: External (non-NCO) program availability:
Config: ESMF weight-generation command ESMF_RegridWeightGen version 7.1.0r found as /opt/local/bin/ESMF_RegridWeightGen
Config: MOAB-Tempest weight-generation command mbtempest not found
Config: MPAS depth coordinate addition command add_depth.py found as /Users/zender/bin/add_depth.py
Config: TempestRemap weight-generation command GenerateOfflineMap found as /usr/local/bin/GenerateOfflineMap

Only NCO is required for many operations including applying existing regridding weights (aka, regridding) and/or generating grids, maps, or conservative weights with the NCO algorithms. Generating new weights (and map-files) with ERWG or TR requires that you install those packages (both of which come with the NCO Conda package). MOAB-Tempest (MBTR) is not yet available in a pre-packaged format and must currently be built from scratch (MBTR is expected to come as a Conda package sometime in 2020). MBTR is only required for power-usersonly required to generate TR weights on the largest meshes. It is also available as a Conda package, and comes with the E3SM-Unified environment. Make sure ncremap reports a sufficiently working status as above before proceeding further.

...

This solution is deceptively simple because it conceals the choices, paths, and options required to create the appropriate map.nc for all situations. We will discuss creating map.nc later after showing more powerful and parallel ways to solve the prototype problem. The solution above only works for users savvy enough to know how to find appropriate pre-built map-files. E3SM mapMap-files used by the E3SM model are available at https://web.lcrc.anl.gov/public/e3sm/inputdata/cpl/gridmaps/ . Many commonly used maps and grids Additional map-files useful in post-processing are available at https://web.lcrc.anl.gov/public/e3sm/diagnostics/maps/. Many commonly used maps and grids can also be found in my (@czender's) directories as ~zender/data/[maps,grids] at most DOE High Performance Computing (HPC) centers. Take a minute now to look at these locations.

...

Code Block
ncremap -m map_ne30np4_to_fv129x256_aave.20150901.nc dat_src.nc dat_rgr.nc # EAMv1
ncremap -m map_ne30pg2_to_cmip6_180x360_aave.20200201.nc dat_src.nc dat_rgr.nc # EAMv2
ncremap -m map_ne30pg2_to_cmip6_180x360_traave.20231201.nc dat_src.nc dat_rgr.nc # EAMv3

Before looking into map-file generation in the next section, try a few ncremap features. For speed's sake, regrid only selected variables:

...

The simplest regridding procedure applies an existing map-file to your data, as in the above example (public servers of pre-existing map-files are also linked to above). At most DOE High Performance Computing (HPC) centers these and others can also be found in my (Charlie Zender 's) directory, ~zender/data/maps. If the desired map-file cannot be found, then you must create it. Creating a map-file requires a complete specification of both source and destination grids (meshes). The files that contain these grid specifications are called "grid-files". Many E3SM grid-files are available at publicly available within model-specific directories of the previous location, e.g., https://web.lcrc.anl.gov/public/e3sm/inputdata/ocn/cplmpas-o/gridmaps/ . At most DOE oEC60to30v3/ . Many grids useful for post-processing are publicly served from https://web.lcrc.anl.gov/public/e3sm/diagnostics/grids/. At most DOE High Performance Computing (HPC) centers these can also be found in my (@czender's) directory as , ~zender/data/grids. Take a minute now to look there for the prototype problem grid-files, ie.eg., for FV 129x256, cmip6_180x360, and ne30np4 ne30pg2 grid-files.

You might find multiple grid-files that contain the string 129x256. Grid-file names are often ambiguous. The grid-file global metadata (ncks -M grid.nc) often displays the source of the grid. These metadata, and sometimes the actual data (fxm: link), are usually more complete and/or accurate in files with a YYYYMMDD-format date-stamp. For example, the metadata in file 129x256_SCRIP.20150901.nc clearly state it is an FV grid and not some other type of grid with 129x256 resolution. The metadata in 129x256_SCRIP.130510 tell the user nothing about the grid boundaries, and some of the data are flawed. When grids seem identical except for their date-stamp, use the grid with the later date-stamp. The curious can examine a grid-file (ncks -M -m grid.nc) and easily see it looks completely different from a typical model or observational data file. Grid-files and data-files are not interchangeable.

Multiple grid-files also contain the string ne30. These are either slightly different grids, or the same grids store in different formats meant for different post-processing tools. The different spectral element (SE) and Finite Volume (and FV) grid types are described with figures and described here (https://acme-climate.atlassian.net/wiki/spaces/Docs/pages/34113147/Atmosphere+Grids). As explained there, for E3SMv1 data many people will want the "dual-grid" with pentagons. The correct grid-file for this is ne30np4_pentagons.091226.nc. Do not be tempted by SE grid-files named with latlon. Datasets from E3SM v2 and v3 simulations are all on FV grids. EAM v2 and v3 grids are named in the format neXXXpg2. ELM and MPAS names take a wider variety of forms, many of which appear below.

All grid-files discussed so far are in SCRIP-format, named for the Spherical Coordinate Remapping and Interpolation Package (authored by @pjones). Other formats exist and are increasingly important, especially for SE grids. For now just know that these other formats are also usually stored as netCDF, and that some tools allow non-SCRIP formats to be used interchangeably with SCRIP.

...

The map-files above are named with alg_typ=nco because the ncremap default interpolation algorithm is the first-order conservative NCO algorithm (NB: before NCO 4.9.1 the default algorithm was ESMF bilin). To re-create the aave map in the first example, invoke ncremap with -a esmfaave (the newest v3 naming convention) or -a conserve (same algorithm, different name in v1, v2):

Code Block
ncremap -a conserve -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_aave.YYYYMMDD.nc # EAMv1
ncremap -a conserve -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_aave.YYYYMMDD.nc # EAMv2

...


ncremap -a esmfaave -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_esmfaave.YYYYMMDD.nc # EAMv3

This takes a few minutes, so save custom-generated map-files for future use. Computing weights to generate map-files is much more computationally expensive and time-consuming than regridding, i.e., than applying the weights in the map-file to the data. We will gloss over most options that weight-generators can take into consideration, because their default values often work well. One option worth knowing now is -a. The invocation synonyms for -a are --alg_typ, --algorithm, and --regrid_algorithm. These are listed in the square brackets in the self-help message that ncremap prints when it is invoked without argument, or with --help:

...

-a

...

alg_typ

...

Algorithm

...

for

...

weight

...

generation

...

(default

...

ncoaave)

...

[alg_typ,

...

algorithm,

...

regrid_algorithm]

...


ESMF

...

algorithms:

...

esmfbilin,bilinear|esmfaave,aave,conserve|conserve2nd|nearestdtos|neareststod|patch

...


NCO

...

algorithms:

...

ncoaave,nco,nco_con

...

One |ncoidw,nco_dwe (inverse-distance-weighted interpolation/extrapolation)
Tempest (and MOAB-Tempest) algorithms: traave,fv2fv_flx|trbilin|trfv2|trintbilin|tempest|fv2fv|fv2fv_stt

At least one valid option argument for each supported interpolation type is shown separated by vertical bars. The arguments shown have multiple synonyms, separated by commas, that are equivalent. For example, -a conserveesmfaave is equivalent to -a aave and to --alg_typ=conservativeconserve. Use the longer option form for clarity and precision, and the shorter form for conciseness. The full list of synonyms, and the complete documentation, is at http://nco.sf.net/nco.html#alg_typ. The NCO algorithm nco_conserve ncoaave is the default algorithm(because it is always availabe). Commonly-used algorithms that invoke ERWG are bilinear esmfbilin and conservative esmfaave. TR options are discussed below. As of E3SM v3, TR algorithms are preferred for all mappings (because they are more accurate). Peruse the list of options now, though defer a thorough investigation until you reach the "Intermediate Regridding" section.

...

The most useful grid parameters (besides resolution) are latitude type (lat_typ), longitude type (lon_typ), title (ttl), and, for regional grids, the SNWE bounding box (snwe). The three supported varieties of global rectangular grids are Uniform/equiangular (lat_typ=uni), Cap/FV (lat_typ=cap), and Gaussian (lat_typ=gss). The four supported varieties of longitude types are the first (westernmost) gridcell centered at Greenwich (lon_typ=grn_ctr), western edge at Greenwish (grn_wst), or at the Dateline (lon_typ=180_ctr and lon_typ=180_wst, respectively). Grids are global, uniform, store latitudes from south-to-north, and have their first longitude centered at Greenwich by default. The grid-formula for this is 'lat_typ=uni#lon_typ=grn_ctrctr#lat_drc=s2n'. Some examples (remember, this API requires NCO 4.7.6+):

Code Block
ncremap -g grd.nc -G latlon=180,360 # 1x1 Uniform grid
ncremap -g grd.nc -G latlon=180,360#lon_typ=grn_wst # CMIP6 1x1 Uniform grid, Greenwich-west edge
ncremap -g grd.nc -G latlon=129,256#lat_typ=cap # 1.4x1.4 FV grid
ncremap -g grd.nc -G latlon=94,192#lat_typ=gss # T62 Gaussian grid

...


ncremap -g grd.nc -G latlon=721,1440#lat_drc=n2s#lat_typ=cap#lon_typ=grn_ctr # ECMWF ERA5 native grid
ncremap -g grd.nc -G latlon=1280,2560#lat_typ=gss#lon_typ=grn_ctr#lat_drc=n2s # ECMWF IFS F640 Full Gaussian
ncremap -g grd.nc -G latlon=360,720#lat_typ=uni#lon_typ=180_wst # "r05" ELM/MOSART 0.5x0.5 uniform grid

Regional grids are a powerful tool in regional process analyses, and can be much smaller in size than global datasets. Regional grids are always uniform. Specify the rectangular bounding box, i.e., the outside edges of the region, in SNWE order:

Code Block
ncremap -g grd.nc -G ttl="Equi-Angular 1x1 Greenland grid"#latlon=latlon=30,90#snwe=55.0,85.0,-90.0,0.0 # 1x1 Greenland grid

Intermediate Regridding:

The sections on Prototypical Regridding were intended to be read sequentially and introduced the most frequently required ncremap features. The Intermediate and Advanced regridding sections are an a la carte description of features most useful to particular component models, workflows, and data formats. Peruse these sections in any order.

...

Code Block
ncremap --l2s -a se2fv_flx -s atm_se_grd.nc -g ocn_fv_grd.nc -m map.nc # Source larger than destination
ncremap -a fv2se_flx -s ocn_fv_grd.nc -g atm_se_grd.nc -m map.nc # Source smaller than destination
ncremap -a se2fv_flx -s atm_se_grd.nc -g atm_fv_grd.nc -m map.nc # Source same size as destination

As mentioned above, EAM v1 datasets are the only ones stored in an SE grid format. To accomodate the mixture of FV and SE grids needed for model evaluation, Transition to TempestRemap for Atmosphere grids describes eight specific global FV<->SE mappings that optimize different combinations of accuracy, conservation, and monotonicity desirable for remapping flux-variables (flx), state-variables (stt), and an alternative (alt) mapping. A plethora of boutique options and switches control the Tempest weight-generation algorithms for these six cases. To simplify their invocation, ncremap names these eight algorithms fv2se_stt, fv2se_flx, fv2se_alt, fv2fv_flx, fv2fv_stt, se2fv_stt, se2fv_flx, se2fv_alt, and se2se. E3SM maps with these algorithms have adopted the suffixes "mono" (for fv2fv_flx, se2fv_flx, and fv2se_alt), "highorder" (fv2fv_stt, se2fv_stt, fv2se_stt), "intbilin" (se2fv_alt), and "monotr" (fv2se_flx). The relevant Tempest map-files can be generated with

...

These maps, applied to appropriate flux and state variables, should exactly reproduce the online remapping in the E3SM v2/v3 v1 coupler. However, explicitly generating all standard maps this way is not recommended because ncremap includes an MWF-mode (for "Make All Weight Files") described below. MWF-mode generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps for ERWG and Tempest. The E3SM v2/v3 configurations all use FV grids. Moreover, the mapfile naming convention changed in v3.

Intermediate Regridding III: MOAB/mbtempest support

...

Advanced procedures have in common that they activate non-standard processing modes for ncremap. These modes do something different, or in addition to, the standard weight-generation and application. Generally these modes were created in order to automate frequently recurring workflows that can leverage the ncremap infrastructure so long as various bells and whistles are introduced along the way. Please let us know if you have ideas for new or improved processing modes.

...

MPAS models produce output in their own format distinct from CESM-heritage models. MPAS-mode invokes three pre-processing steps to massage MPAS datasets until they are amenable to regridding. These steps include missing value annotation, missing value treatment, and dimension permutation. We will shortly describe these steps in order. First, though, MPAS-mode, like most other ncremap modes, is explicitly invoked with the -P md_nm option where the md_nm is the mode name, i.e.,some variation of the MPAS component model name:

Code Block
ncremap -P mpas -m map.nc dat_src.nc dat_rgr.nc

Many model and observational datasets use missing values that are not annotated in the standard manner. For example, the MPAS ocean and ice models use -9.99999979021476795361e+33 as the missing value, yet (at least from 2015-2018) do not store this in a _FillValue attribute with any variables. To prevent arithmetic from treating these values as valid, MPAS-mode automatically puts this value in a _FillValue attribute for all double-precision variables via

Code Block
ncatted -t -a _FillValue,,o,d,-9.99999979021476795361e+33 dat_src.nc

Oddly, the MPAS land-ice model uses -1.0e36 for missing values, so currently MPAS-LI users must explicitly supply this missing value to ncremap:

Code Block
ncremap -P mpas --mss_val=-1.0e36 # Generic MPAS mode, imperfect, prefer component-specific mode names
ncremap -P mpasocean -m map.nc dat_src.nc dat_rgr.nc # Important for correct vertical interpolation
ncremap -P mpasseaice -m map.nc dat_src.nc dat_rgr.nc

Next, MPAS datasets usually have masked regions (e.g., non-ocean cells) yet MPAS users like to visualize regridded data with realistic (not blocky) boundaries along those cells and so they decided to, by default, treat missing values with the renormalization approach described above in the section on Treatment of Missing Values. Hence MPAS-mode automatically invokes ncremap with maximum renormalization, equivalent to

Code Block
ncremap --rnr_thr=0.0 # Automatically weights by timeMonthly_avg_iceAreaCell
ncremap -P mali -m map.nc dat_src.nc dat_rgr.nc

Finally, ncremap requires the horizontal spatial dimension(s), whether latitude and longitude or some unstructured dimension, to be the final (most-rapidly-varying) dimension(s) of input datasets. MPAS datasets natively place their spatial dimension (typically nCells) closer to the least-rapidly-varying position. While this makes perfect sense from an I/O-efficiency point-of-view for unstructured models, it does not play well with regridders. Hence MPAS-mode permutes the input dimensions to a regridder-friendly order with

Code Block
ncpdq -a Time,nVertLevels,nVertLevelsP1,maxEdges,MaxEdges2,nCategories,ONE,nEdges,nCells dat_src.nc dat_tmp.nc

The combination of these three data manipulations defines MPAS-mode.

Advanced Regridding II: Regional Unstructured Output (RRG-mode)

EAM (and CAM-SE) will produce regional output if requested to with the finclNlonlat namelist parameter. Output for a single region can be higher temporal resolution than the host global simulation. This facilitates detailed yet economical regional process studies. Regional output files are in a special format that we call RRG (for "regional regridding"). An RRG file may contain any number of rectangular regions. The coordinates and variables for one region do not interfere with other (possibly overlapping) regions because all variables and dimensions are named with a per-region suffix string, e.g., lat_128e_to_134e_9s_to_16s. ncremap can easily regrid RRG 2D logically rectangular output from an FV-dycore because ncremap can infer (as discussed above) the grid from any well-annotated regional FV data file. Regridding unstructured regional grid data, however, is more complex because unstructured grids without cell vertices and unstructured grid weight-generators are not yet flexible enough to to output only regional (as opposed to global) grids with weights. To summarize, regridding RRG data leads to three difficulties (#1-3 below) and two difficulties (#4-5) shared with FV RRG files:

1. RRG files contain only regional gridcell center locations, not weights
2. Global SE grids have well-defined weights not vertices for each gridpoint
3. Grid generation software (ESMF and TempestRemap) only create global not regional SE grid files
4. Non-standard variable names and dimension names
5. Regional files can contain multiple regions

ncremap's RRG mode resolves these issues to allow trouble-free regridding of SE RRG files. The user must provide two additional input arguments, --dat_glb=dat_glb and --grd_glb=grd_glb to point to a global SE dataset and grid, respectively, of the same resolution as the model that generated the RRG datasets. Hence a typical RRG regridding invocation is:

Code Block
ncremap --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

Here grd_rgn is a regional destination grid-file, dat_rgn is the RRG file to regrid, and dat_rgr is the regridded output. Typically grd_rgn is a uniform rectangular grid covering the same region as the RRG file. Generate this as described in the last example in the section above on "Manual Grid-file Generation". grd_glb is the standard dual-grid grid-file for the SE resolution of the simulation, e.g., ne30np4_pentagons.091226.nc. ncremap regrids the global data file dat_glb to the global dual-grid in order to produce a intermediate global file annotated with gridcell vertices. Then it hyperslabs the lat/lon coordinates (and vertices) from the regional domain to use with regridding the RRG file. A grd_glb file with only one 2D field suffices (and is fastest) for producing the information needed by the RRG procedure. One can prepare an optimal dat_glb file by subsetting any 2D variable (e.g., ncks -v FSNT in.nc dat_glb.nc) from a full global SE output dataset.

ncremap RRG mode supports two additional options to override parameters set internally. First, the per-region suffix string may be set with --rnm_sng=rnm_sng. RRG mode will, by default, regrid the first region it finds in an RRG file. Explicitly set the desired region with rnm_sng for files with multiple regions, e.g., --rnm_sng= . Second, the bounding-box of the region may be explicitly set with --bb_wesn=lon_wst,lon_est,lat_sth,lat_nrt. The normal parsing of the bounding-box string from the suffix string may fail in (as yet undiscovered) corner cases, and the --bb_wesn option provides a workaround. The bounding-box string must include the entire RRG region, specified in WESN order. The two override options may be used independently or together, as in:

Code Block
ncremap --rnm_sng='_128e_to_134e_9s_to_16s' --bb_wesn='128.0,134.0,-16.0,-9.0' --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

RRG-mode supports most normal ncremap options, including input and output methods and regridding algorithms.

Advanced Regridding III: Sub-Gridscale Regridding (SGS-mode)

ncremap has a sub-gridscale (SGS) mode that performs the special pre-processing and weighting necessary to conserve fields that represent fractional spatial portions of a gridcell, and/or fractional temporal periods of the analysis. Spatial fields output by most geophysical models are intensive, and so by default the regridder attempts to conserve the integral of the area times the field value such that the integral is equal on source and destination grids. However some models (like ELM, CLM, CICE, and MPAS-Seaice) output gridcell values intended to apply to only a fraction sgs_frc (for "sub-gridscale fraction'') of the gridcell. The sub-gridscale fraction usually changes spatially with the distribution of land and ocean, and spatiotemporally with the distribution of sea ice and possibly vegetation. For concreteness consider a sub-grid field that represents the land fraction. Land fraction is less than one in gridcells that resolve coastlines or islands. ELM and CLM happily output temperature values valid only for a small (i.e., sgs_frc << 1) island within the larger gridcell. Model architecture dictates this behavior and savvy researchers expect it. The goal of the NCO weight-application algorithm is to treat SGS fields as seamlessly as possible so that those less familiar with sub-gridscale models can easily regrid them correctly.

Fortunately, models like ELM and CLM that run on the same horizontal grid as the overlying atmosphere can use the same mapping-file as the atmosphere, so long as the SGS weight-application procedure is invoked. Not invoking an SGS-aware weight application algorithm is equivalent to assuming sgs_frc = 1 everywhere. Regridding sub-grid values correctly versus incorrectly (e.g., with and without SGS-mode) alters global-mean answers for land-based quantities by about 1% for horizontal grid resolutions of about one degree. The resulting biases are in intricately shaped regions (coasts, lakes, sea-ice floes) and so are easy to overlook.

To invoke SGS mode and correctly regrid sub-gridscale data, specify the names of the fractional area sgs_frc and, if applicable, the mask variable sgs_msk (strictly, this is only necessary if these names differ from their respective defaults landfrac and landmask). Trouble will ensue if sgs_frc is a percentage or an absolute area rather than a fractional area (between zero and one). ncremap must know the normalization factor sgs_nrm by which sgs_frc must be divided (not multiplied) to obtain a true, normalized fraction. Datasets (such as those from CICE) that store sgs_frc in percent should specify the option --sgs_nrm=100 to instruct ncremap to normalize the sub-grid area appropriately before regridding. ncremap will re-derive sgs_msk based on the regridded values of sgs_frc: sgs_msk = 1 is assigned to destination gridcells with sgs_frc > 0.0, and all others sgs_msk = 0. As of NCO version 4.6.8 (released June, 2017), invoking any of the options --sgs_frc, --sgs_msk, or --sgs_nrm, automatically triggers SGS-mode, so that also invoking -P sgs is redundant though legal. As of NCO version 4.9.0 (released December, 2019), the values of the sgs_frc and sgs_msk variables should be explicitly specified. In previous versions they defaulted to landfrac and landmask, respectively, when -P sgs was selected. This behavior still exists but will likely be deprecated in a future version.

The area and sgs_frc fields in the regridded file will be in units of sterradians and fraction, respectively. However, ncremap offers custom options to reproduce the idiosyncratic data and metadata format of two particular models, ELM and CICE. When invoked with -P elm (or -P clm), a final step converts the output area from sterradians to square kilometers. When invoked with -P cice, the final step converts the output area from sterradians to square meters, and the output sgs_frc from a fraction to a percent.

Code Block
# ELM/CLM: output "area" in [sr]
ncremap --sgs_frc=landfrac --sgs_msk=landmask in.nc out.nc
ncremap -P sgs in.nc out.nc # Deprecated in 4.9.0
# ELM/CLM pedantic format: output "area" in [km2]
ncremap -P elm in.nc out.nc # Same as -P clm, alm, ctsm

# CICE: output "area" in [sr]
ncremap --sgs_frc=aice --sgs_msk=tmask --sgs_nrm=100 in.nc out.nc
# CICE pedantic format # Automatically uses MALI _FillValue

Many model and observational datasets use missing values that are not annotated in the standard manner. For example, the MPAS ocean and ice models use -9.99999979021476795361e+33 as the missing value, yet (at least from 2015-2018) do not store this in a _FillValue attribute with any variables. To prevent arithmetic from treating these values as valid, MPAS-mode automatically puts this value in a _FillValue attribute for all floating-point variables via

Code Block
ncatted -t -a _FillValue,,o,d,-9.99999979021476795361e+33 -a _FillValue,,o,f,-9.99999979021476795361e+33 dat_src.nc

Oddly, the MPAS land-ice model uses -1.0e36 for missing values, so currently MPAS-LI users must explicitly supply this missing value, or invoke ncremap with the -P mali option

Code Block
ncremap -P mpas --mss_val=-1.0e36 -m map.nc dat_src.nc dat_rgr.nc # Explicitly supply missing value
ncremap -P mali -m map.nc dat_src.nc dat_rgr.nc # Let ncremap know dataset is from MALI

Next, MPAS datasets usually have masked regions (e.g., non-ocean cells) yet MPAS users like to visualize regridded data with realistic (not blocky) boundaries along those cells and so they decided to, by default, treat missing values with the renormalization approach described above in the section on Treatment of Missing Values. Hence MPAS-mode automatically invokes ncremap with maximum renormalization, equivalent to

Code Block
ncremap --rnr_thr=0.0 -m map.nc dat_src.nc dat_rgr.nc

Finally, ncremap requires the horizontal spatial dimension(s), whether latitude and longitude or some unstructured dimension, to be the final (most-rapidly-varying) dimension(s) of input datasets. MPAS datasets natively place their horizontal spatial dimension (typically nCells) closer to the least-rapidly-varying position. While this makes perfect sense from an I/O-efficiency point-of-view for unstructured models, it does not play well with regridders. Hence all MPAS-modes permute the input dimensions to a regridder-friendly order (i.e., ending with nCells) with a command of the form

Code Block
ncpdq -a Time,nVertLevels,nVertLevelsP1,maxEdges,MaxEdges2,nCategories,ONE,nEdges,nCells dat_src.nc dat_tmp.nc

The combination of these three data manipulations defines MPAS-mode. It can be difficult to learn the ocean mesh (i.e., grid) names and thus to find the appropriate pre-made map-files. The standard low resolution ocean map-files for post-processing each version of MPAS are:

ncremap -P mpasocean --map=map_oEC60to30v3_to_cmip6_180x360_aave.20181001.nc mpov1.nc out.nc # MPAS Ocean v1
ncremap -P mpasocean --map=map_EC30to60E2r2_to_cmip6_180x360_aave.20220301.nc mpov2.nc out.nc # MPAS Ocean v2
ncremap -P mpasocean --map=map_IcoswISC30E3r5_to_cmip6_180x360_traave.20231201.nc mpov3.nc out.nc # MPAS Ocean v3

ncremap -P mpasseaice --map=map_oEC60to30v3_to_cmip6_180x360_aave.20181001.nc msiv1.nc out.nc # MPAS Seaice v1
ncremap -P mpasseaice --map=map_EC30to60E2r2_to_cmip6_180x360_aave.20220301.nc msiv2.nc out.nc # MPAS Seaice v2
ncremap -P mpasseaice --map=map_IcoswISC30E3r5_to_cmip6_180x360_traave.20231201.nc msiv3.nc out.nc # MPAS Seaice v3

Advanced Regridding II: EAMXX-mode

EAMXX storage conventions and files differ from EAM (and CAM) in only two ways. ncremap introduced a -P eamxx mode to support these gotchas in 2022. Dimension permutation is the primary pre-processing step necessary to massage EAMXX datasets so that they are amenable to regridding.

Code Block
ncremap -P eamxx -m map.nc dat_src.nc dat_rgr.nc # Automatically permute dimensions for horizontal regridding 

First, ncremap requires the horizontal spatial dimension(s), whether latitude and longitude or some unstructured dimension, to be the final (most-rapidly-varying) dimension(s) of input datasets. EAMXX datasets natively place their horizontal spatial dimension (typically ncol) closer to the least-rapidly-varying position. While this makes perfect sense from an I/O-efficiency point-of-view for unstructured models, it does not play well with horizontal regridding. Hence EAMXX-mode permutes the input dimensions to a regridder-friendly order (i.e., ending with ncol) with a command of the form

Code Block
ncpdq -a ilev,lev,plev,dim2,ncol dat_src.nc dat_tmp.nc

Second, EAMXX names the surface pressure variable ps by default, not PS as in EAM. The distinction is important whenever vertical interpolation is invoked. Hence EAMXX mode automatically tells the vertical interpolaion routine to look for ps not PS. The combination of these two pre-processing steps defines EAMXX-mode.

Advanced Regridding III: Sub-Gridscale Regridding (SGS-mode)

ncremap has a sub-gridscale (SGS) mode that performs the special pre-processing and weighting necessary to conserve fields that represent fractional spatial portions of a gridcell, and/or fractional temporal periods of the analysis. Spatial fields output by most geophysical models are intensive, and so by default the regridder attempts to conserve the integral of the area times the field value such that the integral is equal on source and destination grids. However some models (like ELM, CLM, CICE, and MPAS-Seaice) output gridcell values intended to apply to only a fraction sgs_frc (for "sub-gridscale fraction'') of the gridcell. The sub-gridscale fraction usually changes spatially with the distribution of land and ocean, and spatiotemporally with the distribution of sea ice and possibly vegetation. For concreteness consider a sub-grid field that represents the land fraction. Land fraction is less than one in gridcells that resolve coastlines or islands. ELM and CLM happily output temperature values valid only for a small (i.e., sgs_frc << 1) island within the larger gridcell. Model architecture dictates this behavior and savvy researchers expect it. The goal of the NCO weight-application algorithm is to treat SGS fields as seamlessly as possible so that those less familiar with sub-gridscale models can easily regrid them correctly.

Fortunately, models like ELM and CLM that run on the same horizontal grid as the overlying atmosphere can use the same mapping-file as the atmosphere, so long as the SGS weight-application procedure is invoked. Not invoking an SGS-aware weight application algorithm is equivalent to assuming sgs_frc = 1 everywhere. Regridding sub-grid values correctly versus incorrectly (e.g., with and without SGS-mode) alters global-mean answers for land-based quantities by about 1% for horizontal grid resolutions of about one degree. The resulting biases are in intricately shaped regions (coasts, lakes, sea-ice floes) and so are easy to overlook.

To invoke SGS mode and correctly regrid sub-gridscale data, specify the names of the fractional area sgs_frc and, if applicable, the mask variable sgs_msk (strictly, this is only necessary if these names differ from their respective defaults landfrac and landmask). Trouble will ensue if sgs_frc is a percentage or an absolute area rather than a fractional area (between zero and one). ncremap must know the normalization factor sgs_nrm by which sgs_frc must be divided (not multiplied) to obtain a true, normalized fraction. Datasets (such as those from CICE) that store sgs_frc in percent should specify the option --sgs_nrm=100 to instruct ncremap to normalize the sub-grid area appropriately before regridding. ncremap will re-derive sgs_msk based on the regridded values of sgs_frc: sgs_msk = 1 is assigned to destination gridcells with sgs_frc > 0.0, and all others sgs_msk = 0. As of NCO version 4.6.8 (released June, 2017), invoking any of the options --sgs_frc, --sgs_msk, or --sgs_nrm, automatically triggers SGS-mode, so that also invoking -P sgs is redundant though legal. As of NCO version 4.9.0 (released December, 2019), the values of the sgs_frc and sgs_msk variables should be explicitly specified. In previous versions they defaulted to landfrac and landmask, respectively, when -P sgs was selected. This behavior still exists but will likely be deprecated in a future version.

The area and sgs_frc fields in the regridded file will be in units of sterradians and fraction, respectively. However, ncremap offers custom options to reproduce the idiosyncratic data and metadata format of two particular models, ELM and CICE. When invoked with -P elm (or -P clm), a final step converts the output area from sterradians to square kilometers. When invoked with -P cice, the final step converts the output area from sterradians to square meters, and the output sgs_frc from a fraction to a percent.

Code Block
# ELM/CLM: output "area" in [m2], "aice" in [%]
ncremap -P cice in.nc out.nc

# MPAS-Seaice: both commands are equivalent
ncremap -P mpasseaice in.nc out.nc
ncremap --sgs_frc=timeMonthly_avg_iceAreaCell in.nc out.nc

It is sometimes convenient to store the sgs_frc field in an external file from the field(s) to be regridded. For example, CMIP-style timeseries are often written with only one variable per file. NCO supports this organization by accepting sgs_frc arguments in the form of a filename followed by a slash and then a variable name:

Code Block
ncremap --sgs_frc=sgs_landfrac_ne30.nc/landfrac -m map.nc in.nc out.nc

Files regridded using explicitly specified SGS options will differ slightly from those regridded using the -P elm or -P cice options. The former will have an area field in sterradians, the generic units used internally by the regridder. The latter produces model-specific area fields in square kilometers (for ELM) or square meters (for CICE), as expected in the raw output from these two models. To convert from angular to areal values, NCO assumes a spherical Earth with radius 6,371,220 m or 6,371,229 m, for ELM and CICE, respectively. The ouput sgs_frc field is expressed as a decimal fraction in all cases except for -P cice which stores the fraction in percent. Thus the generic SGS and model-specific convenience options produce equivalent results, and the latter is intended to be indistinguishable (in terms of metadata and units) to raw model output. This makes it more interoperable with many existing analysis scripts.

...

sr]
ncremap --sgs_frc=landfrac --sgs_msk=landmask -m map.nc in.nc out.nc
ncremap -P sgs -m map.nc in.nc out.nc # Deprecated in 4.9.0
# ELM/CLM pedantic format: output "area" in [km2]
ncremap -P elm -m map.nc in.nc out.nc # Same as -P clm, alm, ctsm

# CICE: output "area" in [sr]
ncremap --sgs_frc=aice --sgs_msk=tmask --sgs_nrm=100 -m map.nc in.nc out.nc
# CICE pedantic format: output "area" in [m2], "aice" in [%]
ncremap -P cice -m map.nc in.nc out.nc

# MPAS-Seaice:
ncremap -P mpasseaice -m map.nc in.nc out.nc # Preferred (because it is forward-compatible)
ncremap -P mpas --sgs_frc=timeMonthly_avg_iceAreaCell -m map.nc in.nc out.nc # Equivalent to above

It is sometimes convenient to store the sgs_frc field in an external file from the field(s) to be regridded. For example, CMIP-style timeseries are often written with only one variable per file. NCO supports this organization by accepting sgs_frc arguments in the form of a filename followed by a slash and then a variable name:

Code Block
ncremap --sgs_frc=sgs_landfrac_ne30.nc/landfrac -m map.nc in.nc out.nc

Files regridded using explicitly specified SGS options will differ slightly from those regridded using the -P elm or -P cice options. The former will have an area field in sterradians, the generic units used internally by the regridder. The latter produces model-specific area fields in square kilometers (for ELM) or square meters (for CICE), as expected in the raw output from these two models. To convert from angular to areal values, NCO assumes a spherical Earth with radius 6,371,220 m or 6,371,229 m, for ELM and CICE, respectively. The ouput sgs_frc field is expressed as a decimal fraction in all cases except for -P cice which stores the fraction in percent. Thus the generic SGS and model-specific convenience options produce equivalent results, and the latter is intended to be indistinguishable (in terms of metadata and units) to raw model output. This makes it more interoperable with many existing analysis scripts.

Advanced Regridding IV: Regional Unstructured Output (RRG-mode)

EAM (and CAM-SE) will produce regional output if requested to with the finclNlonlat namelist parameter. Output for a single region can be higher temporal resolution than the host global simulation. This facilitates detailed yet economical regional process studies. Regional output files are in a special format that we call RRG (for "regional regridding"). An RRG file may contain any number of rectangular regions. The coordinates and variables for one region do not interfere with other (possibly overlapping) regions because all variables and dimensions are named with a per-region suffix string, e.g., lat_128e_to_134e_9s_to_16s. ncremap can easily regrid RRG 2D logically rectangular output from an FV-dycore because ncremap can infer (as discussed above) the grid from any well-annotated regional FV data file. Regridding unstructured regional grid data, however, is more complex because unstructured grids without cell vertices and unstructured grid weight-generators are not yet flexible enough to to output only regional (as opposed to global) grids with weights. To summarize, regridding RRG data leads to three difficulties (#1-3 below) and two difficulties (#4-5) shared with FV RRG files:

  1. RRG files contain only regional gridcell center locations, not weights

  2. Global SE grids have well-defined weights not vertices for each gridpoint

  3. Grid generation software (ESMF and TempestRemap) only create global not regional SE grid files

  4. Non-standard variable names and dimension names

  5. Regional files can contain multiple regions

ncremap's RRG mode resolves these issues to allow trouble-free regridding of SE RRG files. The user must provide two additional input arguments, --dat_glb=dat_glb and --grd_glb=grd_glb to point to a global SE dataset and grid, respectively, of the same resolution as the model that generated the RRG datasets. Hence a typical RRG regridding invocation is:

Code Block
ncremap --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

Here grd_rgn is a regional destination grid-file, dat_rgn is the RRG file to regrid, and dat_rgr is the regridded output. Typically grd_rgn is a uniform rectangular grid covering the same region as the RRG file. Generate this as described in the last example in the section above on "Manual Grid-file Generation". grd_glb is the standard dual-grid grid-file for the SE resolution of the simulation, e.g., ne30np4_pentagons.091226.nc. ncremap regrids the global data file dat_glb to the global dual-grid in order to produce a intermediate global file annotated with gridcell vertices. Then it hyperslabs the lat/lon coordinates (and vertices) from the regional domain to use with regridding the RRG file. A grd_glb file with only one 2D field suffices (and is fastest) for producing the information needed by the RRG procedure. One can prepare an optimal dat_glb file by subsetting any 2D variable (e.g., ncks -v FSNT in.nc dat_glb.nc) from a full global SE output dataset.

ncremap RRG mode supports two additional options to override parameters set internally. First, the per-region suffix string may be set with --rnm_sng=rnm_sng. RRG mode will, by default, regrid the first region it finds in an RRG file. Explicitly set the desired region with rnm_sng for files with multiple regions, e.g., --rnm_sng= . Second, the bounding-box of the region may be explicitly set with --bb_wesn=lon_wst,lon_est,lat_sth,lat_nrt. The normal parsing of the bounding-box string from the suffix string may fail in (as yet undiscovered) corner cases, and the --bb_wesn option provides a workaround. The bounding-box string must include the entire RRG region, specified in WESN order. The two override options may be used independently or together, as in:

Code Block
ncremap --rnm_sng='_128e_to_134e_9s_to_16s' --bb_wesn='128.0,134.0,-16.0,-9.0' --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc

RRG-mode supports most normal ncremap options, including input and output methods and regridding algorithms.

Advanced Regridding V: Make All Weight Files (MWF-mode)

As mentioned above in the TempestRemap section, ncremap includes an MWF-mode (for "Make All Weight Files") that generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps with both ERWG and Tempest. MWF-mode automates the laborious and error-prone process of generating numerous map-files with various switches. Its chief use occurs when developing and testing new global grid-pairs for the E3SM atmosphere and ocean components. Invoke MWF-mode with a number of specialized options to control the naming of the output map-files:

Code Block
ncremap -P mwf -s grd_ocn -g grd_atm --nm_src=ocn_nm --nm_dst=atm_nm --dt_sng=date

where grd_ocn is the "global" ocean grid, grd_atm, is the global atmosphere grid, nm_src sets the shortened name for the source (ocean) grid as it will appear in the output map-files, nm_dst sets, similarly, the shortend named for the destination (atmosphere) grid, and dt_sng sets the date-stamp in the output map-file name map_${nm_src}_to_${nm_dst}_${alg_typ}.${dt_sng}.nc. Setting nm_src, nm_dst, and dt_sng is optional though highly recommended. For example,

...

% ncremap

...

-P

...

mwf

...

-s

...

${DATA}/grids/ocean.QU.240km.scrip.181106.nc -g ${DATA}/grids/cmip6_180x360_scrip.20181001.nc --nm_src=

...

QU240 --nm_dst=

...

cmip6_180x360 --dt_sng=

...

20240220

produces the 10 ERWG map-files:

Code Block
map_oRRS30to10_to_T62_aave.20180901.nc
map_oRRS30to10_to_T62_blin.20180901.nc
map_oRRS30to10_to_T62_ndtos.20180901.nc
map_oRRS30to10_to_T62_nstod.20180901.nc
map_oRRS30to10_to_T62_patc.20180901.nc
map_T62_to_oRRS30to10_aave.20180901.nc
map_T62_to_oRRS30to10_blin.20180901.nc
map_T62_to_oRRS30to10_ndtos.20180901.nc
map_T62_to_oRRS30to10_nstod.20180901.nc
map_T62_to_oRRS30to10_patc.20180901.nc

The ordering of source and destination grids is immaterial for ERWG maps since MWF-mode produces all map combinations. However, as described above in the TempestRemap section, the Tempest overlap-mesh generator must be called with the smaller grid preceding the larger grid. For this reason, always invoke MWF-mode with the smaller grid (i.e., the ocean) as the source, otherwise some Tempest map-file will fail to generate. The six optimized SE<->FV Tempest maps described above in the TempestRemap section will be generated when the destination grid has a ".g" suffix which ncremap interprets as indicating an Exodus-format SE grid (NB: this assumption is an implementation convenience that can be modified if necessary). For example,

Code Block
ncremap -P mwf -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g --nm_src=oRRS30to10 --nm_dst=ne30np4 --dt_sng=20180901

produces the 6 TempestRemap map-files:

Code Block
map_oRRS30to10_to_ne30np4_monotr.20180901.nc
map_oRRS30to10_to_ne30np4_highorder.20180901.nc
map_oRRS30to10_to_ne30np4_mono.20180901.nc
map_ne30np4_to_oRRS30to10_mono.20180901.nc
map_ne30np4_to_oRRS30to10_highorder.20180901.nc
map_ne30np4_to_oRRS30to10_intbilin.20180901.nc

MWF-mode takes significant time to complete (~20 minutes on my MacBookPro) for the above gridssixteen map-files. Note the (v3 standard) mapfile name is map_<src>to<dst>_<alg>.YYYYMMDD.nc. Not coincidentally, ncremap MWF (Make-Weight-File) mode generates maps for exactly these eight algorithms (in both directions). Many researchers do not want the global->ocean direction map-files. Those can simply be deleted.

% ls map*
map_QU240_to_cmip6_180x360_esmfaave.20240220.nc map_cmip6_180x360_to_QU240_esmfaave.20240220.nc
map_QU240_to_cmip6_180x360_esmfbilin.20240220.nc map_cmip6_180x360_to_QU240_esmfbilin.20240220.nc
map_QU240_to_cmip6_180x360_ncoaave.20240220.nc map_cmip6_180x360_to_QU240_ncoaave.20240220.nc
map_QU240_to_cmip6_180x360_ncoidw.20240220.nc map_cmip6_180x360_to_QU240_ncoidw.20240220.nc
map_QU240_to_cmip6_180x360_traave.20240220.nc map_cmip6_180x360_to_QU240_traave.20240220.nc
map_QU240_to_cmip6_180x360_trbilin.20240220.nc map_cmip6_180x360_to_QU240_trbilin.20240220.nc
map_QU240_to_cmip6_180x360_trfv2.20240220.nc map_cmip6_180x360_to_QU240_trfv2.20240220.nc
map_QU240_to_cmip6_180x360_trintbilin.20240220.nc map_cmip6_180x360_to_QU240_trintbilin.20240220.nc

For a subset of these maps, use the --alg_lst option, e.g.,
% ncremap -P mwf --alg_lst=esmfbilin,ncoidw,traave,trbilin -s ${DATA}/grids/ocean.QU.240km.scrip.181106.nc -g ${DATA}/grids/cmip6_180x360_scrip.20181001.nc --nm_src=QU240 --nm_dst=cmip6_180x360 --dt_sng=20240220
% ls map*
map_QU240_to_cmip6_180x360_esmfbilin.20240220.nc map_cmip6_180x360_to_QU240_esmfbilin.20240220.nc
map_QU240_to_cmip6_180x360_ncoidw.20240220.nc map_cmip6_180x360_to_QU240_ncoidw.20240220.nc
map_QU240_to_cmip6_180x360_traave.20240220.nc map_cmip6_180x360_to_QU240_traave.20240220.nc
map_QU240_to_cmip6_180x360_trbilin.20240220.nc map_cmip6_180x360_to_QU240_trbilin.20240220.nc

The ordering of source and destination grids is immaterial for ERWG maps since MWF-mode produces all map combinations. However, as described above in the TempestRemap section, the Tempest overlap-mesh generator must be called with the smaller grid preceding the larger grid. For this reason, always invoke MWF-mode with the smaller grid (i.e., the ocean) as the source, otherwise some Tempest map-files will fail to generate.

MWF-mode can take significant time to complete. To accelerate this, consider installing the MPI-enabled instead of the serial version of ERWG and MBTR. Then use the --wgt_cmd option to tell ncremap the MPI configuration to invoke ERWG with, for example:

...

Background and distributed node parallelism (as described above in the the Parallelism section) of MWF-mode are possible though not yet implemented. Please let us know if this feature is desired.

Advanced Regridding

...

VI: CMIP6 Timeseries

This section describes the recommended procedures to construct and regrid E3SM timeseries data to CMIP6 specifications. Most models provide data to CMIP6 in timeseries format, meaning one variable-per-file with multiple years per file. These timeseries must be regridded to at least one of the CMIP6 standard grids. The E3SM project chose to supply its v1 experiments to CMIP6 archived on rectangular, uniform (i.e., equiangular in latitude and longitude), one-degree (for standard-resolution) and quarter-degree (for high-resolution) grids. Generating these timeseries from experiments as lengthy as 500 model years, formatted to CMIP6 specifications, requires many non-standard options to both ncclimo (to construct the timeseries) and to ncremap (to regrid timeseries), and is a natural capstone exercise in using both together. This section is arranged in reverse order where first we present the final actual commands, followed by the descriptions, meanings, and reasons for particular options.

...

Code Block
# EAM/ELM:
drc_in='/p/user_pub/work/E3SM/1_0/piControl/1deg_atm_60-30km_ocean/atmos/native/model-output/mon/ens1/v1' # Input directory
drc_out="${DATA}/ne30/clm" # Native grid output directory
drc_rgr="${DATA}/ne30/rgr" # Regridded output directory
drc_tmp='/p/cscratch/acme/zender1/tmp' # Temporary directory for intermediate files
map="${DATA}/maps/map_ne30np4_to_cmip6_180x360_aave.20181001.nc" # Regridding map-file
cmip6_opt='-7 --dfl_lvl=1 --no_cll_msr --no_frm_trm --no_stg_grd' # CMIP6-specific options
spl_opt='--yr_srt=1 --yr_end=500 --ypf=500' # 2D Splitter options
vars='FSNT' # 2D
#spl_opt='--yr_srt=1 --yr_end=500 --ypf=25' # 3D Splitter options
#vars='T' # 3D
export TMPDIR=${drc_tmp};cd ${drc_in};/bin/ls 20180129.DECKv1b_piControl.ne30_oEC.edison.cam.h0.0???-*.nc | ncclimo --var=${vars} ${cmip6_opt} ${spl_opt} --map=${map} --drc_out=${drc_out} --drc_rgr=${drc_rgr} > ~/ncclimo.atm 2>&1 &

# MPAS:
drc_in='/p/user_pub/work/E3SM/1_0/piControl/1deg_atm_60-30km_ocean/ocean/native/model-output/mon/ens1/v1' # Input directory
drc_out="${DATA}/ne30/clm" # Native grid output directory
drc_rgr="${DATA}/ne30/rgr" # Regridded output directory
drc_tmp='/p/cscratch/acme/zender1/tmp' # Temporary/intermediate-file directory
map="${DATA}/maps/map_oEC60to30v3_to_cmip6_180x360_aave.20181001.nc" # Regridding map-file
mpas_opt='-m mpas --d2f' # MPAS-specific options
cmip6_opt='-7 --dfl_lvl=1 --no_cll_msr --no_frm_trm --no_stg_grd' # CMIP6-specific options
spl_opt='--yr_srt=1 --yr_end=500 --ypf=500' # 2D Splitter options
vars='timeMonthly_avg_longWaveHeatFluxUp' # 2D
#spl_opt='--yr_srt=1 --yr_end=500 --ypf=25' # 3D Splitter options
#vars='timeMonthly_avg_activeTracers_temperature' # 3D
export TMPDIR=${drc_tmp};cd ${drc_in};/bin/ls mpaso.hist.am.timeSeriesStatsMonthly.0???-*.nc | ncclimo --var=${vars} ${mpas_opt} ${cmip6_opt} ${spl_opt} --map=${map} --drc_out=${drc_out} --drc_rgr=${drc_rgr} > ~/ncclimo.ocn 2>&1 &

Take a moment to compare the methods for EAM and for MPAS. They are nearly identical except for the variable names, experiment names and directories, map-files (so far nothing surprising or important) AND the additional MPAS options in ${mpas_opt}. We will discuss those soon. Each command-set begins with setting experiment-dependent I/O directories and a map-files. Other experiments will require changing these to the appropriate I/O directories, yet the map-file remains the same unless the native or destination grid changes. The next three or four lines in each command-set configure the splitter and regridder with options that many ncclimo/ncremap users have never before tried. Finally the list of input files and all the configuration options are sent to ncclimo. The entire procedure for the user boils down to creating then executing this one splitter command for each desired variable.

...

The main difference between generating the timeseries for the Historical ensemble and the DECK PI experiment is the need to loop over the ensemble. Here the splitter command is not backgrounded so that one member experiment is processed at a time (to avoid overwhelming nodes with I/O and RAM demands). Set the input directory in the ensemble loop, and ensure the globbing pattern for filenames matches the naming convention used for all five members. Consider whether to output to member-specific directories or to a single, ensemble-wide directory. If the former, then nothing special need be done. If the latter, use the --fml_nm (family-name) option as above to avoid identical timeseries names (that will overwrite one another) and to create instead member-specific timeseries names like

CLDLOW_H1_185001_201412.nc
CLDLOW_H2_185001_201412.nc
...

Advanced Regridding

...

VII: Initial Condition Files

First, use the right tool. ncremap can regrid an initial conditions (IC) file, both vertically and horizontally. However, generating scientifically validated IC files for new model resolutions is best done with a lengthy workflow (https://acme-climate.atlassian.net/wiki/spaces/ED/pages/872579110/Adding+support+for+new+grids) in which regridding plays a relatively small role. That said, ncremap is a good tool to place an atmospheric state onto a new grid where it can then be nudged into a valid IC file, or to place the contents of an IC file on a rectangular grid (as shown below) where it is easier to plot. Regridding atmosphere IC files was straightforward until E3SM v2 when the atmosphere separated the dynamical and physical grids. This resulted in IC files containing two sets of grid variables, so now we must regrid v2 IC files with two successive invocations of ncremap:

...

ncremap -R '--rgr col_nm=ncol_d' --map=map_ne30np4_to_cmip6_180x360_nco.20200901.nc foo.eam.i.2001-01-01-00000.nc foo.rgr.nc

Advanced Regridding

...

VIII: Fixing Grid Files to Work as Intended

Gridfiles store a wealth of highly precise information using loosely standardized rules that are open to interpretation. One may encounter gridfiles that conform to one regridder's expectations though not another's. The section provides guidance on how to adjust or repair some of the most frequently encountered problems with gridfiles. The problems currently addressed are floating-point masks, _FillValue, and imprecise RLL grids.

Floating-point mask variables (grid_imask) in SCRIP files---they are non-standard and may break some software. TempestRemap's GenerateOverlapMesh program, for example, breaks (as of this writing, 20210428) when asked to ingest a SCRIP grid-file with a floating point mask. This problem can occur when using grid-files from TR's old ConvertExodusToSCRIP program which outputs output floating point masks. The new TR program, ConvertMeshToSCRIP, appears to have fixed this problem (as of this writing in 20240110). Also, users often create masks from floating-point variables (as described in the next section) and inadvertently leave the mask as a floating point variable. The solution to the problem of floating-point masks can be as simple as a straightforward conversion of the mask to an integer:

...

ncks -C -m -A -v grid_imask ${grid} grid.nc # Replace newly generated default mask with GLOBE/Gardner landmask
ncks -O -7 -L 1 grid.nc grid.nc # Optional step to compress large grids

Advanced Regridding

...

IX: Creating and Using Land Surface Grids with Masks

Regridding software is commonly used to convert continuous fields into binary (True/False) masks for use within numerical models. Land surface models, for example, use a "land mask" to distinguish gridcells containing land (mask is True = 1) from non-land gridcells (mask is False = 0). Similarly, glacier models models use an "ice mask" to distinguish gridcells available to the ice sheet from non-ice gridcells. This section describes the steps to follow and potential crevasses to avoid when manipulating fields with masks into (primarily SCRIP-format) grid-files that can then be used as described above in the creation of weights to map between grids.

...