{"type":"doc","content":[{"type":"paragraph","content":[{"text":"This page is devoted to instruction in NCO’s regridding operator, ","type":"text","marks":[{"type":"textColor","attrs":{"color":"#333333"}}]},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":". It describes steps necessary to create grids, and to regrid datasets between different grids with ","type":"text","marks":[{"type":"textColor","attrs":{"color":"#333333"}}]},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":". Some of the simpler regridding options supported by ","type":"text","marks":[{"type":"textColor","attrs":{"color":"#333333"}}]},{"text":"ncclimo","type":"text","marks":[{"type":"code"}]},{"text":" are also described at ","type":"text","marks":[{"type":"textColor","attrs":{"color":"#333333"}}]},{"text":"Generate, Regrid, and Split Climatologies (climo files) with ncclimo","type":"text","marks":[{"type":"textColor","attrs":{"color":"#333333"}},{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/31129737"}}]},{"text":". This page describes those features in more detail, and other, more boutique features often useful for custom regridding solutions.","type":"text","marks":[{"type":"textColor","attrs":{"color":"#333333"}}]}]},{"type":"paragraph","content":[{"text":"The Zen of Regridding","type":"text","marks":[{"type":"strong"}]}]},{"type":"paragraph","content":[{"text":"Most modern climate/weather-related research requires a regridding step in its workflow. The plethora of geometric and spectral grids on which model and observational data are stored ensures that regridding is usually necessary to scientific insight, especially the focused and variable resolution studies that E3SM models conduct. Why does such a common procedure seem so complex? Because a mind-boggling number of options are required to support advanced regridding features that many users never need. To defer that complexity, this HOWTO begins with solutions to the prototypical regridding problem, without mentioning any other options. It demonstrates how to solve that problem simply, including the minimal software installation required. Once the basic regridding vocabulary has been introduced, we solve the prototype problem when one or more inputs are \"missing\", or need to be created. The HOWTO ends with descriptions of different regridding modes and workflows that use features customized to particular models, observational datasets, and formats. The overall organization, including TBD sections (suggest others, or vote for prioritizing, below), is:","type":"text"}]},{"type":"extension","attrs":{"layout":"default","extensionType":"com.atlassian.confluence.macro.core","extensionKey":"toc","parameters":{"macroParams":{},"macroMetadata":{"macroId":{"value":"67742412-73d1-4d65-819f-fb7774cf4e21"},"schemaVersion":{"value":"1"},"title":"Table of Contents"}},"localId":"d27b9a8c-3bee-40a6-9689-ae66775d9fcc"}},{"type":"heading","attrs":{"level":4},"content":[{"text":"Software Requirements:","type":"text"}]},{"type":"paragraph","content":[{"text":"At a minimum, install a recent version NCO on your executable $PATH with the corresponding library on your ","type":"text"},{"text":"$LD_LIBRARY_PATH","type":"text","marks":[{"type":"code"}]},{"text":". NCO installation instructions are ","type":"text"},{"text":"here","type":"text","marks":[{"type":"link","attrs":{"href":"http://nco.sf.net#install"}}]},{"text":". Unless you have reason to do otherwise, we recommend installing NCO through the Conda package (","type":"text"},{"text":"conda install -c conda-forge nco","type":"text","marks":[{"type":"code"}]},{"text":") or via ","type":"text"},{"text":"activating the E3SM-Unified environment","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/pages/resumedraft.action?draftId=754286611"}}]},{"text":". The Conda NCO package automagically installs two other important regridding tools, the","type":"text"},{"text":" ESMF_RegridWeightGen","type":"text","marks":[{"type":"code"}]},{"text":" (aka ERWG) executable and the TempestRemap (aka TR) executables. Execute 'ncremap --config' to verify you have a working installation:","type":"text"}]},{"type":"paragraph","content":[{"text":"zender@aerosol:~$ ncremap --config","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncremap, the NCO regridder and grid, map, and weight-generator, version 4.9.3-alpha02 \"Fuji Rolls\"","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"...[Legal Stuff]...","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: ncremap script located in directory /Users/zender/bin","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: NCO binaries located in directory /Users/zender/bin, linked to netCDF library version 4.7.3","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: No hardcoded machine-dependent path/module overrides. (If desired, turn-on NCO hardcoded paths at supported national labs with \"export NCO_PATH_OVERRIDE=Yes\").","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: External (non-NCO) program availability:","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: ESMF weight-generation command ESMF_RegridWeightGen version 7.1.0r found as /opt/local/bin/ESMF_RegridWeightGen","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: MOAB-Tempest weight-generation command mbtempest not found","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: MPAS depth coordinate addition command add_depth.py found as /Users/zender/bin/add_depth.py","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"Config: TempestRemap weight-generation command GenerateOfflineMap found as /usr/local/bin/GenerateOfflineMap","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"type":"hardBreak"},{"text":"Only NCO is required for many operations including applying existing regridding weights (aka, regridding) and/or generating grids, maps, or conservative weights with the NCO algorithms. Generating new weights (and map-files) with ERWG or TR requires that you install those packages (both of which come with the NCO Conda package). MOAB-Tempest (MBTR) is only required to generate TR weights on the largest meshes. It is also available as a Conda package, and comes with the ","type":"text"},{"text":"E3SM-Unified environment","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/129732419"}}]},{"text":". Make sure ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" reports a sufficiently working status as above before proceeding further.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Prototypical Regridding I: Use Existing Map-file","type":"text"}]},{"type":"paragraph","content":[{"text":"The regridding problem most commonly faced is converting output from a standard resolution model simulation to equivalent data on a different grid for visualization or intercomparison with other data. The EAM v1 model low-resolution simulations are performed and output on the ne30np4 SE (spectral element) grid, aka the \"source grid\". The corresponding EAM v2 simulations were conducted on the ne30pg2 FV (finite volume) grid. E3SM source grids like these are called “unstructured” because they have only one horizontal dimension (i.e., 1D) which makes them difficult to visualize. The recommended 2D (latitude-longitude) grids for analysis (aka the \"destination grid\") of v1 simulations was the 129x256 Cap grid (the gridcells at the poles look like yarmulke caps), and since v2 is the 180x360 equi-angular grid used by CMIP. The single most important capability of a regridder is the intelligent application of weights that transform data on the input grid to the desired output grid. These weights are stored in a \"map-file\", a single file which contains all the necessary weights and grid information necessary. While most regridding problems revolve around creating the appropriate map-file, this prototype problem assumes that has already been done, so the appropriate map-file (","type":"text"},{"text":"map.nc","type":"text","marks":[{"type":"code"}]},{"text":") already exists and ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" can immediately transform the input dataset (","type":"text"},{"text":"dat_src.nc","type":"text","marks":[{"type":"code"}]},{"text":") to the output (regridded) dataset (","type":"text"},{"text":"dat_rgr.nc","type":"text","marks":[{"type":"code"}]},{"text":"):","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -m map.nc dat_src.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"This solution is deceptively simple because it conceals the choices, paths, and options required to create the appropriate ","type":"text"},{"text":"map.nc","type":"text","marks":[{"type":"code"}]},{"text":" for all situations. We will discuss creating ","type":"text"},{"text":"map.nc","type":"text","marks":[{"type":"link","attrs":{"href":"http://map.nc"}}]},{"text":" later after showing more powerful and parallel ways to solve the prototype problem. The solution above works for users savvy enough to know how to find appropriate pre-built map-files. Map-files used by the E3SM model are available at ","type":"text"},{"text":"https://web.lcrc.anl.gov/public/e3sm/inputdata/cpl/gridmaps/","type":"text","marks":[{"type":"link","attrs":{"href":"https://web.lcrc.anl.gov/public/e3sm/inputdata/cpl/gridmaps/"}}]},{"text":" . Additional map-files useful in post-processing are available at ","type":"text"},{"text":"https://web.lcrc.anl.gov/public/e3sm/diagnostics/maps/","type":"text","marks":[{"type":"link","attrs":{"href":"https://web.lcrc.anl.gov/public/e3sm/diagnostics/maps/"}}]},{"text":". Many commonly used maps and grids can also be found in my (@czender's) directories as ","type":"text"},{"text":"~zender/data/[maps,grids]","type":"text","marks":[{"type":"code"}]},{"text":" at most DOE High Performance Computing (HPC) centers. Take a minute now to look at these locations.","type":"text"}]},{"type":"paragraph","content":[{"text":"Pre-built map-files use the (nearly) standardized naming convention ","type":"text"},{"text":"map_srcgrd_to_dstgrd_algtyp.YYYYMMDD.nc","type":"text","marks":[{"type":"code"}]},{"text":", where ","type":"text"},{"text":"srcgrd","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"dstgrd","type":"text","marks":[{"type":"code"}]},{"text":" are the source and destination grid names, ","type":"text"},{"text":"algtyp","type":"text","marks":[{"type":"code"}]},{"text":" is a shorthand for the numerical regridding algorithm, and ","type":"text"},{"text":"YYYYMMDD","type":"text","marks":[{"type":"code"}]},{"text":" is the date assigned to the map (i.e., the date the map was created). The source grid in the example above is called ","type":"text"},{"text":"ne30np4","type":"text","marks":[{"type":"code"}]},{"text":", the destination is called ","type":"text"},{"text":"fv129x256","type":"text","marks":[{"type":"code"}]},{"text":". A pre-built map for the v1 combination is ","type":"text"},{"text":"map.nc = map_ne30np4_to_fv129x256_aave.20150901.nc","type":"text","marks":[{"type":"code"}]},{"text":", and for v2 is ","type":"text"},{"text":"map_ne30pg2_to_cmip6_180x360_aave.20200201.nc","type":"text","marks":[{"type":"code"}]},{"text":". What is ","type":"text"},{"text":"aave","type":"text","marks":[{"type":"code"}]},{"text":"? Weight generators can use about a dozen interpolation algorithms for regridding, and each has a shorthand name. For now, it is enough to know that the two most common algorithms are (first-order) conservative area-average regridding (","type":"text"},{"text":"aave","type":"text","marks":[{"type":"code"}]},{"text":") and bilinear interpolation (","type":"text"},{"text":"bilin","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"blin","type":"text","marks":[{"type":"code"}]},{"text":"). Hence this conservatively regrids ","type":"text"},{"text":"dat_src.nc","type":"text","marks":[{"type":"code"}]},{"text":" to ","type":"text"},{"text":"dat_rgr.nc","type":"text","marks":[{"type":"code"}]},{"text":" with first order accuracy:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -m map_ne30np4_to_fv129x256_aave.20150901.nc dat_src.nc dat_rgr.nc # EAMv1\nncremap -m map_ne30pg2_to_cmip6_180x360_aave.20200201.nc dat_src.nc dat_rgr.nc # EAMv2\nncremap -m map_ne30pg2_to_cmip6_180x360_traave.20231201.nc dat_src.nc dat_rgr.nc # EAMv3","type":"text"}]},{"type":"paragraph","content":[{"text":"Before looking into map-file generation in the next section, try a few ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" features. For speed's sake, regrid only selected variables:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -v FSNT,AODVIS -m map.nc dat_src.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"To regrid multiple files with a single command, supply ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with the source and regridded directory names (","type":"text"},{"text":"drc_src","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"drc_rgr","type":"text","marks":[{"type":"code"}]},{"text":"). It regrids every file in ","type":"text"},{"text":"drc_src","type":"text","marks":[{"type":"code"}]},{"text":" and places the output in ","type":"text"},{"text":"drc_rgr","type":"text","marks":[{"type":"code"}]},{"text":":","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -m map.nc -I drc_src -O drc_rgr","type":"text"}]},{"type":"paragraph","content":[{"text":"Or supply specific input filenames on the command-line, piped through standard input, or redirected from a file:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -m map.nc -O drc_rgr mdl*2005*nc\nls mdl*2005*nc | ncremap -m map.nc -O drc_rgr\nncremap -m map.nc -O drc_rgr < file_list.txt","type":"text"}]},{"type":"paragraph","content":[{"text":"When an output directory is not specified, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" writes to the current working directory. When the output and input directories are the same, ncremap appends a string (based on the destination grid resolution) to each input filename to avoid name collisions. Finally, be aware that multiple-file invocations of ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" execute in parallel by default. Power users will want to tune this as described in the section on \"Intermediate Regridding\".","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Prototypical Regridding II: Create Map-file from Known Grid-files","type":"text"}]},{"type":"paragraph","content":[{"text":"The simplest regridding procedure applies an existing map-file to your data, as in the above example (public servers of pre-existing map-files are also linked to above). At most DOE High Performance Computing (HPC) centers these and others can also be found in my (","type":"text"},{"type":"mention","attrs":{"id":"557058:9d0a816d-c625-4e4a-be7a-1d4bfa169984","text":"@Charlie Zender"}},{"text":" 's) directory, ","type":"text"},{"text":"~zender/data/maps","type":"text","marks":[{"type":"code"}]},{"text":". If the desired map-file cannot be found, then you must create it. Creating a map-file requires a complete specification of both source and destination grids (meshes). The files that contain these grid specifications are called \"grid-files\". Many E3SM grid-files are publicly available within model-specific directories of the previous location, e.g., ","type":"text"},{"text":"https://web.lcrc.anl.gov/public/e3sm/inputdata/ocn/mpas-o/oEC60to30v3/","type":"text","marks":[{"type":"link","attrs":{"href":"https://web.lcrc.anl.gov/public/e3sm/inputdata/ocn/mpas-o/oEC60to30v3/"}}]},{"text":" . Many grids useful for post-processing are publicly served from ","type":"text"},{"text":"https://web.lcrc.anl.gov/public/e3sm/diagnostics/grids/","type":"text","marks":[{"type":"link","attrs":{"href":"https://web.lcrc.anl.gov/public/e3sm/diagnostics/grids/"}}]},{"text":". At most DOE High Performance Computing (HPC) centers these can also be found in my (@czender's) directory, ","type":"text"},{"text":"~zender/data/grids","type":"text","marks":[{"type":"code"}]},{"text":". Take a minute now to look there for the prototype problem grid-files, e.g., for FV 129x256, ","type":"text"},{"text":"cmip6_180x360","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"ne30pg2","type":"text","marks":[{"type":"code"}]},{"text":" grid-files.","type":"text"}]},{"type":"paragraph","content":[{"text":"You might find multiple grid-files that contain the string ","type":"text"},{"text":"129x256","type":"text","marks":[{"type":"code"}]},{"text":". Grid-file names are often ambiguous. The grid-file global metadata (","type":"text"},{"text":"ncks -M grid.nc","type":"text","marks":[{"type":"code"}]},{"text":") often displays the source of the grid. These metadata, and sometimes the actual data (fxm: link), are usually more complete and/or accurate in files with a ","type":"text"},{"text":"YYYYMMDD-format date-stamp","type":"text","marks":[{"type":"code"}]},{"text":". For example, the metadata in file ","type":"text"},{"text":"129x256_SCRIP.20150901.nc","type":"text","marks":[{"type":"code"}]},{"text":" clearly state it is an FV grid and not some other type of grid with 129x256 resolution. The metadata in ","type":"text"},{"text":"129x256_SCRIP.130510","type":"text","marks":[{"type":"code"}]},{"text":" tell the user nothing about the grid boundaries, and some of the data are flawed. When grids seem identical except for their date-stamp, use the grid with the later date-stamp. The curious can examine a grid-file (","type":"text"},{"text":"ncks -M -m grid.nc","type":"text","marks":[{"type":"code"}]},{"text":") and easily see it looks completely different from a typical model or observational data file. Grid-files and data-files are not interchangeable.","type":"text"}]},{"type":"paragraph","content":[{"text":"Multiple grid-files also contain the string ","type":"text"},{"text":"ne30","type":"text","marks":[{"type":"code"}]},{"text":". These are either slightly different grids, or the same grids store in different formats meant for different post-processing tools. The different spectral element (SE) and Finite Volume (FV) grid types are described with figures and described here (","type":"text"},{"text":"https://acme-climate.atlassian.net/wiki/spaces/Docs/pages/34113147/Atmosphere+Grids","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/Docs/pages/34113147/Atmosphere+Grids"}}]},{"text":"). As explained there, for E3SMv1 data many people will want the \"dual-grid\" with pentagons. The correct grid-file for this is ","type":"text"},{"text":"ne30np4_pentagons.091226.nc","type":"text","marks":[{"type":"code"}]},{"text":". Do not be tempted by SE grid-files named with ","type":"text"},{"text":"latlon","type":"text","marks":[{"type":"code"}]},{"text":". Datasets from E3SM v2 and v3 simulations are all on FV grids. EAM v2 and v3 grids are named in the format ","type":"text"},{"text":"neXXXpg2","type":"text","marks":[{"type":"code"}]},{"text":". ELM and MPAS names take a wider variety of forms, many of which appear below.","type":"text"}]},{"type":"paragraph","content":[{"text":"All grid-files discussed so far are in SCRIP-format, named for the Spherical Coordinate Remapping and Interpolation Package (authored by @pjones). Other formats exist and are increasingly important, especially for SE grids. For now just know that these other formats are also usually stored as netCDF, and that some tools allow non-SCRIP formats to be used interchangeably with SCRIP.","type":"text"}]},{"type":"paragraph","content":[{"text":"Once armed with source and destination grid-files, one can generate their map-file with","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -s grd_src.nc -g grd_dst.nc -m map.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Regrid a datafile at the same time as generating the map-file for archival:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -s grd_src.nc -g grd_dst.nc -m map.nc dat_src.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Regrid a datafile without archiving the (internally generated) map-file:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -s grd_src.nc -g grd_dst.nc dat_src.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"For the prototype problem, the map-file generation procedure becomes","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_nco.YYYYMMDD.nc # EAMv1\nncremap -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_nco.YYYYMMDD.nc # EAMv2","type":"text"}]},{"type":"paragraph","content":[{"text":"The map-files above are named with ","type":"text"},{"text":"alg_typ=nco","type":"text","marks":[{"type":"code"}]},{"text":" because the ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" default interpolation algorithm is the first-order conservative NCO algorithm (NB: before NCO 4.9.1 the default algorithm was ESMF ","type":"text"},{"text":"bilin","type":"text","marks":[{"type":"code"}]},{"text":"). To re-create the ","type":"text"},{"text":"aave","type":"text","marks":[{"type":"code"}]},{"text":" map in the first example, invoke ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with ","type":"text"},{"text":"-a esmfaave","type":"text","marks":[{"type":"code"}]},{"text":" (the newest v3 naming convention) or ","type":"text"},{"text":"-a conserve","type":"text","marks":[{"type":"code"}]},{"text":" (same algorithm, different name in v1, v2):","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -a conserve -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_aave.YYYYMMDD.nc # EAMv1\nncremap -a conserve -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_aave.YYYYMMDD.nc # EAMv2\nncremap -a esmfaave -s ne30pg2.nc -g cmip6_180x360_scrip.20181001.nc -m map_ne30pg2_to_cmip6_180x360_esmfaave.YYYYMMDD.nc # EAMv3","type":"text"}]},{"type":"paragraph","content":[{"text":"This takes a few minutes, so save custom-generated map-files for future use. Computing weights to generate map-files is much more computationally expensive and time-consuming than regridding, i.e., than applying the weights in the map-file to the data. We will gloss over most options that weight-generators can take into consideration, because their default values often work well. One option worth knowing now is ","type":"text"},{"text":"-a","type":"text","marks":[{"type":"code"}]},{"text":". The invocation synonyms for ","type":"text"},{"text":"-a","type":"text","marks":[{"type":"code"}]},{"text":" are ","type":"text"},{"text":"--alg_typ","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"--algorithm","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"--regrid_algorithm","type":"text","marks":[{"type":"code"}]},{"text":". These are listed in the square brackets in the self-help message that ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" prints when it is invoked without argument, or with ","type":"text"},{"text":"--help","type":"text","marks":[{"type":"code"}]},{"text":":","type":"text"}]},{"type":"paragraph","content":[{"text":"-a alg_typ Algorithm for weight generation (default ncoaave) [alg_typ, algorithm, regrid_algorithm]","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" ESMF algorithms: esmfbilin,bilinear|esmfaave,aave,conserve|conserve2nd|nearestdtos|neareststod|patch","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" NCO algorithms: ncoaave,nco,nco_con|ncoidw,nco_dwe (inverse-distance-weighted interpolation/extrapolation) ","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":" Tempest (and MOAB-Tempest) algorithms: traave,fv2fv_flx|trbilin|trfv2|trintbilin|tempest|fv2fv|fv2fv_stt","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"At least one valid option argument for each supported interpolation type is shown separated by vertical bars. The arguments shown have multiple synonyms, separated by commas, that are equivalent. For example, ","type":"text"},{"text":"-a esmfaave","type":"text","marks":[{"type":"code"}]},{"text":" is equivalent to ","type":"text"},{"text":"-a aave","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"--alg_typ=conserve","type":"text","marks":[{"type":"code"}]},{"text":". Use the longer option form for clarity and precision, and the shorter form for conciseness. The full list of synonyms, and the complete documentation, is at ","type":"text"},{"text":"http://nco.sf.net/nco.html#alg_typ","type":"text","marks":[{"type":"link","attrs":{"href":"http://nco.sf.net/nco.html#alg_typ"}}]},{"text":". The NCO algorithm ","type":"text"},{"text":"ncoaave","type":"text","marks":[{"type":"code"}]},{"text":" is the default (because it is always availabe). Commonly-used algorithms that invoke ERWG are ","type":"text"},{"text":"esmfbilin","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"esmfaave","type":"text","marks":[{"type":"code"}]},{"text":". TR options are discussed below. As of E3SM v3, TR algorithms are preferred for all mappings (because they are more accurate). Peruse the list of options now, though defer a thorough investigation until you reach the \"Intermediate Regridding\" section.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Prototypical Regridding III: Infer Grid-file from Data-file","type":"text"}]},{"type":"paragraph","content":[{"text":"Thus far we have explained how to apply a map-file to data, and how, if necessary, to generate a map-file from known grids. What if there is no map-file and the source or the destination grid-files (or both) are unavailable? Often, one knows the basic information about a grid (e.g., resolution) but lacks the grid-file that contains the complete information for that grid geometry. In such cases, one must create the grid-file via one of two methods. First, one can let ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" attempt to infer the grid-file from a data file known to be on the desired grid. This procedure is called \"inferral\" and is fairly painless. Second, one can feed ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" all the required parameters (for rectangular grids only) and it will generate a grid-file. This requires a precise specification of the grid geometry, and will be covered the sub-section on \"Manual Grid-file Generation\".","type":"text"}]},{"type":"paragraph","content":[{"text":"Before we describe what the inferral procedure does, here is an example that demonstrates how easy it is. You can regrid an SE (or FV) dataset from our prototype example to the same grid as an evaluation dataset. Pick any 2D (e.g., ","type":"text"},{"text":"MxN","type":"text","marks":[{"type":"code"}]},{"text":" latitude-by-longitude) dataset to compare the E3SM simulations to. Inferral uses the grid information in the evaluation dataset, which is already on the desired destination grid, to create the (internally generated) destination grid-file. Supply ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with any dataset on the desired destination grid (","type":"text"},{"text":"dat_dst.nc","type":"text","marks":[{"type":"code"}]},{"text":") with ","type":"text"},{"text":"-d","type":"text","marks":[{"type":"code"}]},{"text":" (for \"dataset\") instead of ","type":"text"},{"text":"-g","type":"text","marks":[{"type":"code"}]},{"text":" (for \"grid\"):","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -a bilin -s ne30np4_pentagons.091226.nc -d dat_dst.nc -m map_ne30np4_to_MxN_bilin.YYYYMMDD.nc # EAMv1\nncremap -a bilin -s ne30pg2.nc -d dat_dst.nc -m map_ne30pg2_to_MxN_bilin.YYYYMMDD.nc # EAMv2","type":"text"}]},{"type":"paragraph","content":[{"text":"This tells ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" to infer the destination grid-file and to use it to generate the desired map-file, named with the supposed destination resolution (here, ","type":"text"},{"text":"MxN","type":"text","marks":[{"type":"code"}]},{"text":" degrees or gridcells). To archive the inferred destination grid for later use, supply a name for the grid with ","type":"text"},{"text":"-g","type":"text","marks":[{"type":"code"}]},{"text":":","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -s ne30np4_pentagons.091226.nc -d dat_dst.nc -g grd_dst.nc -m map_ne30np4_to_1x1_nco.YYYYMMDD.nc # Requires NCO >= 4.7.6\nncremap -s ne30pg2.nc -d dat_dst.nc -g grd_dst.nc -m map_ne30pg2_to_MxN_nco.YYYYMMDD.nc # EAMv2","type":"text"}]},{"type":"paragraph","content":[{"text":"One can infer a grid without having to regrid anything. Supply ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with a data-template file (","type":"text"},{"text":"dat.nc","type":"text","marks":[{"type":"code"}]},{"text":") and a grid-file name (","type":"text"},{"text":"grd.nc","type":"text","marks":[{"type":"code"}]},{"text":"). Since there are no input files to regrid, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" exits after inferring the grid-file:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -d dat.nc -g grd.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Grid-inferral is easier to try than manual grid-generation, and will work if the data file contains the necessary information. The only data needed to construct a SCRIP grid-file are the vertices of each gridcell. The gridcell vertices define the gridcell edges and these in turn define the gridcell area which is equivalent to the all-important weight parameter necessary to regrid data. Of course the gridcell vertices must be stored with recognizable names and/or metadata indicators. The Climate & Forecast (CF) metadata convention calls the gridcell vertices the \"cell-bounds\". Coordinates (like latitude and longitude) usually store cell-center values, and should, according to CF, have ","type":"text"},{"text":"bounds","type":"text","marks":[{"type":"code"}]},{"text":" attributes whose values point to variables (e.g., ","type":"text"},{"text":"lat_bounds","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"lon_vertices","type":"text","marks":[{"type":"code"}]},{"text":") that contain the actual vertices. Relatively few datasets \"in the wild\" contain gridcell vertices, though the practice is, happily, on the rise. Formally SE models have nodal points with weights without any fixed perimeter or vertices assigned to the nodes, so the lack of vertices in SE model output is a feature, not an oversight. The dual-grid (referenced above) addresses this by defining \"pretend\" gridcell vertices for each nodal point so that an SE dataset can be treated like an FV dataset. However, dual-grids are difficult to generate, and may not exist or be accurate for many SE grids. In that case, ","type":"text"},{"text":"ncremap ","type":"text","marks":[{"type":"strong"}]},{"text":"cannot infer the grid (because the vertices are unknown) and one needs to use a different package (such as TempestRemap, below) to construct the grid-file and the mapping weights.","type":"text"}]},{"type":"paragraph","content":[{"text":"Inferral works well on important categories of grids for which ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" can guess the missing grid information. In the absence of gridcell vertice information, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" examines the location of and spacing between gridcell centers and can often determine from these what type of grid a data-file (not a grid-file!) is stored on. A data-file simply means the filetype preferred for creation/distribution of modeled/observed data. Hence ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" has the (original and unique, so far as we know) ability to infer all useful rectangular grid-types from data-files that employ the grid. The key constraint here is \"rectangular\", meaning the grid must be orthogonal (though not necessarily regularly spaced) in latitude and longitude. This includes all uniform angle grids, FV grids, and Gaussian grids. For curvilinear grids (including most swath data), ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" infers the underlying grid to be the set of curves that bisect the curves created by joining the gridcell centers. This often works well for curvilinear grids that do not cross a pole. Inferral works for unstructured (i.e., 1D) grids only when the cell-bounds are stored in the datafile as described above. Hence inferral ","type":"text"},{"text":"will not work on raw output","type":"text","marks":[{"type":"em"}]},{"text":" from SE models.","type":"text"}]},{"type":"paragraph","content":[{"text":"A few more examples will flesh-out how inferral can be used. First, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" can infer both source and destination grids in one command:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -d dat_dst.nc dat_src.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Here the user provides only data files (no grid- or map-files) yet still obtains regridded output! The first positional (i.e., not immediately preceded by an option indicator) argument (","type":"text"},{"text":"dat_src.nc","type":"text","marks":[{"type":"code"}]},{"text":") is interpreted as the data to regrid, and the second positional argument (","type":"text"},{"text":"dat_rgr.nc","type":"text","marks":[{"type":"code"}]},{"text":") is the output name. The ","type":"text"},{"text":"-d","type":"text","marks":[{"type":"code"}]},{"text":" argument is the name of any dataset (","type":"text"},{"text":"dat_dst.nc","type":"text","marks":[{"type":"code"}]},{"text":") on the desired destination grid. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" infers the source grid from ","type":"text"},{"text":"dat_src.nc","type":"text","marks":[{"type":"code"}]},{"text":", then infers the destination grid from ","type":"text"},{"text":"dat_dst.nc","type":"text","marks":[{"type":"code"}]},{"text":", then generates weights (with the default algorithm since none is specified) and creates a map-file that it uses to regrid ","type":"text"},{"text":"dat_src.nc","type":"text","marks":[{"type":"code"}]},{"text":" to ","type":"text"},{"text":"dat_rgr.nc","type":"text","marks":[{"type":"code"}]},{"text":". No grid-file or map-file names were specified (with ","type":"text"},{"text":"-g","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"-m","type":"text","marks":[{"type":"code"}]},{"text":") so both grid-files and the map-file are generated internally in temporary locations and erased after use.","type":"text"}]},{"type":"paragraph","content":[{"text":"Second, this procedure, like most ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" features, works for multiple input files:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -d dat_dst.nc -I drc_in -O drc_rgr\nncremap -d dat_dst.nc -I drc_in","type":"text"}]},{"type":"paragraph","content":[{"text":"Unless a map-file or source grid-file is explicitly provided (with ","type":"text"},{"text":"-m","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"-s","type":"text","marks":[{"type":"code"}]},{"text":", respectively), ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" infers a separate source grid-file (and computes a corresponding map-file) for each input file. This allows it to regrid lists of uniquely gridded data (e.g., satellite swaths each on its own grid) to a common destination grid. When all source files are on the same grid (as is typical with models), then turn-off the expensive multiple inferral and map-generation procedures with the ","type":"text"},{"text":"-M","type":"text","marks":[{"type":"code"}]},{"text":" switch to save time:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -M -d dat_dst.nc -I drc_in -O drc_rgr\nncremap --mlt_map --dst_fl=dat_dst.nc --drc_in=drc_in --drc_rgr=drc_rgr # Long-options for clarity","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Prototypical Regridding IV: Manual Grid-file Generation","type":"text"}]},{"type":"paragraph","content":[{"text":"If a desired grid-file is unavailable, and no dataset on that grid is available (so inferral cannot be used), then one must manually create a new grid. Users create new grids for many reasons including dataset intercomparisons, regional studies, and fine-tuned graphics. NCO and ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" support manual generation of the most common rectangular grids as SCRIP-format grid-files. Create a grid by supplying ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with a grid-file name and \"grid-formula\" (","type":"text"},{"text":"grd_sng","type":"text","marks":[{"type":"em"}]},{"text":") that contains, at a minimum, the grid-resolution. The grid-formula is a hash-separated string of name-value pairs each representing a grid parameter. All parameters except grid resolution have reasonable defaults, so a grid-formula can be as simple as \"latlon=180,360\":","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -g grd.nc -G latlon=180,360","type":"text"}]},{"type":"paragraph","content":[{"text":"Congratulations! The new grid-file ","type":"text"},{"text":"grd.nc","type":"text","marks":[{"type":"link","attrs":{"href":"http://grd.nc"}}]},{"text":" is a valid source or destination grid for ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" commands.","type":"text"}]},{"type":"paragraph","content":[{"text":"Grid-file generation documentation in the NCO Users Guide at ","type":"text"},{"text":"http://nco.sf.net/nco.html#grid","type":"text","marks":[{"type":"link","attrs":{"href":"http://nco.sf.net/nco.html#grid"}}]},{"text":" describes all the grid parameters and contains many examples. Note that the examples in this section use grid generation API for ncremap version 4.7.6 (August, 2018) and later. Earlier versions can use the ","type":"text"},{"text":"ncks","type":"text","marks":[{"type":"strong"}]},{"text":" API explained in the Users Guide.","type":"text"}]},{"type":"paragraph","content":[{"text":"The most useful grid parameters (besides resolution) are latitude type ","type":"text"},{"text":"(lat_typ","type":"text","marks":[{"type":"em"}]},{"text":"), longitude type (","type":"text"},{"text":"lon_typ","type":"text","marks":[{"type":"em"}]},{"text":"), title (","type":"text"},{"text":"ttl","type":"text","marks":[{"type":"em"}]},{"text":"), and, for regional grids, the SNWE bounding box (","type":"text"},{"text":"snwe","type":"text","marks":[{"type":"em"}]},{"text":"). The three supported varieties of global rectangular grids are Uniform/equiangular (","type":"text"},{"text":"lat_typ","type":"text","marks":[{"type":"em"}]},{"text":"=","type":"text"},{"text":"uni","type":"text","marks":[{"type":"code"}]},{"text":"), Cap/FV (","type":"text"},{"text":"lat_typ","type":"text","marks":[{"type":"em"}]},{"text":"=","type":"text"},{"text":"cap","type":"text","marks":[{"type":"code"}]},{"text":"), and Gaussian (","type":"text"},{"text":"lat_typ","type":"text","marks":[{"type":"em"}]},{"text":"=","type":"text"},{"text":"gss","type":"text","marks":[{"type":"code"}]},{"text":"). The four supported varieties of longitude types are the first (westernmost) gridcell centered at Greenwich (","type":"text"},{"text":"lon_typ","type":"text","marks":[{"type":"em"}]},{"text":"=","type":"text"},{"text":"grn_ctr","type":"text","marks":[{"type":"code"}]},{"text":"), western edge at Greenwish (","type":"text"},{"text":"grn_wst","type":"text","marks":[{"type":"code"}]},{"text":"), or at the Dateline (","type":"text"},{"text":"lon_typ","type":"text","marks":[{"type":"em"}]},{"text":"=","type":"text"},{"text":"180_ctr","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"lon_typ","type":"text","marks":[{"type":"em"}]},{"text":"=","type":"text"},{"text":"180_wst","type":"text","marks":[{"type":"code"}]},{"text":", respectively). Grids are global, uniform, store latitudes from south-to-north, and have their first longitude centered at Greenwich by default. The grid-formula for this is '","type":"text"},{"text":"lat_typ=uni#lon_typ=grn_ctr#lat_drc=s2n","type":"text","marks":[{"type":"code"}]},{"text":"'. Some examples (remember, this API requires NCO 4.7.6+):","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -g grd.nc -G latlon=180,360 # 1x1 Uniform grid\nncremap -g grd.nc -G latlon=180,360#lon_typ=grn_wst # CMIP6 1x1 Uniform grid\nncremap -g grd.nc -G latlon=129,256#lat_typ=cap # 1.4x1.4 FV grid\nncremap -g grd.nc -G latlon=94,192#lat_typ=gss # T62 Gaussian grid\nncremap -g grd.nc -G latlon=721,1440#lat_drc=n2s#lat_typ=cap#lon_typ=grn_ctr # ECMWF ERA5 native grid\nncremap -g grd.nc -G latlon=1280,2560#lat_typ=gss#lon_typ=grn_ctr#lat_drc=n2s # ECMWF IFS F640 Full Gaussian\nncremap -g grd.nc -G latlon=360,720#lat_typ=uni#lon_typ=180_wst # \"r05\" ELM/MOSART 0.5x0.5 uniform grid","type":"text"}]},{"type":"paragraph","content":[{"text":"Regional grids are a powerful tool in regional process analyses, and can be much smaller in size than global datasets. Regional grids are always uniform. Specify the rectangular bounding box, i.e., the outside edges of the region, in SNWE order:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -g grd.nc -G latlon=30,90#snwe=55.0,85.0,-90.0,0.0 # 1x1 Greenland grid","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Intermediate Regridding:","type":"text"}]},{"type":"paragraph","content":[{"text":"The sections on Prototypical Regridding were intended to be read sequentially and introduced the most frequently required ncremap features. The Intermediate and Advanced regridding sections are an a la carte description of features most useful to particular component models, workflows, and data formats. Peruse these sections in any order.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Intermediate Regridding I: Treatment of Missing Data","type":"text"}]},{"type":"paragraph","content":[{"text":"The terminology of this section can be confusing. The most important point to clarify is that the treatments of missing data described here are independent of the regridding algorithm (and weight-generator) which is specified by ","type":"text"},{"text":"alg_typ. alg_typ ","type":"text","marks":[{"type":"em"}]},{"text":"only defines the algorithm used to generate the weights contained in a map-file, not how to ","type":"text"},{"text":"apply","type":"text","marks":[{"type":"em"}]},{"text":" those weights. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" applies the map-file weights (i.e., regrids) to data-files that may contain fields with spatio-temporally missing values, such as AODVIS in EAM, all fields with ocean gridpoints in ELM output, most fields in MPAS-Seaice, and many fields in satellite-retrieved datasets. Unless the locations of a field's (like AODVIS) missing values were supplied as a mask to the weight-generator algorithm at map-creation time, the map-file weights cannot and will not automatically take into account that some of the source gridcells contain invalid (aka missing) values. This section describes three options for how the weight-applicator (not weight-generator) is to reconstruct values in destination gridcells affected by missing source gridcells. One option preserves the integral value of the input gridcells and is called \"integral-preserving\" or \"conservative\" because it is first-order conservative in the affected gridcells locally, as well as globally conservative. At the other extreme is the option to preserve (locally) the gridcell mean of the input values in the affected output gridcells. This option is called \"mean-preserving\" or \"renormalized\" and is not conservative. The third option allows the user to specify a sliding scale between the integral- and mean-preserving options, which is to say it is a more general form of renormalization. Once again, these weight-application approaches are independent of the weight-generation algorithm defined by ","type":"text"},{"text":"alg_typ","type":"text","marks":[{"type":"em"}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"Conservative weight-generation is, for first-order accurate algorithms, a straightforward procedure of identifying gridcell overlap and apportioning values correctly from source to destination. The absence of valid values (or presence of missing values) forces a choice on how to construct destination gridcell values where some but not all contributing source cells are valid. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" supports three distinct approaches: \"Integral-preserving\", \"Mean-preserving\", and \"Sliding Scale\". The integral-preserving approach uses all valid data from the input grid on the output grid once and only once. Destination cells receive the weighted valid values of the source cells. This is a first-order conservative procedure, and will, in conjunction with weights from conservative regridding algorithm, conserve and preserve the local and global integrals of the source and destination fields. The mean-preserving approach divides the destination value by the sum of the valid weights. This preserves and extends the mean of the valid input values throughout the entire destination gridcell. In other words, it extrapolates valid data to missing regions while preserving the mean valid data. Input and output integrals are unequal and mean-preserving regridding is not conservative and does not preserve the integrals when missing data are involved. Furthermore, mean-preserving regridding only preserves the local (i.e., gridcell-level) mean of the input data, not the global mean. Both approaches, integral-preserving and mean-preserving (aka conservative and re-normalized) produce identical answers when no missing data maps to the destination gridcell. Before explaining the nuance of the third approach (\"sliding-scale\"), we demonstrate the symmetric ways of invoking integral- or mean-preserving regridding: ","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -m map.nc dat_src.nc dat_rgr.nc # Integral-preserving treatment of missing data is used by default\nncremap --preserve=integral -m map.nc dat_src.nc dat_rgr.nc # Explicitly specify integral-preserving treatment (requires NCO 4.9.2+)\nncremap --preserve=mean -m map.nc dat_src.nc dat_rgr.nc # Explicitly specify mean-preserving treatment (requires NCO 4.9.2+)","type":"text"}]},{"type":"paragraph","content":[{"text":"By default, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" implements the integral-preserving, conservative approach because it has useful properties, is simpler to understand, and requires no additional parameters. However, this approach will often produce unrealistic gridpoint values (e.g., ocean temperatures < 100 K and that is not a typo) values along coastlines or near data gaps where state variables are regridded to/from small fractions of a gridcell. The mean-preserving approach solves this problem, yet incurs others, like non-conservation. The mean-preserving approach (aka re-normalization) ensures that the output values are physically consistent and do not stick-out like sore thumbs in a plot, although the integral of their value times area is not conserved.","type":"text"}]},{"type":"paragraph","content":[{"text":"The third option, the \"sliding scale\" approach, is slightly more complex and much more flexible than the all-or-nothing integral- or mean-preserving examples above. The sliding-scale refers to the fraction of the destination gridcell that must be overlapped by valid source gridcells. Users can specify the renormalization threshold weight","type":"text"},{"text":" rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" which is the required valid fraction of destination gridcells with the renormalization threshold option \"","type":"text"},{"text":"--rnr_thr=rnr_thr","type":"text","marks":[{"type":"code"}]},{"text":"\". The weight-application algorithm then ensures that valid values overlap at least the fraction ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" of each destination gridcell for that gridcell to meet the threshold for a non-missing destination value. When ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" is exceeded, the mean valid value in the valid area is placed in the destination gridcell. If the valid area covers less than ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":", then the destination gridcell is assigned the missing value. Valid values of ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" range from zero to one. A threshold weight of ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" = 0.0 indicates that any amount (no matter how small) of valid data should be represented by its mean value on the output grid. Keep in mind though, that the actual valid area divides a sum (the area-weighted integral of the valid values), and values of zero or very near to zero in the divisor can lead to floating-point underflow and divide-by-zero errors. A threshold weight of ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" = 1.0 indicates that the entire destination gridcell (to within machine precision) must be overlapped by valid data in order for that destination gridcell to be assigned a valid (non-missing) value. Threshold weights 0.0 < ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" < 1.0 invoke the real-power of the sliding scale. Remote sensing classification schemes applied to L2 data may require, for example, valid retrievals over at least 50% of source pixels before assigning a valid value to a destination gridcell. This is equivalent to","type":"text"},{"text":" rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" = 0.5. And so, ","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap --rnr_thr=0.25 -m map.nc dat_src.nc dat_rgr.nc # Sliding scale: Destination value preserves mean of input data, requires at least 25% valid input data\nncremap --rnr_thr=0.50 -m map.nc dat_src.nc dat_rgr.nc # Sliding scale: Destination value preserves mean of input data, requires at least 50% valid input data","type":"text"}]},{"type":"paragraph","content":[{"text":"These sliding scale examples specify that valid values must cover at least 25% and 50% of the destination gridcell to meet the threshold for a non-missing destination value. With actual valid destination areas of 25% or 50%, this approach would produce destination values greater than the conservative algorithm by factors of four and two, respectively. Careful readers may already have observed that the mean-preserving approach (","type":"text"},{"text":"--preserve=mean","type":"text","marks":[{"type":"code"}]},{"text":") is exactly equivalent to the sliding scale approach with a renormalization threshold weight ","type":"text"},{"text":"rnr_thr","type":"text","marks":[{"type":"em"}]},{"text":" = 0.0. The latter approach is just a more numerical way of expressing the former approach. On the other hand, no numerical threshold weight produces answers equivalent to the integral-preserving approach. However, it is convenient for power users to be able invoke all three missing data approaches via the rnr_thr option alone. Hence, we made setting the threshold weight to the string \"none\" (","type":"text"},{"text":"--rnr_thr=none)","type":"text","marks":[{"type":"code"}]},{"text":" to be exactly equivalent to specifying the integral-preserving approach. In summary, the ","type":"text"},{"text":"--preserve","type":"text","marks":[{"type":"code"}]},{"text":" option with its two valid arguments ","type":"text"},{"text":"integral","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"mean","type":"text","marks":[{"type":"code"}]},{"text":" that invoke the integral-preserving and mean-preserving algorithms will suffice for most situations. The sliding-scale algorithm must be invoked via the ","type":"text"},{"text":"--rnr_thr","type":"text","marks":[{"type":"code"}]},{"text":" option with a numerical argument, although the string \"none\" will default to the integral-preserving algorithm.","type":"text"}]},{"type":"paragraph","content":[{"text":"In practice, it may make sense to use the default integral-preserving treatment with conservative weights, and the mean-preserving or sliding-scale treatment with other (non-conservative) weights such as those produced by bilinear interpolation or nearest-neighbor. Another consideration in selecting the weight-application algorithm for missing values is whether the fields to regrid are fluxes or state variables. For example, temperature (unlike heat) and concentrations (amount per unit volume) are not physically conserved quantities under areal-regridding so it often makes sense to interpolate them in a non-conservative fashion, to preserve their fine-scale structures. Few researchers can digest the unphysical values of temperature that the integral-preserving treatment produces in regions rife with missing values. On the other hand, mass and energy fluxes should be physically conserved under areal-regridding. Hence, one must consider both the type of field and its conservation properties when choosing a missing value treatment.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Intermediate Regridding II: TempestRemap","type":"text"}]},{"type":"paragraph","content":[{"text":"TempestRemap (TR) is the chief alternative to ERWG and NCO for regridding weight-generation. TR is replacing ERWG as the default on-line weight generator in E3SMv2. Tempest algorithms, written by @paulullrich, have many numerical advantages described in papers and at ","type":"text"},{"text":"Transition to TempestRemap for Atmosphere grids","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/178848194"}}]},{"text":" . Verify that ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" can access your Tempest installation as described in the above section on \"Software Requirements\" before trying the examples below. Once installed, TR can be as easy to use as ERWG or NCO with FV grid-files, e.g.,","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -a fv2fv_flx -s ne30np4_pentagons.091226.nc -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_tempest.YYYYMMDD.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"This command, the same as shown in the \"Create Map-file from Known Grid-files\" section above, except using ","type":"text"},{"text":"alg_typ","type":"text","marks":[{"type":"em"}]},{"text":"='","type":"text"},{"text":"fv2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":"', is a jumping-off point for understanding Tempest features and quirks. First, simply note that the ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" interfaces for ERWG, NCO, and TR weight-generators are the same even though the underlying ERWG, NCO, and TR applications have different APIs. Second, Tempest accepts SCRIP-format grids (as shown) and Exodus-format grid-files, also stored in netCDF though typically with a '","type":"text"},{"text":".g","type":"text","marks":[{"type":"code"}]},{"text":"' suffix, e.g., ne30.g as described at ","type":"text"},{"text":"Transition to TempestRemap for Atmosphere grids","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/178848194"}}]},{"text":" Exodus grid-files contain grid \"connectivity\" and other information required to optimally treat advanced grids like SE. Hence this also works","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -a se2fv_flx -s ne30.g -g 129x256_SCRIP.20150901.nc -m map_ne30np4_to_fv129x256_tempest.YYYYMMDD.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"This produces subtly different weights because ne30.g encodes the SE ne30np4 grid-specification, not its dual-grid FV representation. TR generates superior weights when instructed to use an algorithm optimized for the type of remapped variable and the grid representation as described below. The above example employs the recommended algorithm to remap fluxes on the SE grid to the FV destination grid, i.e., ","type":"text"},{"text":"se2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"The exact options that NCO invokes for a specific TR algorithm like ","type":"text"},{"text":"se2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":" can be discovered in multiple ways: First, all TR options that NCO employs are recommended on the ","type":"text"},{"text":"Transition to TempestRemap for Atmosphere grids","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.netnull/pages/createpage.action?spaceKey=DOC&title=Transition%20to%20TempestRemap%20for%20Atmosphere%20grids&linkCreation=true&fromPageId=754286611"}}]},{"text":"; Second, the NCO Users Guide documents TR options at ","type":"text"},{"text":"http://nco.sf.net/nco.html#tr","type":"text","marks":[{"type":"link","attrs":{"href":"http://nco.sf.net/nco.html#tr"}}]},{"text":"; Third, the options ","type":"text"},{"text":"--dbg_lvl=1","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"--dbg_lvl=2","type":"text","marks":[{"type":"code"}]},{"text":" cause ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" to print the sub-commands it executes, including the TR commands with options. Experiment, intercompare, and find the algorithm that works best for your purposes. Advanced users may be interested in the quantitative evaluations of the quality of the weights in the map-file provided by the ","type":"text"},{"text":"map-checker","type":"text","marks":[{"type":"em"}]},{"text":" (","type":"text"},{"text":"ncks --chk_map map.nc","type":"text","marks":[{"type":"code"}]},{"text":") described below.","type":"text"}]},{"type":"paragraph","content":[{"text":"As mentioned at the Tempest overlap-mesh generator expects to be sent the two grid-files in the order smaller then larger (ERWG and NCO have no corresponding restriction). For example, Tempest considers the global ocean to be a smaller domain than the global atmosphere since it covers less area (due to masked points). Hence ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" must be told when the source grid-file covers a larger domain than the destination. Do this with the \"","type":"text"},{"text":"--a2o","type":"text","marks":[{"type":"code"}]},{"text":"\" or \"","type":"text"},{"text":"--l2s","type":"text","marks":[{"type":"code"}]},{"text":"\" switch (for \"atmosphere-to-ocean\" or \"large-to-small\", respectively):","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap --l2s -a se2fv_flx -s atm_se_grd.nc -g ocn_fv_grd.nc -m map.nc # Source larger than destination\nncremap -a fv2se_flx -s ocn_fv_grd.nc -g atm_se_grd.nc -m map.nc # Source smaller than destination\nncremap -a se2fv_flx -s atm_se_grd.nc -g atm_fv_grd.nc -m map.nc # Source same size as destination","type":"text"}]},{"type":"paragraph","content":[{"text":"As mentioned above, EAM v1 datasets are the only ones stored in an SE grid format. To accomodate the mixture of FV and SE grids needed for model evaluation, ","type":"text"},{"text":"Transition to TempestRemap for Atmosphere grids","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/178848194"}}]},{"text":" describes eight specific global FV<->SE mappings that optimize different combinations of accuracy, conservation, and monotonicity desirable for remapping flux-variables (flx), state-variables (stt), and an alternative (alt) mapping. A plethora of boutique options and switches control the Tempest weight-generation algorithms for these six cases. To simplify their invocation, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" names these eight algorithms ","type":"text"},{"text":"fv2se_stt","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"fv2se_flx","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"fv2se_alt","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"fv2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"fv2fv_stt","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"se2fv_stt","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"se2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"se2fv_alt","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"se2se","type":"text","marks":[{"type":"code"}]},{"text":". E3SM maps with these algorithms have adopted the suffixes \"mono\" (for ","type":"text"},{"text":"fv2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"se2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"fv2se_alt","type":"text","marks":[{"type":"code"}]},{"text":"), \"highorder\" (","type":"text"},{"text":"fv2fv_stt","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"se2fv_stt","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"fv2se_stt","type":"text","marks":[{"type":"code"}]},{"text":"), \"intbilin\" (","type":"text"},{"text":"se2fv_alt","type":"text","marks":[{"type":"code"}]},{"text":"), and \"monotr\" (","type":"text"},{"text":"fv2se_flx","type":"text","marks":[{"type":"code"}]},{"text":"). The relevant Tempest map-files can be generated with","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -a fv2se_flx -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_monotr.20180901.nc\nncremap -a fv2se_stt -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_highorder.20180901.nc\nncremap -a fv2se_alt -s ocean.RRS.30-10km_scrip_150722.nc -g ne30.g -m map_oRRS30to10_to_ne30np4_mono.20180901.nc\nncremap --l2s -a se2fv_flx -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_mono.20180901.nc\nncremap --l2s -a se2fv_stt -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_highorder.20180901.nc\nncremap --l2s -a se2fv_alt -s ne30.g -g ocean.RRS.30-10km_scrip_150722.nc -m map_ne30np4_to_oRRS30to10_intbilin.20180901.nc\nncremap -a se2se -s ne30.g -g ne120.g -m map_ne30np4_to_ne120np4_se2se.20180901.nc\nncremap -a fv2fv_flx -s ocean.RRS.30-10km_scrip_150722.nc -g cmip6_180x360_scrip.20181001.nc -m map_oRRS30to10_to_180x360_mono.20180901.nc\nncremap -a fv2fv_stt -s ocean.RRS.30-10km_scrip_150722.nc -g cmip6_180x360_scrip.20181001.nc -m map_oRRS30to10_to_180x360_highorder.20180901.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"These maps, applied to appropriate flux and state variables, should exactly reproduce the online remapping in the E3SM v1 coupler. However, explicitly generating all standard maps this way is not recommended because ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" includes an MWF-mode (for \"Make All Weight Files\") described below. MWF-mode generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps for ERWG and Tempest. The E3SM v2/v3 configurations all use FV grids. Moreover, the mapfile naming convention changed in v3.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Intermediate Regridding III: MOAB/mbtempest support","type":"text"}]},{"type":"paragraph","content":[{"text":"Weight-generator computations (e.g., finding grid intersections) increase non-linearly with the grid size, so the largest grids are most efficiently computed with parallel algorithms. ERWG has long supported distributed computation in an MPI-environment, and NCO has always supported multi-threaded weight computation via OpenMP. A growing subset of TempestRemap algorithms have now been ported to the parallel MOAB (Mesh Oriented datABase) tool ","type":"text"},{"text":"mbtempest","type":"text","marks":[{"type":"strong"}]},{"text":" (aka MBTR). ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" can invoke the MOAB regridding toolchain as of NCO version 5.0.2 from September, 2021. The \"spoiler\" answer to \"How do I get ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" to invoke ","type":"text"},{"text":"mbtempest","type":"text","marks":[{"type":"strong"}]},{"text":"?\" is simple: Add the ","type":"text"},{"text":"--mpi_nbr=mpi_nbr","type":"text","marks":[{"type":"code"}]},{"text":" option to an ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" command that already calls a TempestRemap algorithm. However, MOAB involves complex, MPI-enabled software, and support for it is continually changing and subject to important limitations. Read on to understand what to currently expect.","type":"text"}]},{"type":"paragraph","content":[{"text":"First, MOAB requires an MPI environment to perform well. Invoking MOAB (i.e., using ","type":"text"},{"text":"--mpi_nbr=mpi_nbr","type":"text","marks":[{"type":"code"}]},{"text":" with a TR algorithm name) in a non-MPI environment will result in an error. One can easily obtain MPI-enabled MOAB environment with Conda. For example, install Conda MPI versions of the MOAB package with","type":"text"}]},{"type":"paragraph","content":[{"text":"conda install -c conda-forge moab=5.3.0=mpich_tempest esmf","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Please ensure you have the latest version of ERWG, MOAB, and/or TempestRemap before reporting any related problems to NCO.","type":"text"}]},{"type":"paragraph","content":[{"text":"This section is intentionally placed after the TR section because ","type":"text"},{"text":"mbtempest","type":"text","marks":[{"type":"strong"}]},{"text":" re-uses the TR algorithm names described in the previous section. For example, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" invokes TR to generate weights to remap fluxes from FV to FV grids when invoked with","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap -a fv2fv_flx --src_grd=src.g --dst_grd=dst.nc -m map.nc # Invoke TR","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Add the","type":"text"},{"text":" --mpi_nbr","type":"text","marks":[{"type":"code"}]},{"text":" option and ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" will instead invoke the MOAB toolchain to compute weights for any TempestRemap algorithm (otherwise the TR toolchain would be used):","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap --mpi_nbr=8 -a fv2fv_flx --src_grd=src.g --dst_grd=dst.nc -m map.nc # Invoke MOAB","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Although mapping weights generated by MOAB and TempestRemap use the same numerical algorithms, they are likely to produce slightly different weights due to round-off differences. MOAB is heavily parallelized and computes and adds terms together in an unpredictable order compared to the serial TempestRemap.","type":"text"}]},{"type":"paragraph","content":[{"text":"Transparently to the user, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" supplies the selected weight generator with the recommended options, which can be quite complex. For the preceding example, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" either invokes TempestRemap's ","type":"text"},{"text":"GenerateOverlapWeights","type":"text","marks":[{"type":"strong"}]},{"text":" with the boutique options ","type":"text"},{"text":"--in_type fv --in_np 1 --out_type fv --out_np 1 --correct_areas","type":"text","marks":[{"type":"code"}]},{"text":" that E3SM recommends for conservative and monotone remapping of fluxes, or it invokes multiple components of the MOAB toolchain:","type":"text"}]},{"type":"paragraph","content":[{"text":"mbconvert -B -o PARALLEL=WRITE_PART -O PARALLEL=BCAST_DELETE -O PARTITION=TRIVIAL -O PARALLEL_RESOLVE_SHARED_ENTS \"src.g\" \"src.h5m\"","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"mbconvert -B -o PARALLEL=WRITE_PART -O PARALLEL=BCAST_DELETE -O PARTITION=TRIVIAL -O PARALLEL_RESOLVE_SHARED_ENTS \"dst.nc\" \"dst.h5m\"","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"mbpart 8 --zoltan RCB \"src.h5m\" \"src_8p.h5m\"","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"mbpart 8 --zoltan RCB --recompute_rcb_box --scale_sphere --project_on_sphere 2 \"dst.h5m\" \"dst_8p.h5m\"","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"mpirun -n 8 mbtempest --type 5 --weights --load \"src_8p.h5m\" --load \"dst_8p.h5m\" --method fv --order 1 --method fv --order 1 --file \"map.nc\"","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"The purpose of the ncremap front-end to MOAB is to hide this complexity from the user while preserving the familiar look and feel of invoking other weight-generators. Once again, the MOAB toolchain should produce a map-file identical (to rounding precision) to one produced by TR. When speed matters (i.e., large grids), and the algorithm is supported (i.e., ","type":"text"},{"text":"fv2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":"), invoke MOAB, otherwise invoke TR.","type":"text"}]},{"type":"paragraph","content":[{"text":"This section comes before the parallelism section because mbtempest supports MPI-enabled parallelism that is distinct from the ncremap workflow parallelism described in the next section.","type":"text"}]},{"type":"paragraph","content":[{"text":"Caveat lector: As of September, 2021 Weights generated by MOAB (version 5.3.0 and earlier) are only trustworthy for the ","type":"text"},{"text":"fv2fv_flx","type":"text","marks":[{"type":"code"}]},{"text":" algorithm. The options for all other algorithms are implemented as indicated though they should be invoked for testing purposes only. High order and spectral element map algorithms are not fully implemented and will produce unreliable results. MOAB anticipates supporting more TempestRemap algorithms in the future. Always use the map-checker to test maps before use, e.g., with ","type":"text"},{"text":"ncks --chk_map map.nc","type":"text","marks":[{"type":"code"}]},{"text":".","type":"text"}]},{"type":"paragraph","content":[{"text":"Another limitation is that ","type":"text"},{"text":"mbtempest ","type":"text","marks":[{"type":"strong"}]},{"text":"(version 5.3.0) currently only generates map-files that regrid to rank 1 (unstructured) grids. ","type":"text"},{"text":"mbtempest","type":"text","marks":[{"type":"strong"}]},{"text":" \"unrolls\" any rank 2 (e.g.,","type":"text"},{"text":" latxlon","type":"text","marks":[{"type":"code"}]},{"text":") destination grid into its rank 1 equivalent. Users seeking to regrid to rank 2 grids can manually alter the MOAB-generated mapfile with something like:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncks -O -x -v dst_grid_dims ~/map.nc ~/map_fix.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncap2 -O -s 'defdim(\"dst_grid_dims\",2);dst_grid_dims($dst_grid_dims)={256,129};' ~/map_tmp.nc ~/map_fix.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Replace 256,129 above by the ","type":"text"},{"text":"lon,lat","type":"text","marks":[{"type":"code"}]},{"text":" (Fortran order!) of your grid. The resulting map-file, ","type":"text"},{"text":"map_fix.nc","type":"text","marks":[{"type":"code"}]},{"text":", will almost regrid as intended. The regridded output will have curvilinear coordinates (e.g, ","type":"text"},{"text":"lat(lat,lon)","type":"text","marks":[{"type":"code"}]},{"text":") instead of rectangular coordinates (e.g., ","type":"text"},{"text":"lat(lat)","type":"text","marks":[{"type":"code"}]},{"text":"), and the coordinate bounds variables (e.g.,","type":"text"},{"text":" lat_bnds","type":"text","marks":[{"type":"code"}]},{"text":") will not be correct and may cause plotting issues at poles and Greenwich. Nevertheless, the weights are correct and of unsurpassed accuracy. MOAB anticipates supporting rank 2 mapfiles and TR high-order and spectral element algorithms in the future.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Intermediate Regridding IV: Parallelism","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" can exploit three types of parallelism: multiples nodes in a cluster, multiple simultaneous file regriddings on a single node, and multiple simultaneous variables regridded within a single file. Each level of parallelism reduces the wallclock time to complete the regridding workflow at the expense of increase resource requirements.","type":"text"}]},{"type":"paragraph","content":[{"text":"File-level parallelism accelerates throughput when regridding multiple files in one ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" invocation, and has no effect when only one file is to be regridded. Note that the ","type":"text"},{"text":"ncclimo","type":"text","marks":[{"type":"strong"}]},{"text":" and ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" semantics for selecting file-level parallelism are identical, though their defaults differ (","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" mode for ","type":"text"},{"text":"ncclimo","type":"text","marks":[{"type":"strong"}]},{"text":" and ","type":"text"},{"text":"Serial","type":"text","marks":[{"type":"code"}]},{"text":" mode for ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":"). Select the desired mode with the argument to ","type":"text"},{"text":"--par_typ=par_typ","type":"text","marks":[{"type":"code"}]},{"text":". Explicitly select ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" mode with ","type":"text"},{"text":"par_typ","type":"text","marks":[{"type":"em"}]},{"text":" values of ","type":"text"},{"text":"bck","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"background","type":"text","marks":[{"type":"code"}]},{"text":", or ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":". The values ","type":"text"},{"text":"mpi","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" select ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" mode, and the ","type":"text"},{"text":"srl","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"serial","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"Serial","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"nil","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"none","type":"text","marks":[{"type":"code"}]},{"text":" will all select ","type":"text"},{"text":"Serial","type":"text","marks":[{"type":"code"}]},{"text":" mode (which disables file-level parallelism, though still allows intra-file OpenMP parallelism).","type":"text"}]},{"type":"paragraph","content":[{"text":"The default file-level parallelism for ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" is ","type":"text"},{"text":"Serial","type":"text","marks":[{"type":"code"}]},{"text":" mode (i.e., no file-level parallelism), in which ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" processes one input file at a time. ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" modes implement true file-level parallelism. Typically both these parallel modes scale well with sufficent memory unless and until I/O contention becomes the bottleneck. In ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" mode ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" issues all commands to regrid the input file list as UNIX background processes on the local node. Nodes with mutiple cores and sufficient RAM take advantage of this to simultaneously regrid multiple files. In ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" mode ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" issues commands to regrid the input file list in round-robin fashion to all available compute nodes. Prior to NCO version 4.9.0 (released December, 2019), ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" parallelism modes both regridded all the input files at one time and there was no way to limit the number of files being simultaneously regridded. Subsequent versions allow finer grained parallelism by introducing the ability to limit the number of discrete workflow elements or ``jobs'' (i.e., file regriddings) to perform simultaneously within an ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" invocation or ``workflow''.","type":"text"}]},{"type":"paragraph","content":[{"text":"As of NCO version 4.9.0 (released December, 2019), the ","type":"text"},{"text":"--job_nbr=","type":"text","marks":[{"type":"code"}]},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" option specifies the maximum number of files to regrid simultaneously on all nodes being harnessed by the workflow. Thus","type":"text"},{"text":" job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" is an additional parameter to fine-tune file level parallelism (it has no effect in ","type":"text"},{"text":"Serial","type":"text","marks":[{"type":"code"}]},{"text":" mode). In both parallel modes ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" spawns processes in batches of","type":"text"},{"text":" job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" jobs, then waits for those processes to complete. Once a batch finishes, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" spawns the next batch. In ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" mode, all jobs are spawned to the local node. In ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" mode, all jobs are spawned in round-robin fashion to all available nodes until ","type":"text"},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" jobs are running.","type":"text"}]},{"type":"paragraph","content":[{"text":"If regridding consumes so much RAM (e.g., because variables are large and/or the number of threads is large) that a single node can perform only one regridding job at a time, then a reasonable value for ","type":"text"},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" is the number of nodes, ","type":"text"},{"text":"node_nbr","type":"text","marks":[{"type":"em"}]},{"text":". Often, however, nodes can regrid multiple files simultaneously. It can be more efficient to spawn multiple jobs per node than to increase the threading per job because I/O contention for write access to a single file prevents threading from scaling indefinitely.","type":"text"}]},{"type":"paragraph","content":[{"text":"By default ","type":"text"},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" = 2 in ","type":"text"},{"text":"Background","type":"text","marks":[{"type":"code"}]},{"text":" mode, and ","type":"text"},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" = ","type":"text"},{"text":"node_nbr","type":"text","marks":[{"type":"em"}]},{"text":" in ","type":"text"},{"text":"MPI","type":"text","marks":[{"type":"code"}]},{"text":" mode. This helps prevent users from overloading nodes with too many jobs. Subject to the availability of adequate RAM, expand the number of jobs per node by increasing ","type":"text"},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":" until, ideally, each core on the node is used. Remember that processes and threading are multiplicative in core use. Four jobs each with four threads each consumes sixteen cores.","type":"text"}]},{"type":"paragraph","content":[{"text":"We have thus far demonstrated how to control file-level parallelism (with ","type":"text"},{"text":"par_typ","type":"text","marks":[{"type":"em"}]},{"text":") and workflow level parallelism (with ","type":"text"},{"text":"job_nbr","type":"text","marks":[{"type":"em"}]},{"text":"). The third level of parallelism mentioned above is that ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" can use OpenMP shared-memory techniques to simultaneosly regrid multiple variables within a single file. This shared memory parallelism is efficient because it requires only a single copy of the regridding weights in physical memory to regrid multiple variable simultaneously. Even so, regridding multiple variables at high resolution may become memory-limited, meaning that the insufficient RAM can often limit the number of variables that the system can simultaneously regrid.","type":"text"}]},{"type":"paragraph","content":[{"text":"By convention all variables to be regridded share the same regridding weights stored in a map-file, so that only one copy of the weights needs to be in memory, just as in S","type":"text"},{"text":"erial ","type":"text","marks":[{"type":"code"}]},{"text":"mode. However, the per-thread (i.e., per-variable) OpenMP memory demands are considerable, with the memory required to regrid variables amounting to no less than about 5-7 times (for type ","type":"text"},{"text":"NC_FLOAT","type":"text","marks":[{"type":"code"}]},{"text":") and 2.5-3.5 times (for type ","type":"text"},{"text":"NC_DOUBLE)","type":"text","marks":[{"type":"code"}]},{"text":" the size of the uncompressed variable, respectively. Memory requirements are so high because the regridder performs all arithmetic in double precision to retain the highest accuracy, and must allocate separate buffers to hold the input and output (regridded) variable, a tally array to count the number of missing values and an array to sum the of the weights contributing to each output gridcell (the last two arrays are only necessary for variables with a ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute). The input, output, and weight-sum arrays are always double precision, and the tally array is composed of four-byte integers. Given the high memory demands, one strategy to optimize ","type":"text"},{"text":"thr_nbr","type":"text","marks":[{"type":"em"}]},{"text":" for repetitious workflows is to increase it to keep doubling it (1, 2, 4, ...) until throughput stops improving. With sufficient memory, the NCO regridder scales well up to 8-16 threads.","type":"text"}]},{"type":"paragraph","content":[{"text":"As an example, consider regridding 100 files with a single map. Say you have a five-node cluster, and each node has 16 cores and can simultaneously regrid two files using eight threads each. (One needs to test a bit to optimize these parameters.) Then an optimal (in terms of wallclock time) invocation would request five nodes with 10 simultaneous jobs of eight threads. On PBS or SLURM batch systems this would involve a scheduler command like ","type":"text"},{"text":"qsub -l nodes=5 ...","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"sbatch --nodes=5 ...","type":"text","marks":[{"type":"code"}]},{"text":", respectively, followed by ","type":"text"},{"text":"ncremap --par_typ=mpi --job_nbr=10 --thr_nbr=8 ...","type":"text","marks":[{"type":"code"}]},{"text":". This job will likely complete between five and ten-times faster than a serial-mode invocation of ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" to regrid the same files. The uncertainty range is due to unforeseeable, system-dependent load and I/O charateristics. Nodes that can simultaneously write to more than one file fare better with multiple jobs per node. Nodes with only one I/O channel to disk may be better exploited by utilizing more threads per process.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding:","type":"text"}]},{"type":"paragraph","content":[{"text":"Advanced procedures have in common that they activate non-standard processing modes for ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":". These modes do something different, or in addition to, the standard weight-generation and application. Generally these modes were created in order to automate frequently recurring workflows that can leverage the ncremap infrastructure so long as various bells and whistles are introduced along the way. Please let us know if you have ideas for new or improved processing modes.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding I: MPAS-mode","type":"text"}]},{"type":"paragraph","content":[{"text":"MPAS models produce output in their own format distinct from CESM-heritage models. MPAS-mode invokes three pre-processing steps to massage MPAS datasets until they are amenable to regridding. These steps include missing value annotation, missing value treatment, and dimension permutation. We will shortly describe these steps in order. First, though, MPAS-mode, like most other ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" modes, is explicitly invoked with the ","type":"text"},{"text":"-P md_nm","type":"text","marks":[{"type":"code"}]},{"text":" option where the ","type":"text"},{"text":"md_nm","type":"text","marks":[{"type":"code"}]},{"text":" is some variation of the MPAS component model name: ","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -P mpas -m map.nc dat_src.nc dat_rgr.nc # Generic MPAS mode, imperfect, prefer component-specific mode names\nncremap -P mpasatmosphere -m map.nc dat_src.nc dat_rgr.nc # Introduced in NCO 5.2.7\nncremap -P mpasocean -m map.nc dat_src.nc dat_rgr.nc # Important for correct vertical interpolation\nncremap -P mpasseaice -m map.nc dat_src.nc dat_rgr.nc # Automatically weights by timeMonthly_avg_iceAreaCell\nncremap -P mali -m map.nc dat_src.nc dat_rgr.nc # Automatically uses MALI _FillValue","type":"text"}]},{"type":"paragraph","content":[{"text":"Many model and observational datasets use missing values that are not annotated in the standard manner. For example, the MPAS ocean and ice models use -9.99999979021476795361e+33 as the missing value, yet (at least from 2015-2018) do not store this in a ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute with any variables. To prevent arithmetic from treating these values as valid, MPAS-mode automatically puts this value in a ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute for all floating-point variables via","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncatted -t -a _FillValue,,o,d,-9.99999979021476795361e+33 -a _FillValue,,o,f,-9.99999979021476795361e+33 dat_src.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Oddly, the MPAS land-ice model uses -1.0e36 for missing values, so currently MPAS-LI users must explicitly supply this missing value, or invoke ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with the ","type":"text"},{"text":"-P mali","type":"text","marks":[{"type":"code"}]},{"text":" option","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -P mpas --mss_val=-1.0e36 -m map.nc dat_src.nc dat_rgr.nc # Explicitly supply missing value\nncremap -P mali -m map.nc dat_src.nc dat_rgr.nc # Let ncremap know dataset is from MALI","type":"text"}]},{"type":"paragraph","content":[{"text":"Next, MPAS datasets usually have masked regions (e.g., non-ocean cells) yet MPAS users like to visualize regridded data with realistic (not blocky) boundaries along those cells and so they decided to, by default, treat missing values with the renormalization approach described above in the section on Treatment of Missing Values. Hence MPAS-mode automatically invokes ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" with maximum renormalization, equivalent to","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap --rnr_thr=0.0 -m map.nc dat_src.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Finally, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" requires the horizontal spatial dimension(s), whether latitude and longitude or some unstructured dimension, to be the final (most-rapidly-varying) dimension(s) of input datasets. MPAS datasets natively place their horizontal spatial dimension (typically ","type":"text"},{"text":"nCells","type":"text","marks":[{"type":"code"}]},{"text":") closer to the least-rapidly-varying position. While this makes perfect sense from an I/O-efficiency point-of-view for unstructured models, it does not play well with regridders. Hence all MPAS-modes permute the input dimensions to a regridder-friendly order (i.e., ending with ","type":"text"},{"text":"nCells","type":"text","marks":[{"type":"code"}]},{"text":") with a command of the form","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncpdq -a Time,nVertLevels,nVertLevelsP1,maxEdges,MaxEdges2,nCategories,ONE,nEdges,nCells dat_src.nc dat_tmp.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"The combination of these three data manipulations defines MPAS-mode. It can be difficult to learn the ocean mesh (i.e., grid) names and thus to find the appropriate pre-made map-files. The standard low resolution ocean map-files for post-processing each version of MPAS are:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap -P mpasocean --map=map_oEC60to30v3_to_cmip6_180x360_aave.20181001.nc mpov1.nc out.nc # MPAS Ocean v1","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncremap -P mpasocean --map=map_EC30to60E2r2_to_cmip6_180x360_aave.20220301.nc mpov2.nc out.nc # MPAS Ocean v2","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncremap -P mpasocean --map=map_IcoswISC30E3r5_to_cmip6_180x360_traave.20231201.nc mpov3.nc out.nc # MPAS Ocean v3","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"ncremap -P mpasseaice --map=map_oEC60to30v3_to_cmip6_180x360_aave.20181001.nc msiv1.nc out.nc # MPAS Seaice v1","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncremap -P mpasseaice --map=map_EC30to60E2r2_to_cmip6_180x360_aave.20220301.nc msiv2.nc out.nc # MPAS Seaice v2","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncremap -P mpasseaice --map=map_IcoswISC30E3r5_to_cmip6_180x360_traave.20231201.nc msiv3.nc out.nc # MPAS Seaice v3","type":"text","marks":[{"type":"code"}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding II: EAMXX-mode","type":"text"}]},{"type":"paragraph","content":[{"text":"EAMXX storage conventions and files differ from EAM (and CAM) in only two ways. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" introduced a ","type":"text"},{"text":"-P eamxx","type":"text","marks":[{"type":"code"}]},{"text":" mode to support these gotchas in 2022. Dimension permutation is the primary pre-processing step necessary to massage EAMXX datasets so that they are amenable to regridding. ","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -P eamxx -m map.nc dat_src.nc dat_rgr.nc # Automatically permute dimensions for horizontal regridding ","type":"text"}]},{"type":"paragraph","content":[{"text":"First, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" requires the horizontal spatial dimension(s), whether latitude and longitude or some unstructured dimension, to be the final (most-rapidly-varying) dimension(s) of input datasets. EAMXX datasets natively place their horizontal spatial dimension (typically ","type":"text"},{"text":"ncol","type":"text","marks":[{"type":"code"}]},{"text":") closer to the least-rapidly-varying position. While this makes perfect sense from an I/O-efficiency point-of-view for unstructured models, it does not play well with horizontal regridding. Hence EAMXX-mode permutes the input dimensions to a regridder-friendly order (i.e., ending with ","type":"text"},{"text":"ncol","type":"text","marks":[{"type":"code"}]},{"text":") with a command of the form","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncpdq -a ilev,lev,plev,dim2,ncol dat_src.nc dat_tmp.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Second, EAMXX names the surface pressure variable ","type":"text"},{"text":"ps","type":"text","marks":[{"type":"code"}]},{"text":" by default, not ","type":"text"},{"text":"PS","type":"text","marks":[{"type":"code"}]},{"text":" as in EAM. The distinction is important whenever vertical interpolation is invoked. Hence EAMXX mode automatically tells the vertical interpolaion routine to look for ","type":"text"},{"text":"ps","type":"text","marks":[{"type":"code"}]},{"text":" not ","type":"text"},{"text":"PS","type":"text","marks":[{"type":"code"}]},{"text":". The combination of these two pre-processing steps defines EAMXX-mode.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding III: Sub-Gridscale Regridding (SGS-mode)","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" has a sub-gridscale (SGS) mode that performs the special pre-processing and weighting necessary to conserve fields that represent fractional spatial portions of a gridcell, and/or fractional temporal periods of the analysis. Spatial fields output by most geophysical models are intensive, and so by default the regridder attempts to conserve the integral of the area times the field value such that the integral is equal on source and destination grids. However some models (like ELM, CLM, CICE, and MPAS-Seaice) output gridcell values intended to apply to only a fraction sgs_frc (for \"sub-gridscale fraction'') of the gridcell. The sub-gridscale fraction usually changes spatially with the distribution of land and ocean, and spatiotemporally with the distribution of sea ice and possibly vegetation. For concreteness consider a sub-grid field that represents the land fraction. Land fraction is less than one in gridcells that resolve coastlines or islands. ELM and CLM happily output temperature values valid only for a small (i.e., ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" << 1) island within the larger gridcell. Model architecture dictates this behavior and savvy researchers expect it. The goal of the NCO weight-application algorithm is to treat SGS fields as seamlessly as possible so that those less familiar with sub-gridscale models can easily regrid them correctly.","type":"text"}]},{"type":"paragraph","content":[{"text":"Fortunately, models like ELM and CLM that run on the same horizontal grid as the overlying atmosphere can use the same mapping-file as the atmosphere, so long as the SGS weight-application procedure is invoked. Not invoking an SGS-aware weight application algorithm is equivalent to assuming ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" = 1 everywhere. Regridding sub-grid values correctly versus incorrectly (e.g., with and without SGS-mode) alters global-mean answers for land-based quantities by about 1% for horizontal grid resolutions of about one degree. The resulting biases are in intricately shaped regions (coasts, lakes, sea-ice floes) and so are easy to overlook.","type":"text"}]},{"type":"paragraph","content":[{"text":"To invoke SGS mode and correctly regrid sub-gridscale data, specify the names of the fractional area ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" and, if applicable, the mask variable ","type":"text"},{"text":"sgs_msk","type":"text","marks":[{"type":"em"}]},{"text":" (strictly, this is only necessary if these names differ from their respective defaults ","type":"text"},{"text":"landfrac","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"landmask","type":"text","marks":[{"type":"code"}]},{"text":"). Trouble will ensue if ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" is a percentage or an absolute area rather than a fractional area (between zero and one). ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" must know the normalization factor ","type":"text"},{"text":"sgs_nrm","type":"text","marks":[{"type":"em"}]},{"text":" by which ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" must be divided (not multiplied) to obtain a true, normalized fraction. Datasets (such as those from CICE) that store ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" in percent should specify the option ","type":"text"},{"text":"--sgs_nrm=100","type":"text","marks":[{"type":"code"}]},{"text":" to instruct ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" to normalize the sub-grid area appropriately before regridding. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" will re-derive ","type":"text"},{"text":"sgs_msk","type":"text","marks":[{"type":"em"}]},{"text":" based on the regridded values of ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":": ","type":"text"},{"text":"sgs_msk","type":"text","marks":[{"type":"em"}]},{"text":" = 1 is assigned to destination gridcells with ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" > 0.0, and all others ","type":"text"},{"text":"sgs_msk","type":"text","marks":[{"type":"em"}]},{"text":" = 0. As of NCO version 4.6.8 (released June, 2017), invoking any of the options ","type":"text"},{"text":"--sgs_frc","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"--sgs_msk","type":"text","marks":[{"type":"code"}]},{"text":", or ","type":"text"},{"text":"--sgs_nrm","type":"text","marks":[{"type":"code"}]},{"text":", automatically triggers SGS-mode, so that also invoking ","type":"text"},{"text":"-P sgs","type":"text","marks":[{"type":"code"}]},{"text":" is redundant though legal. As of NCO version 4.9.0 (released December, 2019), the values of the ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" and ","type":"text"},{"text":"sgs_msk","type":"text","marks":[{"type":"em"}]},{"text":" variables should be explicitly specified. In previous versions they defaulted to ","type":"text"},{"text":"landfrac","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"landmask","type":"text","marks":[{"type":"code"}]},{"text":", respectively, when ","type":"text"},{"text":"-P sgs","type":"text","marks":[{"type":"code"}]},{"text":" was selected. This behavior still exists but will likely be deprecated in a future version.","type":"text"}]},{"type":"paragraph","content":[{"text":"The ","type":"text"},{"text":"area","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"code"}]},{"text":" fields in the regridded file will be in units of sterradians and fraction, respectively. However, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":" offers custom options to reproduce the idiosyncratic data and metadata format of two particular models, ELM and CICE. When invoked with ","type":"text"},{"text":"-P elm","type":"text","marks":[{"type":"code"}]},{"text":" (or ","type":"text"},{"text":"-P clm","type":"text","marks":[{"type":"code"}]},{"text":"), a final step converts the output area from sterradians to square kilometers. When invoked with ","type":"text"},{"text":"-P cice","type":"text","marks":[{"type":"code"}]},{"text":", the final step converts the output area from sterradians to square meters, and the output ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" from a fraction to a percent.","type":"text"}]},{"type":"codeBlock","content":[{"text":"# ELM/CLM: output \"area\" in [sr]\nncremap --sgs_frc=landfrac --sgs_msk=landmask -m map.nc in.nc out.nc\nncremap -P sgs -m map.nc in.nc out.nc # Deprecated in 4.9.0\n# ELM/CLM pedantic format: output \"area\" in [km2]\nncremap -P elm -m map.nc in.nc out.nc # Same as -P clm, alm, ctsm\n\n# CICE: output \"area\" in [sr]\nncremap --sgs_frc=aice --sgs_msk=tmask --sgs_nrm=100 -m map.nc in.nc out.nc\n# CICE pedantic format: output \"area\" in [m2], \"aice\" in [%]\nncremap -P cice -m map.nc in.nc out.nc\n\n# MPAS-Seaice:\nncremap -P mpasseaice -m map.nc in.nc out.nc # Preferred (because it is forward-compatible)\nncremap -P mpas --sgs_frc=timeMonthly_avg_iceAreaCell -m map.nc in.nc out.nc # Equivalent to above","type":"text"}]},{"type":"paragraph","content":[{"text":"It is sometimes convenient to store the ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" field in an external file from the field(s) to be regridded. For example, CMIP-style timeseries are often written with only one variable per file. NCO supports this organization by accepting ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" arguments in the form of a filename followed by a slash and then a variable name:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap --sgs_frc=sgs_landfrac_ne30.nc/landfrac -m map.nc in.nc out.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Files regridded using explicitly specified SGS options will differ slightly from those regridded using the ","type":"text"},{"text":"-P elm","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"-P cice","type":"text","marks":[{"type":"code"}]},{"text":" options. The former will have an area field in sterradians, the generic units used internally by the regridder. The latter produces model-specific area fields in square kilometers (for ELM) or square meters (for CICE), as expected in the raw output from these two models. To convert from angular to areal values, NCO assumes a spherical Earth with radius 6,371,220 m or 6,371,229 m, for ELM and CICE, respectively. The ouput ","type":"text"},{"text":"sgs_frc","type":"text","marks":[{"type":"em"}]},{"text":" field is expressed as a decimal fraction in all cases except for ","type":"text"},{"text":"-P cice","type":"text","marks":[{"type":"code"}]},{"text":" which stores the fraction in percent. Thus the generic SGS and model-specific convenience options produce equivalent results, and the latter is intended to be indistinguishable (in terms of metadata and units) to raw model output. This makes it more interoperable with many existing analysis scripts.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding IV: Regional Unstructured Output (RRG-mode)","type":"text"}]},{"type":"paragraph","content":[{"text":"EAM (and CAM-SE) will produce regional output if requested to with the ","type":"text"},{"text":"finclNlonlat","type":"text","marks":[{"type":"code"}]},{"text":" namelist parameter. Output for a single region can be higher temporal resolution than the host global simulation. This facilitates detailed yet economical regional process studies. Regional output files are in a special format that we call RRG (for \"regional regridding\"). An RRG file may contain any number of rectangular regions. The coordinates and variables for one region do not interfere with other (possibly overlapping) regions because all variables and dimensions are named with a per-region suffix string, e.g., ","type":"text"},{"text":"lat_128e_to_134e_9s_to_16s","type":"text","marks":[{"type":"code"}]},{"text":". ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" can easily regrid RRG 2D logically rectangular output from an FV-dycore because ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" can infer (as discussed above) the grid from any well-annotated regional FV data file. Regridding unstructured regional grid data, however, is more complex because unstructured grids without cell vertices and unstructured grid weight-generators are not yet flexible enough to to output only regional (as opposed to global) grids with weights. To summarize, regridding RRG data leads to three difficulties (#1-3 below) and two difficulties (#4-5) shared with FV RRG files:","type":"text"}]},{"type":"orderedList","attrs":{"order":1},"content":[{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"RRG files contain only regional gridcell center locations, not weights","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Global SE grids have well-defined weights not vertices for each gridpoint","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Grid generation software (ESMF and TempestRemap) only create global not regional SE grid files","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Non-standard variable names and dimension names","type":"text"}]}]},{"type":"listItem","content":[{"type":"paragraph","content":[{"text":"Regional files can contain multiple regions","type":"text"}]}]}]},{"type":"paragraph","content":[{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":"'s RRG mode resolves these issues to allow trouble-free regridding of SE RRG files. The user must provide two additional input arguments, ","type":"text"},{"text":"--dat_glb=dat_glb","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"--grd_glb=grd_glb","type":"text","marks":[{"type":"code"}]},{"text":" to point to a global SE dataset and grid, respectively, of the same resolution as the model that generated the RRG datasets. Hence a typical RRG regridding invocation is:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"Here ","type":"text"},{"text":"grd_rgn","type":"text","marks":[{"type":"code"}]},{"text":" is a regional destination grid-file, ","type":"text"},{"text":"dat_rgn","type":"text","marks":[{"type":"code"}]},{"text":" is the RRG file to regrid, and ","type":"text"},{"text":"dat_rgr","type":"text","marks":[{"type":"code"}]},{"text":" is the regridded output. Typically ","type":"text"},{"text":"grd_rgn","type":"text","marks":[{"type":"code"}]},{"text":" is a uniform rectangular grid covering the same region as the RRG file. Generate this as described in the last example in the section above on \"Manual Grid-file Generation\". ","type":"text"},{"text":"grd_glb","type":"text","marks":[{"type":"code"}]},{"text":" is the standard dual-grid grid-file for the SE resolution of the simulation, e.g., ","type":"text"},{"text":"ne30np4_pentagons.091226.nc","type":"text","marks":[{"type":"code"}]},{"text":". ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" regrids the global data file ","type":"text"},{"text":"dat_glb","type":"text","marks":[{"type":"code"}]},{"text":" to the global dual-grid in order to produce a intermediate global file annotated with gridcell vertices. Then it hyperslabs the lat/lon coordinates (and vertices) from the regional domain to use with regridding the RRG file. A ","type":"text"},{"text":"grd_glb","type":"text","marks":[{"type":"code"}]},{"text":" file with only one 2D field suffices (and is fastest) for producing the information needed by the RRG procedure. One can prepare an optimal ","type":"text"},{"text":"dat_glb","type":"text","marks":[{"type":"code"}]},{"text":" file by subsetting any 2D variable (e.g., ","type":"text"},{"text":"ncks -v FSNT in.nc dat_glb.nc","type":"text","marks":[{"type":"code"}]},{"text":") from a full global SE output dataset.","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap RRG mode supports two additional options to override parameters set internally. First, the per-region suffix string may be set with ","type":"text"},{"text":"--rnm_sng=rnm_sng","type":"text","marks":[{"type":"code"}]},{"text":". RRG mode will, by default, regrid the first region it finds in an RRG file. Explicitly set the desired region with ","type":"text"},{"text":"rnm_sng","type":"text","marks":[{"type":"code"}]},{"text":" for files with multiple regions, e.g., ","type":"text"},{"text":"--rnm_sng= ","type":"text","marks":[{"type":"code"}]},{"text":". Second, the bounding-box of the region may be explicitly set with ","type":"text"},{"text":"--bb_wesn=lon_wst,lon_est,lat_sth,lat_nrt","type":"text","marks":[{"type":"code"}]},{"text":". The normal parsing of the bounding-box string from the suffix string may fail in (as yet undiscovered) corner cases, and the ","type":"text"},{"text":"--bb_wesn","type":"text","marks":[{"type":"code"}]},{"text":" option provides a workaround. The bounding-box string must include the entire RRG region, specified in WESN order. The two override options may be used independently or together, as in:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap --rnm_sng='_128e_to_134e_9s_to_16s' --bb_wesn='128.0,134.0,-16.0,-9.0' --dat_glb=dat_glb.nc --grd_glb=grd_glb.nc -g grd_rgn.nc dat_rgn.nc dat_rgr.nc","type":"text"}]},{"type":"paragraph","content":[{"text":"RRG-mode supports most normal ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" options, including input and output methods and regridding algorithms.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding V: Make All Weight Files (MWF-mode)","type":"text"}]},{"type":"paragraph","content":[{"text":"As mentioned above in the TempestRemap section, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" includes an MWF-mode (for \"Make All Weight Files\") that generates and names, with one command and in a self-consistent manner, all combinations of E3SM global atmosphere<->ocean maps with both ERWG and Tempest. MWF-mode automates the laborious and error-prone process of generating numerous map-files with various switches. Its chief use occurs when developing and testing new global grid-pairs for the E3SM atmosphere and ocean components. Invoke MWF-mode with a number of specialized options to control the naming of the output map-files:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -P mwf -s grd_ocn -g grd_atm --nm_src=ocn_nm --nm_dst=atm_nm --dt_sng=date","type":"text"}]},{"type":"paragraph","content":[{"text":"where ","type":"text"},{"text":"grd_ocn","type":"text","marks":[{"type":"code"}]},{"text":" is the \"global\" ocean grid, ","type":"text"},{"text":"grd_atm","type":"text","marks":[{"type":"code"}]},{"text":", is the global atmosphere grid, ","type":"text"},{"text":"nm_src","type":"text","marks":[{"type":"code"}]},{"text":" sets the shortened name for the source (ocean) grid as it will appear in the output map-files, ","type":"text"},{"text":"nm_dst","type":"text","marks":[{"type":"code"}]},{"text":" sets, similarly, the shortend named for the destination (atmosphere) grid, and ","type":"text"},{"text":"dt_sng","type":"text","marks":[{"type":"code"}]},{"text":" sets the date-stamp in the output map-file name ","type":"text"},{"text":"map_${nm_src}_to_${nm_dst}_${alg_typ}.${dt_sng}.nc","type":"text","marks":[{"type":"code"}]},{"text":". Setting ","type":"text"},{"text":"nm_src","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"nm_dst","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"dt_sng","type":"text","marks":[{"type":"code"}]},{"text":" is optional though highly recommended. For example,","type":"text"}]},{"type":"paragraph","content":[{"text":"% ncremap -P mwf -s ${DATA}/grids/ocean.QU.240km.scrip.181106.nc -g ${DATA}/grids/cmip6_180x360_scrip.20181001.nc --nm_src=QU240 --nm_dst=cmip6_180x360 --dt_sng=20240220","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"produces sixteen map-files. Note the (v3 standard) mapfile name is ","type":"text"},{"text":"map_to_.YYYYMMDD.nc","type":"text","marks":[{"type":"code"}]},{"text":". Not coincidentally, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" MWF (Make-Weight-File) mode generates maps for exactly these eight algorithms (in both directions). Many researchers do not want the global->ocean direction map-files. Those can simply be deleted.","type":"text"}]},{"type":"paragraph","content":[{"text":"% ls map*","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_esmfaave.20240220.nc map_cmip6_180x360_to_QU240_esmfaave.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_esmfbilin.20240220.nc map_cmip6_180x360_to_QU240_esmfbilin.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_ncoaave.20240220.nc map_cmip6_180x360_to_QU240_ncoaave.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_ncoidw.20240220.nc map_cmip6_180x360_to_QU240_ncoidw.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_traave.20240220.nc map_cmip6_180x360_to_QU240_traave.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_trbilin.20240220.nc map_cmip6_180x360_to_QU240_trbilin.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_trfv2.20240220.nc map_cmip6_180x360_to_QU240_trfv2.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_trintbilin.20240220.nc map_cmip6_180x360_to_QU240_trintbilin.20240220.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"For a subset of these maps, use the ","type":"text"},{"text":"--alg_lst","type":"text","marks":[{"type":"code"}]},{"text":" option, e.g.,","type":"text"},{"type":"hardBreak"},{"text":"% ncremap -P mwf --alg_lst=esmfbilin,ncoidw,traave,trbilin -s ${DATA}/grids/ocean.QU.240km.scrip.181106.nc -g ${DATA}/grids/cmip6_180x360_scrip.20181001.nc --nm_src=QU240 --nm_dst=cmip6_180x360 --dt_sng=20240220","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"% ls map*","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_esmfbilin.20240220.nc map_cmip6_180x360_to_QU240_esmfbilin.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_ncoidw.20240220.nc map_cmip6_180x360_to_QU240_ncoidw.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_traave.20240220.nc map_cmip6_180x360_to_QU240_traave.20240220.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"map_QU240_to_cmip6_180x360_trbilin.20240220.nc map_cmip6_180x360_to_QU240_trbilin.20240220.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"The ordering of source and destination grids is immaterial for ERWG maps since MWF-mode produces all map combinations. However, as described above in the TempestRemap section, the Tempest overlap-mesh generator must be called with the smaller grid preceding the larger grid. For this reason, always invoke MWF-mode with the smaller grid (i.e., the ocean) as the source, otherwise some Tempest map-files will fail to generate. ","type":"text"}]},{"type":"paragraph","content":[{"text":"MWF-mode can take significant time to complete. To accelerate this, consider installing the MPI-enabled instead of the serial version of ERWG and MBTR. Then use the ","type":"text"},{"text":"--wgt_cmd","type":"text","marks":[{"type":"code"}]},{"text":" option to tell ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" the MPI configuration to invoke ERWG with, for example:","type":"text"}]},{"type":"codeBlock","content":[{"text":"ncremap -P mwf --wgt_cmd='mpirun -np 12 ESMF_RegridWeightGen' -s ocean.RRS.30-10km_scrip_150722.nc -g t62_SCRIP.20150901.nc --nm_src=oRRS30to10 --nm_dst=T62 --dt_sng=20180901","type":"text"}]},{"type":"paragraph","content":[{"text":"Background and distributed node parallelism (as described above in the the Parallelism section) of MWF-mode are possible though not yet implemented. Please let us know if this feature is desired.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding VI: CMIP6 Timeseries","type":"text"}]},{"type":"paragraph","content":[{"text":"This section describes the recommended procedures to construct and regrid E3SM timeseries data to CMIP6 specifications. Most models provide data to CMIP6 in timeseries format, meaning one variable-per-file with multiple years per file. These timeseries must be regridded to at least one of the CMIP6 standard grids. The E3SM project chose to supply its v1 experiments to CMIP6 archived on rectangular, uniform (i.e., equiangular in latitude and longitude), one-degree (for standard-resolution) and quarter-degree (for high-resolution) grids. Generating these timeseries from experiments as lengthy as 500 model years, formatted to CMIP6 specifications, requires many non-standard options to both ","type":"text"},{"text":"ncclimo","type":"text","marks":[{"type":"code"}]},{"text":" (to construct the timeseries) and to ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" (to regrid timeseries), and is a natural capstone exercise in using both together. This section is arranged in reverse order where first we present the final actual commands, followed by the descriptions, meanings, and reasons for particular options.","type":"text"}]},{"type":"paragraph","content":[{"text":"The recommended procedures for generating EAM and MPAS-O timeseries of the 500-yr DECK pre-industrial simulations for CMIP6 are:","type":"text"}]},{"type":"codeBlock","content":[{"text":"# EAM/ELM:\ndrc_in='/p/user_pub/work/E3SM/1_0/piControl/1deg_atm_60-30km_ocean/atmos/native/model-output/mon/ens1/v1' # Input directory\ndrc_out=\"${DATA}/ne30/clm\" # Native grid output directory\ndrc_rgr=\"${DATA}/ne30/rgr\" # Regridded output directory\ndrc_tmp='/p/cscratch/acme/zender1/tmp' # Temporary directory for intermediate files\nmap=\"${DATA}/maps/map_ne30np4_to_cmip6_180x360_aave.20181001.nc\" # Regridding map-file\ncmip6_opt='-7 --dfl_lvl=1 --no_cll_msr --no_frm_trm --no_stg_grd' # CMIP6-specific options\nspl_opt='--yr_srt=1 --yr_end=500 --ypf=500' # 2D Splitter options\nvars='FSNT' # 2D\n#spl_opt='--yr_srt=1 --yr_end=500 --ypf=25' # 3D Splitter options\n#vars='T' # 3D\nexport TMPDIR=${drc_tmp};cd ${drc_in};/bin/ls 20180129.DECKv1b_piControl.ne30_oEC.edison.cam.h0.0???-*.nc | ncclimo --var=${vars} ${cmip6_opt} ${spl_opt} --map=${map} --drc_out=${drc_out} --drc_rgr=${drc_rgr} > ~/ncclimo.atm 2>&1 &\n\n# MPAS:\ndrc_in='/p/user_pub/work/E3SM/1_0/piControl/1deg_atm_60-30km_ocean/ocean/native/model-output/mon/ens1/v1' # Input directory\ndrc_out=\"${DATA}/ne30/clm\" # Native grid output directory\ndrc_rgr=\"${DATA}/ne30/rgr\" # Regridded output directory\ndrc_tmp='/p/cscratch/acme/zender1/tmp' # Temporary/intermediate-file directory\nmap=\"${DATA}/maps/map_oEC60to30v3_to_cmip6_180x360_aave.20181001.nc\" # Regridding map-file\nmpas_opt='-m mpas --d2f' # MPAS-specific options\ncmip6_opt='-7 --dfl_lvl=1 --no_cll_msr --no_frm_trm --no_stg_grd' # CMIP6-specific options\nspl_opt='--yr_srt=1 --yr_end=500 --ypf=500' # 2D Splitter options\nvars='timeMonthly_avg_longWaveHeatFluxUp' # 2D\n#spl_opt='--yr_srt=1 --yr_end=500 --ypf=25' # 3D Splitter options\n#vars='timeMonthly_avg_activeTracers_temperature' # 3D\nexport TMPDIR=${drc_tmp};cd ${drc_in};/bin/ls mpaso.hist.am.timeSeriesStatsMonthly.0???-*.nc | ncclimo --var=${vars} ${mpas_opt} ${cmip6_opt} ${spl_opt} --map=${map} --drc_out=${drc_out} --drc_rgr=${drc_rgr} > ~/ncclimo.ocn 2>&1 &","type":"text"}]},{"type":"paragraph","content":[{"text":"Take a moment to compare the methods for EAM and for MPAS. They are nearly identical except for the variable names, experiment names and directories, map-files (so far nothing surprising or important) AND the additional MPAS options in ","type":"text"},{"text":"${mpas_opt}","type":"text","marks":[{"type":"code"}]},{"text":". We will discuss those soon. Each command-set begins with setting experiment-dependent I/O directories and a map-files. Other experiments will require changing these to the appropriate I/O directories, yet the map-file remains the same unless the native or destination grid changes. The next three or four lines in each command-set configure the splitter and regridder with options that many ","type":"text"},{"text":"ncclimo","type":"text","marks":[{"type":"code"}]},{"text":"/","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" users have never before tried. Finally the list of input files and all the configuration options are sent to ","type":"text"},{"text":"ncclimo","type":"text","marks":[{"type":"code"}]},{"text":". The entire procedure for the user boils down to creating then executing this one splitter command for each desired variable.","type":"text"}]},{"type":"paragraph","content":[{"text":"Regridding is performed only if the splitter (i.e., ","type":"text"},{"text":"ncclimo","type":"text","marks":[{"type":"code"}]},{"text":") is invoked with the ","type":"text"},{"text":"--map","type":"text","marks":[{"type":"code"}]},{"text":" option that supplies a suitable mapfile from the native to the desired destination grid. CMIP6 will only distribute data on 2D structured grids, yet E3SM will itself distribute the timeseries on native (unstructured grids). Hence the commands above construct both the native timeseries (stored in ","type":"text"},{"text":"${drc_out}","type":"text","marks":[{"type":"code"}]},{"text":") and the regridded timeseries for CMIP6 (stored in ","type":"text"},{"text":"${drc_rgr}","type":"text","marks":[{"type":"code"}]},{"text":"). Internally, the splitter constructs the native grid timeseries for the same time segement for all requested variables in parallel, waits for completion, and then calls the regridder (ncremap) in parallel with all timeseries for that segment, waits, then continues to the next segment.","type":"text"}]},{"type":"paragraph","content":[{"text":"The first configuration options to discuss are the MPAS-specific options. In order to automatically trigger a number of MPAS-specific behaviors, the regridder must first know that the model type is MPAS. When invoked with '","type":"text"},{"text":"-m mpas","type":"text","marks":[{"type":"code"}]},{"text":"' the splitter will pass the MPAS-flag to the regridder. The splitter itself simply creates timeseries and does nothing different for MPAS files other than pass options through to the regridder. MPAS outputs its native grid data in double precision, not single precision like EAM/ELM/CAM. Thus raw MPAS datasets are twice the size needed by most analyses. The ","type":"text"},{"text":"--d2f","type":"text","marks":[{"type":"code"}]},{"text":" flag tells the regridder to demote doubles (unless they are coordinate variables) to floats in an additional pre-processing step. Otherwise, regridded MPAS output would be twice the size with no appreciable benefits for analysis.","type":"text"}]},{"type":"paragraph","content":[{"text":"The CMIP6-specific options (","type":"text"},{"text":"${cmip6_opt}","type":"text","marks":[{"type":"code"}]},{"text":") collectively ensure that the timeseries are compliant, compact, and concise. CMIP6 requires datasets be in netCDF4-Classic format, i.e., netCDF4 storage constrained to the netCDF3 API. This is achieved with the ","type":"text"},{"text":"-7","type":"text","marks":[{"type":"code"}]},{"text":" switch (mnemonic: 7=4+3). Additionally, CMIP6 requires datasets use netCDF4's internal compression, the DEFLATE algorithm (same as in gzip). We recommend deflation level 1 (i.e., ","type":"text"},{"text":"--dfl_lvl=1","type":"text","marks":[{"type":"code"}]},{"text":") since higher levels compress only marginally better yet require significantly more wallclock time.","type":"text"}]},{"type":"paragraph","content":[{"text":"The next three CMIP6 options trim the timeseries to exclude variables that could otherwise be included. The ","type":"text"},{"text":"--no_cll_msr","type":"text","marks":[{"type":"code"}]},{"text":" (no-cell-measures) switch excludes variables typically listed in the CF ","type":"text"},{"text":"cell_measures","type":"text","marks":[{"type":"code"}]},{"text":" attribute such as gridcell area and volume. The ","type":"text"},{"text":"--no_frm_trm","type":"text","marks":[{"type":"code"}]},{"text":" (no-formula-terms) switch excludes variables that appear in the CF ","type":"text"},{"text":"formula_terms","type":"text","marks":[{"type":"code"}]},{"text":" attribute, notably the 2D surface pressure (","type":"text"},{"text":"PS","type":"text","marks":[{"type":"code"}]},{"text":") for EAM. The ","type":"text"},{"text":"--no_stg_grd","type":"text","marks":[{"type":"code"}]},{"text":" (no-staggered-grid) switch excludes the offset (aka staggered) grid that ncremap normally adds to rectangular output grids. The specific variables excluded are ","type":"text"},{"text":"slat","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"slon","type":"text","marks":[{"type":"code"}]},{"text":", and ","type":"text"},{"text":"w_stag","type":"text","marks":[{"type":"code"}]},{"text":". There is no downside to this option for MPAS data, although it can cause problems for older versions of AMWG diagnostics. Thus timeseries processed with these options include no \"extras\" that might inflate their size or, alas, their convenience.","type":"text"}]},{"type":"paragraph","content":[{"text":"The splitter options (","type":"text"},{"text":"${spl_opt}","type":"text","marks":[{"type":"code"}]},{"text":") configure the timeseries length and number of segments. The splitter expects the number of (monthly) input files to equal the number of years (between ","type":"text"},{"text":"${yr_srt}","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"${yr_end}","type":"text","marks":[{"type":"code"}]},{"text":", inclusive) times twelve. This sanity check helps prevent inadvertent omissions/inclusions of unwanted months. Each timeseries is split, if necessary, into a number of segments of equal length and possibly one shorter length tail segment. The ","type":"text"},{"text":"--ypf","type":"text","marks":[{"type":"code"}]},{"text":" option specifies the number of years per file (i.e., segment). CMIP6 recommends file sizes be no greater than a few gigabytes. Factors that influence the segment filesize include the segment length, the variable rank and number of layers if 3D, and, for regridded timeseries, the grid resolution and the presence of missing values (e.g., due to ocean bathymetry). A compromise that meets these criteria is segment lengths up to 500-years for 2D variables, and 25-years for 3D variables. For consistency, these same segments length limits are used for all E3SM v1 models and experiments in CMIP6. Note that because the segment lengths differ for 2D and 3D variables, it is necessary to call the splitter at least twice per experiment, once with 2D variables (supplied with ","type":"text"},{"text":"--var","type":"text","marks":[{"type":"code"}]},{"text":") and segment size, and likewise for 3D variables and segment size.","type":"text"}]},{"type":"paragraph","content":[{"text":"These 500-year and 25-year segment lengths yield native-grid files of sizes ~800 MB and 2.3 GB for v1 EAM 2D and 3D variables, respectively, that have regridded (to 1x1 degree) sizes of ~1.0 GB and 3.0 GB. For v1 MPAS-Ocean data, these segment lengths yield native-grid files of sizes ~9.7 GB and 22 GB for MPAS-Ocean 2D and 3D variables, respectively, that have regridded (to 1x1 degree) sizes of ~900 MB and 1.5 GB. Hence all the regridded E3SM data distributed by CMIP6 will be in files of sizes between 1-3 GB. Note MPAS regridded data consumes ~90% less space than native grid data. This is due to two factors: 1. Raw MPAS data do not utilize the netCDF ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute (which would substantially improve compression), and 2. Raw MPAS data are double precision not single precision.","type":"text"}]},{"type":"paragraph","content":[{"text":"The RAM overhead of timeseries generation can also be a factor on small nodes. Splitting does most of its work on disk and so requires only as much RAM as required to store a single timestep of a single variable. Regridding is a different kettle of fish, a bird of another feather, and potentially a can of worms. The maximum RAM usage is about three times the uncompressed size of the entire timeseries. For the 500-yr 2D and 25-yr 3D segments considered here, expect peak RAM usage of 20 GB and 64 GB, respectively, for MPAS data. If the regridder exhausts available memory when called with multiple variables, then reduce the parallelization over variables using the ","type":"text"},{"text":"--job=${job_nbr}","type":"text","marks":[{"type":"code"}]},{"text":" option (not shown). This is unlikely to occur on beefy nodes because job_nbr defaults to 2 (i.e., variables are split and regridded in groups of two). The splitter parallelizes well for typical timeseries of 2D variables, and can be invoked with ","type":"text"},{"text":"${job_nbr}","type":"text","marks":[{"type":"code"}]},{"text":" exceeding 100 when no regridding (which consumes much more memory than splitting) is performed.","type":"text"}]},{"type":"paragraph","content":[{"text":"Now that the content of the rather lengthy CMIP6 splitter/regridder commands has been explained, it is worthwhile describing the method of invocation. The splitter accepts filenames supplied in numerous ways (command-line arguments, pipes to stdin, directory contents, redirection operators) as described above. For large numbers of input files typical of CMIP6 experiments, piping filenames as output by ","type":"text"},{"text":"ls","type":"text","marks":[{"type":"code"}]},{"text":" from the input file directory into the splitter is preferred for two reasons. First, ","type":"text"},{"text":"ls","type":"text","marks":[{"type":"code"}]},{"text":" automatically sorts files into alphanumeric order. This is equivalent to timeseries order because of the filename conventions employed by E3SM. Thus ","type":"text"},{"text":"ls","type":"text","marks":[{"type":"code"}]},{"text":" ensures that timeseries monotonically increase. Moreover, ","type":"text"},{"text":"ls","type":"text","marks":[{"type":"code"}]},{"text":" understands command-line globbing to simplify culling only required time periods from directories with longer simulations. Second, issuing ","type":"text"},{"text":"ls ","type":"text","marks":[{"type":"code"}]},{"text":"from the input file directory removes the lengthy path component of each filename received by the splitter. For a 500-year pre-industrial DECK simulation, this removes 500*12=6000 copies of the same ~100-character directory path from the provenance metadata maintained in the ","type":"text"},{"text":"history","type":"text","marks":[{"type":"code"}]},{"text":" attribute of each downstream file.","type":"text"}]},{"type":"paragraph","content":[{"text":"Finally, note that we explicitly set ","type":"text"},{"text":"${TMPDIR}","type":"text","marks":[{"type":"code"}]},{"text":" to a capacious writable directory prior to execution. The regridder writes all intermediate files to this directory, and removes them only upon successful completion. (The splitter itself never writes to ","type":"text"},{"text":"${TMPDIR}","type":"text","marks":[{"type":"code"}]},{"text":"). However, for MPAS files, the regridder may write as many as three or four intermediate files per output file to ","type":"text"},{"text":"${TMPDIR}","type":"text","marks":[{"type":"code"}]},{"text":". Since some 3D MPAS-Ocean DECK PI files are 10's of GB in size, it is best to ensure the intermediate files are written to volatile storage. They will be automatically deleted upon successful completion of regridding. Should the splitter or regridder fail for any reason, the files will remain in ","type":"text"},{"text":"${TMPDIR}","type":"text","marks":[{"type":"code"}]},{"text":" to assist in debugging. Thus it is best if ","type":"text"},{"text":"${TMPDIR}","type":"text","marks":[{"type":"code"}]},{"text":" is automatically scrubbed every so often, e.g., on re-boots as with most Linux and MacOS workstations.","type":"text"}]},{"type":"paragraph","content":[{"text":"This discussion of splitting and regridding has focused on \"one-off\" experiments such as the DECK 500-yr pre-industrial simulation. The above methods with minor modifications also apply to ensemble experiments such as those with historical forcing since 1850. For example, this generates CMIP6 timeseries to analyze historical cloud radiative effects in the ensemble of five E3SM v1 simulations designated H1-H5:","type":"text"}]},{"type":"codeBlock","content":[{"text":"drc_out=\"${DATA}/ne30/clm\" # Native grid output directory\ndrc_rgr=\"${DATA}/ne30/rgr\" # Regridded output directory\ndrc_tmp='/p/cscratch/acme/zender1/tmp' # Temporary/intermediate-file directory\ncmip6_opt='-7 --dfl_lvl=1 --no_cll_msr --no_frm_trm --no_stg_grd' # CMIP6-specific options\nspl_opt='--yr_srt=1850 --yr_end=2014 --ypf=500' # 2D Splitter options\nvars='CLDLOW,CLDTOT,FSDS,FSDSC,FSNS,FSNSC,FLDS,FLNS,FLNSC,PS,TGCLDIWP,TGCLDLWP' # 2D\nfor nsm_nm in H1 H2 H3 H4 H5 ; do\ndrc_in=\"/p/user_pub/work/E3SM/1_0/historical_${nsm_nm}/1deg_atm_60-30km_ocean/atmos/native/model-output/mon/ens1/v1\" # Input directory\nexport TMPDIR=${drc_tmp};cd ${drc_in};/bin/ls 2018????.DECKv1b_${nsm_nm}.ne30_oEC.edison.cam.h0.????-??.nc | ncclimo --fml_nm=${nsm_nm} --var=${vars} ${cmip6_opt} ${spl_opt} --map=${DATA}/maps/map_ne30np4_to_cmip6_180x360_aave.20181001.nc --drc_out=${drc_out} --drc_rgr=${drc_rgr} > ~/ncclimo.atm.${nsm_nm} 2>&1\ndone","type":"text"}]},{"type":"paragraph","content":[{"text":"The main difference between generating the timeseries for the Historical ensemble and the DECK PI experiment is the need to loop over the ensemble. Here the splitter command is not backgrounded so that one member experiment is processed at a time (to avoid overwhelming nodes with I/O and RAM demands). Set the input directory in the ensemble loop, and ensure the globbing pattern for filenames matches the naming convention used for all five members. Consider whether to output to member-specific directories or to a single, ensemble-wide directory. If the former, then nothing special need be done. If the latter, use the ","type":"text"},{"text":"--fml_nm","type":"text","marks":[{"type":"code"}]},{"text":" (family-name) option as above to avoid identical timeseries names (that will overwrite one another) and to create instead member-specific timeseries names like","type":"text"},{"type":"hardBreak"},{"type":"hardBreak"},{"text":"CLDLOW_H1_185001_201412.nc","type":"text","marks":[{"type":"link","attrs":{"href":"http://CLDLOW_H1_185001_201412.nc"}}]},{"type":"hardBreak"},{"text":"CLDLOW_H2_185001_201412.nc","type":"text","marks":[{"type":"link","attrs":{"href":"http://CLDLOW_H2_185001_201412.nc"}}]},{"type":"hardBreak"},{"text":"...","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding VII: Initial Condition Files","type":"text"}]},{"type":"paragraph","content":[{"text":"First, use the right tool. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" can regrid an initial conditions (IC) file, both vertically and horizontally. However, generating scientifically validated IC files for new model resolutions is best done with a lengthy workflow (","type":"text"},{"text":"https://acme-climate.atlassian.net/wiki/spaces/ED/pages/872579110/Adding+support+for+new+grids","type":"text","marks":[{"type":"link","attrs":{"href":"https://acme-climate.atlassian.net/wiki/spaces/ED/pages/872579110/Adding+support+for+new+grids"}}]},{"text":") in which regridding plays a relatively small role. That said, ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" is a good tool to place an atmospheric state onto a new grid where it can then be nudged into a valid IC file, or to place the contents of an IC file on a rectangular grid (as shown below) where it is easier to plot. Regridding atmosphere IC files was straightforward until E3SM v2 when the atmosphere separated the dynamical and physical grids. This resulted in IC files containing two sets of grid variables, so now we must regrid v2 IC files with two successive invocations of ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":":","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap --map=map_ne30pg2_to_cmip6_180x360_nco.20200901.nc foo.eam.i.2001-01-01-00000.nc foo.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncremap --map=map_ne30np4_to_cmip6_180x360_nco.20200901.nc foo.nc foo.eam.i.2001-01-01-00000.rgr.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"The first command remaps the variables on the PG2 physics grid to the desired output grid. This is straightforward since the physics grid variables (","type":"text"},{"text":"SICTHK","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"ICEFRAC","type":"text","marks":[{"type":"code"}]},{"text":", ","type":"text"},{"text":"TS1","type":"text","marks":[{"type":"code"}]},{"text":"...) use ","type":"text"},{"text":"ncol","type":"text","marks":[{"type":"code"}]},{"text":" (the ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" default) as the horizontal coordinate. The second command remaps the remaining variables, stored on the NP4 dynamics grid with horizontal dimension ","type":"text"},{"text":"ncol_d","type":"text","marks":[{"type":"code"}]},{"text":", to the same output grid. The second invocation automatcally regrids all the variables on the dynamics grid (with the ","type":"text"},{"text":"ncol_d","type":"text","marks":[{"type":"code"}]},{"text":" dimension) because the intermediate file ","type":"text"},{"text":"foo.nc","type":"text","marks":[{"type":"link","attrs":{"href":"http://foo.nc"}}]},{"text":" no longer has an ","type":"text"},{"text":"ncol","type":"text","marks":[{"type":"code"}]},{"text":" dimension.","type":"text"}]},{"type":"paragraph","content":[{"text":"To regrid only the variables with a non-default (e.g., not ","type":"text"},{"text":"ncol","type":"text","marks":[{"type":"code"}]},{"text":") horizontal dimension in a file with multiple horizontal dimensions, one would explicitly select the non-default dimension using the ","type":"text"},{"text":"ncremap -R","type":"text","marks":[{"type":"code"}]},{"text":" option, whose argument is passed directly to the underlying regridder (","type":"text"},{"text":"ncks","type":"text","marks":[{"type":"code"}]},{"text":"):","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap -R '--rgr col_nm=ncol_d' --map=map_ne30np4_to_cmip6_180x360_nco.20200901.nc foo.eam.i.2001-01-01-00000.nc foo.rgr.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding VIII: Fixing Grid Files to Work as Intended","type":"text"}]},{"type":"paragraph","content":[{"text":"Gridfiles store a wealth of highly precise information using loosely standardized rules that are open to interpretation. One may encounter gridfiles that conform to one regridder's expectations though not another's. The section provides guidance on how to adjust or repair some of the most frequently encountered problems with gridfiles. The problems currently addressed are floating-point masks, ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":", and imprecise RLL grids.","type":"text"}]},{"type":"paragraph","content":[{"text":"Floating-point mask variables (grid_imask) in SCRIP files---they are non-standard and may break some software. TempestRemap's ","type":"text"},{"text":"GenerateOverlapMesh","type":"text","marks":[{"type":"code"}]},{"text":" program, for example, breaks (as of this writing, 20210428) when asked to ingest a SCRIP grid-file with a floating point mask. This problem can occur when using grid-files from TR's old ","type":"text"},{"text":"ConvertExodusToSCRIP","type":"text","marks":[{"type":"code"}]},{"text":" program which output floating point masks. The new TR program, ConvertMeshToSCRIP, appears to have fixed this problem (as of this writing in 20240110). Also, users often create masks from floating-point variables (as described in the next section) and inadvertently leave the mask as a floating point variable. The solution to the problem of floating-point masks can be as simple as a straightforward conversion of the mask to an integer:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncap2 -s 'grid_imask=int(grid_imask)' grd.nc # Integerize mask","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"However, ","type":"text"},{"text":"ncap2","type":"text","marks":[{"type":"code"}]},{"text":" uses implicit type-conversion rules that truncate floating-point variables, i.e., round-down towards zero, so an input floating point mask value of 0.99999 will be converted to an integer mask value of zero in the output file. If this is not the desired behavior, consider","type":"text"}]},{"type":"paragraph","content":[{"text":"ncap2 -s 'grid_imask=int(round(grid_imask))' grd.nc # Integerize mask","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"This converts input floating point mask values between 0.5 and 1.5 to integer mask values of one in the output file.","type":"text"}]},{"type":"paragraph","content":[{"text":"The attribute to identify data whose values are missing, impossible, or not-yet defined in a netCDF dataset has long been ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":". Many people are used to adding ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attributes to all variables in data files. That is usually fine and it is understandable how that practice sometimes gets carried into grid-files. However, there is no reason for a SCRIP grid-file to include any ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attributes. The SCRIP format allows the incorporation of a mask variable (","type":"text"},{"text":"grid_imask","type":"text","marks":[{"type":"code"}]},{"text":", discussed further in the next section) that performs some of the roles of a ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute. Masks designate cells intended to be ignored by the weight generation algorithm; this does not require that data values be ill-defined in masked-out cells, just that masked cells not contribute to (for source-grid masks) or receive (for destination-grid masks) mapping weights.","type":"text"}]},{"type":"paragraph","content":[{"text":"In any case, the interpretation of values identified as missing due to their equality with the ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute is not-standardized and will likely vary among weight-generation software. To avoid this ambiguity, we recommend that the ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" attribute be removed from all variables in the grid-file:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncatted -a _FillValue,,d,, grd.nc # Remove all _FillValue attributes","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"The last issue we address here is imprecise RLL grids. Within and before the last decade, climate modelers often generated rectangular (including Gaussian) grids with only single-precision arithmetic. That often works well enough, especially for low-resolution grids. However, we design weight generation software to be as precise and repeatable as possible, and this requires double-precision accuracy. Moreover, for reasons not completely understood (though perhaps related to the density of RLL meridians near the poles), high resolution single-precision RLL grids break some weight generation software. For example, single precision globally uniform 3 minute (0.05x0.05 degree) RLL grids tend to either break NCO or cause it to generate bad weights. ERWG usually completes weight generation with such grids, though the weights are often bad.","type":"text"}]},{"type":"paragraph","content":[{"text":"The first step in avoiding imprecise RLL grids is recognizing these symptoms. If ERWG or NCO produce bad weights, e.g., ","type":"text"},{"text":"max(frac_a,frac_b) > 1.01","type":"text","marks":[{"type":"code"}]},{"text":", then this might be due to an imprecise grid. The next step is verify whether the RLL gridcell corners exhibit random behavior after the seventh digit or so. For example, examine the first (or last) few gridcells and verify that the increments between corners (and/or centers) is as expected:","type":"text"}]},{"type":"paragraph","content":[{"text":"zender@cori06:~$ grid=/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/mappingdata/grids/SCRIPgrid_3minx3min_GLOBE-Gardner_c120922.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"zender@cori06:~$ ncks --trd -H -d grid_size,,2 -v grid_corner_lat,grid_corner_lon,grid_center_lat,grid_center_lon ${grid}","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[0] grid_center_lat[0]=-89.9749984741","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[1] grid_center_lat[1]=-89.9749984741","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[2] grid_center_lat[2]=-89.9749984741","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"grid_size[0] grid_center_lon[0]=-179.975006104","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[1] grid_center_lon[1]=-179.925003052","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[2] grid_center_lon[2]=-179.875","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"...","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"These gridcell centers random behavior after the seventh digit. If we regenerate this grid with full double-precision accuracy, the grid vertices and centers will be exactly 0.05 degrees apart as desired:","type":"text"}]},{"type":"paragraph","content":[{"text":"zender@cori06:~$ ncremap -G latlon=3600,7200#lat_typ=uni#lon_typ=180_wst -g grd.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"zender@cori06:~$ ncks --trd -H -d grid_size,,2 -v grid_corner_lat,grid_corner_lon,grid_center_lat,grid_center_lon grd.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[0] grid_center_lat[0]=-89.975","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[1] grid_center_lat[1]=-89.975","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[2] grid_center_lat[2]=-89.975","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"grid_size[0] grid_center_lon[0]=-179.975","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[1] grid_center_lon[1]=-179.925","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"grid_size[2] grid_center_lon[2]=-179.875","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"...","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Once the grid contains coordinates accurate to double-precision, other grid-file fields such as the mask (","type":"text"},{"text":"grid_imask","type":"text","marks":[{"type":"code"}]},{"text":") can be appended at will, including from the original, imprecise file:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncks -C -m -A -v grid_imask ${grid} grid.nc # Replace newly generated default mask with GLOBE/Gardner landmask","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncks -O -7 -L 1 grid.nc grid.nc # Optional step to compress large grids","type":"text","marks":[{"type":"code"}]}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Advanced Regridding IX: Creating and Using Land Surface Grids with Masks","type":"text"}]},{"type":"paragraph","content":[{"text":"Regridding software is commonly used to convert continuous fields into binary (True/False) masks for use within numerical models. Land surface models, for example, use a \"land mask\" to distinguish gridcells containing land (mask is True = 1) from non-land gridcells (mask is False = 0). Similarly, glacier models models use an \"ice mask\" to distinguish gridcells available to the ice sheet from non-ice gridcells. This section describes the steps to follow and potential crevasses to avoid when manipulating fields with masks into (primarily SCRIP-format) grid-files that can then be used as described above in the creation of weights to map between grids.","type":"text"}]},{"type":"paragraph","content":[{"text":"Masks intended for use in regridding must satisfy constraints imposed by most or all weight-generation software. Such masks must be integer-valued and time-invariant (no time dimension). Regridding masks are thus one-dimensional for unstructured grids, or two-dimensional for rectangular or curvilinear grids. Some weight-generation software (TempestRemap, for example) will crash if it encounters a floating-point valued mask. No known software will create, or apply, time-varying map-files, so time dimensions in grid files must be avoided.","type":"text"}]},{"type":"paragraph","content":[{"text":"Three generic tasks often occur during mask creation: 1) Using rules to combine continuous field(s) into a mask, 2) Merging a mask into a grid-file, 3) Inferring a mask from an exising continuous field.","type":"text"}]},{"type":"paragraph","content":[{"text":"As an example of defining a mask from a set of rules applied to continuous input data, consider the problem of defining an ice mask (e.g., for ELM/E3SM) from fields like bathymetry/topography and ice thickness, which MALI produces in the output variables ","type":"text"},{"text":"bedTopography","type":"text","marks":[{"type":"code"}]},{"text":" and ","type":"text"},{"text":"thickness","type":"text","marks":[{"type":"code"}]},{"text":", respectively. Say one wishes the mask to include all gridcells that have multi-year land-based ice as well as gridcells that might form multi-year ice in a simulation. NCO can process these (and other) conditions into an integer-valued output field using the ","type":"text"},{"text":"ncap2 where()","type":"text","marks":[{"type":"code"}]},{"text":" operator:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncap2 -s 'icemask[lat,lon]=0;where(thickness > 0 || bedTopography > 0) icemask=1' in.nc mask.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"This command ensures an integer-valued mask by initializing the ","type":"text"},{"text":"icemask","type":"text","marks":[{"type":"code"}]},{"text":" variable to ","type":"text"},{"text":"0","type":"text","marks":[{"type":"code"}]},{"text":" (note the lack of a decimal point), an integer. Initializing to ","type":"text"},{"text":"0.0f","type":"text","marks":[{"type":"code"}]},{"text":" or ","type":"text"},{"text":"0.0","type":"text","marks":[{"type":"code"}]},{"text":" would result in ","type":"text"},{"text":"icemask","type":"text","marks":[{"type":"code"}]},{"text":" being defined as a single or double-precision floating-point variable, respectively. If the input variables include a leading time dimension, then above command could be modified to define a time-invariant mask in terms of the variables at the initial timestep:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncap2 -s 'icemask[lat,lon]=0;where(thickness(0,:,:) > 0 || bedTopography(0,:,:) > 0) icemask=1' in.nc mask.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"A mask defined from a data file in this way must then be added to gridfile in order to be utilized when generating remapping weights. This is straightforward so long as the name, type, rank, and dimensionality of the mask variable match the expectations of the gridfile:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncrename -v icemask,grid_imask mask.nc","type":"text","marks":[{"type":"code"}]},{"type":"hardBreak"},{"text":"ncks -A -C -v grid_imask mask.nc grd.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Next, consider inferring a mask from a single continuous input field. Suppose the input field has non-zero floating point values where the mask is to be true, and ","type":"text"},{"text":"_FillValue","type":"text","marks":[{"type":"code"}]},{"text":" or zero-values where the mask is to be false. ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"code"}]},{"text":" will automatically infer the correct mask from a data file containing that input field. For example, RACMO ice sheet model data contains a field named ","type":"text"},{"text":"Icemask_GR","type":"text","marks":[{"type":"code"}]},{"text":" that is true only over multi-year Greenland ice. One can create a gridfile that contains a mask variable (","type":"text"},{"text":"grid_imask","type":"text","marks":[{"type":"code"}]},{"text":") derived from the ","type":"text"},{"text":"Icemask_GR","type":"text","marks":[{"type":"code"}]},{"text":" field like this:","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap --msk_dst=Icemask_GR --dst_fl=racmo_data.nc --grd_dst=grid_icemask.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"In this case the grid-file is complete and maps may be created from it, e.g.,","type":"text"}]},{"type":"paragraph","content":[{"text":"ncremap --grd_src=grid_icemask.nc --grd_dst=1x1.nc --map=map.nc","type":"text","marks":[{"type":"code"}]}]},{"type":"paragraph","content":[{"text":"Such a map applied to a RACMO dataset would allow only points inside the Greenland Ice Sheet to contribute to the destination (analysis) grid.","type":"text"}]},{"type":"heading","attrs":{"level":4},"content":[{"text":"Epilogue: User-Suggested Examples","type":"text"}]},{"type":"paragraph","content":[{"text":"E3SM employs some the most advanced grid formulations used in ESM modeling today, and since regridding is a complex subject full of details, it is unlikely that the foregoing documentation already answers everyone's questions. Moreover, since some of the greatest E3SM innovations and optimization rely on even newer grid treatments, we are unlikely to ever stop needing to learn newer regridding techniques. If you read this far you know about ","type":"text"},{"text":"ncremap","type":"text","marks":[{"type":"strong"}]},{"text":"'s main capabilities, yet either you and U2 still haven't found what you're looking for, or you may have a hunch that one of those relevant-sounding-though-undocumented (at least in Confluence) options that appears with ","type":"text"},{"text":"ncremap --help","type":"text","marks":[{"type":"code"}]},{"text":", is for.In that spirit, we welcome user requests for annotated examples for their real-world regridding issues. Fire away!","type":"text"}]}],"version":1}

Browser not supported