Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The surface topography dataset grid SCRIPgrid_1km-merge-10min_HYDRO1K-merge-nomask_c130402.nc (hereafter the HYDRO1K grid or hydro1k.nc for short) used by E3SM and CESM is self-overlapping. It contains numerous (about 17) quadrilateral gridcells that partially overlap one another. These are cells of differing orientations and sizes that seem to originate from different measurements. They are not repeated or duplicate gridcells. Weight-generators that receive the same input location area twice might (if they do not take precaustions precautions to idenfity the issue, which no known weight-generators do) double-weight the self-overlapped region(s). In other words, self-overlapping input grids can lead weight-generators to produce values frac_b >> 1.0. Applying these weights would lead to exaggerated values on the destination grid.The best solution to this issue is to adjust the input grid to avoid self-overlap. However, this solution may be difficult or impractical where the original data, producer, or algorithm are unavailable or unclear. In such cases, the --frac_b_nrm flag provides a workaround. Please understand that

Before describing the NCO method to address self-overlapping, we should mention that the HYDRO1k grid also contains one gridcell whose vertices are stored in clockwise (CW) not counter-clockwise (CCW) order. This gridcell violates the CCW ordering convention followed by all known weight generators and can lead to failture and carnage if not corrected. The practical solution is to re-order the vertices in that gridcell, thereby creating a modified input grid that is completely CCW (yet still suffers from self-overlapping gridcells):

ncap2 -O --no_tmp_fl -s 'grid_corner_lon(50085604,:)={100.024421691894531,100.024421691894531,100.012207031250000,100.012207031250000};grid_corner_lat(50085604,:)={42.598628997802734,42.589633941650391,42.589637756347656,42.598632812500000};' hydro1k.nc hydro1k_ccw.nc

NCO commands to produce maps that convert HYDRO1K elevation data (on the fixed, pure-CCW grid) to any desired output grid should include appropriate options for niceness, format, and speed. First, the HYDRO1K grid is memory-intensive. Map-files based on it must be created on a node with plenty of RAM. For example, the HYDRO1K-to-ne1024np4 map requires about 200 GB RAM. Etiquette requires lowering the priority of memory and CPU-intensive jobs with the nice command on shared resources such as login-nodes. Second, the output weight and vertice arrays may exceed the capacities of the netCDF3 classic and 64-bit offset formats so we recommend setting the output format to CDF5 or to netCDF4-classic with the -5 or -7 switches, respectively. Third, the weight-generation code is threaded and our preliminary benchmarks show that it can scale well up to at least six threads. These options yield commands of the form

nice ncremap -5 --thr_nbr=6 --src_grd=hydro1k_ccw.nc --dst_grd=ne1024np4_scrip_c20191023.nc --map=map_hydro1k_to_ne1024np4_nco.20200401.nc

Such maps will suffer in correctness due to self-overlapping input gridcells as discussed above. The map-checker helps diagnose the extent of the self-overlapping issue:

% ncks --chk_map map_hydro1k_to_ne1024np4_nco.20200401.nc
...
frac_b max: 1.9322116675425409 = 1.0+9.3e-01 // Maximum in grid B cell [27052765,+27.0731,+142.207]
...
WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING
Danger, Will Robinson! max(frac_b) or min(frac_b) error exceeds 1.0e-01
Regridding with these embarrassing weights will produce funny results
Suggest re-generating weights with a better algorithm/weight-generator
Have both input grid-files been validated? If not, one might be barmy
For example, a source grid that overlaps itself will usually result in frac_b >> 1
WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING WARNING

The map-checker reveals that the maximum frac_b, in grid B cell 27052765 is nearly two! This particular destination cell is nearly completely by two independent input gridcells. If this map were applied to the HYDRO1K elevation data (despite the WARNING message not to do so), and the input elevations are nearly equal in the two overlapping input gridcells, then then elevation in the output gridcell would be nearly twice as high as the input elevation. We can learn more information about the self-overlapping problems by invoking the map-checker with a higher debugging level:

% ncks --dbg_lvl=2 --chk_map map_hydro1k_to_ne1024np4_nco.20200401.nc
...
WARNING consistency = 1.0555483035171453 = 1.0+5.6e-02 for grid B cell [17512046,+28.6551,+61.0204]
WARNING consistency = 1.4909408462334364 = 1.0+4.9e-01 for grid B cell [22288166,-10.0243,+142.847]
WARNING consistency = 1.5083400424248017 = 1.0+5.1e-01 for grid B cell [26813058,+24.7855,+141.304]
WARNING consistency = 1.8095310966429947 = 1.0+8.1e-01 for grid B cell [26813059,+24.7929,+141.328]
WARNING consistency = 1.1003605275652104 = 1.0+1.0e-01 for grid B cell [26813062,+24.8271,+141.328]
WARNING consistency = 1.5691716370205997 = 1.0+5.7e-01 for grid B cell [27006682,+26.712,+142.119]
WARNING consistency = 1.3010253856656488 = 1.0+3.0e-01 for grid B cell [27006683,+26.6633,+142.143]
WARNING consistency = 1.5450145730783844 = 1.0+5.5e-01 for grid B cell [27006686,+26.6981,+142.143]
WARNING consistency = 1.6607209155522704 = 1.0+6.6e-01 for grid B cell [27043555,+27.0515,+142.207]
WARNING consistency = 1.3608697313423308 = 1.0+3.6e-01 for grid B cell [27052764,+27.0654,+142.183]
WARNING consistency = 1.9322116675425409 = 1.0+9.3e-01 for grid B cell [27052765,+27.0731,+142.207]
WARNING consistency = 1.1687177202454959 = 1.0+1.7e-01 for grid B cell [27052767,+27.1003,+142.183]
WARNING consistency = 1.3914390175753097 = 1.0+3.9e-01 for grid B cell [27052768,+27.108,+142.207]
WARNING consistency = 1.7110750268542088 = 1.0+7.1e-01 for grid B cell [27052771,+27.1296,+142.207]
WARNING consistency = 1.0005420980107826 = 1.0+5.4e-04 for grid B cell [50257987,+61.086,+59.2809]
WARNING consistency = 1.0004283863341064 = 1.0+4.3e-04 for grid B cell [50257997,+61.0767,+59.3492]
WARNING consistency = 1.0004192624042263 = 1.0+4.2e-04 for grid B cell [50607727,+65.2985,+61.3002]
WARNING non-consistent row-sums (error exceeds tolerance = 1.0e-08) for 17 of 56623106 grid B cells
...

The map checker lists the 17 destination gridcells whose frac_b exceeds 1.0 by a pre-set tolerance of 1.0e-8. We can examine the input grid at these 17 latitudes and longitudes. For the HYDRO1K dataset, we found that there are self-overlapping gridcells at these locations, which include some islands (e.g., Iwo Jima) and continental regions. The resulting elevation biases in these gridcells range between factors of 1.10-1.93. Considering that the other 56623089 destination gridcells trigger no warnings, the ESM simulations are probably not significantly degraded by faulty elevation in only 17 gridcells. However, this remains to be tested.

The best solution to the self-overlapping issue is, of course, to pre-emptively adjust the input grid and input datasets to avoid self-overlap. However, this solution may be difficult or impractical where the original data, producer, or algorithm are unavailable or where it is unclear which of the overlapped gridcells to retain. In such cases, NCO (versions 4.9.2 and later) provides a practical workaround. The ncks --frac_b_nrm flag invokes a corrective procedure that adjusts for the influence of self-overlapping gridcells.

The procedure first identifies destination gridcells as those with frac_b > 1.0+1.0e-8, as shown above.
Weights that contribute to any such gridcell are then normalized by the sum of the weights (i.e., by frac_b) that contribute to the gridcell. In the example above, the maximum frac_b is 1.93, so all weights contributing to that gridcell are divided by 1.93. The adjusted frac_b is then 1.0, which prevents "double counting" values regridded into that gridcell. The weights that were altered (to avoid double-counting values during regridding) will also alter the frac_a of ALL source gridcells involved. A shortcoming of this procedure is that it misses any self-overlapped input gridcells that contribute to destination gridcells with frac_b < 1, such as can occur along coastlines.

Invoking the corrective procedure is as simple as it is irreversible. Please understand that this procedure alters map-files in-place, so backup the original file first:

% /bin/cp map_hydro1k_to_ne1024np4_nco.20200401.nc map_hydro1k_to_ne1024np4_nco_fix.20200401.nc
% ncks --frac_b_nrm map_hydro1k_to_ne1024np4_nco_fix.nc is designed to alter map.nc in-place, so backup the original file first.20200401.nc

The map-checker will show that the "fixed" map-file no longer generates the above WARNINGs associated with frac_b. Instead, the WARNINGs are now associated with frac_a due to the weight adjustment described above:

% ncks --frac_b_nrm dbg_lvl=2 --chk_map map_hydro1k_to_ne1024np4_nco_fix.2020030120200401.nc
...
...

The new frac_a WARNINGs indicate that the intersection portion of each self-overlapped input gridcells will be weighted less than the portion of the gridcell that is not self-overlapped. This is expected and desired so these WARNINGs may be safely ignored.