Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Due to past versions of E3SM initializing variables to NaNs, simulations that use older restart files will still be initializing variables, like TWS_MONTH_BEGIN, to NaN which may cause runtime errors (examples of how these errors may appear in e3sm.log.* files are included below).

Solution

  • Ensure that the initial fix is in your branch (here: commit with fix ).

  • If your simulation uses an initial condition file for land (finidat), replace the NaNs in TWS_MONTH_BEGIN with the fill value of 1.e+36. Below are methods to perform the conversion (note that they overwrite the input file, so make a copy).

    • Using NCO functions:

      Code Block
      languagenone
      ncatted -a _FillValue,TWS_MONTH_BEGIN,o,f,NaN ${infile.nc}
      ncatted -a _FillValue,TWS_MONTH_BEGIN,m,f,1.0e36 ${infile.nc}
    • Using a Python script:

      Code Block
      languagepy
      from netCDF4 import Dataset
      import numpy as np
      
      ofile = Dataset('infile.nc','r+')
      var_array = f.variables['TWS_MONTH_BEGIN']
      var_array[:][np.isnan(var_array[:])] = 1.e+36
      ofile.close()

E3SM v2.1

E3SMv3

Example backtrace due to floating point exception

...

896: PIO: FATAL ERROR: Aborting... An error occured, Writing variables (number of variables = 180) to file (./E3SM.2023-SCIDAC.ne30pg2_EC30to60E2r2.AMIP.EF_0.13.CF_22.HD_0.56.elm.h0.1984-01.nc, ncid=150) using PIO_IOTYPE_PNETCDF iotype failed. Non blocking write for variable (TWS_MONTH_BEGIN, varid=206) failed (Number of subarray requests/regions=1, Size of data local to this process = 982). NetCDF: Numeric conversion not representable (err=-60). Aborting since the error handler was set to PIO_INTERNAL_ERROR... (/global/u1/w/whannah/E3SM/E3SM_SRC2/externals/scorpio/src/clib/pio_darray_int.c: 395)

History

  • Tests failed restart comparison due to missing TWS_MONTH_BEGIN restart variable (Issue #4649)

  • Longer tests that restart at beginning of the month failed restart comparison due to col_ws%endwb not being on the restart file. (Issue #5079)

  • The initial condition for a production test had to be converted from NaNs (PR #5811)