Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The purpose of this page is to document the procedure for adding support for new atmosphere grids. The process should be the same for new uniform resolutions as well as for new regionally-refined meshes, although some settings will need to be changed for new regionally-refined mesh configurations. This page is a work in progress, and will be updated as this process is refined and (eventually) made more automated. This documentation is an update of a document written by Mark Taylor and Colin Zarzycki, available as a Google Doc here.

...

  1. Create a batch script hange "account" in the sbatch directives at the top of the script. For example, set #SBATCH --account=e3sm

  2. cmake -C /path/to/e3sm/components/homme/cmake/machineFiles/cori-knl.cmake  -DPREQX_NP=4 /path/to/workingdir

  3. Make sure a working NCL is in your PATH. On Cori, add the following to the script: module load ncl.

2C. Atmospheric mesh quality

Atmospheric RRM mesh quality can be measured with the “Max Dinv-based element distortion” metric. This will be printed in the log file for standalone HOMME or EAM simulations (and can be obtained from the log files during the topo generation step). It measures how distorted the elements become in the mesh transition region. It is the ratio of the two singular values of the 2x2 derivative matrix of the element map to the unit square, representing the ration of the largest length scale to the smallest length scale. A grid of perfect quadrilaterals will have a value of 1.0. The equal-angle cubed-sphere grid has a value of 1.7.   A high quality regionally refined grid will have a value less than 4. With a high quality grid, usually one can run with the timesteps used in a uniform grid with matching fine resolution. RRM grids with a value > 4 may require smaller timesteps for stability. Very large values indicate a problem with the grid and it should be redesigned.

3. Generate mapping files

...

...

  1. Create mapping files for each land surface type if needed. An (older and deprecated) example of doing this can be found here. Updated instructions follow:

    1. Obtain or generate a target grid file in SCRIP format. For these example, we will use a ne1024pg2 grid file, which we will need to create (note that most np4 grid files can be found within the inputdata repository, for example, the ne1024np4 grid file is at https://web.lcrc.anl.gov/public/e3sm/mapping/grids/ne1024np4_scrip_c20191023.nc). To generate the pg2 SCRIP file: 

      Code Block
      ${tempest_root}/bin/GenerateCSMesh --alt --res 1024 --file ne1024.g
      ${tempest_root}/bin/GenerateVolumetricMesh --in ne1024.g --out ne1024pg2.g --np 2 --uniform
      ${tempest_root}/bin/ConvertExodusToSCRIP --in ne1024pg2.g --out ne1024pg2_scrip.nc
    2. Get list of input grid files for each land surface input data file. This is done by running the components/elm/tools/mkmapdata/mkmapdata.sh script in debug mode to output a list of needed files (along with the commands that will be used to generate each map file; also make sure GRIDFILE is set to the SCRIP file from the above step): 

      Code Block
      languagebash
      cd ${e3sm_root}/components/elm/tools/mkmapdata
      ./mkmapdata.sh --gridfile ${GRIDFILE} --inputdata-path ${INPUTDATA_ROOT} --res ne1024pg2 --gridtype global --output-filetype 64bit_offset --debug -v --list
    3. Download needed input grid files. The above command will output a list of needed files to clm.input_data_list. We need to download all of these before calling the script without the debug flag to actually perform the mapping. This is possible using check_input_data in CIME, but needs to be done from a dummy case directory. So, one can create a dummy case, cd to that case, and then call ./check_input_data --data-list-dir <path where mkmapdata was run from> --download. However, this failed to connect to the CESM SVN server for me. So instead, I used the following one-off script: 

      Code Block
      #!/bin/bash
      e3sm_inputdata_repository="https://web.lcrc.anl.gov/public/e3sm"
      cesm_inputdata_repository="https://svn-ccsm-inputdata.cgd.ucar.edu/trunk"
      inputdata_list=clm.input_data_list
      cat $inputdata_list | while read line; do
          localpath=`echo ${line} | sed 's:.* = \(.*\):\1:'`
          url1=${e3sm_inputdata_repository}/`echo ${line} | sed 's:.*\(inputdata/lnd/.*\):\1:'`
          url2=${cesm_inputdata_repository}/`echo ${line} | sed 's:.*\(inputdata/lnd/.*\):\1:'`
          if [ ! -f ${localpath} ]; then
              echo "${url1} -> ${localpath}"
              mkdir -p `dirname ${localpath}`
              cd `dirname ${localpath}`
              # Try to download using first URL, if that fails then use the second
              wget ${url1} || wget ${url2}
          else
              echo "${localpath} exists, skipping."
          fi
      done
    4. Create mapping files. Should just be able to run the above mkmapdata.sh command without the –debug --list flags. We need to append the --outputfile-type 64bit_offset flag for our large files (no reason not to do this by default anyways). NOTE - This step requires NCL, which is no longer part of the E3SM unified environement. If the machine you are using does not have an NCL module, creating a custom environement that includes NCL is an easy work around. Fixing this issue to avoid the NCL dependency will require rewriting the rmdups.ncl and mkunitymap.ncl script in another language (python+xarray would make sense). We will also need to write a version of the gc_qarea() function, unless the geocat project writes a port that we can use (see geocat issue #31).

      Code Block
      ./mkmapdata.sh --gridfile ${GRIDFILE} --inputdata-path ${INPUTDATA_ROOT} --res ne1024pg2 --gridtype global --output-filetype 64bit_offset -v
  2. Compile surface dataset source code (NOTE: ${e3sm_root}/components/clm/tools/clm4_5/mksurfdata_map/src/Makefile.common needs to be edited to build on most machines; this is fixed in https://github.com/E3SM-Project/E3SM/pull/2757):

    Code Block
    # Setup environment (should work on any E3SM-supported machine)
    eval $(${e3sm_root}/cime/CIME/Tools/get_case_env)
    ${e3sm_root}/cime/CIME/scripts/configure --macros-format Makefile --mpilib mpi-serial
    source .env_mach_specific.sh
    
    # Build mksurfdata_map
    cd ${e3sm_root}/components/elm/tools/mksurfdata_map/src
    INC_NETCDF="`nf-config --includedir`" \
        LIB_NETCDF="`nc-config --libdir`" USER_FC="`nc-config --fc`" \
        USER_LDFLAGS="`nf-config --flibs`" make
    

    Note for Perlmutter (Jan 2023) - The build line above did not work on PM until it was modified as follows:

    Code Block
    INC_NETCDF="`nf-config --includedir`" LIB_NETCDF="`nc-config --libdir`" USER_FC="`nc-config --fc`" USER_FFLAGS="" USER_FCTYP="ftn" USER_FFLAGS='-fallow-invalid-boz -fallow-argument-mismatch -ffree-line-length-none'  make


  3. Run the mksurfdata.pl script in "debug" mode to generate the namelist (use year 2010 on ne120np4 grids as an example). 

    Code Block
    # For supported resolutions
    #(use year 2010 on ne120np4 grids as an example)
    cd $e3sm_dir/components/elm/tools/mksurfdata_map
    ./mksurfdata.pl -res ne120np4 -y 2010 -d -dinlc /global/cfs/cdirs/e3sm/inputdata -usr_mapdir /global/cfs/cdirs/e3sm/inputdata/lnd/clm2/mappingdata/maps/ne120np4
    
    # For unsupported, user-specified resolutions
    # (use year 2010 on ne50np4 grid as an example)
    # (Assuming the mapping files created in step 1 has a time stamp of '190409' in the filenames and the location of mapping files are '/whatever/directory/you/put/mapping/files')
    ./mksurfdata.pl -res usrspec -usr_gname ne50np4 -usr_gdate 190409 -y 2010 -d -dinlc /global/cfs/cdirs/e3sm/inputdata -usr_mapdir /whatever/directory/you/put/mapping/files

    (However, ./mksurfdata.pl -h shows -y is by default 2010. When running without "-y" option, standard output says sim_year 2000. I suspect the mksurfdata.pl help information is wrong. To be confirmed.)

  4. Modify namelist file
    (Should the correct namelist settings be automatically picked up if the default land build name list settings are modified accordingly?)

    Time-evolving Land use land cover change (LULCC) data should not be used for fixed-time compsets, but the LULCC information for that particular year should be used (right?)
    Manually change to mksrf_fvegtyp = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/rawdata/AA_mksrf_landuse_rc_1850-2015_06062017_LUH2/AA_mksrf_landuse_rc_2010_06062017.nc' for the F2010 ne120 compset.

  5. Create the land surface data by interactive or batch job

    Code Block
    rm -f surfdata_ne120np4_simyr2010.bash
    cat <<EOF >> surfdata_ne120np4_simyr2010.bash
    #!/bin/bash
    
    #SBATCH  --job-name=mksurfdata2010
    #SBATCH  --account=acme
    #SBATCH  --nodes=1
    #SBATCH  --output=mksurfdata.o%j
    #SBATCH  --exclusive
    #SBATCH  --time=00:30:00
    #SBATCH  --qos=debug
    
    # Load modules
    module load nco
    module load ncl
    module load cray-netcdf
    module load cray-hdf5
    
    # mksurfdata_map is dynamically linked
    export LIB_NETCDF=$NETCDF_DIR/lib
    export INC_NETCDF=$NETCDF_DIR/include
    export USER_FC=ifort
    export USER_CC=icc
    export USER_LDFLAGS="-L$NETCDF_DIR/lib -lnetcdf -lnetcdff -lnetcdf_intel"
    export USER_LDFLAGS=$USER_LDFLAGS" -L$HDF5_DIR/lib -lhdf5 -lhdf5_fortran -lhdf5_cpp -lhdf5_fortran_intel -lhdf5_hl_intel -lhdf5hl_fortran_intel"
    
    cd /global/homes/t/tang30/ACME_code/MkLandSurf/components/clm/tools/clm4_5/mksurfdata_map
    
    CDATE=c`date +%y%m%d` # current date
    
    ./mksurfdata_map < namelist
    EOF
    
    sbatch surfdata_ne120np4_simyr2010.bash

    The land surface data in NetCDF format will be created at current directory. (How to verify the file is correct?)

...