Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Table of Contents

Child Pages

Child pages (Children Display)

...

Code Block
output_root=${HOME}/cscratch/e3sm/grids/ne4
mkdir -p ${output_root}

Types of Atmosphere grid metadata files

See SE Atmosphere Grid Overview (EAM & CAM) for description of the spectral elements, GLL nodes, subcell grid and dual grid.   

  • Exodus file: "ne4.g".   This is a netcdf file following Exodus conventions.  It gives the corners of all elements on the sphere and their connectivity.  It is independent of the polynomial order used inside the element ("np").  

    • This file is used by TempestRemap (TR) to generate mapping files.  The polynomial order is a command line option and the GLL nodes are internally generated by TR.  

  • SCRIP file:  "ne4pg2.scrip.nc".   This file contains a description of the atmosphere physics grid n the format used by the original incremental remap tool SCRIP. It is used for most output and also used to generate mapping files between components and for post-processing of most output.

  • Less common “GLL” metadata files needed for specialized purposes:

    • SCRIP file:  "ne4np4_scrip.nc".   This file contains a description (SCRIP format) of the GLL dual grid. It includes the locations of the GLL nodes and artificial bounding polygons around those nodes.   Ideally the spherical area of each polygon will match the GLL weight ("exact GLL areas"), but not all tools can achieve exact areas.  Inexact areas does not impact the accuracy of the resulting mapping algorithm, it just means that mass will not be exactly conserved by the mapping algorithm.  

    • latlon file:  "ne4np4_latlon.nc".   This file contains a list of all the GLL nodes in the mesh (in latitude/longitude coordinates).   The list of GLL nodes must be in the the internal HOMME global id ordering, matching the ordering used in CAM and EAM native grid output.   It also contains the connectivity of the GLL subcell grid.   

      • This file is used by CAM's interpic_new utility, and graphics programs Paraview and Visit when plotting native grid GLL output.


...

Step-by-step guide

1. Generate a new "grid" file

Requirements: TempestRemap

...

The Exodus file contains only information about the position of the spectral element on the sphere. For SE aware utilities such as TempestRemap, they can use the polynomial order and the reference element map to fill in necessary data such as the locations of the nodal GLL points. For non-SE aware utilities, we need additional meta data, described in the next section.   

2A. Generate control volume mesh files for E3SM v2 "pg2" grids 

Requirements:

  • exodus mesh file

  • TempestRemap

...

Code Block
${tempest_root}/bin/GenerateVolumetricMesh --in ne4.g --out ne4pg2.g --np 2 --uniform                                  
${tempest_root}/bin/ConvertExodusToSCRIP --in ne4pg2.g --out ne4pg2.scrip.nc                     


2B. Generate "dual grid" mesh files (SCRIP and lat/lon format) for E3SM v1 "np4" GLL grids

Requirements:

  • exodus mesh file

  • Matlab or Fortran utility.

...

Code Block
mv *_latlon.nc ${output_root}
mv *_scrip.nc ${output_root}

3. Generate mapping files

Requirements:

  • TempestRemap

  • ESMF_RegridWeightGen

  • ncremap

  • grid descriptor files for each component that exists on a different grid
    (atmosphere, ocean, possibly land if on a different grid than the atmosphere)

...

Code Block
atm_grid_file=ne30pg2.g
atm_scrip_grid_file=ne30pg2_scrip.nc
ocn_grid_file=ocean.oEC60to30v3.scrip.181106.nc
lnd_grid_file=SCRIPgrid_0.5x0.5_nomask_c110308.nc

atm_name=ne30pg2
ocn_name=oEC60to30v3
lnd_name=r05

## Conservative, monotone maps.

alg_name=mono

date=200110

function run {
    echo "src $src dst $dst map $map"
    ncremap -a tempest --src_grd=$src --dst_grd=$dst -m $map \
        -W '--in_type fv --in_np 1 --out_type fv --out_np 1 --out_format Classic --correct_areas' \
        $extra
}

extra=""

src=$ocn_grid_file
dst=$atm_grid_file
map="map_${ocn_name}_to_${atm_name}_${alg_name}.${date}.nc"
run

src=$atm_grid_file
dst=$ocn_grid_file
map="map_${atm_name}_to_${ocn_name}_${alg_name}.${date}.nc"
extra=--a2o
run
extra=""

src=$lnd_grid_file
dst=$atm_grid_file
map="map_${lnd_name}_to_${atm_name}_${alg_name}.${date}.nc"
run

src=$atm_grid_file
dst=$lnd_grid_file
map="map_${atm_name}_to_${lnd_name}_${alg_name}.${date}.nc"
run

## Nonconservative, monotone maps.

alg_name=bilin

src=$atm_scrip_grid_file
dst=$lnd_grid_file
map="map_${atm_name}_to_${lnd_name}_${alg_name}.${date}.nc"
ncremap -a bilinear -s $src -g $dst -m $map -W '--extrap_method  nearestidavg'

src=$atm_scrip_grid_file
dst=$ocn_grid_file
map="map_${atm_name}_to_${ocn_name}_${alg_name}.${date}.nc"
ncremap -a bilinear -s $src -g $dst -m $map -W '--extrap_method  nearestidavg'

4. Generate domain files

Domain files are needed by the coupler and the land model at runtime.    The land model uses the mask to determine where to run and the coupler use the land fraction to merge fluxes from multiple surface types to the atmosphere above them.  Domain files are created from the mapping files created in the previous step, using a tool provided with CIME in ${e3sm_root}/cime/tools/mapping/gen_domain_files. This directory contains the source code for the tool (in Fortran 90) and a Makefile.    Cloning E3SM is now required to obtain code within the distribution.   To clone E3SM, 

...

Code Block
domain.lnd.${atm_grid}_${ocn_grid}.${datestring}.nc
domain.ocn.${atm_grid}_${ocn_grid}.${datestring}.nc
domain.ocn.${ocn_grid}.${datestring}.nc
domain.lnd.${lnd_grid}_${ocn_grid}.${datestring}.nc
domain.ocn.${lnd_grid}_${ocn_grid}.${datestring}.nc


5. Generate topography file 

Generating the topography and related surface roughness data sets is a detailed process that has been moved to it’s own page, with detailed instructions depending on the model version (V1, V2, V3)

Topography Generation

6. Generate and spin-up a new atmosphere initial condition

Generating a new initial condition for the atmosphere is a two-step process. First, an existing initial condition is interpolated to the target resolution, then the interpolated initial condition is used to spin-up a new initial condition that is in balance and consistent with the dynamics (I am probably not explaining this very well; bottom line, without this step the model will probably blow up if you try to run with the interpolated initial condition, at least for RRM grids). The spin-up requires initially lowering the timestep and increasing the hyperviscosity, and then gradually relaxing these back to more reasonable values. Both of these steps are described below.

Step 1:  Generating a "first-guess" initial condition

A starting point for a new initial condition is first interpolated from an existing initial condition. Traditionally, this has been done using the interpic_new tool, which does both horizontal (for new grids) and vertical (for potentially different vertical grids/numbers of levels) interpolation of the fields in the initial condition file. This is a Fortran code that is included in E3SM, within the atmosphere tools directory. Unfortunately, the tool only supports interpolating from lat/lon grids, and cannot interpolate from unstructured to unstructured. So, if you want to use this tool to interpolate an existing initial condition to an SE grid, you will have to start with an older initial condition on an FV grid or similar. Thus, the use of interpic_new is NO LONGER SUPPORTED OR RECOMMENDED. The script below for building and using interpic_new is included only to document the process in case someone wanted to revive this workflow. The script will not work on NERSC, as the paths do not exists:

...

Note that although this approach is flexible in the source grid, it does require an existing initial condition with the same vertical grid. However, NCO will now do vertical interpolation, and will also wrap TempestRemap horizontal remapping. This is well documented on the NCO homepage (http://nco.sourceforge.net/nco.html).


Step 2:   Generate the atmosphere initial condition 

Two options are now available to generate the atmosphere initial condition.   The first method is document on the page: Generate atm initial condition from analysis data.  The second method is documented here in the section Spinning up the atmosphere.

Option 1:  Generate atm initial condition from analysis data


Option 2:  Spinning up the atmosphere

The following procedure is copied from the recommendations in Mark and Colin's Google Doc https://docs.google.com/document/d/1ymlTgKz2SIvveRS72roKvNHN6a79B4TLOGrypPjRvg0/edit on running new RRM configurations (TODO: clean this up and update):

...

During this tuning process, it is useful to compare the smallest ‘dx’ from the atmosphere log file to the smallest ‘dx’ from the global uniform high resolution run.  Use the ‘dx’ based on the singular values of Dinv, not the ‘dx’ based on element area. If the ‘dx’ for newmesh.g is 20% smaller than the value from the global uniform grid, it suggests the advective timesteps might need to be 20% smaller, and the viscous timesteps might need to be 44% smaller (they go like dx^2).  The code prints out CFL estimates that are rough approximation that can be used to check if you are in the correct ballpark.

7. Generate land surface data (fsurdat)

Requirements:

...

  1. Create mapping files for each land surface type if needed. An (older and deprecated) example of doing this can be found here. Updated instructions follow:

    1. Obtain or generate a target grid file in SCRIP format. For these example, we will use a ne1024pg2 grid file, which we will need to create (note that most np4 grid files can be found within the inputdata repository, for example, the ne1024np4 grid file is at https://web.lcrc.anl.gov/public/e3sm/mapping/grids/ne1024np4_scrip_c20191023.nc). To generate the pg2 SCRIP file: 

      Code Block
      ${tempest_root}/bin/GenerateCSMesh --alt --res 1024 --file ne1024.g
      ${tempest_root}/bin/GenerateVolumetricMesh --in ne1024.g --out ne1024pg2.g --np 2 --uniform
      ${tempest_root}/bin/ConvertExodusToSCRIP --in ne1024pg2.g --out ne1024pg2_scrip.nc
    2. Get list of input grid files for each land surface input data file. This is done by running the components/clm/tools/shared/mkmapdata/mkmapdata.sh script in debug mode to output a list of needed files (along with the commands that will be used to generate each map file; also make sure GRIDFILE is set to the SCRIP file from the above step): 

      Code Block
      languagebash
      cd ${e3sm_root}/components/clm/tools/shared/mkmapdata
      ./mkmapdata.sh --gridfile ${GRIDFILE} --inputdata-path ${INPUTDATA_ROOT} --res ne1024pg2 --gridtype global --output-filetype 64bit_offset --debug -v --list
    3. Download needed input grid files. The above command will output a list of needed files to clm.input_data_list. We need to download all of these before calling the script without the debug flag to actually perform the mapping. This is possible using check_input_data in CIME, but needs to be done from a dummy case directory. So, one can create a dummy case, cd to that case, and then call ./check_input_data --data-list-dir <path where mkmapdata was run from> --download. However, this failed to connect to the CESM SVN server for me. So instead, I used the following one-off script: 

      Code Block
      #!/bin/bash
      e3sm_inputdata_repository="https://web.lcrc.anl.gov/public/e3sm"
      cesm_inputdata_repository="https://svn-ccsm-inputdata.cgd.ucar.edu/trunk"
      inputdata_list=clm.input_data_list
      cat $inputdata_list | while read line; do
          localpath=`echo ${line} | sed 's:.* = \(.*\):\1:'`
          url1=${e3sm_inputdata_repository}/`echo ${line} | sed 's:.*\(inputdata/lnd/.*\):\1:'`
          url2=${cesm_inputdata_repository}/`echo ${line} | sed 's:.*\(inputdata/lnd/.*\):\1:'`
          if [ ! -f ${localpath} ]; then
              echo "${url1} -> ${localpath}"
              mkdir -p `dirname ${localpath}`
              cd `dirname ${localpath}`
              # Try to download using first URL, if that fails then use the second
              wget ${url1} || wget ${url2}
          else
              echo "${localpath} exists, skipping."
          fi
      done
    4. Create mapping files. Should just be able to run the above mkmapdata.sh command without the –debug --list flags. We need to append the --outputfile-type 64bit_offset flag for our large files (no reason not to do this by default anyways):

      Code Block
      ./mkmapdata.sh --gridfile ${GRIDFILE} --inputdata-path ${INPUTDATA_ROOT} --res ne1024pg2 --gridtype global --output-filetype 64bit_offset -v
  2. Compile surface dataset source code (NOTE: ${e3sm_root}/components/clm/tools/clm4_5/mksurfdata_map/src/Makefile.common needs to be edited to build on most machines; this is fixed in https://github.com/E3SM-Project/E3SM/pull/2757):

    Code Block
    # Setup environment (should work on any E3SM-supported machine)
    ${e3sm_dir}/cime/tools/configure --macros-format=Makefile && source .env_mach_specific.sh
    
    # Set environent variables expected by mksurfdata_map Makefile;
    # Note that NETCDF_DIR is probably specific to NERSC and may need
    # to be adjusted for other systems
    export LIB_NETCDF=$NETCDF_DIR/lib
    export INC_NETCDF=$NETCDF_DIR/include
    export USER_FC=ifort
    export USER_CC=icc
    
    # Build mksurfdata_map
    cd $e3sm_dir/components/clm/tools/clm4_5/mksurfdata_map/src/ && gmake


  3. Run the mksurfdata.pl script in "debug" mode to generate the namelist (use year 2010 on ne120np4 grids as an example). 

    Code Block
    # For supported resolutions
    #(use year 2010 on ne120np4 grids as an example)
    cd $e3sm_dir/components/clm/tools/clm4_5/mksurfdata_map
    ./mksurfdata.pl -res ne120np4 -y 2010 -d -dinlc /global/cfs/cdirs/e3sm/inputdata -usr_mapdir /global/cfs/cdirs/e3sm/inputdata/lnd/clm2/mappingdata/maps/ne120np4
    
    # For unsupported, user-specified resolutions
    # (use year 2010 on ne50np4 grid as an example)
    # (Assuming the mapping files created in step 1 has a time stamp of '190409' in the filenames and the location of mapping files are '/whatever/directory/you/put/mapping/files')
    ./mksurfdata.pl -res usrspec -usr_gname ne50np4 -usr_gdate 190409 -y 2010 -d -dinlc /global/cfs/cdirs/e3sm/inputdata -usr_mapdir /whatever/directory/you/put/mapping/files

    (However, ./mksurfdata.pl -h shows -y is by default 2010. When running without "-y" option, standard output says sim_year 2000. I suspect the mksurfdata.pl help information is wrong. To be confirmed.)

  4. Modify namelist file
    (Should the correct namelist settings be automatically picked up if the default land build name list settings are modified accordingly?)

    Time-evolving Land use land cover change (LULCC) data should not be used for fixed-time compsets, but the LULCC information for that particular year should be used (right?)
    Manually change to mksrf_fvegtyp = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/rawdata/AA_mksrf_landuse_rc_1850-2015_06062017_LUH2/AA_mksrf_landuse_rc_2010_06062017.nc' for the F2010 ne120 compset.

  5. Create the land surface data by interactive or batch job

    Code Block
    rm -f surfdata_ne120np4_simyr2010.bash
    cat <<EOF >> surfdata_ne120np4_simyr2010.bash
    #!/bin/bash
    
    #SBATCH  --job-name=mksurfdata2010
    #SBATCH  --account=acme
    #SBATCH  --nodes=1
    #SBATCH  --output=mksurfdata.o%j
    #SBATCH  --exclusive
    #SBATCH  --time=00:30:00
    #SBATCH  --qos=debug
    
    # Load modules
    module load nco
    module load ncl
    module load cray-netcdf
    module load cray-hdf5
    
    # mksurfdata_map is dynamically linked
    export LIB_NETCDF=$NETCDF_DIR/lib
    export INC_NETCDF=$NETCDF_DIR/include
    export USER_FC=ifort
    export USER_CC=icc
    export USER_LDFLAGS="-L$NETCDF_DIR/lib -lnetcdf -lnetcdff -lnetcdf_intel"
    export USER_LDFLAGS=$USER_LDFLAGS" -L$HDF5_DIR/lib -lhdf5 -lhdf5_fortran -lhdf5_cpp -lhdf5_fortran_intel -lhdf5_hl_intel -lhdf5hl_fortran_intel"
    
    cd /global/homes/t/tang30/ACME_code/MkLandSurf/components/clm/tools/clm4_5/mksurfdata_map
    
    CDATE=c`date +%y%m%d` # current date
    
    ./mksurfdata_map < namelist
    EOF
    
    sbatch surfdata_ne120np4_simyr2010.bash

    The land surface data in NetCDF format will be created at current directory. (How to verify the file is correct?)


8. Generate a new land initial condition (finidat)

Three options:

  • cold start:  finidat="", no file necessary.  Lets us get up and running, but not suitable for climate science applications

  • Interpolate a spunup state from a previous simulation.  This is reasonable for many applications, but not suitable for official published E3SM simulations.

  • spin-up a new initial condition following best practices from land model developers.  

...

what I would recommend in general, is to start the spin-up process with a land cold-start condition, using reanalysis data atmosphere and having all the other land settings just the way you want them for the eventual coupled simulation (resolution, domain file, land physics and BGC settings). Then run for enough years to get an approximate steady state (the number depends on what land options you are using – no BGC means a lot fewer years). Then use the resulting restart file as the initial condition for a coupled run that has at least the same atmosphere settings you will eventually use for your production run. Save high-frequency output for multiple years (preferably 10 or more, but a high res run might not have that luxury). Use that atm output to drive a second offline land spin-up, to equilibrate the land to the expected initial climate from the atmosphere. Then use the resulting land restart as finidat for the start of your fully-coupled spin-up simulation, and let it run for a while to assess drifts in all coupled components. Only after you are satisfied that everything is stable are you in a safe state to begin a science experiment production run. Off-the-shelf finidat file is not likely to save you much time in this process, because it will not be spun up to your experimental conditions and tolerances.

9. Create a new atmospheric dry deposition file

From the README for mkatmsrffile tool at components/cam/tools/mkatmsrffile:

...

Output file produced using the above procedure was compared against an existing file (/project/projectdirs/e3sm/inputdata/atm/cam/chem/trop_mam/atmsrf_ne30np4_110920.nc​) using a script from Peter Caldwell. Following figures show the comparison:

...


10. Create a new compset and/or new supported grid by modifying CIME's xml files

???


11. Implement tests for the new grid

(TODO:  develop this section.  One example might be to include a case-building test as done in ${e3sm_root}/cime/config/e3sm/tests.py)


...

Summary

After reading through the info above on  , I (Peter Caldwell) created lists of stuff we should create tests for, merge to master, and revise to avoid dual-grid dependency. Ben Hillman - am I missing anything?

Tools we should create tests for:

  1. TempestRemap for generating uniform grids
    (in Paul’s external git repo - may have its own tests?)

  2. SQuadGen for generating RRM grids
    (in Paul’s external repo - may have its own tests?)

  3. No longer needed:   run_dualgrid.sh to obtain scrip and latlon files 
    (in PreAndPostProcessingScripts repo; uses a matlab file).  

  4. makegrid.job for generating dualgrid and latlon files (in components/homme/tests/template, may have its own tests?)

    1. Replaced by "homme_tool", tests added 2020/5

  5. smoothtopo.job for applying dycore-specific smoothing to topography (in components/homme/tests/template, may have its own tests?)

    1. Replaced by "homme_tool", tests added, 2020/5

  6. run ncremap (an NCO command) to generate mapping files

  7. components/cam/tools/topo_tool/cube_to_target

  8. cime/tools/mapping/gen_domain_files

  9. mksurfdata.pl to generate the namelist needed to make fsurdat file

  10. use mksurfdata_map for fsurdat

  11. use the interpic_new tool to regrid atmos state to new grid for initial condition

...