Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

IN PROGRESS

Contact: Ryan Forsyth

...

On other machines, the paths are the same, except for the <simulations_dir>.

On compy (PNNL):

<simulations_dir>: /compyfs/<username>/E3SM_simulations

On cori (NERSC):

<simulations_dir>: ${CSCRATCH}/E3SM_simulations

Configuring the Model Run – Old Run Script

A template for running the model is provided at https://github.com/E3SM-Project/E3SM/blob/master/run_e3sm.template.csh . Notice there is a section at the top labeled "THINGS USERS USUALLY CHANGE (SEE END OF SECTION FOR GUIDANCE)". These are the settings that you are most likely to change. The "EXPLANATION FOR OPTIONS ABOVE:" section explains these parameters.

Create a new run script or copy an existing one. The path to it should be <run_scripts_dir>/run.<case_name>.csh

For ease of use, below are further explanatory notes:

BASIC INFO ABOUT RUN

  • set job_name = v2_test01.piControl:

    • v2_test01 is a short custom description to help identify the simulation.

    • piControl is the type of simulation. Other options here include , but are not limited to: amip, F2010

  • set resolution = ne30pg2_EC30to60E2r2-1900_ICG:

    • ne30 is the number of spectral elements for the atmospheric grid.

    • EC30to60E2r2 is the ocean and seaice resolution.

    • rrm for regionally refined mesh is an option to replace other resolutions.

SOURCE CODE OPTIONS

  • fetch_code: if you have not run the model before, want to incorporate new changes, or use a new branch, then set this to true. Otherwise, you can set this to false, which means time doesn't have to be spent checking out code.

  • e3sm_tag: the specific hash of the E3SM repo you want to run the model with. Note if you use a branch rather than a specific hash, you’ll be using the head of the branch, which may change as commits are added.

  • tag_name: you can pick a short name to replace e3sm_tag. Typically this will be a date (e.g., "20210122" for 2021-01-22. It is good practice to use year-month-day so ls will list runs chronologically).

CUSTOM CASE_NAME

  • set case_name = ${tag_name}.${job_name}.${resolution}:

    • Note that job_name (see BASIC INFO ABOUT RUN) typically has two parts (separated by period), so case_name will actually have four parts.

    • If you are comparing the same case across different machines, add ${machine}: ${tag_name}.${job_name}.${resolution}.${machine}.

PROCESSOR CONFIGURATION

  • set processor_config = S: S,M,L sizes, amongst other specified in the "EXPLANATION FOR OPTIONS ABOVE:" section. Use S for short tests. Full simulations should use L. The size determines how many nodes will be used. The exact number of nodes will differ amongst machines.

DIRECTORIES

Code Block
set code_root_dir               = ~/E3SM/code/
set e3sm_simulations_dir        = <simulations_dir>
set case_build_dir              = ${e3sm_simulations_dir}/${case_name}/build
set case_run_dir                = ${e3sm_simulations_dir}/${case_name}/run
set short_term_archive_root_dir = ${e3sm_simulations_dir}/${case_name}/archive

LENGTH OF SIMULATION, RESTARTS, AND ARCHIVING

For a short run, this section might look like:

Code Block
set stop_units       = nmonths      # Units will be number of months
set stop_num         = 1            # Stop after running one month (one stop_unit)
set restart_units    = $stop_units
set restart_num      = $stop_num
set num_resubmits    = 0

For a long run, this section might look like:

Code Block
set stop_units       = nyears       # Units will be number of years
set stop_num         = 20           # Stop running after 20 years (20 stop_units)
set restart_units    = $stop_units  # Units will also be number of years
set restart_num      = 5            # Write restart file after running 5 years (5 stop units)
set num_resubmits    = 4            # Submit 4 times after the initial submit (4+1 submits * 20 years/submit = 100 years)

In the above configuration, the model is submitted 5 times (initially and then 4 times after). Each submission covers 20 simulated years, so this will run 100 simulated years. On each submission, restart files will be written every 5 years. Since each submission covers 20 simulated years, each one will have 4 restart files written.

Model runs need to return the same results whether they use restart or not. If that is not the case, then a non-bit-for-bit change has been introduced.

.

On compy (PNNL):

<simulations_dir>: /compyfs/<username>/E3SM_simulations

On cori (NERSC):

<simulations_dir>: ${CSCRATCH}/E3SM_simulations

Configuring the Model Run – Run Script

Start with an example of a run script for a low-resolution coupled simulation: run.20210409.v2beta4.piControl.ne30pg2_EC30to60E2r2.chrysalis.sh. Create a new run script or copy an existing one. The path to it should be <run_scripts_dir>/run.<case_name>.sh

# Machine and project

  • readonly MACHINE=chrysalis: the name of the machine you’re running on.

  • readonly PROJECT="e3sm": SLURM project accounting (typically e3sm).

...

  • readonly COMPSET="WCYCL1850" : compset (configuration)

  • readonly RESOLUTION="ne30pg2_EC30to60E2r2": resolution (low-resolution coupled simulation in this case)

    • ne30 is the number of spectral elements for the atmospheric dynamics grid, while pg2 refers to the physics grid option. This mesh grid spacing is approximately 110 km.

    • EC30to60E2r2 is the ocean and sea-ice resolution. The grid spacing varies between 30 and 60 km.

    • For simulations with regionally refined meshes such as the N American atmosphere grid coupled to the WC14 ocean and sea-ice, replace with northamericax4v1pg2_WC14to60E2r3.

  • readonly DESCRIPTOR="v2beta4.piControl":

    • v2beta4 is a short custom description to help identify the simulation.

    • piControl is the type of simulation. Other options here include , but are not limited to: amip, F2010.

  • readonly CASE_GROUP="v2beta4.piControl":

    • This will let you mark multiple cases as part of the same group for later processing (e.g., with PACE).

...

To gzip log files from failed jobs, run gzip *.log.<job ID>.*

Post-Processing with zppy (needs update)

To post-process a model run, do the following steps. Note that to post-process up to year n, then you must have short-term archived up to year n.

...