Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. cori optimized build has an internal compiler error (in shoc_assumed_pdf.cpp?). Avoid this by building in debug mode.

  2. perlmutter optimized build yields corrupted answers (really hot planet). Avoid this by building in debug mode.

  3. ne120 fails with OOM(?) errors right now when SPA is active. Delete that proc in namelist_scream.xml.

  4. For perlmutter: you need to use the gnugpu compiler and set --gpu-bind=none in env_batch.xml:

    Code Block
    <directives compiler="gnugpu">
          <directive> --gpus-per-task=1</directive>
          <directive> --gpu-bind=none</directive>

Step 1: Conda environment

...

The SCREAMv1 build requires a handful of python libraries. NERSC has these available by default, but on other machines we need to make some changes. Creating a conda environment with those packages installed is the easiest way to create an appropriate environment. For example,

...

Code Block
breakoutModewide
export CODE_ROOT=~/gitwork/scream/ #or wherever you cloned the scream repo
export COMPSET=F2010-SCREAMv1      #or whatever compset you want
export RES=ne4_ne4                 #or whatever resolution you want
export CASE_NAME=${RES}.${COMPSET}.test1  #name for your simulation.
export PECOUNT=96x1 $PECOUNT               # Number of MPIs by number of threads. Should be divisible by node size
export QUEUE=pdebug                #whatever the name of your debug or batch queue is

...

Resolution

Grid name (aka $RES)

ne4

ne4_ne4

ne30

ne30_ne30

ne120

ne120_r0125_oRRS18to6v3

ne256

ne512

ne512_r0125_oRRS18to6v3

ne1024

Suggested PECOUNTs (not necessarily performant, just something to get started)

ne4 (max = 96)

ne30 (max = 5,400)

ne120 (max = 86,400)

ne256 (max = 393,216)

ne512 (max = 1,572,864)

ne1024 (max = 6,291,456)

cori-knl (68 cores/node; 96+16 GB/node)

16x1

NTASKS=675675x1

perlmutter (64 cores/node; 4 GPUs/node; 256 GB/node)

NTASKS=1212x1

syrah (16 cores/node; 64 GB/node)

32x1

160x1

320x1

quartz (36 cores/node; 128 GB/node)

72x1

180x1

360x1

summit (8 cores/node?; 6 GPUs/node; 512+96 GB/node)

...

Code Block
./xmlchange ATM_NCPL=288
./xmlchange DEBUG=TRUE #debug rather than optimized build.
./xmlchange JOB_QUEUE=pdebug #debug if on cori or perlmutter
./xmlchange JOB_WALLCLOCK_TIME=0:30:00
./xmlchange STOP_OPTION=ndays #how long to run for
./xmlchange STOP_N=1
./xmlchange HIST_OPTION=ndays #how often to write cpl.hi files
./xmlchange HIST_N=1
./xmlchange NTASKS=675 #change how many MPI tasks to use

The point of these changes are (respectively):

  1. change the atm timestep to 288 steps per day (300 sec). This needs to be done via ATM_NCPL or else the land model will get confused about how frequently it is coupling with land

  2. compile in debug mode. Will run 10x slower but will provide better error messages. And doesn’t run in any other mode on some machines.

  3. change the default queue and wallclock from the standard queue with 1 hr walltime to debug queue and its max of 30 min walltime to get through the queue faster.

  4. change the default length of the run from just a few steps to 1 day (or whatever you choose). This change is made in both env_run.xml and env_test.xml because case.submit seems to grab for one or the other file according to confusing rules - easier to just change both.

  5. HIST_OPTION and HIST_N set the frequency of coupler snapshots (cpl.hi) files, which are useful for figuring out whether SCREAM is getting or giving bad data from/to the surface models

  6. NTASKS is the number of MPI tasks to use (which sets the number of nodes submit asks for). You can also set this via --pecount setting in create_newcase.

Step 5: Change SCREAM settings

As of , this is done by modifying namelist_scream.xml either by hand or by using the atm-config-chg function which now comes bundled when you create a case. Explore namelist_scream.xml for variables you might want to change (but you shouldn’t have to change anything to run).

** For PM: in env_batch.xml set the --gpu-bind=none for the “gnugpu" compiler

...

  • if you want to run with non-hydrostatic dycore:

    • Change to tstep_type=9 (or run ./atm-config-chg tstep_type=9 in case directory)

    • Change to theta_hydrostatic_mode=False (or run ./atm-config-chg theta_hydrostatic_mode=False)

  • To modify what output gets written, change ./data/scream_output.yaml file under the run/data/ directory

  • Some bugs are affected by chunk length for vectorization, which is handled by “pack” size in v1. Pack size can be tweaked by editing the cmake machine file for the current machine (components/scream/cmake/machine-files/$machine.cmake).

Step 6: Config/Compile/Run

...

  1. Change Vertical__Coordinate__Filename to use the initial condition file for your new resolution

  2. Change Filename under Initial__Conditions → Physics__GLL subsection to also use that new initial condition file

  3. Change SPA__Remap__File to use one appropriate to map ne30 to your new resolution

  4. Change se_ne as appropriate

  5. change se_tstep and nu_top (recommended defaults for these and dtime are given in the awesome table on the EAM's HOMME Dycore Recommended Settings (THETA) page

** Other options **

Changing output frequency and variables

  1. Modify the ./data/scream_output.yaml file under the run directory

Running with non-hydrostatic dycore

  1. Change to tstep_type=9 (or run ./atm-config-chg tstep_type=9 in case directory)

  2. Change to theta_hydrostatic_mode=False (or run ./atm-config-chg theta_hydrostatic_mode=False)