...
NERSC (cori-knl, perlmutter): nothing to do - everything is available by default
Summit: just
module load python
LLNL machines (quartz, syrah):
...
Create a conda environment with
...
the needed packages:
Code Block |
---|
conda create -n scream_v1_build pyyaml pylint psutil |
will create an environment named scream_v1_build
with the appropriate packages. Then activate the environment before building.Once this is done (one time for a given machine) you can activate the environment:
Code Block |
---|
conda activate scream_v1_build |
...
Step 2: Define convenience variables
...
Suggested PECOUNTs (not necessarily performant, just something to get started). Note that EAMv1 currently uses something like 0.04 to 0.07GB/element, so make sure you don’t add more elements/node than you have memory for.
ne4 (max = 96) | ne30 (max = 5,400) | ne120 (max = 86,400) | ne256 (max = 393,216) | ne512 (max = 1,572,864) | ne1024 (max = 6,291,456) | |
---|---|---|---|---|---|---|
cori-knl (68 cores/node; 96+16 GB/node) | 16x1 | 675x1 | ||||
perlmutter (64 cores/node; 4 GPUs/node; 256 GB/node) | 12x1 | |||||
syrah (16 cores/node; 64 GB/node) | 32x1 | 160x1 | 320x11600x1 | |||
quartz (36 cores/node; 128 GB/node) | 72x1 | 180x1 | 360x11800x1 | |||
summit (8 cores/node?; 6 GPUs/node; 512+96 GB/node) | 256x1 | 4096x1 |
Available compilers are listed in $CODE_ROOT/cime_config/machines/config_compilers.xml. Options for the various machines are listed below. CIME also puts simulations in a different scratch directory on each machine. Figuring out where output will go can be confusing, so defaults are listed below.
Location where CIME puts run: | available compilers | |
---|---|---|
cori-knl | /global/cscratch1/sd/${USER}/e3sm_scratch/cori-knl/ | intel, gnu |
perlmutter | /pscratch/sd/{first-letter-of-username/${USER}/e3sm_scratch/perlmutter/ | gnugpu, nvidiagpu, gnu, nvidia |
syrah | /p/lustre2/${USER}/e3sm_scratch/syrah/ | intel |
quartz | /p/lustre2/${USER}/e3sm_scratch/quartz/ | intel |
summit | /autofs/nccs-svm1_home1/$USER/ for $CASE and /gpfs/alpine/cli115/proj-shared/${USER}/e3sm_scratch/ for run stuff | gnugpu, ibmgpu, pgigpu, gnu, ibm, pgi |
...
change the atm timestep to 288 steps per day (300 sec). This needs to be done via ATM_NCPL or else the land model will get confused about how frequently it is coupling with land
compile in debug mode. Will run 10x slower but will provide better error messages. And doesn’t run in any other mode on some machines.
change the default queue and wallclock from the standard queue with 1 hr walltime to debug queue and its max of 30 min walltime to get through the queue faster. Note that the format for summit wallclock limits is hh:mm instead of hh:mm:ss on other machines.
change the default length of the run from just a few steps to 1 day (or whatever you choose). This change is made in both env_run.xml and env_test.xml because case.submit seems to grab for one or the other file according to confusing rules - easier to just change both.
HIST_OPTION and HIST_N set the frequency of coupler snapshots (cpl.hi) files, which are useful for figuring out whether SCREAM is getting or giving bad data from/to the surface models
NTASKS is the number of MPI tasks to use (which sets the number of nodes submit asks for). You can also set this via --pecount setting in create_newcase.
Changing the PIO_NETCDF_FORMAT to 64bit_data is needed at very high resolutions to avoid exceeding max variables size limits.
...
You can check it’s progress via squeue -u <username>
on many systemsLLNL and NERSC systems. Use jobstat -u <username>
or bjobs
on Summit. Model output will be in the run subdirectory.
...