NERSC Quick Reference for E3SM Tutorial
This page meant to be a “cheat sheet” for basic NERSC commands meant to be reference during tutorial.
To login to perlmutter: `ssh -l $USER perlmutter.nersc.gov` https://docs.nersc.gov/connect/
For the E3SM tutorial, all users associated will be added to a temporary project named ntrain6
(not username, this is your unix group and where compute hours charged, etc). To check if you are in ntrain6
, try the command groups
.
To ask for a compute node: salloc -A ntrain6 --reservation=e3sm_day1 -C cpu -N 1 -c 32 -t 30 -q shared
(-t is time, units in minutes. -q shared means you would be sharing a compute node)
Note you don’t need --qos=interactive
for reservations, but would want to use that when not using reservation.
Notes about reservations (nodes set aside for given time for use by one or several users):
To use it, need these flags to sbatch: --reservation e3sm_day1 -A ntrain6
where the -A
is indeed changing the account to be used – it charges against ntrain6
instead.
Can do this on the case.submit
command via something like: case.submit -a="-A ntrain6 --reservation=e3sm_day1"
Note that the reservation name will be different for each new reservation. On May7th, Tuesday, the reservation name is e3sm_day1
. On May8th, e3sm_day2
, on May 9th, e3sm_day3
, and May 10th e3sm_day4
. The reservations are scheduled to last every day from 1pm to midnight except on the last day (1-5pm).
After a job is submitted, slurm allows a user to change a few things (before job starts to run!). Examples:
scontrol update qos=debug jobid=xx -- move job xx to the debug qos
scontrol update qos=debug timelimit=30 jobid=xx -- move job xx to the debug qos and change walltime to 30 min
scontrol update qos=regular -- move job xx to regular qos
scontrol update reservation=e3sm_day1 account=ntrain6 jobid=xx -- move job xx to reservation noted
scontrol update reservation=e3sm_day1 account=ntrain6 timelimit=60 jobid=xx -- move job xx to reservation noted and set walltime to 90 min
To see list of your jobs: squeue -u $USER or sqs -u $USER
Can also change format of output:
squeue -u $USER --format "%.8i %.42j %.9P %3f %.9q %.6a %.3u %.3t %.10M %.10l %.6D %19V %S"
Unrelated to reservations, here is possible solution to the issue of some users not having write access to e3sm space:
/global/cfs/cdirs/ntrain6/www
(contents will be removed after the tutorial)/global/cfs/cdirs/e3sm/www/Tutorials/2024/users
Note in general for perlmutter, when there is not access to a special reservation, the machine has a debug
qos that can allow for much faster Q wait times. To use debug qos, job must be 8 nodes or less, 30 minutes or less, and can only have 5 jobs at a time in the debug qos.
https://docs.nersc.gov/jobs/policy/