NERSC Quick Reference for E3SM Tutorial

This page meant to be a “cheat sheet” for basic NERSC commands meant to be reference during tutorial.

To login to perlmutter: `ssh -l $USER perlmutter.nersc.gov` https://docs.nersc.gov/connect/

For the E3SM tutorial, all users associated will be added to a temporary project named ntrain6 (not username, this is your unix group and where compute hours charged, etc). To check if you are in ntrain6, try the command groups.

To ask for a compute node: salloc -A ntrain6 --reservation=e3sm_day1 -C cpu -N 1 -c 32 -t 30 -q shared (-t is time, units in minutes. -q shared means you would be sharing a compute node)

Note you don’t need --qos=interactive for reservations, but would want to use that when not using reservation.

Notes about reservations (nodes set aside for given time for use by one or several users):

To use it, need these flags to sbatch: --reservation e3sm_day1 -A ntrain6

where the -A is indeed changing the account to be used – it charges against ntrain6 instead.

Can do this on the case.submit command via something like: case.submit -a="-A ntrain6 --reservation=e3sm_day1"

Note that the reservation name will be different for each new reservation. On May7th, Tuesday, the reservation name is e3sm_day1. On May8th, e3sm_day2, on May 9th, e3sm_day3, and May 10th e3sm_day4. The reservations are scheduled to last every day from 1pm to midnight except on the last day (1-5pm).

After a job is submitted, slurm allows a user to change a few things (before job starts to run!). Examples:

scontrol update qos=debug jobid=xx -- move job xx to the debug qos scontrol update qos=debug timelimit=30 jobid=xx -- move job xx to the debug qos and change walltime to 30 min scontrol update qos=regular -- move job xx to regular qos scontrol update reservation=e3sm_day1 account=ntrain6 jobid=xx -- move job xx to reservation noted scontrol update reservation=e3sm_day1 account=ntrain6 timelimit=60 jobid=xx -- move job xx to reservation noted and set walltime to 90 min

To see list of your jobs: squeue -u $USER or sqs -u $USER

Can also change format of output:

squeue -u $USER --format "%.8i %.42j %.9P %3f %.9q %.6a %.3u %.3t %.10M %.10l %.6D %19V %S"

Unrelated to reservations, here is possible solution to the issue of some users not having write access to e3sm space:

/global/cfs/cdirs/ntrain6/www (contents will be removed after the tutorial)
/global/cfs/cdirs/e3sm/www/Tutorials/2024/users

 

Note in general for perlmutter, when there is not access to a special reservation, the machine has a debug qos that can allow for much faster Q wait times. To use debug qos, job must be 8 nodes or less, 30 minutes or less, and can only have 5 jobs at a time in the debug qos.

https://docs.nersc.gov/jobs/policy/