Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page meant to be a “cheat sheet” for basic NERSC commands meant to be reference during tutorial. I’m making notes here now, but will try to simplify. Let me know what else could be included.

To login to perlmutter: `ssh -l $USER saul-p1perlmutter.nersc.gov` (multiple login nodes, all can submit jobs to CPU/GPU compute nodes) https://docs.nersc.gov/connect/

For the E3SM tutorial, all users associated will be added to a temporary project named ntrain6 (not username, this is your unix group and where compute hours charged, etc). To check if you are in ntrain6, try the command groups.

To ask for a compute node: salloc -A ntrain6 --reservation=e3sm_day1 -C cpu -N 1 -c 32 -t 30 :00 -q shared (-t is time, units in minutes. -q shared means you would be sharing a compute node)

Note you don’t need --qos=interactive for reservations, but would want to use that when not using reservation.

...

where the -A is indeed changing the account to be used – it charges against ntrain6 instead.

Can do this on the case.submit command via something like: case.submit -a="-A ntrain6 --reservation=e3sm_day1"

Note that the reservation name will be different for each new resvreservation. On May7th, TuesdaTuesday, the resveration reservation name is e3sm_day1. On May8th, e3sm_day2, on May 9th, e3sm_day3, and May 10th e3sm_day4. The reservations are scheduled to last every day from 1pm to midnight except on the last day (1-5pm).

...

Code Block
breakoutModewide
scontrol update qos=debug jobid=xx                -- move job xx to the debug qos
scontrol update qos=debug timelimit=30 jobid=xx   -- move job xx to the debug qos and change walltime to 30 min
scontrol update qos=regular                       -- move job xx to regular qos
scontrol update reservation=e3sm_dryrunday1 account=ntrain6 jobid=xx -- move job xx to reservation noted
scontrol update reservation=e3sm_dryrunday1 account=ntrain6 timelimit=60 jobid=xx -- move job xx to reservation noted and set walltime to 90 min

To see list of your jobs: squeue -u $USER or sqs -u $USER

Can also change format of output:

Code Block
breakoutModewide
squeue -u $USER  --format "%.8i %.42j %.9P %3f %.9q %.6a %.3u %.3t %.10M %.10l %.6D %19V %S"

Unrelated to reservations, here is possible solution to the issue of some users not having write access to e3sm space:

/global/cfs/cdirs/ntrain6/www   just tell them the (contents will be removed after the trainingtutorial)
/global/cfs/cdirs/e3sm/www/Tutorials/2024/users – Wuyin notes may also work as he has changed permissions?

Note in general for perlmutter, when there is not access to a special reservation, the machine has a debug qos that can allow for much faster Q wait times. To use debug qos, job must be 8 nodes or less, 30 minutes or less, and can only have 5 jobs at a time in the debug qos.

https://docs.nersc.gov/jobs/policy/