...
Check that the rpointer files all point to the last restart. On very rare occasions, there might be some inconsistency if the model crashed at the end.
Run
head -n 1 rpointer.*
to see the restart date.
gzip all the
*.log
files from the faulty segment so that they get moved during the next short-term archiving. Togzip
log files from failed jobs, rungzip *.log.<job ID>.*
(where<job ID>
has no periods/dots in it).Delete core or error files if there are any. MPAS components will sometimes produce a large number of them. The following commands are useful for checking for these files:
ls | grep -in core
ls | grep -in err
If you are re-submitting the initial job, you will need to run
./xmlchange -id CONTINUE_RUN -val TRUE
...