Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The atmosphere and ocean model output is significantly larger than the land and ice model output. These commands recognize that by using different parallelization strategies that may (rhea standard queue) or may not (cooley or rhea bigmem queue) be required, depending on the fatness of the analysis nodes, as explained below.

Extended climos:

ncclimo can re-use previous work and produce extended (i.e., longer duration) climatologies by combining two previously computed climatologies (this is called the binary method) or by computing a new climatology from raw monthly model output and then combining that together with a previously computed climatology (this called the incremental method). Producing an extended climatology by the incremental method requires, at minimum, specifying (with -S) the start year of the previously computed climo. Producing an extended climatology by the binary method requires, at minimum, specifying both the start years (with -S and -s) and end years (with -E and -e) of both pre-computed climatologies.

Following are two examples of computing extended climatologies using the binary method (where both input climatologies are already computed). In the first example, note that the input directory is the same for both pre-computed climatologies, and is the same as the output directory for the extended climatology. This is perfectly legal and keeps the number of required options to a a minimum. It amounts to storing all native grid climatologies in the same directory:

ncclimo -p mpi -c ${caseid} -m cam  -S 2 -E 5 -s 6 -e 15 -i ${DATA}/acme/atm -o ${DATA}/acme/atm
ncclimo        -c ${caseid} -m clm2 -S 2 -E 5 -s 6 -e 15 -i ${DATA}/acme/lnd -o ${DATA}/acme/lnd
ncclimo -p mpi -c hist      -m ocn  -S 2 -E 5 -s 6 -e 15 -i ${DATA}/acme/ocn -o ${DATA}/acme/ocn 
ncclimo        -c hist      -m ice  -S 2 -E 5 -s 6 -e 15 -i ${DATA}/acme/ice -o ${DATA}/acme/ice

Memory Considerations:

It is important to employ the optimal ncclimo  parallelization strategy for your computer hardware resources. Select from the three available choices with the '-p par_typ' switch. The options are serial mode ('-p nil' or '-p serial'), background mode parallelism ('-p bck'), and MPI parallelism ('-p mpi'). The default is background mode parallelism, which is appropriate for lower resolution (e.g., ne30L30) simulations on most nodes at high-performance computer centers. Use (or at least start with) serial mode on personal laptops/workstations. Serial mode requires twelve times less RAM than the parallel modes, and is much less likely to deadlock or cause OOM (out-of-memory) conditions on your personal computer. If the available RAM (+swap) is < 12*4*sizeof(monthly input file), then try serial mode first (12 is the optimal number of parallel processes for monthly climos, the computational overhead is a factor of four). CAM-SE ne30L30 output is about ~1 GB per month so each month requires about 4 GB of RAM. CAM-SE ne30L72 output (with LINOZ) is about ~10 GB/month so each month requires ~40 GB RAM. CAM-SE ne120 output is about ~12 GB/month so each month requires ~48 GB RAM. The computer does not actually use all this memory at one time, and many kernels compress RAM usage to below what top reports, so the actual physical usage is hard to pin-down, but may be a factor of 2.5-3.0 (rather than a factor of four) times the size of the input file. For instance, my 16 GB MacBookPro will successfully run an ne30L30 climatology (that requests 48 GB RAM) in background mode, but the laptop will be slow and unresponsive for other uses until it finishes (in 6-8 minutes) the climos. Experiment a bit and choose the parallelization option that works best for you. 

...