Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Basic:

The basic way to use climo_nco.sh is to bring up a terminal window and simply type:

climo_nco.sh         -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # CAM
climo_nco.sh -m clm2 -s start_yr -e end_yr -c run_id -i drc_in -o drc_out out # ALM/CLM

A complete description of all available flags is given in comments embedded in climo_nco.sh, and for convenience here is a summary:

-s: start year (example: 1980). Note that unless Unless the optional flag "-a sdd" is specified, the first month used will be Dec of the year before the start year you specify (to allow for contiguous DJF climos). If "-a sdd" is specified, the first month used will be Jan of the specified start year.
-e: end yr (example: 2000). Unless the optional flag "-a sdd" is specified, the last month used will be Nov of the specified end yryear. If "-a sdd" is specified, the last month will be Dec of the specified end yryear.
-i: directory containing all netcdf files to be used for input to this code.
-o: directory where computed climo files should be placed.

-c: name of the run you're trying to create climos for. For example, if your input files looked like: caseid, i.e., simulation name. For input files like famipc5_ne30_v0.3_00001.cam.h0.1980-01.nc, you would specify "-c famipc5_ne30_v0.3_00001". The ".cam." and ".h0." bits are added to the filenames internally by default, and can be modified via advanced commands the "-m mdl_nm" and "-h hst_nm" switches if needed. See comments in climo_nco.sh for documentation of this

MPAS O/I considerations:

MPAS ocean and ice models -m: model type. Default is "cam". Other options are "clm2", "ocn", "ice", "cism", "cice", "pop".

MPAS O/I considerations:

MPAS ocean and ice models currently have their own (non-CESM'ish) naming convention for monthly output files. climo_nco.sh recognizes input files as being MPAS-style when invoked with "-h c hist" and "-m ocn" or "-m ice". The Use the optional "-c run_id" option is not used for MPAS. Hence invocation f fml_nm" switch to replace "hist" with a more descriptive simulation name for the output. Invocation looks like this:

climo_nco.sh -hc hist -m ocn -s 1980 -e 1983 -i drc_in -o drc_out

...

 # MPAS-O
climo_nco.sh -c hist -m ice -s 1980 -e 1983 -i drc_in -o drc_out # MPAS-I

MPAS climos are unaware of missing values until/unless the input files are "fixed". I recommend that the person who produces the simulation annotate all floating point variables with the appropriate _FillValue prior to

...

invoking climo_nco.sh. Run something like this once in the history file directory:

for fl in `ls hist.*` ; do
ncatted -O -t -a _FillValue,,o,d,-9.99999979021476795361e+33 ${fl}
done

If/when MPAS O/I generates the _FillValue attributes itself, this step can and should be skipped. All other climo_nco.sh features like regridding (below) are invoked identically for MPAS as for CAM/CLM users although under-the-hood climo_nco.sh does do some special pre-processing (dimension permutation, metadata annotation) for MPAS. A five-year oEC60to30 MPAS-O climo with regridding to T62 takes < 10 minutes on rhea.

...

climo_nco.sh will (optionally) regrid during climatology generation and produce climatology files on both the native and desired analysis grids. This regridding is virtually free, because it is performed on idle nodes/cores after the monthly climatologies have been computed and while the seasonal climatologies are being computed. This load-balancing can save half an hour on ne120 datasets. To regrid, simply pass the desired mapfile name with "-r map.nc", e.g., "-r ${DATA}/maps/map_ne120np4_to_fv257x512_aave.20150901.nc". Pass Although this should not be necessary for normal use, you may pass any options specific to regridding with "-R opt1 opt2", e.g., "-R '--rgr col_nm=lndgrid'" (to regrid ALM or CLM).Specifying '-

Specifying '-O drc_rgr' (NB: uppercase "O") causes climo_nco.sh to place the regridded files in the directory ${drc_rgr}. These files have the same names as the native grid climos from which they were derived. There is no namespace conflict because they are in separate directories. These files also have symbolic links to their AMWG filenames. If '-O drc_rgr' is not specified, climo_nco.sh places all regridded files in the native grid climo output directory, ${drc_out}, specified by '-o drc_out' (NB: lowercase "o") . To avoid namespace conflicts when both climos are stored in the same directory, the names of the regridded files are suffixed by the destination geometry string obtained from the mapfile, e.g., '*_climo_fv257x512_bilin.nc'. These files also have symbolic links to their AMWG filenames.

...

See the full ncremap documentation for more examples (including MPAS!).

 

Coupled Runs:

climo_nco.sh works on all ACME models. It can simultaneously generate climatologies for a coupled run, where climatologies mean both native and regridded monthly, seasonal, and annual averages as per the AG specification. Here are template commands for a recent simulation:

caseid=20160121.A_B2000ATMMOD.ne30_oEC.titan.a00
drc_in=/lustre/atlas1/cli112/proj-shared/golaz/ACME_simulations/20160121.A_B2000ATMMOD.ne30_oEC.titan.a00/run
map_atm=${DATA}/maps/map_ne30np4_to_fv129x256_aave.20150901.nc
map_lnd=$map_atm
map_ocn=${DATA}/maps/map_oEC60to30_to_t62_bilin.20160301.nc
map_ice=$map_ocn
climo_nco.sh -p mpi -c ${caseid} -m cam  -s 2 -e 5 -i $drc_in -r $map_atm -o ${DATA}/acme/atm
climo_nco.sh -c ${caseid} -m clm2 -s 2 -e 5 -i $drc_in -r $map_lnd -o ${DATA}/acme/lnd
climo_nco.sh -p mpi -c hist -m ocn -s 2 -e 5 -i $drc_in -r $map_ocn -o ${DATA}/acme/ocn
climo_nco.sh -c hist -m ice -s 2 -e 5 -i $drc_in -r $map_ice -o ${DATA}/acme/ice

The atmosphere and ocean model output is significantly larger than the land and ice model output. These commands recognize that by using different parallelization strategies that may (rhea standard queue) or may not (cooley or rhea bigmem queue) be required, depending on the fatness of the analysis nodes, as explained below.

Memory Considerations:

It is important to employ the optimal climo_nco.sh  parallelization strategy for your computer hardware resources. Select from the three available choices with the '-p par_typ' switch. The options are serial mode ('-p nil' or '-p serial'), background mode parallelism ('-p bck'), and MPI parallelism ('-p mpi'). The default is background mode parallelism, which is appropriate for lower resolution (e.g., ne30L30) simulations on most nodes at high-performance computer centers. Use (or at least start with) serial mode on personal laptops/workstations. Serial mode requires twelve times less RAM than the parallel modes, and is much less likely to deadlock or cause OOM (out-of-memory) conditions on your personal computer. If the available RAM (+swap) is < 12*4*sizeof(monthly input file), then try serial mode first (12 is the optimal number of parallel processes for monthly climos, the computational overhead is a factor of four). CAM-SE ne30L30 output is about ~1 GB per month so each month requires about 4 GB of RAM. CAM-SE ne30L72 output (with LINOZ) is about ~10 GB/month so each month requires ~40 GB RAM. CAM-SE ne120 output is about ~12 GB/month so each month requires ~48 GB RAM. The computer does not actually use all this memory at one time, and many kernels compress RAM usage to below what top reports, so the actual physical usage is hard to pin-down, but may be a factor of 2.5-3.0 (rather than a factor of four) times the size of the input file. For instance, my 16 GB MacBookPro will successfully run an ne30L30 climatology (that requests 48 GB RAM) in background mode, but the laptop will be slow and unresponsive for other uses until it finishes (in 6-8 minutes) the climos. Experiment a bit and choose the parallelization option that works best for you. 

...