Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Use ncclimo if possible. It requires and comes with NCO version 4.6.0 and later.  Since about early 2018, the preferred way to obtain NCO for E3SM analysis is with the E3SM-Unified Conda package, which installs numerous analysis packages in a platform-independent manner and, as importantly, allows you to skip reading the rest of this paragraph. Those who need only NCO, or who wish to avoid Conda, should read-on. The newest versions of NCO are installed on rhea/titan.ccs.ornl.gov at ORNL, pileus.ornl.gov (CADES at ORNL), cooley/mira.alcf blues/anvil.lcrc.anl.gov: at ANL, cori/edison.nersc.gov (NERSC), aims4.llnl.gov (LLNL), rogercheyenne.ncsa.illinoisucar.edu (NCSANCAR), and yellowstonecompy.ucarpnl.edu gov (NCARPNNL). The ncclimo and ncremap scripts are hard-coded to should find the latest versions automatically, and do not require any module or path changes. To use other (besides the ncclimo and ncremap scripts) NCO executables from the command-line or from your own scripts may require loading modules. This is site-specific and not under my (CZ's) control. At OLCF, for example, "module load gcc" helps to run NCO from the command-line or scripts. For other machines You might check that the default NCO is recent enough (try "module load nco", then "ncks --version") or use developers' executables/libraries (in ~zender/[bin,lib] on all machines). Follow these directions on the NCO homepage to install on your own machines/directories. It can be as easy as "apt-get install nco", "dnf install nco", or "conda install -c conda-forge nco", or you can build/install from scratch with "configure;make;make install"

Climatology generation mode (produce monthly + seasonal + annual climos from monthly input files):

The usual way to use ncclimo is to bring up a terminal window and type:

Code Block
ncclimo         -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # EAM/CAM/CAM-SE
Code Block
ncclimo -v FSNT -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # EAM subset
Code Block
ncclimo -m clm2 -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # ELM/ALM/CLM

Each option can be accessed by a handful of long-option synonyms to suit users' tastes. With long options the first example above may be rewritten as

Code Block
ncclimo --start=start_yr --end=end_yr --case=run_id --input=drc_in --output=drc_out

When invoked without options ncclimo outputs a handy table of all available options, their long-option synonyms, and some examples. NCO documentation here describes the full meaning of all options. A short summary of the most common options is:

-a: type of DJF average. Either -a scd (default) or -a sdd. scd means seasonally continuous December. The first month used will be Dec of the year before the start year you specify with -s. sdd means seasonally discontinuous December. The first month used will be Jan of the specified start year.

-C: Climatology mode. Either "mth" (default, and for monthly input files) or "ann" (for annual input files). 

-c: caseid, i.e., simulation name. For input files like famipc5_ne30_v0.3_00001.cam.h0.1980-01.nc, specify "-c famipc5_ne30_v0.3_00001". The ".cam." and ".h0." bits are added to the filenames internally by default, and can be modified via the "-m mdl_nm" and "-h hst_nm" switches if needed. See comments in ncclimo for documentation. 

-e: end year (example: 2000). Unless the optional flag "-a sdd" is specified, the last month used will be Nov of the specified end year. If "-a sdd" is specified, the last month will be Dec of the specified end year.

-h: history file volume that separates the model name from the date in the input file name. Default is "h0".  Other common values are "h1" and "h". 

-i: directory containing all netcdf files to be used for input to this code.

-m: model type. Default is "cam". Other options are "clm2", "ocn", "ice", "cism", "cice", "pop".

-o: directory where computed native grid climo files will be placed. Regridded climos will also be placed here unless a separate directory for them is specified with -O (NB: capital "O") 

-O: directory where regridded climo files will be placed.

-s: start year (example: 1980). The first month used will be Dec of the year before the start year you specify (example Dec 1979 to allow for contiguous DJF climos). If "-a sdd" is specified, the first month used will be Jan of the specified start year.

-v: variable list, e.g., FSNT,AODVIS,PREC.? (yes, regular expressions work so this expands to PRECC,PRECL,PRECSC,PRECSL)

Timeseries Reshaping mode, aka Splitting:

ncclimo will reshape a series of input files into outputs that are continuous timeseries of each variable taken from all input files. Timeseries to be reshaped (split) often come with hard-to-predict names, e.g., because the number of days or months in a file, or timesteps per day or month may all vary. Thus ncclimo in splitter mode requires the user to supply the input filenames. ncclimo will not construct input filenames itself in splitter mode (unlike monthly or annual climo generation mode). ncclimo will, as of version 4.6.4, automatically switch to timeseries reshaping mode if it receives a list of files through a pipe to stdin, or, alternatively, placed as positional arguments (after the last command-line option), or if neither of these is done and no caseid is specified, in which case it assumes all *.nc files in drc_in constitute the input file list. These examples invoke reshaping mode in the four possible ways (choose your poison):

Code Block
drc_in=/scratch2/scratchdirs/golaz/ACME_simulations/20161117.beta0.A_WCYCL1850S.ne30_oEC_ICG.edison/run
map_fl=${DATA}/maps/map_ne30np4_to_fv129x256_aave.20150901.nc
Code Block
# Read list from file
ls $drc_in/*cam.h0.0[012]??* > input_list
ncclimo --dbg=0 --yr_srt=1 --yr_end=250 --var=FSNT,AODVIS --map=$map_fl --drc_out=$drc_out < input_list
# Pipe list to stdin
cd $drc_in
ls *cam.h0.0[012]??* | ncclimo --dbg=0 --yr_srt=1 --yr_end=250 --var=FSNT,AODVIS --map=$map_fl --drc_out=$drc_out
# List as positional arguments
ncclimo --var=FSNT,AODVIS --yr_srt=1 --yr_end=250 --map=$map_fl --drc_out=$drc_out $drc_in/*cam.h0.0[012]??*.nc
# Read directory
ncclimo --var=T,Q,RH --yr_srt=1 --yr_end=250 --drc_in=$drc_in --map=$map_fl --drc_out=$drc_out

...

Timeseries Reshaping mode, aka Splitting:

ncclimo will reshape a series of input files into outputs that are continuous timeseries of each variable taken from all input files. Timeseries to be reshaped (split) often come with hard-to-predict names, e.g., because the number of days or months in a file, or timesteps per day or month may all vary. Thus ncclimo in splitter mode requires the user to supply the input filenames. ncclimo will not construct input filenames itself in splitter mode (unlike monthly or annual climo generation mode). ncclimo will, as of version 4.6.4, automatically switch to timeseries reshaping mode if it receives a list of files through a pipe to stdin, or, alternatively, placed as positional arguments (after the last command-line option), or if neither of these is done and no caseid is specified, in which case it assumes all *.nc files in drc_in constitute the input file list. These examples invoke reshaping mode in the four possible ways (choose your poison):

Code Block
drc_in=/scratch2/scratchdirs/golaz/ACME_simulations/20161117.beta0.A_WCYCL1850S.ne30_oEC_ICG.edison/run
map_fl=${DATA}/maps/map_ne30np4_to_fv129x256_aave.20150901.nc
#1 Read list from file
ls $drc_in/*cam.h0.0[012]??* > input_list
ncclimo --dbg=0 --yr_srt=1 --yr_end=250 --var=FSNT,AODVIS --map=$map_fl --drc_out=$drc_out < input_list
#2 Pipe list to stdin
cd $drc_in
ls *cam.h0.0[012]??* | ncclimo --dbg=0 --yr_srt=1 --yr_end=250 --var=FSNT,AODVIS --map=$map_fl --drc_out=$drc_out
#3 List as positional arguments
ncclimo --var=FSNT,AODVIS --yr_srt=1 --yr_end=250 --map=$map_fl --drc_out=$drc_out $drc_in/*cam.h0.0[012]??*.nc
#4 Ingest entire directory
ncclimo --var=T,Q,RH --yr_srt=1 --yr_end=250 --drc_in=$drc_in --map=$map_fl --drc_out=$drc_out

The output is a collection of per-variable timeseries such as FSNT_YYYYMM_YYYYMM.nc, AODVIS_YYYYMM_YYYYMM.nc, etc. The output is split into segments each containing no more than ypf_max (default 50) years-per-file, e.g., FSNT_000101_005012.nc, FSNT_005101_009912.nc, FSNT_010001_014912.nc, etc. Change the maximum number of years-per-output-file with the --ypf_max=ypf_max option. One caveat worth noting is that, for technical reasons, non-interactive batch queues cannot unambiguously distinguish filelists provided via stdin from batch queue information provided via stdin. To address this issue, ncclimo supports the --stdin switch to cause the filelist to be read from stdin. This switch is only necessary when running ncclimo in non-interactive batch mode. The switch is permitted though redundant when running ncclimo interactively (i.e., from a terminal shell). The remainder of the ncclimo documentation refers to climatology-generation mode, not to splitter mode.

Climatology generation mode (produce monthly + seasonal + annual climos from monthly input files):

The usual way to use ncclimo is to bring up a terminal window and type:

Code Block
ncclimo         -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # EAM/CAM/CAM-SE
ncclimo -v FSNT -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # EAM subset
ncclimo -m clm2 -s start_yr -e end_yr -c run_id -i drc_in -o drc_out # ELM/ALM/CLM

Each option can be accessed by a handful of long-option synonyms to suit users' tastes. With long options the first example above may be rewritten as

Code Block
ncclimo --start=start_yr --end=end_yr --case=run_id --input=drc_in --output=drc_out

When invoked without options ncclimo outputs a handy table of all available options, their long-option synonyms, and some examples. NCO documentation here describes the full meaning of all options. A short summary of the most common options is:

-a: type of DJF average. Either -a sdd (default) or -a scd. scd means seasonally continuous December. The first month used will be Dec of the year before the start year you specify with -s. sdd means seasonally discontinuous December. The first month used will be Jan of the specified start year. (Prior to NCO 4.9.4, released in August 2020, the default was scd not sdd.)

-C: Climatology mode. Either "mth" (default, and for monthly input files) or "ann" (for annual input files). 

-c: caseid, i.e., simulation name. For input files like famipc5_ne30_v0.3_00001.cam.h0.1980-01.nc, specify "-c famipc5_ne30_v0.3_00001". The ".cam." and ".h0." bits are added to the filenames internally by default, and can be modified via the "-m mdl_nm" and "-h hst_nm" switches if needed. See comments in ncclimo for documentation. 

-e: end year (example: 2000). Unless the optional flag "-a sdd" is specified, the last month used will be Nov of the specified end year. If "-a sdd" is specified, the last month will be Dec of the specified end year.

-h: history file volume that separates the model name from the date in the input file name. Default is "h0".  Other common values are "h1" and "h". 

-i: directory containing all netcdf files to be used for input to this code.

-m: model type. Default is "cam". Other options are "clm2", "ocn", "ice", "cism", "cice", "pop".

-o: directory where computed native grid climo files will be placed. Regridded climos will also be placed here unless a separate directory for them is specified with -O (NB: capital "O") 

-O: directory where regridded climo files will be placed.

-s: start year (example: 1980). The first month used will be Dec of the year before the start year you specify (example Dec 1979 to allow for contiguous DJF climos). If "-a sdd" is specified, the first month used will be Jan of the specified start year.

-v: variable list, e.g., FSNT,AODVIS,PREC.? (yes, regular expressions work so this expands to PRECC,PRECL,PRECSC,PRECSL)

MPAS O/I considerations:

MPAS ocean and ice models currently have their own (non-CESM'ish) naming convention that guarantees output files have the same names for all simulations. By default ncclimo analyzes the "timeSeriesStatsMonthly" analysis member output (tell CZ if you want options for other AM output). ncclimo recognizes input files as being MPAS-style when invoked with "-m mpaso" or "-m mpascice" like this:

...