Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In climatology generation mode, the NCO operator ncclimo ingests "raw" data consisting of interannual sets of files, each containing sub-daily (diurnal), daily, monthly, or yearly averages, and from these produces climatological daily, monthly, seasonal, and/or annual means. Alternatively, in timeseries reshaping (aka “splitter”) mode, ncclimo will subset and temporally split the input timeseries into per-variable files spanning the entire period. ncclimo will optionally regrid (by calling ncremap) all output files in either mode. The primary ncremap documentation is here. This presentation, given at the Albuquerque workshop on 20151104, conveys much of the information presented below, and some newer information, in a more graphical format. 

Table of Contents

Prerequisites

ncclimo requires and comes with NCO version 4.6.0 and later.  Since early 2018, the preferred way to obtain NCO for E3SM analysis is with the E3SM-Unified Conda package, which installs numerous analysis packages in a platform-independent manner and, as importantly, allows you to skip reading the rest of this paragraph. Those who need only NCO, or who wish to avoid Conda, should read-on. The newest versions of NCO are installed on all major DOE supercomputers in C. Zender’s home directory (usually ~zender/[bin,lib]), and semi-recent versions are sometimes available as machine modules (e.g., module load nco). This is site-specific and not under my (CZ's) control. Follow these directions on the NCO homepage to install on your own machines/directories. It can be as easy as apt-get install nco, dnf install nco, or conda install -c conda-forge nco, or you can build/install from scratch with configure;make;make install

...

Timeseries Reshaping mode, aka Splitting

ncclimo will reshape input files that are a series of snapshots of all model variables into outputs that are continuous timeseries of each individual variable taken from all input files. Timeseries to be reshaped (split) often come with hard-to-predict names, e.g., because the number of days or months in a file, or timesteps per day or month may all vary. Thus ncclimo in splitter mode requires the user to supply the input filenames. ncclimo will not construct input filenames itself in splitter mode (unlike monthly or annual climo generation mode). ncclimo will, as of version 5.0.4, employ timeseries reshaping mode if it receives the --split switch or the --ypf_max option described below. In addition, it must receive a list of files through a pipe to stdin, or, alternatively, placed as positional arguments (after the last command-line option), or if neither of these is done and no caseid is specified, in which case it assumes all *.nc files in drc_in constitute the input file list. These examples invoke reshaping mode in the four possible ways (choose your poison):

Code Block
# Abbreviations
drc_in=/scratch2/scratchdirs/golaz/ACME_simulations/20161117.beta0.A_WCYCL1850S.ne30_oEC_ICG.edison/run~zender/data/ne30/raw
map_fl=${DATA}/maps/map_ne30np4ne30/map_ne30pg2_to_fv129x256cmip6_180x360_aavenco.2015090120200901.nc
# Splitter Input Mode #1: Read input filename list from file
ls $drc_in/*cameam.h0.0201[01234]??*.nc > input_list
ncclimo --split --yr_srt=12013 --yr_end=2502014 --var=FSNT,AODVIS --map=$map_fl --drc_out=$drc_out < input_list
# Splitter Input Mode #2: Pipe input filenames to stdin
cd $drc_in
ls $drc_in/*cameam.h0.0201[01234]??*.nc | ncclimo --split --yr_srt=12013 --yr_end=2502014 --var=FSNT,AODVIS --map=$map_fl --drc_out=$drc_out
# Splitter Input Mode #3: Append filenames positional arguments
ncclimo --split --var=FSNT,AODVIS --yr_srt=12013 --yr_end=2502014 --map=$map_fl --drc_out=$drc_out $drc_in/*cam.h0.0[012]??*.nc
# Splitter Input Mode #4: Ingest entire directory (be sure the directory contains only files to be climatologized!)
ncclimo --split --var=T,Q,RH --yr_srt=12013 --yr_end=2502014 --drc_in=$drc_in --map=$map_fl --drc_out=$drc_out

...

ncclimo can (as of NCO 4.9.4) reshape timeseries with temporal resolution shorter than one-month, aka, high-frequency timeseries. For E3SM, this typically means timeseries with daily or finer (e.g., hourly) resolution, such as is often output in EAM/ELM h1-h4 datasets. EAM/ELM output these datasets with a fixed number of records (i.e., timesteps) per file. For example, fifteen daily timesteps or 24 hourly timesteps per file. A primary difficulty in processing such datasets is that their boundaries often do not coincide with the desired analysis interval, which might start and end on even boundaries of a month or year. Aligning timeseries to even month or year boundaries requires extra processing logic which users must invoke by setting the climatology mode option to high frequency splitting (hfs), i.e., --clm_md=hfs:

cd $drc_in;ls *.cameam.h1.000?? > ~/input_list
ncclimo --clm_md=hfs --var=PRECT --ypf=1 --yr_srt=1 2013 --yr_end=3 2014 --map=map.nc --drc_out=${drc_out} < ~/input_list

The output of the above would be three files each containing the values of PRECT for exactly one year, no matter what the time resolution or the boundaries of the input. Omitting the --clm_md=hfs option for high-frequency timeseries would result in output segments not evenly aligned on year boundaries.

Climatology generation mode (produce monthly, seasonal, and annual climatologies from monthly-mean input data)

A common task for ncclimo is to produce climatological monthly, seasonal, and annual-means from an interannual series of monthly-mean input files with commands like these:

Code Block
ncclimo         -s startyr_yrsrt -e endyr_yr end -c run_id -i drc_in -o drc_out # EAM/CAM/CAM-SE
ncclimo -v FSNT -s startyr_yrsrt -e endyr_yr end -c run_id -i drc_in -o drc_out # EAM subset
ncclimo -m clm2 -s startyr_yrsrt -e endyr_yr end -c run_id -i drc_in -o drc_out # ELM/ALM/CLM

...

Code Block
ncclimo --start=startyr_yrsrt --end=endyr_yr end --case=run_id --input=drc_in --output=drc_out

...