...
ncclimo
will reshape input files that are a series of snapshots of all model variables into outputs that are continuous timeseries of each individual variable taken from all input files. Timeseries to be reshaped (split) often come with hard-to-predict names, e.g., because the number of days or months in a file, or timesteps per day or month may all vary. Thus ncclimo
in splitter mode requires the user to supply the input filenames. ncclimo
will not construct input filenames itself in splitter mode (unlike monthly or annual climo generation mode). ncclimo
will, as of version 5.0.4, employ timeseries reshaping mode if it receives the --split
switch or the --ypf_max
option described below. In addition, it must receive a list of files through a pipe to stdin
, or, alternatively, placed as positional arguments (after the last command-line option), or if neither of these is done and no caseid is specified, in which case it assumes all *.nc
files in drc_in constitute the input file list. These examples invoke reshaping mode in the four possible ways (choose your poison):
...
#
...
Sample
...
Abbreviations
...
drc_in=~zender/data/ne30/raw
...
map_fl=${DATA}/maps/map_ne30/map_ne30pg2_to_cmip6_180x360_nco.20200901.nc
...
#
...
Splitter
...
Input
...
Mode
...
#1:
...
Read
...
input
...
filename
...
list
...
from
...
file
...
ls
...
$drc_in/
...
eam.h0.201[34]
...
.nc
...
>
...
input_list
...
ncclimo
...
--split
...
--yr_srt=2013
...
--yr_end=2014
...
--var=FSNT,AODVIS
...
--map=$map_fl
...
--drc_out=$drc_out
...
<
...
input_list
...
#
...
Splitter
...
Input
...
Mode
...
#2:
...
Pipe
...
input
...
filenames
...
to
...
stdin
...
cd
...
$drc_in
...
ls
...
$drc_in/
...
eam.h0.201[34]
...
.nc
...
|
...
ncclimo
...
--split
...
--yr_srt=2013
...
--yr_end=2014
...
--var=FSNT,AODVIS
...
--map=$map_fl
...
--drc_out=$drc_out
...
#
...
Splitter
...
Input
...
Mode
...
#3:
...
Append
...
filenames
...
positional
...
arguments
...
ncclimo
...
--split
...
--var=FSNT,AODVIS
...
--yr_srt=2013
...
--yr_end=2014
...
--map=$map_fl
...
--drc_out=$drc_out
...
$drc_in/
...
cam.h0.0[012]??
...
.nc
...
#
...
Splitter
...
Input
...
Mode
...
#4:
...
Ingest
...
entire
...
directory
...
(be
...
sure
...
the
...
directory
...
contains
...
only
...
files
...
to
...
be
...
climatologized!)
...
ncclimo
...
--split
...
--var=T,Q,RH
...
--yr_srt=2013
...
--yr_end=2014
...
--drc_in=$drc_in
...
--map=$map_fl
...
--drc_out=$drc_out
The output is a collection of per-variable timeseries such as FSNT_YYYYMM_YYYYMM.nc
, AODVIS_YYYYMM_YYYYMM.nc
, etc. The output is split into segments each containing no more than ypf_max (default 50) years-per-file, e.g., FSNT_000101_005012.nc, FSNT_005101_009912.nc, FSNT_010001_014912.nc, etc. Change the maximum number of years-per-output-file with the --ypf_max=
ypf_max option.
...
A common task for ncclimo
is to produce climatological monthly, seasonal, and annual-means from an interannual series of monthly-mean input files with commands like these:
...
...
language | bash |
---|
ncclimo
...
-P
...
eam
...
-s
...
$yr_srt
...
-e
...
$yr_end -c
...
$caseid
...
-i
...
$drc_in
...
-o
...
$drc_out #
...
EAM/CAM/CAM-SE
...
ncclimo
...
-P
...
eam
...
-v
...
FSNT
...
-s
...
$yr_srt
...
-e
...
$yr_end -c
...
$caseid
...
-i
...
$drc_in
...
-o
...
$drc_out #
...
EAM
...
subset
...
ncclimo
...
-P
...
elm
...
-
...
s
...
$yr_srt
...
-e
...
$yr_end -c
...
$caseid
...
-i
...
$drc_in
...
-o
...
$drc_out #
...
ELM/ALM/CLM
Each option can be accessed by a handful of long-option synonyms to suit users' tastes. With long options the first example above may be rewritten as
...
ncclimo
...
--prc_typ=eam
...
--start=$yr_srt
...
--end=$yr_end --case=$caseid
...
--input=$drc_in
...
--output=$drc_out
Note that -P eam
above, is not necessary since the default processing type is EAM. However, it is a good habit to specify the component model (if any) to ncclimo
since ncremap
may require this information in the regridding step. When invoked without options ncclimo
outputs a handy table of all available options, their long-option synonyms, and some examples. NCO documentation here describes the full meaning of all options. The most common options are:
...
MPAS ocean and ice models have their own (non-CESM'ish) naming convention that guarantees output files have the same names for all simulations. By default ncclimo
analyzes the timeSeriesStatsMonthly
analysis member (AM) output (tell Charlie Zender if you want options for other AM output). ncclimo
recognizes input files as being MPAS-style when invoked with -m mpaso
or -m mpasseaice
(or synonyms) like this:
...
ncclimo
...
-P
...
mpasocean
...
-s
...
1980
...
-e
...
1983
...
-i
...
$drc_in
...
-o
...
$drc_out
...
ncclimo
...
-P
...
mpasseaice
...
-s
...
1980
...
-e
...
1983
...
-i
...
$drc_in
...
-o
...
$drc_out
Some data are best evaluated with custom-defined seasons, e.g., JFM instead of DJF, or two-month seasons such as FM or ON. ncclimo
supports up to eleven (and counting) seasons, although by default it only computes MAM, JJA, SON, and DJF. As of NCO 4.6.8, use the --seasons
(or --csn
) option to specify additional or alternate seasons:
...
ncclimo
...
-P
...
mpasseaice
...
--seasons=jfm,jas,ann
...
-s
...
1980
...
-e
...
1983
...
-i
...
$drc_in
...
-o
...
$drc_out
The climatological annual mean, ANN, is also computed automatically when MAM, JJA, SON, and DJF are all requested (which is the default, so ANN is always computed by default). Use –-seasons=none
to completely turn-off seasonal and annual-mean climatologies.
MPAS climos are unaware of missing values until/unless the input files are "fixed". We recommend that the person who produces the simulation annotate all floating point variables with the appropriate _FillValue
prior to invoking ncclimo
. Run something like this once in the history file directory:
...
for
...
fl
...
in
...
ls hist.*
...
;
...
do
...
ncatted
...
-O
...
-t
...
-a
...
_FillValue,,o,d,-9.99999979021476795361e+33
...
${fl}
...
done
If/when MPAS generates the _FillValue
attributes itself, this step can and should be skipped (MPAS developers: please let Charlie Zender know when MPAS “fixes” this “feature”). All other ncclimo
features like regridding (below) are invoked identically for MPAS as for EAM/ELM users although under-the-hood ncremap
(if invoked) specially pre-processes (dimension permutation, metadata annotation) MPAS data.
...
As of NCO 4.9.4 (September, 2020), ncclimo
can produce climatologies that retain the diurnal cycle resolution provided by the input data. These “high frequency climos” are useful for characterizing the diurnal cycle of processes typically retained in EAM/ELM h1-h4 history files, high-frequency observational analyses (e.g., MERRA2, ERA5), and similar data. In all respects except two, high frequency climo features are invoked and controlled by the same options as traditional climo generation from monthly mean input. The most significant difference is that the user must supply the filenames of high-frequency input data via any of the four methods outlined above for splitting. High-frequency climo input dataset names are too complex for ncclimo
to automatically generate (as it does for monthly-mean input), so one must supply the names via standard input, positional arguments, or filename globbing, or directory location exactly as for splitter mode described above. The second difference is that the user must supply the --clm_md=hfc
option to tell ncclimo to operate in climo-generation rather than splitter mode:
...
ncclimo
...
-P
...
eam
...
--clm_md=hfc
...
--yr_srt=1
...
--yr_end=250
...
--var=FSNT,AODVIS
...
--map=$map_fl
...
--drc_out=$drc_out
...
<
...
input_list
...
ls
...
cam.h0.0[012]??
...
|
...
ncclimo
...
-P
...
eam
...
--clm_md=hfc
...
--yr_srt=1
...
--yr_end=250
...
--var=FSNT,AODVIS
...
--map=$map_fl
...
--drc_out=$drc_out
...
ncclimo
...
-P
...
eam
...
--clm_md=hfc
...
--var=FSNT,AODVIS
...
--yr_srt=1
...
--yr_end=250
...
--map=$map_fl
...
--drc_out=$drc_out
...
$drc_in/
...
eam.h4.0[012]??
...
.nc
...
ncclimo
...
-P
...
eam
...
--clm_md=hfc
...
--var=T,Q,RH
...
--yr_srt=1
...
--yr_end=250
...
--drc_in=$drc_in
...
--map=$map_fl
...
--drc_out=$drc_out
In high-frequency mode, ncclimo
automatically determines the number of timesteps per day (which must be an integer >= 1). In high-frequency mode the --caseid
option is optional since the user provides all the input filenames. If provided, caseid
is used to rename the output filenames (similar to the --fml_nm
option).
...
The above commands perform a climatology without regridding, then with regridding (all climos stored in ${drc_out}
), then with regridding and storing regridded files separately (in ${drc_rgr}
). Paths specified by $drc_in
, $drc_out
, and $drc_rgr
may be relative or absolute. An alternative to regridding during climatology generation is to manually regrid afterwards with ncremap
, which has more specialized features built-in for regridding. To use ncremap
to regrid a climatology in $drc_out
and place the results in $drc_rgr
, use something like
...
ncremap
...
--map=map.nc
...
-I
...
$drc_out
...
-O
...
$drc_rgr
...
ls
...
$drc_out/
...
climo
...
|
...
ncremap
...
--map=map.nc
...
-O
...
$drc_rgr
As of 20170526 and version 4.6.7, ncremap
supports ncremap
supports sub-gridscale (SGS) regridding. Though designed for ELM and MPAS-Seaice, this feature is configurable for other SGS datasets as well. In sub-grid mode, ncremap
performs substantial pre- and post-processing so ensures that regridding conserves fields that may represent only a fraction of the entire gridcell. The sub-gridscale fraction represented by each field is contained in a separate variable (set settable with the option --sgs_frc
) whose default is landfrac
, defaults to landfrac
). SGS mode eases regridding of datasets (e.g., from ELM, CLM, CICE, MPAS-Seaice) that output data normalized to a gridcell fraction rather than to its entire extent. SGS mode automatically derives new binary masks (--sgs_msk
, defaults to landmask
) and allows for additional normalization (--sgs_nrm
). Specific flavors of SGS can be selected (with -P elm
, or -P clm
, -P cice
, or -P mpasseaice
). These ensure regridded datasets recreate the idiosyncratic units (e.g., %, km2) employed by raw ELM, CLM, CICE, and MPAS-Seaice model output.
...
ncclimo
works on all E3SM component models, including the coupler. It can simultaneously generate climatologies for a coupled run, where climatologies mean both native and regridded monthly, seasonal, and annual averages. Here are template commands to fully climatologize and regrid a coupled simulation:
...
caseid=
...
v2.
...
LR.
...
historical_0101
drc_in=/
...
lcrc/
...
group/
...
e3sm/ac.forsyth2/E3SMv2/v2.LR.historical_0101/archive
map_atm=${DATA}/maps/map_
...
ne30pg2_to_
...
cmip6_180x360_
...
nco.
...
20200901.nc
...
map_lnd=$map_atm
...
map_ocn=${DATA}/maps/map_
...
EC30to60E2r2_to_cmip6_180x360_aave.
...
20220301.nc
...
map_ice=$map_ocn
...
ncclimo
...
-P
...
eam
...
-p
...
mpi
...
-c
...
$caseid -s
...
2
...
-e
...
5
...
-i
...
$drc_in
...
/atm/hist -r
...
$map_atm
...
-o
...
${DATA}/
...
e3sm/atm
...
ncclimo
...
-P
...
elm -c $caseid -s 2 -e 5 -i $drc_in/lnd/hist -r $map_lnd -o ${DATA}/
...
e3sm/lnd
...
ncclimo
...
-P
...
mpasocean
...
-p
...
mpi
...
-s
...
2
...
-e
...
5
...
-i
...
$drc_in
...
/ocn/hist -r
...
$map_ocn
...
-o
...
${DATA}/
...
e3sm/ocn
...
ncclimo
...
-P
...
mpasseaice
...
-s
...
2
...
-e
...
5
...
-i
...
$drc_in
...
/ice/hist -r
...
$map_ice
...
-o
...
${DATA}/
...
e3sm/ice
The atmosphere and ocean model output is significantly larger than the land and ice model output. These commands recognize that by using different parallelization strategies that may be required, depending on the RAM fatness of the analysis nodes, as explained below. MPAS models do not utilize the $caseid
option. They use their own naming convention. By default, ncclimo
processes the MPAS hist.am.timeSeriesStatsMonthly
analysis members.
...
Code Block |
---|
caseid=somelongname drc_in=/scratch1/scratchdirs/zender/acmee3sm/${caseid}/atm ncclimo -c ${caseid} -m cam -S 10 -E 20 -s 21 -e 50 -i ${drc_in} |
When no output directory is specified, ncclimo internal logic automatically places the extended climo in the input climo directory. Files are not overwritten because the extended climos have different names than the input climos. The next example utilizes the directory structure and options that Chris Golaz adopted for coupled ACME/E3SM simulations. The extra options (compared to the idealized example above) supply important information. The input climos were generated in seasonally discontiguous December (sdd) mode so the extended climatology must also be generated with the '-a sdd' option (or else ncclimo will not find the pre-computed input files). The input directory for the first pre-computed input climatology is specified with -x. The second pre-computed input climatology is specified with the usual -i option. A new output directory for the extended climos is specified with -X.
...