Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

ncclimo will reshape input files that are a series of snapshots of all model variables into outputs that are continuous timeseries of each individual variable taken from all input files. Timeseries to be reshaped (split) often come with hard-to-predict names, e.g., because the number of days or months in a file, or timesteps per day or month may all vary. Thus ncclimo in splitter mode requires the user to supply the input filenames. ncclimo will not construct input filenames itself in splitter mode (unlike monthly or annual climo generation mode). ncclimo will, as of version 5.0.4, employ timeseries reshaping mode if it receives the --split switch or the --ypf_max option described below. In addition, it must receive a list of files through a pipe to stdin, or, alternatively, placed as positional arguments (after the last command-line option), or if neither of these is done and no caseid is specified, in which case it assumes all *.nc files in drc_in constitute the input file list. These examples invoke reshaping mode in the four possible ways (choose your poison):

...

#

...

Sample

...

Abbreviations

...


drc_in=~zender/data/ne30/raw

...


map_fl=${DATA}/maps/map_ne30/map_ne30pg2_to_cmip6_180x360_nco.20200901.nc

...


#

...

Splitter

...

Input

...

Mode

...

#1:

...

Read

...

input

...

filename

...

list

...

from

...

file

...


ls

...

$drc_in/

...

eam.h0.201[34]

...

.nc

...

>

...

input_list

...


ncclimo

...

--split

...

--yr_srt=2013

...

--yr_end=2014

...

--var=FSNT,AODVIS

...

--map=$map_fl

...

--drc_out=$drc_out

...

<

...

input_list

...


#

...

Splitter

...

Input

...

Mode

...

#2:

...

Pipe

...

input

...

filenames

...

to

...

stdin

...


cd

...

$drc_in

...


ls

...

$drc_in/

...

eam.h0.201[34]

...

.nc

...

|

...

ncclimo

...

--split

...

--yr_srt=2013

...

--yr_end=2014

...

--var=FSNT,AODVIS

...

--map=$map_fl

...

--drc_out=$drc_out

...


#

...

Splitter

...

Input

...

Mode

...

#3:

...

Append

...

filenames

...

positional

...

arguments

...


ncclimo

...

--split

...

--var=FSNT,AODVIS

...

--yr_srt=2013

...

--yr_end=2014

...

--map=$map_fl

...

--drc_out=$drc_out

...

$drc_in/

...

cam.h0.0[012]??

...

.nc

...


#

...

Splitter

...

Input

...

Mode

...

#4:

...

Ingest

...

entire

...

directory

...

(be

...

sure

...

the

...

directory

...

contains

...

only

...

files

...

to

...

be

...

climatologized!)

...


ncclimo

...

--split

...

--var=T,Q,RH

...

--yr_srt=2013

...

--yr_end=2014

...

--drc_in=$drc_in

...

--map=$map_fl

...

--drc_out=$drc_out

The output is a collection of per-variable timeseries such as FSNT_YYYYMM_YYYYMM.nc, AODVIS_YYYYMM_YYYYMM.nc, etc. The output is split into segments each containing no more than ypf_max (default 50) years-per-file, e.g., FSNT_000101_005012.nc, FSNT_005101_009912.nc, FSNT_010001_014912.nc, etc. Change the maximum number of years-per-output-file with the --ypf_max=ypf_max option. 

...

A common task for ncclimo is to produce climatological monthly, seasonal, and annual-means from an interannual series of monthly-mean input files with commands like these:

...

...

languagebash

ncclimo

...

-P

...

eam

...

-s

...

$yr_srt

...

-e

...

$yr_end -c

...

$caseid

...

-i

...

$drc_in

...

-o

...

$drc_out #

...

EAM/CAM/CAM-SE

...


ncclimo

...

-P

...

eam

...

-v

...

FSNT

...

-s

...

$yr_srt

...

-e

...

$yr_end -c

...

$caseid

...

-i

...

$drc_in

...

-o

...

$drc_out #

...

EAM

...

subset

...


ncclimo

...

-P

...

elm

...

-

...

s

...

$yr_srt

...

-e

...

$yr_end -c

...

$caseid

...

-i

...

$drc_in

...

-o

...

$drc_out #

...

ELM/ALM/CLM

Each option can be accessed by a handful of long-option synonyms to suit users' tastes. With long options the first example above may be rewritten as

...

ncclimo

...

--prc_typ=eam

...

--start=$yr_srt

...

--end=$yr_end --case=$caseid

...

--input=$drc_in

...

--output=$drc_out

Note that -P eam above, is not necessary since the default processing type is EAM. However, it is a good habit to specify the component model (if any) to ncclimo since ncremap may require this information in the regridding step. When invoked without options ncclimo outputs a handy table of all available options, their long-option synonyms, and some examples. NCO documentation here describes the full meaning of all options. The most common options are:

...

MPAS ocean and ice models have their own (non-CESM'ish) naming convention that guarantees output files have the same names for all simulations. By default ncclimo analyzes the timeSeriesStatsMonthly analysis member (AM) output (tell Charlie Zender if you want options for other AM output). ncclimo recognizes input files as being MPAS-style when invoked with -m mpaso or -m mpasseaice (or synonyms) like this:

...

ncclimo

...

-P

...

mpasocean

...

-s

...

1980

...

-e

...

1983

...

-i

...

$drc_in

...

-o

...

$drc_out

...


ncclimo

...

-P

...

mpasseaice

...

-s

...

1980

...

-e

...

1983

...

-i

...

$drc_in

...

-o

...

$drc_out

Some data are best evaluated with custom-defined seasons, e.g., JFM instead of DJF, or two-month seasons such as FM or ON. ncclimo supports up to eleven (and counting) seasons, although by default it only computes MAM, JJA, SON, and DJF. As of NCO 4.6.8, use the --seasons (or --csn) option to specify additional or alternate seasons:

...

ncclimo

...

-P

...

mpasseaice

...

--seasons=jfm,jas,ann

...

-s

...

1980

...

-e

...

1983

...

-i

...

$drc_in

...

-o

...

$drc_out

The climatological annual mean, ANN, is also computed automatically when MAM, JJA, SON, and DJF are all requested (which is the default, so ANN is always computed by default). Use –-seasons=none to completely turn-off seasonal and annual-mean climatologies.

MPAS climos are unaware of missing values until/unless the input files are "fixed". We recommend that the person who produces the simulation annotate all floating point variables with the appropriate _FillValue prior to invoking ncclimo. Run something like this once in the history file directory:

...

for

...

fl

...

in

...

ls hist.*

...

;

...

do

...


ncatted

...

-O

...

-t

...

-a

...

_FillValue,,o,d,-9.99999979021476795361e+33

...

${fl}

...


done

If/when MPAS generates the _FillValue attributes itself, this step can and should be skipped (MPAS developers: please let Charlie Zender know when MPAS “fixes” this “feature”). All other ncclimo features like regridding (below) are invoked identically for MPAS as for EAM/ELM users although under-the-hood ncremap (if invoked) specially pre-processes (dimension permutation, metadata annotation) MPAS data.

...

As of NCO 4.9.4 (September, 2020), ncclimocan produce climatologies that retain the diurnal cycle resolution provided by the input data. These “high frequency climos” are useful for characterizing the diurnal cycle of processes typically retained in EAM/ELM h1-h4 history files, high-frequency observational analyses (e.g., MERRA2, ERA5), and similar data. In all respects except two, high frequency climo features are invoked and controlled by the same options as traditional climo generation from monthly mean input. The most significant difference is that the user must supply the filenames of high-frequency input data via any of the four methods outlined above for splitting. High-frequency climo input dataset names are too complex for ncclimoto automatically generate (as it does for monthly-mean input), so one must supply the names via standard input, positional arguments, or filename globbing, or directory location exactly as for splitter mode described above. The second difference is that the user must supply the --clm_md=hfc option to tell ncclimo to operate in climo-generation rather than splitter mode:

...

ncclimo

...

-P

...

eam

...

--clm_md=hfc

...

--yr_srt=1

...

--yr_end=250

...

--var=FSNT,AODVIS

...

--map=$map_fl

...

--drc_out=$drc_out

...

<

...

input_list

...


ls

...

cam.h0.0[012]??

...

|

...

ncclimo

...

-P

...

eam

...

--clm_md=hfc

...

--yr_srt=1

...

--yr_end=250

...

--var=FSNT,AODVIS

...

--map=$map_fl

...

--drc_out=$drc_out

...


ncclimo

...

-P

...

eam

...

--clm_md=hfc

...

--var=FSNT,AODVIS

...

--yr_srt=1

...

--yr_end=250

...

--map=$map_fl

...

--drc_out=$drc_out

...

$drc_in/

...

eam.h4.0[012]??

...

.nc

...


ncclimo

...

-P

...

eam

...

--clm_md=hfc

...

--var=T,Q,RH

...

--yr_srt=1

...

--yr_end=250

...

--drc_in=$drc_in

...

--map=$map_fl

...

--drc_out=$drc_out

In high-frequency mode, ncclimoautomatically determines the number of timesteps per day (which must be an integer >= 1). In high-frequency mode the --caseid option is optional since the user provides all the input filenames. If provided, caseid is used to rename the output filenames (similar to the --fml_nm option).

...

The above commands perform a climatology without regridding, then with regridding (all climos stored in ${drc_out}), then with regridding and storing regridded files separately (in ${drc_rgr}). Paths specified by $drc_in, $drc_out, and $drc_rgr may be relative or absolute. An alternative to regridding during climatology generation is to manually regrid afterwards with ncremap, which has more specialized features built-in for regridding. To use ncremap to regrid a climatology in $drc_out and place the results in $drc_rgr, use something like

...

ncremap

...

--map=map.nc

...

-I

...

$drc_out

...

-O

...

$drc_rgr

...


ls

...

$drc_out/

...

climo

...

|

...

ncremap

...

--map=map.nc

...

-O

...

$drc_rgr

As of 20170526 and version 4.6.7, ncremap supports ncremap supports sub-gridscale (SGS) regridding. Though designed for ELM and MPAS-Seaice, this feature is configurable for other SGS datasets as well. In sub-grid mode, ncremap performs substantial pre- and post-processing so ensures that regridding conserves fields that may represent only a fraction of the entire gridcell. The sub-gridscale fraction represented by each field is contained in a separate variable (set settable with the option --sgs_frc) whose default is landfrac , defaults to landfrac). SGS mode eases regridding of datasets (e.g., from ELM, CLM, CICE, MPAS-Seaice) that output data normalized to a gridcell fraction rather than to its entire extent. SGS mode automatically derives new binary masks (--sgs_msk, defaults to landmask) and allows for additional normalization (--sgs_nrm). Specific flavors of SGS can be selected (with -P elm, or -P clm, -P cice, or -P mpasseaice). These ensure regridded datasets recreate the idiosyncratic units (e.g., %, km2) employed by raw ELM, CLM, CICE, and MPAS-Seaice model output. 

...

ncclimo works on all E3SM component models, including the coupler. It can simultaneously generate climatologies for a coupled run, where climatologies mean both native and regridded monthly, seasonal, and annual averages. Here are template commands to fully climatologize and regrid a coupled simulation:

...

caseid=

...

v2.

...

LR.

...

historical_0101
drc_in=/

...

lcrc/

...

group/

...

e3sm/ac.forsyth2/E3SMv2/v2.LR.historical_0101/archive
map_atm=${DATA}/maps/map_

...

ne30pg2_to_

...

cmip6_180x360_

...

nco.

...

20200901.nc

...


map_lnd=$map_atm

...


map_ocn=${DATA}/maps/map_

...

EC30to60E2r2_to_cmip6_180x360_aave.

...

20220301.nc

...


map_ice=$map_ocn

...

ncclimo

...

-P

...

eam

...

-p

...

mpi

...

-c

...

$caseid -s

...

2

...

-e

...

5

...

-i

...

$drc_in

...

/atm/hist -r

...

$map_atm

...

-o

...

${DATA}/

...

e3sm/atm

...


ncclimo

...

-P

...

elm -c $caseid -s 2 -e 5 -i $drc_in/lnd/hist -r $map_lnd -o ${DATA}/

...

e3sm/lnd

...


ncclimo

...

-P

...

mpasocean

...

-p

...

mpi

...

-s

...

2

...

-e

...

5

...

-i

...

$drc_in

...

/ocn/hist -r

...

$map_ocn

...

-o

...

${DATA}/

...

e3sm/ocn

...


ncclimo

...

-P

...

mpasseaice

...

-s

...

2

...

-e

...

5

...

-i

...

$drc_in

...

/ice/hist -r

...

$map_ice

...

-o

...

${DATA}/

...

e3sm/ice

The atmosphere and ocean model output is significantly larger than the land and ice model output. These commands recognize that by using different parallelization strategies that may be required, depending on the RAM fatness of the analysis nodes, as explained below. MPAS models do not utilize the $caseid option. They use their own naming convention. By default, ncclimo processes the MPAS hist.am.timeSeriesStatsMonthly analysis members.

...

Code Block
caseid=somelongname
drc_in=/scratch1/scratchdirs/zender/acmee3sm/${caseid}/atm
ncclimo -c ${caseid} -m cam -S 10 -E 20 -s 21 -e 50 -i ${drc_in}

When no output directory is specified, ncclimo internal logic automatically places the extended climo in the input climo directory. Files are not overwritten because the extended climos have different names than the input climos. The next example utilizes the directory structure and options that Chris Golaz adopted for coupled ACME/E3SM simulations. The extra options (compared to the idealized example above) supply important information. The input climos were generated in seasonally discontiguous December (sdd) mode so the extended climatology must also be generated with the '-a sdd' option (or else ncclimo will not find the pre-computed input files). The input directory for the first pre-computed input climatology is specified with -x. The second pre-computed input climatology is specified with the usual -i option. A new output directory for the extended climos is specified with -X.

...