Generate atm initial condition from analysis data

  NERSC Directory Change Notice  

Due to project's name change at NERSC from 'ACME' to 'E3SM' and NERSC’s file system update, the directory  '/project/projectdirs/acme/' is now '/cfs/cdirs/e3sm'

This is to document a new procedure for generating atm initial condition file from analysis data. The essence of the workflow is no different than the commonly used ones (e.g., CAM tool interpic) for generating initial condition for new grids from existing ones, namely, horizontal and vertical remapping from source data grid to target grid. The emphasis of this workflow is to make use of builtin functionalities of the powerful NCO. Necessary intermediate steps are introduced to facilitate the use of NCO utilities to generate dynamically reasonably balanced initial conditions that are most likely ready for model simulations without additional spinup to filter fast gravity waves. This is effectively realized by adjusting surface pressure field to account for the mismatch in topography between the source and destination grids. The workflow is first put together to generate  input files for SCREAM project's initialized simulations using IFS analysis provided by the DYAMOND Initiative. It can be easily adapted to other analysis or model data, including remapping from initial conditions from other SE grids used by EAM.

The following illustration uses ne512 as target grid and IFS analysis as source data. The workflow will be streamlined to minimize user input and the intermediate codes (esp. IDL) will be hardened for portability.

Note that to generate the initial file for a specific grid (uniform resolution or RRM), the grid description file is required (a SCRIP or a .g file). For high-resolution grids, a .g grid file is preferred in order to use TempestRemap to generate high-order map file. See more in step 3. 

NOTE: The procedure to generate new initial conditions has been built into into the HICCUP tool (see https://github.com/E3SM-Project/HICCUP), which is a set of flexible and robust python routines to automate and streamline the task of generating a new atmospheric initial condition for E3SM. It can use ERA5 reanlaysis data or regrid (either horizontal, vertical or both) an existing atmospheric initial condition file. HICCUP has its own surface adjustment routine that follows the original adjustment routine published by ECMWF, which was also  reproduced in similar tools by Jerry Olson and Wuyin Lin. HICCUP was built with code readability and documentation as a top priority and includes some unit tests for adjustment routines. HICCUP has also successfully been used to generate initial conditions for ne1024 grids using large memory analysis nodes available at HPC centers. Development of HICCUP is ongoing and new features are planned, so any feedback or feature requests are welcome. 


 1. Preparation of analysis data

The analysis data are typically provided at regular lat/lon grids. The following fields are mandatory: physical fields of PS, PHSI, T, Q, U, V, and vertical coordinate related variables, namely, P0,  hyam, hybm, hyai, hybi for hybrid simga-pressure coordinate and  lev and ilev for constant pressure levels.

The IFS data provided by the DYAMOND initial is at full 640 gaussian grids, ifs_oper_T1279_2016080100.nc. It contains the required fields but need to be either renamed (t,q,u,v,z_2_ or converted (e.g., lhsp). It has the hybrid coordinate coefficient variables but using different vertical coordinate dimension names than the ones used to define the physical fields.  NCO vertical interpolation function may have been updated to recognize such hybrid coefficient variables. The workflow as used here opt to re-define these variables using consistent vertical dimension names.

The above can be done using  /project/projectdirs/m2136/analysis-IC/ifs_modify.sh.

The modified file is named as ifs_oper_T1279_2016080100_mod.nc. Not all  variables in the file are needed for IC. A subset of it, ifs_oper_T1279_2016080100_mod_subset.nc, is used for the remaining steps.

 2. Source data grid description

Grid description info (e.g., SCRIP file) is needed for the source data in order to generate map files for remapping to target model grid. Charlie Zender generated the SCRIP file for the IFS data used here. 

/global/homes/z/zender/data/grids/f640_scrip.20190601.nc

In the case of not having a source grid description file, it can be inferred using NCO tool in the following format. (see also section 3.23 of NCO documentation).

~zender/bin_cori/ncremap -d ifs_oper_T1279_2016080100.nc -g ifs_oper_T1279_scrip.nc 

 3. Generate map file

Normally, horizontal map files can be generated using either ESMF_RegridWeightGen or TempestRemap via wrapper ncremap. However, during the process of generating the IC file for ne1024, it is found that the resulting data using the mapfile generated by ESMF_regrid contain invalid data outside of normal range of values. The cause has not been identified. Though ESMF_regrid generated mapfile appears to work ok for ne512, TempestRemap is chosen to suit the general need of SCREAM project for global simulations at km scale resolution.

The mapfile from IFS f640 to ne512 can be generated using TempestRemap in the following format. Use e3sm_unified to set up proper path (cori as an example), including path to TempestRemap executables and ncremap

source /global/project/projectdirs/acme/software/anaconda_envs/load_latest_e3sm_unified.sh

nice ncremap -a fv2se_stt --src_grd=/global/homes/w/wlin/ACME/remap/IFS_T1279/f640_scrip.20190601_CCW_nc6.nc \
     --dst_grd=/project/projectdirs/acme/bhillma/grids/ne512np4/descriptor_files/ne512.g \
     -m /global/homes/w/wlin/gscrc/DYAMOND/map_f640_to_ne512np4_highorder.20190618.nc

Do apply nice when running on login nodes because it can take very high amount of memory (70+ GB for ne1024)

The example command is recorded in  /project/projectdirs/m2136/analysis-IC/gen.mapfiles.TR.ne512.hiorder

 4. Horizontal remapping

Horizontal remapping using the subset of modified IFS file (see step 1) and TempestRemap generated mapfile

nice ~zender/bin_cori/ncremap -5 --map=map_f640_to_ne512np4_highorder.20190618.nc \
ifs_oper_T1279_2016080100_mod_subset.nc ifs_oper_T1279_2016080100_mod_subset_to_e3sm_ne512np4.nc

This command is recorded in /project/projectdirs/m2136/analysis-IC/map_ifs_to_e3sm_horiz.ne512.sh. Similar commands can be found under that directory for other grid resolutions.

 5. Adjustment of topography mismatch and vertfile for nco vertical interpolation

This step is in preparation for vertical interpolation using the new NCO functionality. NCO vertical interpolation uses a vertfile to define destination vertical grid layout.  For EAM's hybrid vertical coordinate, it takes PS as well as hybrid coordinate coefficients – If PS is not provided in the vertfile,  it would assume the same PS as in the source data. In this workflow, the vertfile included the adjusted PS that account for the mismatch in topography between source data and model. The topography in source data has been remapped to destination grid after step 4. Without adjusting for the mismatch, the resulting interpolated initial file could excite fast gravity wave via dynamical adjustment during first few steps of simulation, hence requiring much shorter time stepping than intended for the target resolution – a process commonly called spin-up.

This step is handled by /project/projectdirs/m2136/analysis-IC/vertfile_topoadj_ne512.sh

It requires a generic EAM vertical coordinate file, the topo file for target grid, and PS, PHIS and bottom level temperature extracted from horizontally remapped file after step 4. The generic vertical coordinate file must have same number of levels as for the target configuration (i.e., the required IC file). See the comments in the script for more info.

 6. Vertical interpolation

This is the last step of the IC generation. It uses a new capability of NCO to perform vertical interpolation. As shown, only a further subset (optional) of variables are involved.

vertfile='E3SM_vert_coor_ne512.nc'

export PATH=~zender/bin_cori:$PATH

nice ~zender/bin_cori/ncremap --vrt_fl=$vertfile -v T,Q,U,V,P0,PS,PHIS,CLDLIQ,CLDICE,O3,skt,date,datesec,hyam,hybm,hyai,hybi,lev,ilev \
ifs_oper_T1279_2016080100_mod_subset_to_e3sm_ne512np4.nc ifs_oper_T1279_2016080100_mod_subset_to_e3sm_ne512np4_topoadj_L72.nc

The above commands are recorded in /project/projectdirs/m2136/analysis-IC/map_ifs_to_e3sm_ne512_vert.sh, similar files can be found under the same directory for other grids.

 Extra: Notes from testing by Chris

Chris Terai 2019-11-12: I was able to follow the steps above to regrid and initial condition file on cori, but here are things that needed to be modified to make it work for me. Note: The nco version used for these operations was loaded from e3sm_unified environment (which ncks →/global/project/projectdirs/acme/software/anaconda_envs/cori/base/envs/e3sm_unified_1.3.0_cdatx/bin/ncks).

RESOLVED Step 1: When running ifs_modify.sh,

export IDL_PATH=/usr/common/usg/idl/idl82/lib:+/global/u2/w/wlin/idl

should be replaced with

export IDL_PATH=/global/common/cori/software/idl/idl85/idl/lib:+/global/u2/w/wlin/idl

Step 3: With this and many other steps, I attempted to use compute nodes (10+ nodes) for creating ne1024 maps, but ran out of memory, so applying 'nice' to commands and running on login nodes worked best, as suggested above.

Step 5:

It might make it more streamlined to remove the PS data and other headers (dimensions) from the vertical coordinate file before adding the PS from the topo-adjustment to the vertical coordinate file. Currently, this is performed in Step 6 but performing this at the beginning of Step 5 also reduces the likelihood of getting errors when trying to append the topo-adjusted PS.

The step to remove the unnecessary information is done manually (as described in map_ifs_to_e3sm_ne512_vert.sh and pasted below):

# perform vertical interpolation using NCO, using horizontally remapped data                                                                                               
#                                                                                                                                                                          
# extract a vertical coordinate file from an E3SM file, must contain P0,hyam,hybm,hyai,hybi,                                                                               
# PS to take from source. However, ncks always extract PS when asking for hyam,hybm,hyai,hybi                                                                              
# Using ncdump -v P0,hyam,hybm,hyai,hybi,lev,ilev > E3SM_vert_coor.txt                                                                                                     
#       manually edit the file to remove other header info like ncol or other dimensions. 
#       CRT addition: also remove the old PS from the vertical coordinate data.                                                       
#       ncgen E3SM_vert_coor.txt -o E3SM_vert_coor.nc   #Generates netcdf from edited .txt file.                                                                                           

In step 2 of vert_topo_adj_ne1024... using

ncks -3 -d lev,136 ... 

gave me an error. Therefore, had to use

ncks -7 -d lev,136 ...

RESOLVED: In step 3, where we run the idl script, ana_ini_psreset.pro needed to be modified. It is written to accept a 2D (lat,lon) set of data, where the SE is 1-dimensional in x-y. I kept the two sets of for loops, but set nlat = 0 and let nlon = ncol.

  result=size(ps)
  nlon=result(1)
  psl=fltarr(nlon)
  nlat = 1