Creating SSP370 and SSP585 compsets for E3SM

 

Metadata

Origin: May 2022
Documentation author(s): Jim Benedict (LANL)
E3SMv2 SSP* collaborators and contributors: Xingying Huang, @Hailong Wang, @Mingxuan Wu, @Alan Di Vittorio, @Gautam Bisht, @Qi Tang, @Wuyin Lin, @Chris Golaz, @Philip Cameron-Smith, @Michael J Prather

 

Overview

Disclaimer

The information that follows is an outline of the steps that were taken to develop the SSP370 and SSP585 component sets ("compsets") for E3SM version 2 (released September 2021).  These same steps may or may not work for previous or future versions of E3SM.  The instructions are accurate to the best of the author's knowledge, but it must be noted that the author is not an expert in the E3SM land model or the E3SM aerosol and atmospheric chemistry modules. If errors are found in this document, please notify @Jim Benedict or add a comment to this page. A high-level overview of compset creation (from 2016, some information may be obsolete!) can be found in https://acme-climate.atlassian.net/wiki/spaces/DOC/pages/46891102; the information outlined below is intended to supplement this workflow overview by providing greater detail and is specific to the SSP370 and SSP585 compsets for E3SMv2.

Creating a local workspace

The Energy Exascale Earth System Model (E3SM) is freely available via a GitHub repository.  For production simulations, it is recommended to clone the latest "maintenance" version of E3SM to a local directory, following instructions here. One example of cloning the latest E3SM 2.0 maintenance version (where YYYYMMDD represents the current date):

git clone -b maint-2.0 git@github.com:E3SM-Project/E3SM.git E3SMv2_maint2.0_YYYYMMDD

Leveraging existing SSP585 compset to create SSP370 compset

Let's define $E3SMROOT = /path/to/E3SMv2_maint2.0_YYYYMMDD as the cloned local model workspace.  (Note to user:  the full list of commonly used directory paths such as $E3SMROOT can be found in the Definitions section near the end of this manual.)  Important information about which files must be modified to create the new SSP370 compset can be obtained by recursively searching for instances of 'ssp585' in $E3SMROOT:

cd $E3SMROOT grep -r -i ssp5 ./

This step provides a long list of files in which 'ssp5' appears.  Note:  It is strongly recommended to grep for the simple phrase 'ssp5' instead of anything more detailed, such as 'ssp585' or 'ssp5_85'.  Some of the returns may be irrelevant for the user's needs;  for example, there are many "BGC" versions of the SSP585 compset that could be ignored when creating the simple SSP370 compset.  The relevant entries returned from the grep search can guide the identification of files that must be modified for SSP370.  Below, in Creation of E3SM SSP370 input files, we focus on the key input files required for SSP370.  Later, in E3SM SSP370:  Configuration settings, we examine required modifications to the source code and namelist settings.

Creation of E3SM SSP370 input files

ELM land use files (SSP370)

ELM setup for SSP simulations primarily centers on the creation of two input files, called "fsurdat" and "fdyndat" in the ELM namelist, which represent the land cover properties (fsurdat = surfdata*.nc) and temporal land use changes (fdyndat = landuse.timeseries*.nc).  @Alan Di Vittorio has created a descriptive and highly informative summary of steps needed to complete the ELM setup.  Below are the key steps, with reference to Alan's page.  For all steps, do NOT have the E3SM unified software environment activated as this may interfere with the necessary environment setup.

  1. Check if the desired ELM input files might already exist in $INPUTDATA/lnd/clm2/surfdata_map.  Search for files of the form sspN_rcpN.N, for example ssp3_rcp7.0.

  2. If the desired ELM input files do not yet exist, certain tools/scripts exist in every E3SM download that can be used to create the two ELM input files.  One could use the copy of E3SM from within $E3SMROOT, but because these ELM tools generate large data files it is recommended instead that the user clone a second copy of the latest version of E3SM into a directory with sufficient disk space (a project directory is preferred, otherwise the scratch disk... avoid the home directory).  Example for E3SM master branch:

    git clone -b master --recursive https://github.com/E3SM-Project/E3SM.git E3SMv2-master-YYYYMMDD

    We will define $ELMTOOLSROOT = /path/to/E3SMv2_master_YYYYMMDD.

  3. Data files from input4mips repository must be preprocessed and reformatted before they can be used by ELM.  Some of the preprocessing is undertaken by ELM developers to convert the data from "LUH2" format to "LUH1" format.  For SSP370, the following files were posted to $INPUTDATA/lnd/clm2/rawdata/LUT_input_files_currentLUH2_SSP3_RCP70_LUH1f_c08292020.nc and LUH2_SSP3_RCP70_LUH1f_c08292020_harvest.nc.  The fields in these files are on a 0.5°-deg grid (LON=720, LAT=360) with slightly different time steps due to an offset for the harvest data.  Further processing of these files is needed.  Use the Land Use Translator (LUT) to convert LUH1-formatted data to ELM plant functional types (PFTs) and harvest fractions.  See here for supplemental details.  For SSP370, do the following:

     

  4. A series of output files are written to ./output.  If there are no problems creating the LUT* output files, they may be copied to, e.g., $INPUTDATA/lnd/clm2/rawdata/LUT_LUH2_SSP3_RCP70_LUH1f_MMDDYYYY (note the change in date format).  Note that these LUT* files are on a 0.5° lat-lon grid, so there is no E3SM grid dependence yet.

  5. A "file list" text file must be created that lists full paths to the yearly LUT* files that ELM will use during the simulation.   For example, if the simulation will span years 2015-2100, this "file list" file must include separate lines pointing first to the LUT* 2015 file, then the LUT* 2016 files, and so on to the LUT* 2100 file.  A year is included within the same line as the full file path.  Use $INPUTDATA/lnd/clm2/rawdata/LUT_LUH2_SSP3_RCP70_LUH1f_02262021/LUT_LUH2_SSP3_RCP70_LUH1f_list.txt as a template.  Note:  The year stamp within the “filelist” file MUST be placed exactly at character 197.

  6. Use script mksurfdata.pl to generate the two ELM input files "fsurdat" and "fdyndat".  First, load required modules and configure the environment (below are the instructions for NERSC-Cori.. for Compy, see here):


    Then, compile the mksurfdata code:


    Run mksurfdata.pl in "debug" model (-d) simply to create a namelist file that can be modified as needed.  For  SSP370 and the "ne30np4" horizontal grid on NERSC-Cori, this would be:


    A description of mksurfdata.pl command options:
    -res ne30np4.pg2:  This will need to correspond exactly to a supported mapping file.  Let's use ne30np4pg2 as an example.  First, check that relevant mapping files exist in $INPUTDATA/lnd/clm2/mappingdata/maps/ne30np4pg2.  In this directory, the files map_0.5x0.5_*_to_ne30pg2*.nc are the relevant files and they do exist.  In $E3SMROOT/components/elm/bld/namelist_files/namelist_defaults.xml, a search of "map" and "to_hgrid" shows:

    What is entered for the -res option must match the desired to_hgrid setting;  therefore, for E3SMv2, we set -res ne30np4.pg2.

    -years 2015:  Recall that the two key ELM input files created in this exercise are fsurdat = surfdata*.nc and fdyndat = landuse.timeseries*.nc.  For standard configurations of E3SM/ELM, fsurdat should represent land cover properties for the first year of the simulation.  Therefore, for SSP* simulations that begin on 2015-01-01, -years 2015 should be used.


    -rcp 3-7.0:  This sets the RCP/SSP scenario.  For historical data omit the -rcp argument.

    -d:  Indicates "debug" mode, which produces only a pre-populated namelist file as noted above and does not produce any data files.

    -dinlc:  Should point to $INPUTDATA.

    -usr_mapdir:  Should point to the directory in which the relevant mapping files exist.


    This creates a pre-populated namelist file (based on the options supplied to mksurfdata.pl) called "namelist" in the current directory.  It is recommended to rename this file for better description and traceability, such as namelist_ssp370_YYYYMMDD.  Five key settings in the resulting namelist file should be checked and modified as needed:

    (1) mksrf_fvegtyp (input):  This should point to the LUT* land cover file corresponding to the first year of the simulation (in our SSP* example case, 2015, which is considered a "historical" year):  mksrf_fvegtyp  = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/rawdata/LUT_LUH2_HIST_LUH1f_07082020/LUT_LUH2_historical_2015_c07082020.nc'
    (2) fsurdat (output):  The name of the created land cover file (a file containing a single year of land cover data in 12 monthly time steps) corresponding to the -years option and representing the first year of the simulation.  For the developed SSP370 compset for E3SMv2,  the entry for fsurdat (<fsurdat>lnd/clm2/surfdata_map/surfdata_ne30np4.pg2_simyr1850_2015_c211105.nc </fsurdat>) represents land surface conditions for 1850 instead of the preferred 2015 conditions.  This is a known oversight, but for reasons described here this error has negligible impact on standard SSP370 simulations run using the RUN_TYPE = 'hybrid'.
    (3) fsurlog (output):  A verbose logfile for fsurdat:  fsurlog        = 'surfdata_ne30np4.pg2_SSP3_RCP70_simyr2015_cYYMMDD.log'
    (4) mksrf_fdynuse (input):  This should point to the "filelist" file that lists LUT* files to be used in the E3SM simulation.  For a SSP* run, the listed files should span 2015-2100 inclusive:  mksrf_fdynuse  = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/rawdata/LUT_LUH2_SSP3_RCP70_LUH1f_02262021/LUT_LUH2_SSP3_RCP70_LUH1f_list_JJB.txt'
    (5) fdyndat (output):  The name of the created land use file that will contain yearly data for each year of the E3SM simulation.  The yearly time span in the fdyndat filename should represent the span of years within mksrf_fdynusefdyndat        = 'landuse.timeseries_ne30np4.pg2_SSP3_RCP70_simyr2015-2100_cYYMMDD.nc

    Next, rerun ./mksurfdata.pl (not in debug mode) by supplying the newly created namelist to produce the output files:


    The output files are saved to the current directory ($E3SMROOT/components/elm/tools/clm4_5/mksurfdata_map).  If they will be used in E3SM production runs, and if the user is part of the e3sm UNIX group, the surfdata* and landuse* files may be copied to $INPUTDATA/lnd/clm2/surfdata_map for wider use.  See E3SM SSP370:  Configuration settings for recommended namelist settings.

Further reading:  Additional details on creating the landuse* and surfdata* files on a new grid, see this site for more complete instructions.

EAM aerosol emissions files (SSP370)

A series of EAM input files representing the scenario-dependent time evolution of aerosol emissions are required.  Importantly, while the aerosol emissions files are scenario-depedent, they are independent of the spatial grid (they are interpolated to whatever spatial grid is used for the E3SM simulation).  At least for SSP* simulations in the E3SMv2 framework, the emissions of the various aerosol species are prescribed ("specified").  The aerosol files are grouped into two categories, denoted by "specifiers" in the EAM namelist, and their namelist entries for SSP370 (which uses MAM4) are listed below:

  1. ext_frc_specifier (elevated emissions, aerosol production away from the surface):  $INPUTDATA/atm/cam/chem/trop_mozart_aero/emis/CMIP6_SSP370_ne30/cmip6_ssp370_mam4_[species]_elev_2015-2100_c210216.nc, where [species] is so2, soag, bc_a4, num_a1, num_a2, num_a4, pom_a4, so4_a1, and so4_a2.

  2. srf_emis_specifier (surface emissions):  $INPUTDATA/atm/cam/chem/trop_mozart_aero/emis/CMIP6_SSP370_ne30/cmip6_ssp370_mam4_[species]_surf_2015-2100_c210216.nc, where [species] is so2, bc_a4, num_a1, num_a2, num_a4, pom_a4, so4_a1, and so4_a2.  NOTE:  Another aerosol emissions file for DMS ($INPUTDATA/atm/cam/chem/trop_mozart_aero/emis/DMSflux.1850-2100.1deg_latlon_conserv.POPmonthlyClimFromACES4BGC_c20160727.nc) is included as part of the srf_emis_specifier entry, but it is scenario independent and is found in a separate directory.

For the E3SMv2 SSP370 compset, the aerosol input files were created and provided by @Hailong Wang.  However, Xingying Huang and @Jim Benedict independently reproduced as much as the aerosol input fields as possible (some inputs are simulation-derived and could not easily be reproduced) to learn and validate the process.  Instructions that identify required aerosol files and a template script to process them were provided by Hailong:

E3SM_aerosol_emissions.pdf: Instruction manual provided by @Hailong Wang that identifies input4mips raw data files needed to create E3SM aerosol emissions input files.
Create_emission_f19_MAM4_HIST_ORIG.m: Original MATLAB script provided by @Hailong Wang that processes historical input4mips emissions files only.

A modified version of Create_emission_f19_MAM4_HIST_ORIG.m specific to SSP370 is Create_emission_f19_MAM4_SSP370.m:

The general process is outlined below, with supplemental details:

  1. Download required input4mips aerosol data files.  The list of required files is included in Hailong's instruction manual.  An example download session is:

    1. Go to:  https://esgf-node.llnl.gov/search/input4mips/

    2. Enter in search box:  OC-em-anthro-openburning AND ScenarioMIP AND ssp370

    3. Results show four data sets ("data sets" are collections of files).  Click on "List files" to see which files are contained in each data set.

    4. For this search, I found file OC-em-anthro_input4MIPs_emissions_ScenarioMIP_IAMC-AIM-ssp370-1-1_gn_201501-210012.nc in the fourth data set.

    5. Click "Add to Data Cart"

    6. Once all data sets are added to cart, click on "My Data Cart" (top right)

    7. Click button for "Select All Datasets", then ABOVE this, near the top, click on "WGET script" following "Collective Services for All Selected Datasets".  DO NOT click on "WGET script" for each individual dataset.

    8. Save the wget script locally, scp it to the preferred supercomputing facility.

    9. Run wget script. An example for NERSC-Cori:

       

  2. The downloaded input4mips files were interpolated to CESM's fv19 (nominal 1.9°x2.5° finite volume) grid, although this is not a strict requirement.  Remapping to a coarser grid expedites processing, and the fv19 grid has traditionally been used.  Recall that the aerosol files are interpolated to the model grid automatically during runtime.  Hailong's instructions include steps to do the remapping using NCL, and an alternate approach using ncremap via a bash script is provided here.

  3. A modified version of Hailong's MATLAB script (Create_emission_f19_MAM4_SSP370.m) was created to convert the remapped CMIP6 (input4mips) aerosol emissions data files to the required E3SM input file format for SSP370.  A description of what the MATLAB script does is included in Hailong's instruction manual, but essentially certain assumptions are made regarding aerosol vertical distribution and species categorization.

    1. In the script, modify inputDataRoot and outputDataRoot to point to locations where input exist and where output files should be written.

    2. Confirm that year, day, and year2 are appropriate for the desired simulation.  Note that year2 should have "buffer" years, offset by 1 year, before and after year.

    3. Hailong's original script, designed to process historical aerosol emissions files, was primarily modified by changing the instances of infile.

    4. Additional notes before running the MATLAB script:

      1. The raw input4mips emissions files represent monthly seasonal cycles for 2015, 2020, 2030, …, 2100, as represented by the year and date arrays.  Among other changes, the MATLAB script adds "buffer" years to the beginning and end of the time series by copying the 2015 seasonal cycle to "2014" and by copying the 2100 seasonal cycle to "2101", as represented by the year2 and date2 arrays.

      2. The script systematically processes the following species:  BC, BC_ELEV, POM, POM_ELEV, SO2, SO2_ELEV, SO4_a1, SO4_a1_ELEV, SO4_a2, SO4_a2_ELEV, NUM_a1 & NUM_a4 (includes num_a1_BC_*, num_a1_POM_*, num_a1_SO4_*;  some "a1" variables are actually written to "a4" file), NUM_a1_ELEV & NUM_a4_ELEV (see previous comment), NUM_a2, NUM_a2_ELEV, BIGALK, BIGENE, ISOPRENE, TERPENE, TOLUENE.

      3. Beginning with "BIGALK" in the species list above, there are files of the form folder4='/Volumes/disk3/CEDS/regrid/VOC04_anthro_185001-185012.nc for which future-scenario analogs were not available.  Hailong's note on this:
        This section of the MATLAB script is to produce the SOAG emissions following the old AR5 way for CESM, in which the VOC emissions were based on the combination of some NMVOC datasets and an atmospheric chemistry model output if I remember it correctly. (I will confirm it with Yang Yang.)  For the E3SMv1 historical simulations, we decided to do something different for SOAG,  as I described in the emission document and also in the Wang et al. (2020) aerosol overview paper for more details. However, the rescaling of SOAG from OC emissions requires a separate simulation that has an explicit SOA treatment, which was based on the CESM1 model and is being implemented in the E3SMv3 as part of the NGD task. We did have such a CESM1 simulation for the historical SOAG emissions. For the SSP585 and SSP370 emissions that Yang Yang helped produce, we used the same historical CESM1 simulation for the rescaling, which is not the best practice. It won’t be used for future versions of the E3SM model, so I didn’t include the rescaling procedure and simulation output. For your SSP370 simulations, you may choose to use the SOAG we generated. To generate it from the MATLAB script (as for the historical ones), you would need to find ways to obtain the required VOC species that are not provided by the input4MIPS. I do expect that SOAG emissions are treated differently in all CMIP6 models.
        Therefore, the SOAG file Hailong provided (cmip6_ssp370_mam4_soag_elev_2015-2100_c210216.nc) cannot be easily verified independently.  Instead, effort was made instead to verify that the total mass emissions for the various species matches those from past versions of the files.

  4. Validation of aerosol emissions files

    1. Various methods to compare the aerosol files provided by Hailong with those created "in-house" show that differences are generally within machine precision over all global areas except immediately along 0° longitude, where differences were larger but not substantial.  A test using esmf_regrid via NCL (instead of ncremap) shows that the 0° longitude differences reduce to within machine precision, suggesting that these differences are entirely due to the selected remapping scheme.  Using NCL's esmf_regrid produces results that are nearly identical to Hailong's data files.  Using ncremap produces generally very similar values, with the largest differences along longitude 0°E.  For E3SMv2 SSP370 aerosol files, the version created from esmf_regrid (i.e., those files provided by Hailong) was used.

    2. Simple comparison between input4mips files and final aerosol emissions files provided by Hailong:  Surface emissions were compared using this script, which converts the E3SM input file aerosol data back to the original input4mips format and computes global sums of the available species.  For all surface emissions (could not check SOAG as noted earlier), total global summed mass fluxes were within ~0.02% for all sectors, where sectors refers to the different types of emission sources including agriculture (AGR), energy (ENE), industry (IND), international shipping (SHP), residential and commercial (RCO), solvent production and application (SLV), transportation (TRA), and waste (WST) (see Gidden et al. 2019).

    3. A more detailed validation of surface aerosol emissions data between the E3SM aerosol input files and the input4mips files was also undertaken.  There is no straightforward way to evaluate elevated aerosol emissions since the original mass fluxes are all at the surface and the elevated emissions are simply scaled by the surface emissions.  It is recommended to run a test simulation and write out the surface and elevated emissions in mass fluxes to validate.  Three independent 1-year test simulations were conducted for years 2015 (RUN_STARTDATE="2015-01-01"), 2050 (RUN_STARTDATE="2050-01-01"), and 2100 (RUN_STARTDATE="2100-01-01").  For each run, the existing E3SMv2 SSP585 compset was used but the SSP370 aerosol files were substituted for the default SSP585 aerosol input files.  Also, the user should set history_aerosol = .true. and history_verbose = .ture. to write out the required history fields for the comparisons.  The model output was compared to the original input4mips files using the following scripts (attached at the end of this subsection):

      1. validate_sfc_emissions_DRIVER.py:  Iteratively calls the NCL scripts and defined some variables

      2. validate_sfc_emissions_inputs.ncl:  Namelist for corresponding NCL script

      3. validate_sfc_emissions.ncl:  Does the primary analysis steps and plotting

      4. validate_elev_emissions_DRIVER.py:  Iteratively calls the NCL scripts and defined some variables

      5. validate_elev_emissions_inputs.ncl:  Namelist for corresponding NCL script

      6. validate_elev_emissions.ncl:  Does the primary analysis steps and plotting

    4. Based on (c), we define the following color shading codes:  Successfully validated and could not be readily validated

      1. SURFACE emissions

        1. …/cmip6_ssp370_mam4_bc_a4_surf_2015-2100_c210216.nc:  AGR, ENE, IND, TRA, RCO, SLV, WST, SHP (time, lat, lon): Compare sum across sectors to model output variable:  SFbc_a4

        2. .../cmip6_ssp370_mam4_num_a1_surf_2015-2100_c210216.nc:  num_a1_SO4_AGR, num_a1_SO4_SHP, num_a1_SO4_SLV, num_a1_SO4_WST  (time, lat, lon)

          1. Compare sum across sectors to model output variable:  SFnum_a1

          2. Cannot easily check this and Hailong suggested that if the mass fluxes are close between the input and output files then so will be the number concentrations.  E3SM's SFnum_a1 contains SO4 but also dust, sea salt and marine organic aerosols.  However, it appears that there are no other variables in the output to account for number fluxes of sea salt and marine organics. A rough estimate of the magnitude could probably be obtained from the respective mass concentration SFncl_a1 and SFmom_a1, but then we'd have to estimate sizes and this is not worth the trouble.

        3. .../cmip6_ssp370_mam4_num_a2_surf_2015-2100_c210216.nc:  num_a2_SO4_RCO, num_a2_SO4_TRA (time, lat, lon): Compare sum across sectors to model output variable:  SFnum_a2. Cannot easily check this, see note for SFnum_a1.

        4. .../cmip6_ssp370_mam4_num_a4_surf_2015-2100_c210216.nc:  num_a1_BC_AGR, num_a1_BC_ENE, num_a1_BC_IND, num_a1_BC_RCO, num_a1_BC_SHP, num_a1_BC_SLV, num_a1_BC_TRA, num_a1_BC_WST, num_a1_POM_AGR, num_a1_POM_ENE, num_a1_POM_IND, num_a1_POM_RCO, num_a1_POM_SHP, num_a1_POM_SLV, num_a1_POM_TRA, num_a1_POM_WST

          1. Compare sum across sectors to model output variable:  SFnum_a4

          2. Note 1:  per Hailong's suggestion, sum the species and sectors in the input file and convert units from "(particles/cm2/s) * 6.022e26" to "1/m2/s" by multiplying the input file data by (1./6.022E26)*(100**2) -- that is, mfactor in validate_sfc_emissions_DRIVER.py.

          3. Note 2:  Data from the input files is consistently 0.7% lower than in the model output, so it's possible that there are species contained in SFnum_a4 that are not included in the input file… but the difference is small enough (and temporally consistent) that this raises no alarms.

        5. .../cmip6_ssp370_mam4_pom_a4_surf_2015-2100_c210216.nc:  AGR, ENE, IND, TRA, RCO, SLV, WST, SHP (time, lat, lon): Compare sum across sectors to model output variable:  SFpom_a4

        6. .../cmip6_ssp370_mam4_so2_surf_2015-2100_c210216.nc:  AGR, TRA, RCO, SLV, WST, SHP (time, lat, lon): Compare sum across sectors to model output variable:  SFSO2

        7. .../cmip6_ssp370_mam4_so4_a1_surf_2015-2100_c210216.nc:  AGR, SLV, WST, SHP (time, lat, lon): Compare sum across sectors to model output variable:  SFso4_a1

        8. .../cmip6_ssp370_mam4_so4_a2_surf_2015-2100_c210216.nc:  RCO, TRA (time, lat, lon): Compare sum across sectors to model output variable:  SFso4_a2

        9. NOTE:  Model output fields (e.g., SFbc_a1, SFbc_a3, SFpom_a1, SFpom_a3, and SFso4_a3) that do not have an analog in the input4mips files should be zero in the model output.

      2. ELEVATED emissions

        1. …/cmip6_ssp370_mam4_bc_a4_elev_2015-2100_c210216.nc:  BB (time, altitude=13, lat, lon) in units "molecules/cm3/s"

          1. Compare sum across sectors to model output variable:  bc_a4_CLXF

          2. Per Hailong's suggestion:  Compare vertical integral of BB to bc_a4_CLXF, acknowledging that resulting spatial map differences may arise due to known remapping issues/limitations.  This applies to all "elevated" emissions fluxes.

        2. .../cmip6_ssp370_mam4_num_a1_elev_2015-2100_c210216.nc:  num_a1_SO4_ELEV_BB, num_a1_SO4_ELEV_ENE, num_a1_SO4_ELEV_IND, num_a1_SO4_ELEV_contvolc  (time, altitude=13, lat, lon) in units "(particles/cm3/s) * 6.022e26"

          1. Compare sum across sectors to model output variable:  num_a1_CLXF

          2. Per Hailong's suggestion:  Compare the vertical integral of the sum of num_a1_SO4_ELEV_* from input file to num_a1_CLXF from model output.  Note that there are no natural contributions within the elevated number emissions fluxes, as there were for surface num_a[1,2], so the issue of "extra" sources is avoided here.  No need to look at spatial map differences because of known remapping issues/limitations.

        3. .../cmip6_ssp370_mam4_num_a2_elev_2015-2100_c210216.nc:  num_a2_SO4_ELEV_contvolc (time, altitude=13, lat, lon) in units "(particles/cm3/s) * 6.022e26"

          1. Compare sum across sectors to model output variable:  num_a2_CLXF

          2. Per Hailong's suggestion:  Compare the vertical integral of the sum of num_a1_SO4_ELEV_* from input file to num_a1_CLXF from model output.  Note that there are no natural contributions within the elevated number emissions fluxes, as there were for surface num_a[1,2], so the issue of "extra" sources is avoided here.  No need to look at spatial map differences because of known remapping issues/limitations.

        4. .../cmip6_ssp370_mam4_num_a4_elev_2015-2100_c210216.nc:  num_a1_BC_ELEV_BB, num_a1_POM_ELEV_BB  (time, altitude=13, lat, lon) in units "(particles/cm3/s) * 6.022e26". Compare sum across sectors to model output variable:  num_a4_CLXF

        5. .../cmip6_ssp370_mam4_pom_a4_elev_2015-2100_c210216.nc:  BB (time, altitude=13, lat, lon) in units "molecules/cm3/s". Compare sum across sectors to model output variable:  pom_a4_CLXF

        6. .../cmip6_ssp370_mam4_so2_elev_2015-2100_c210216.nc:  BB, ENE_ELEV, IND_ELEV, contvolc (time, altitude=13, lat, lon) in units "molecules/cm3/s". Compare sum across sectors to model output variable:  SO2_CLXF

        7. .../cmip6_ssp370_mam4_so4_a1_elev_2015-2100_c210216.nc:  BB, ENE_ELEV, IND_ELEV, contvolc  (time, altitude=13, lat, lon) in units "molecules/cm3/s". Compare sum across sectors to model output variable:  so4_a1_CLXF

        8. .../cmip6_ssp370_mam4_so4_a2_elev_2015-2100_c210216.nc:  contvolc  (time, altitude=13, lat, lon) in units "molecules/cm3/s". Compare sum across sectors to model output variable:  so4_a2_CLXF

        9. .../cmip6_ssp370_mam4_soag_elev_2015-2100_c210216.nc:  SOAbb_src, SOAbg_src, SOAff_src  (time, altitude=12, lat, lon) in units "molecules/cm3/s". Compare sum across sectors to model output variable:  SOAG_CLXF

 

EAM greenhouse gas/ozone/oxidation files (SSP370)

Several radiative forcing files specific to a future SSP must be created.  For E3SMv2, these files are associated with the following EAM namelist entries:

  1. chlorine_loading_file

  2. linoz_data_file

  3. bndtvghg

  4. tracer_cnst_file

One additional namelist entry, linoz_data_path, defines the path to linoz_data_file.  Below are instructions on how to convert raw input4mips data files to input files for E3SM.  These instructions assume the user has access to IDL and Fortran compilers, and the workflow was invoked on NERSC-Cori.

chlorine_loading_file and linoz_data_file

For both chlorine_loading_file and linoz_data_file, the UCI chemistry box model (we’ll denote this BOXMODEL) provided by @Michael J Prather and @Philip Cameron-Smith was used to process the raw input4mips files into a form usable by E3SMv2.  It is recommended that the user download the BOXMODEL directory from the weblink using either http downloads or wget (see, e.g., this note on wget). We will first create the chemistry_loading_file following instructions in $BOXMODEL/Linoz_Input/CMIP6_derived_files/README.txt.

  • NOTE:  The steps outlined below have already been completed as part of the process to create the SSP370 compset.  The user may wish to repeat the steps independently, or leverage what already exists in the $BOXMODEL workspace and customize existing files as needed.

  • Step 1: Create a combined GHG concentration file.  The instructions point to two directories:
    (a) ../../GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_historical_GHG_files/
    and
    (b) ../../GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_SSP585_GHG_files/

    • For (a):  No action is required for historical data.  In the (a) path, there is a .csh script and many subdirectories, each a path to a single netCDF file containing one GHG species specific to historical conditions.  The .csh script reads all the individual .nc files and combines them into a single file, and output, which already exists as part of the provided $BOXMODEL workspace, is located here: $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_historical_GHG_files/combined_GHG_concentrations_CMIP6.nc.

    • In (b), there is a .csh script and many subdirectories, each a path to a single netCDF file containing one GHG species specific to SSP585.  The .csh script reads all the individual .nc files and combines them into a single file:  $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_SSP585_GHG_files/combined_GHG_concentrations_SSP585_2015-2500.nc.

    • For SSP370, two steps must be taken:

      • (1) Raw input4mips SSP370 data files must be downloaded – see the list of files in $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_SSP370_GHG_files.

      • (2) A SSP370 analog to $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_SSP585_GHG_files/combine_GHG_concentrations.csh must be created.  Note that in the SSP370 example version of $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/full_set_of_SSP370_GHG_files/combine_GHG_concentrations.csh, the variable file_list is slightly different than the SSP585 version in order to accommodate the different input4mips file organization in SSP370.

  • Step 1b: For convenience, create a symbolic link (symlink) from this directory to the combined GHG file for convenience (optional)

    • Create a symlink to the "combined" SSP370 file:

    • There was already a symlink to the historical "combined" file, so no action required.

  • Step 2: Generate input concentration file for PRATMO

    • From $BOXMODEL/Linoz_Input/CMIP6_derived_files:

      Output to: ./LINOZ_E3SM/data/CMIP6_ghg.dat

    • Modify file Extract_SSP585_GHG_for_PRATMO.pro:

      (Change "SSP585" to "SSP370" in several instances.)

    • Run "SSP370" of script:

      Output to: ./LINOZ_E3SM/data/CMIP6_ghg_SSP370.dat

  • Step 3: Run PRATMO first for the historical period and then for SSP370.  Detailed instructions for running PRATMO as part of step 3 only can be found in $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/README.txt -- note:  this is a separate file from the previous README.txt mentioned above.

    • For part (1) in $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/README.txt, no action is needed, assuming that the listed fortran executables have been properly set.

    • For part (2), it is recommended to create multiple copies of bctmx.f, one for each configuration.  For example, the user could create bctmx.f.Historical containing settings for the historical GHG configuration, bctmx.f.SSP370 containing settings for the SSP370 configuration, and so on.  Whenever PRATMO is run, simply copy the desired configuration file to bctmx.f, which is the only version that PRATMO will use.  Note: It is recommended to use the same procedure for batmo.f as well (see “part 3” below).  First, check bctmx.f.Historical near L376 to ensure that the following line is being used:

      Then copy the "Historical" version of the file to the "active" bctmx.f version to be compiled:

    • For part (3), make copies of batmo.f (see note in part 2 above):  batmo.f.Historical, batmo.f.SSP370, etc.  Check batmo.f.Historical to ensure that:

      • L6:  character*23 FNAME

      • L29:  iyear spans 1845,2015,5

      • L35:  fname='init_fspecies_0000'

      • L168:  CHARACTER*22 FNAME

      • L182:  fname='linoz0000_2010jpl'
        Copy batmo.f.Historical to batmo.f, the version of the file to be compiled.

    • For part (4), compile the executable:

      You may get a few warnings, but these should not be fatal.  Then, run the script.  NOTE:  The script takes ~3.5 hours to finish, so one option is to run the script on a login node using tmux, which is supported on most supercomputing centers and otherwise is freely available:

      Output is:  $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/... 
      ...init_fspecies_1845 to init_fspecies_2015  (linoz v2 table)
      ...linoz1845_2010jpl to linoz2015_2010jpl  (output abundances of long-lived species in mole/mole as a function of lat, mon, and z)
      See $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/README.txt for a description of outputs.

    • REPEAT parts (2-4) but for SSP370:

      • SSP370 part (2): Check bctmx.f.SSP370 near L376 to ensure that the following line is used:

        Copy bctmx.f.SSP370 to bctmx.f.

      • SSP370 part (3): Check batmo.f.SSP370 to ensure that:

        • L6:  character*25 FNAME

        • L29:  iyear spans 2020,2500,5

        • L35:  fname='init_fspecies_0000_SSP370'

        • L168:  CHARACTER*24 FNAME

        • L182:  fname='linoz0000_2010jpl_SSP370'
          Copy batmo.f.SSP370 to batmo.f.

      • SSP370 part (4):  Compile the code...

        You may get a few warnings, but these should not be fatal.  Run the script.  NOTE:  The script takes ~8 hours to finish, so it is highly recommended that the user either submit a batch job (sample batch script for NERSC-Cori is available at $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/linoz_file_generator_SSP370_BATCH) or use a tmux window (see 'historical' above for example). The output is:
        $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/...
        .../init_fspecies_2020_SSP370 to init_fspecies_2500_SSP370  (Linoz v2 table)
        .../linoz2020_2010jpl_SSP370 to linoz2500_2010jpl_SSP370  (output abundances of long-lived species in mole/mole as a function of lat, mon and z).
        See $BOXMODEL/Linoz_Input/CMIP6_derived_files/LINOZ_E3SM/README.txt for a description of outputs.

  • Step 4:  Convert PRATMO output to E3SM input format (return to $BOXMODEL/Linoz_Input/CMIP6_derived_files/README.txt)

    • This process is slightly different from the Fortran files above because we're not swapping in/out modified scripts.  Instead, there are independent scripts here for both historical and SSP* configurations.

    • Run the following from within IDL (cannot be run from shell prompt):

      • .run Convert_UCI_data_to_CESM_input.pro

        • Set input_dir   = './LINOZ_E3SM/'

        • Output to: $BOXMODEL/Linoz_Input/CMIP6_derived_files/Data_for_E3SM/linoz1850-2015_2010JPL_CMIP6_10deg_58km_cYYYYMMDD.nc       ; use current date

      • .run Convert_UCI_data_to_CESM_input_SSP370.pro

        • Ensure that the following lines are set:

          • L45:  output_file = 'linoz_2020-2500_CMIP6_SSP370_10deg_58km_cYYYYMMDD.nc'       ; use current date

          • L84:  file_in = 'linoz'+Year_String+'_2010jpl_SSP370'

        • Set input_dir   = './LINOZ_E3SM/'

        • Output to:  $BOXMODEL/Linoz_Input/CMIP6_derived_files/Data_for_E3SM/linoz_2020-2500_CMIP6_SSP370_10deg_58km_cYYYYMMDD.nc       ; use current date

      • The following note appears in the README file at this point:
        # For the SSP585 data, the last years (2014-17) from the historical file were included in the Linoz_SSP585 file.
        This issue, also noted in Step 6 of the README, becomes irrelevant because we will be combining both historical and SSP* Linoz files.  Philip Cameron-Smith comments:  "The issue is that air from the troposphere takes time to mix up into the stratosphere.  Hence, the historical GHG concentrations in the troposphere should be shifted by 3 years as a crude method to provide the GHG concentrations in the stratosphere.  Historical GHGs from CMIP6 go up to 2014.  Hence, the corresponding stratospheric GHG concentrations go up to 2017.  And the SSP370 will provide stratospheric GHG concentrations starting in 2018.  Hence, when creating the E3SM Linoz files, the last part of the historical Linoz data files needs to be grafted on to the start of the SSP370 linoz files.  However, if you create a single combined linoz data file then this is no longer an issue."

  • Step 5: Generate the Chlorine_loading input file for E3SM

    • idl Create_chlorine_loading_file.pro

      • Inputs from: './combined_GHG_concentrations_CMIP6.nc'   (this is the symlink created in Step 1b above)

      • Outputs to: './Data_for_E3SM/Linoz_Chlorine_Loading_CMIP6_0003-2017_cYYYYMMDD.nc'       ; use current date

    • idl Create_chlorine_loading_SSP370.pro: Ensure that the following lines are set:

      • L3:  ;   chlorine_loading for SSP370 scenario.

      • L14:  input_file  = 'combined_GHG_concentrations_SSP370_2015-2500.nc'

      • L16:  output_file = 'Linoz_Chlorine_Loading_CMIP6_SSP370_2018-2503_cYYYYMMDD.nc'       ; use current date

      • L25:  ghg_year       = FLOOR(time/365) + 1850    ; Note: CF time dimension in historical and SSP370 GHG files from CMIP6 are different.

      • L77:  NCDF_ATTPUT, file_id, /GLOBAL, "E3SM_SSP370", update author and date

  • Step 6: Add years from historical data. NOTE: because of the 3-year offset, it may be necessary to append data from other linoz data files, e.g, for CMIP6 future scenarios it is necessary to add at least the 2015 data from the historical inputdata.  The easiest thing is to combine the historical and future linoz files, to cover almost any time period.

    • L10:  set linoz_input1 = 'Data_for_E3SM/linoz1850-2015_2010JPL_CMIP6_10deg_58km_cYYYYMMDD.nc'    # use appropriate date matched to file name

    • L11:  set linoz_input2 = 'Data_for_E3SM/linoz_2020-2500_CMIP6_SSP370_10deg_58km_cYYYYMMDD.nc'    # use appropriate date matched to file name

    • L13:  set linoz_output = 'Data_for_E3SM/linoz_1850-2500_CMIP6_Hist_SSP370_10deg_58km_cYYYYMMDD.nc'    # use current date

    • L15:  set chlorine_input1 = 'Data_for_E3SM/Linoz_Chlorine_Loading_CMIP6_0003-2017_cYYYYMMDD.nc'    # use appropriate date matched to file name

    • L16:  set chlorine_input2 = 'Data_for_E3SM/Linoz_Chlorine_Loading_CMIP6_SSP370_2018-2503_cYYYYMMDD.nc'    # use appropriate date matched to file name

    • L18:  set chlorine_output = 'Data_for_E3SM/Linoz_Chlorine_Loading_CMIP6_Hist_SSP370_0003-2503_cYYYYMMDD.nc'    # use current date

    • To run: ./merge_linoz_Hist_SSP370.csh

    • Outputs:

      • $BOXMODEL/Linoz_Input/CMIP6_derived_files/Data_for_E3SM/linoz_1850-2500_CMIP6_Hist_SSP370_10deg_58km_cYYYYMMDD.nc (this represents EAM namelist entries linoz_data_path and linoz_data_file)

      • $BOXMODEL/Linoz_Input/CMIP6_derived_files/Data_for_E3SM/Linoz_Chlorine_Loading_CMIP6_Hist_SSP370_0003-2503_cYYYYMMDD.nc (this is EAM namelist entry chlorine_loading_file)

  • Step 7:  Plot data from E3SM files to check for oddities. A copy of Plot_Linoz_SSP5_8.5.pro was saved to Plot_Linoz_SSP3_7.0.pro and the following changes were made:

    • L18:  file_Chlorine = file_dir+'Linoz_Chlorine_Loading_CMIP6_Hist_SSP370_0003-2503_cYYYYMMDD.nc'     # use appropriate date matched to file name

    • L19:  file_Linoz    = file_dir+'linoz_1850-2500_CMIP6_Hist_SSP370_10deg_58km_cYYYYMMDD.nc'     # use appropriate date matched to file name

    • Additional minor changes were done to add the input file names to the plot titles

    • To run: idl Plot_Linoz_SSP3_7.0.pro

    • An example, embarrassingly simple python plot script is attached (check_GHG.py).

 

bndtvghg file

File bndtvghg contains the prescribed time evolution of global GHGs (CO2, CFCs f11 and f12, CH4, and N2O).  We’ll again use the UCI chemistry box model (denoted BOXMODEL).  It is recommended that the user follow along in:   $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/README.txt.

  • NOTE:  The steps outlined below have already been completed as part of the process to create the SSP370 compset.  The user may wish to repeat the steps independently, or leverage what already exists in the $BOXMODEL workspace and customizing existing files as needed.

  • Step 1:  Run convert_GHG_input4MIPS_to_E3SM to convert SSP370 GHGs from CMIP6 to E3SM format.

    • cd $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0

    • On NERSC-Cori, prior to running this script, activate the E3SM unified software environment and load the IDL module:

    • For SSP370, file convert_GHG_input4MIPS_to_E3SM_for_SSP3_7.0 already exists.  If one were to modify this file for other compsets, the following lines should be checked:

      • L3 (SSP designation)

      • L14 (input_dir)

      • L15-19:  Check that the species file names are consistent with what appears in input_dir.  Best to copy-and-paste full file name, in case not just the "ssp" part changes!

      • L64-68:  This if-block related to $pjc_HOST is commented out, assuming that NERSC-Cori is being used and that e3sm_unified has been activated.  The user may need to customize this.

      • L70:  Again, commented out module load idl in favor of loading IDL module prior to running the script.

      • L71 (SSP designation)

    • BEFORE RUNNING convert_GHG_input4MIPS_to_E3SM_for_SSP3_7.0, ensure that settings in Add_date_for_SSP3_7.0.pro are correct:

      • cd $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0

      • File Add_date_for_SSP3_7.0.pro already exists.  If one were to modify this file for other compsets, the following lines should be checked:

        • L6 & L17:  Check SSP designation, also change date stamp

    • Run shell script:

      • Inputs:  The relevant input4mips species files in $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0/input4MIPS_data/SSP3_7.0_v1.2.1/*.nc

      • cd $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0

      • ./convert_GHG_input4MIPS_to_E3SM_for_SSP3_7.0

      • Outputs: The shell script creates a combined output file ./Data_for_E3SM/temp4.nc, which is then modified by Add_date_for_SSP3_7.0.pro.  The final output file is ./Data_for_E3SM/GHG_CMIP_SSP370-1-2-1_Annual_Global_2015-2500_cYYYYMMDD.nc    # Use current date  (this is the EAM namelist entry bndtvghg)

  • Steps 2 & 3 can be skipped for SSP370.

  • Step 4:  Plot GHG values:  This is only for checking the generated datasets.  This plot routine is set up to be run from within IDL.  See the end of the script to change it so it will run from command line.

    • cd $BOXMODEL/GHG_concentrations/CMIP6_DECK_GHG_annual-means_v1.2.0

    • File Plot_GHG_SSP3_7.0.pro already exists.  If one were to modify this file for other compsets/uses, the following lines should be checked:

      • L7:  SSP designation for script name

      • L18:  SSP designation for input file name -and- modify date stamp as needed

    • Output:  X-window plots (plots to screen)

  • Step 5:  extract_* scripts were not needed for SSP370.

tracer_cnst_file

File tracer_cnst_file contains tracer constituents such as ozone (O3), halons other than CFC11 and CFC12 (HALONS), HO2, NO3, OH, and H2O2.  These fields are used for aerosol oxidation.  Critical notes:

  • For SSP585 in E3SMv1, concentrations were provided by NCAR (WACCM) with one missing species (H2O2) coming from another internal NCAR data source.  O3 was replaced with input4MIPS data.

  • For SSP370, we start with the SSP585 tracer_cnst_file and only replace O3 with input4MIPS data for the SSP370 scenario, keeping all other oxidants the same as SSP585.

The process to create the SSP370 tracer_cnst_file for E3SMv2 is as follows:

  • Workspace: https://web.lcrc.anl.gov/public/e3sm/e3sm_support/compset_generation/ssp370_ssp585/atm_ghg/chem_tracer_cnst_file/, which we’ll denote TRACERROOT. Again, it is recommended that the user download the TRACERROOT directory from the weblink using either http downloads or wget (see, e.g., this note on wget).

  • We will use the SSP585 version of tracer_cnst_file in SSP370 for all gas species except O3, which instead will come from the raw input4mips data.  A copy of the existing SSP585 version of tracer_cnst_file was put in the local working directory:

  • Ozone input4mips data:  The ozone input4mips data file has already been downloaded to $TRACERROOT/CMIP6_DECK_Ozone.  If the user had to retrieve this ozone input4mips data file, the following steps could be used:

    • On , enter "UReading-CCMI-ssp370" in the search box as this text was in the SSP585 ozone ('oxid') file used in E3SM (as deduced from the use case)

    • Downloaded collection:  input4MIPs.CMIP6.ScenarioMIP.UReading.UReading-CCMI-ssp370-1-0.atmos.mon.vmro3.gn

      • This collection contains two files:

        • (1) vmro3_input4MIPs_ozone_ScenarioMIP_UReading-CCMI-ssp370-1-0_gn_201501-204912.nc

        • (2) vmro3_input4MIPs_ozone_ScenarioMIP_UReading-CCMI-ssp370-1-0_gn_205001-209912.nc

      • Click on "WGET Script" under collection name

      • Save the wget script locally, scp it to NERSC-Cori

      • Run wget script on NERSC-Cori:

  • From $TRACERROOT/CMIP6_DECK_Ozone/README.txt, the Historical oxidants section at the top may be skipped when creating the SSP370 compset.  For the SSP5_8.5 future oxidants section, SSP370 versions of the following scripts and their associated output files have already been created, but the user should check that these files exist and could customize them as needed:

    • (1) cp merge_regrid_SSP585_files.csh merge_regrid_SSP370_files.csh

      • Several changes to merge_regrid_SSP370_files.csh were made, mostly changing "SSP585" to "SSP370" and commenting out several unnecessary tasks... best seen by:

      • Script merge_regrid_SSP370_files.csh was run from $TRACERROOT/CMIP6_DECK_Ozone/CMIP6_DECK_Ozone:

        • Output: $TRACERROOT/CMIP6_DECK_Ozone/merged_regridded_data/O3_input4MIPS_SSP370_1.9x2.5x66L_2015-2099.nc

    • (2) cp Combine_oxidants_SSP585.pro Combine_oxidants_SSP370.pro

      • A copy of the 'oxid' template file was made, renaming it with the new date stamp (note that “SSP370” was added for clarity):

      • IDL script Combine_oxidants_SSP370.pro was modified to add header comments, substitute in the SSP370 O3 input file created by running merge_regrid_SSP370_files.csh, and add explanatory notes as global variables to the final output file

      • Run:

        • Output (this is the file that can be used for SSP370, see E3SM SSP370: configuration settings next):   $TRACERROOT/CMIP6_DECK_Ozone/Data_for_E3SM/oxid_SSP370_1.9x2.5_L70_2015-2100_cYYYYMMDD.nc

 

E3SM SSP370: Configuration settings

As noted at the beginning of this document, important information about which files must be modified to create the new SSP370 compset can be obtained by recursively searching for instances of 'ssp585' in $E3SMROOT using, e.g., grep -r -i ssp5 $E3SMROOT. Here, we highlight recommended steps to create or modify files needed for the SSP370 compset.  As the SSP370 compset has already been created for E3SMv2, the user does not need to repeat these steps, but they are presented as reference in the event that new customizations or compsets must be created.

Creation of SSP370 "use case" files

"Use case" files are simply collections of namelist entries for a particular compset.  They serve as the first update to namelist entries from their hardwired default values but, importantly, use case settings are superseded by any user_nl_* settings.  It is important to remember this order of operations when configuring the model!

EAM SSP370 use case

What was done to create the SSP370 EAM use case file:

  • Copy the SSP585 use case file to a SSP370 version:

  • In $E3SMROOT/components/eam/bld/namelist_files/use_cases/SSP370_cam5_CMIP6.xml, change the following – NOTE: File date stamps below correspond to the values used to create SSP370:

    • solar_data_*:  Keep as is

    • bndtvghg

      • OLD (SSP585):  $INPUTDATA/atm/cam/ggas/GHG_CMIP_SSP585-1-2-1_Annual_Global_2015-2500_c20190310.nc

      • NEW:  (SSP370):  $INPUTDATA/atm/cam/ggas/GHG_CMIP_SSP370-1-2-1_Annual_Global_2015-2500_c20210509.nc

    • prescribed_volcaero_*

      • Keep as is.  These settings/files will not change between SSP585 and SSP370, per Hailong Wang:  "[This] file is not specific to SSP585 or model version, so they should be the same for SSP370 and in v2."

    • fsnowoptics

      • SSP585 set to:  lnd/clm2/snicardata/snicar_optics_5bnd_mam_c160322.nc

      • Keep as is:  Hailong said this file applies to both historical and future conditions and there is no need to change it from the SSP585 setting.

    • ext_frc_specifier & srf_emis_specifier

      • ext_frc_specifier  (SSP370 settings)

      • srf_emis_specifier  (SSP370 settings)

        • NOTE:  dms_emis_file will not change between SSP585 and SSP370, per Hailong Wang:  "[This] file is not specific to SSP585 or model version, so they should be the same for SSP370 and in v2.”

    • tracer_cnst_file

      • OLD:  <tracer_cnst_file    >oxid_1.9x2.5_L70_2015-2100_c20190421.nc</tracer_cnst_file>

      • NEW:  <tracer_cnst_file    >oxid_SSP370_1.9x2.5_L70_2015-2100_c20211006.nc</tracer_cnst_file>

      • Commands for staging to $INPUTDATA:

    • mam_mom_*

      • These settings/files will not change between SSP585 and SSP370, per Hailong Wang:  "[This] file is not specific to SSP585 or model version, so they should be the same for SSP370 and in v2.”

    • chlorine_loading_file

      • OLD:  <chlorine_loading_file      >atm/cam/chem/trop_mozart/ub/Linoz_Chlorine_Loading_CMIP6_Hist_SSP585_0003-2503_c20190414.nc</chlorine_loading_file>

      • NEW:  <chlorine_loading_file      >atm/cam/chem/trop_mozart/ub/Linoz_Chlorine_Loading_CMIP6_Hist_SSP370_0003-2503_c20210202.nc</chlorine_loading_file>

      • Commands for staging to $INPUTDATA:

    • linoz_data_file

      • OLD:  <linoz_data_file            >linoz_1850-2500_CMIP6_Hist_SSP585_10deg_58km_c20190414.nc</linoz_data_file>

      • NEW:  <linoz_data_file            >linoz_1850-2500_CMIP6_Hist_SSP370_10deg_58km_c20210202.nc</linoz_data_file>

      • Commands for staging to $INPUTDATA:

    • sim_year:  Should be set to 2015-2100 for SSP*

    • For SSP370 in E3SMv2, all other EAM use case settings we kept as is.

ELM SSP370 use case

What was done to create the SSP370 ELM use case file – NOTE: File date stamps below correspond to the values used to create SSP370:

  • Copy the SSP585 use case file to a SSP370 version:

  • In $E3SMROOT/components/elm/bld/namelist_files/use_cases/2015-2100_SSP370_transient.xml, change/check the following:

    • fsurdat

      • NEW setting – this should be the surfdata* file produced from section ELM land use files above.

    • flanduse_timeseries

      • NEW setting – this should be the landuse.timeseries* file produced from section ELM land use files above.

    • The following namelist settings should be set in the ELM use case file:

    • For SSP370 in E3SMv2, all other ELM use case settings we kept as is.

Source code modifications (SSP370)

The following are modifications that were made to the existing E3SMv2 code base to create the SSP370 compset.  The user need not make these modification as SSP370 is now a supported compset in E3SMv2, but they are presented here as a reference in the event that new customizations or compsets must be created.

  • $E3SMROOT/driver-mct/cime_config/config_component_e3sm.xml:

    • Emulating SSP585 entries, add 2 lines:

    • Support for BGC version(s) of SSP370 is NOT included at the time this document was created, but may be added in the future.

  • $E3SMROOT/components/eam/cime_config/config_component.xml:

    • Emulating SSP585 entries, add 1 line:

  • $E3SMROOT/components/elm/cime_config/config_component.xml: Add:

  • $E3SMROOT/components/elm/bld/namelist_files/namelist_defaults_clm4_5_tools.xml, update the description of available SSPs to read: “For transient LULCC only 3 RCPs are currently available: SSP2 RCP4.5, SSP3 RCP7.0, and SSP5 RCP8.5 (LUT files need to be created for the others)“. Did not need to make any other changes because file lists for SSP370 were already present.

  • $E3SMROOT/components/data_comps/datm/cime_config/config_component.xml: No changes needed, SSP370 entries were already staged.

  • $E3SMROOT/components/data_comps/datm/cime_config/namelist_definition_datm.xml: No changes needed – everywhere there was a SSP585 reference or file path, there was also one for SSP370.

  • $E3SMROOT/driver-moab/cime_config/config_compsets.xml: No changes needed.

  • $E3SMROOT/driver-moab/cime_config/config_component_e3sm.xml: Emulating SSP585 entries, add 2 lines:

    • Support for BGC version(s) of SSP370 is NOT included at the time this document was created, but may be added in the future.

  • $E3SMROOT/cime_config/allactive/config_compsets.xml:

    • To emulate the non-BGC SSP585 compset, add:

    • To emulate the non-BGC SSP585 compset, add: <value  compset="SSP370.*_EAM">2015-01-01</value>

 

E3SM SSP370:  Running the model

The standard SSP* compset is designed to be run in "hybrid" mode in which the simulation branches from an existing fully coupled historical simulation.  CMIP6-style SSP* simulations are initialized at 2015-01-01, which corresponds to the end of the historical run.  An example run script (based off the template run script from an April 2022 clone of E3SMv2 maintenance branch), which includes appropriate settings for some of the XML files, is attached.  Additional details can be accessed at the E3SM Documentation page.

 

 

Creation of E3SM SSP585 input files

ELM land use files (SSP585)

As for SSP370, SSP585 versions of ELM input files fsurdat and fdyndat, which represent the land cover properties (fsurdat = surfdata*.nc) and temporal land use changes (fdyndat = landuse.timeseries*.nc), must be created.  Note again that these files are dependent on the horizontal grid on which the simulation is run.  @Alan Di Vittorio has created a descriptive and highly informative summary of steps needed to complete the ELM setup.  Below are the key steps, with reference to Alan's page.  For all steps, do NOT have the E3SM unified software environment activated.

  1. Check if the desired ELM input files might already exist in $INPUTDATA/lnd/clm2/surfdata_map.  Search for files of the form sspN_rcpN.N, for example ssp5_rcp8.5.

  2. If the desired ELM input files do not yet exist, the same ELM tools/scripts used to create the SSP370 ELM input files can also be used to create those for SSP585.  See instructions in section ELM land use files (SSP370) above.

  3. For SSP585, the preliminary files (LUT_LUH2_*) necessary to create the final fsurdat and fdyndat ELM input files had already been staged by ELM developers in:  $INPUTDATA/lnd/clm2/rawdata/LUT_LUH2_SSP5_RCP85_LUH1f_07082020.  Although these preliminary files existed, additional steps were needed to convert them to the appropriate format and horizontal grid for E3SMv2 -- see next.

  4. For SSP585, the "file list" text file had already been created, see $INPUTDATA/lnd/clm2/rawdata/LUT_LUH2_SSP5_RCP85_LUH1f_07082020/LUT_LUH2_SSP5_RCP85_LUH1f_list.txt.  If edits must be made to the filelist file, recall that the year stamp MUST be placed exactly at character 197.

  5. Repeat step 5 in ELM land use files (SSP370), making the following edits for SSP585:

    1. When running mksurfdata.pl in "debug" model (-d), use the following command (the -rcp option was changed):

    2. When editing the resulting namelist file, consider the following:

      1. mksrf_fvegtyp (input):  This should point to the LUT* land cover file corresponding to the first year of the simulation (in our SSP* example case, 2015, which is considered a "historical" year):  mksrf_fvegtyp  = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/rawdata/LUT_LUH2_HIST_LUH1f_07082020/LUT_LUH2_historical_2015_c07082020.nc'

      2. fsurdat (output):  The name of the created land cover file (a file containing a single year of land cover data in 12 monthly time steps) corresponding to the -years option and representing the first year of the simulation.  For the developed SSP585 compset for E3SMv2,  the entry for fsurdat (<fsurdat>lnd/clm2/surfdata_map/surfdata_ne30np4.pg2_simyr1850_2015_c211105.nc </fsurdat>) represents land surface conditions for 1850 instead of the preferred 2015 conditions.  This is a known oversight, but for reasons described here this error has negligible impact on standard SSP585 simulations run using the RUN_TYPE = 'hybrid'.

      3. fsurlog (output):  A verbose logfile for fsurdatfsurlog        = 'surfdata_ne30np4.pg2_SSP5_RCP85_simyr2015_cYYMMDD.log'

      4. mksrf_fdynuse (input):  This should point to the "filelist" file that lists LUT* files to be used in the E3SM simulation.  For a SSP* run, the listed files should span 2015-2100 inclusive:  mksrf_fdynuse  = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/rawdata/LUT_LUH2_SSP5_RCP85_LUH1f_07082020/LUT_LUH2_SSP5_RCP85_LUH1f_list.txt'

      5. fdyndat (output):  The name of the created land use file that will contain yearly data for each year of the E3SM simulation.  The yearly time span in the fdyndat filename should represent the span of years within mksrf_fdynusefdyndat        = 'landuse.timeseries_ne30np4.pg2_SSP5_RCP85_simyr2015-2100_cYYMMDD.nc

    3. Next, rerun ./mksurfdata.pl (not in debug mode) by supplying the newly created namelist to produce the output files:

      $ELMTOOLSROOT/components/elm/tools/clm4_5/mksurfdata_map> ./mksurfdata_map < namelist_ssp585_YYYYMMDD

    4. The output files are saved to the current directory ($ELMTOOLSROOT/components/elm/tools/clm4_5/mksurfdata_map).  If they will be used in E3SM production runs, and if the user is part of the 'e3sm' UNIX group, the surfdata* and landuse* files may be copied to $INPUTDATA/lnd/clm2/surfdata_map for wider use.  See E3SM SSP585:  Configuration settings for recommended namelist settings.

Further reading:  Additional details on creating the landuse* and surfdata* files on a new grid, see this site for more complete instructions.

EAM aerosol emissions files (SSP585)

Because the aerosol emissions input files are independent of the spatial grid (they are interpolated to whatever spatial grid is used for the E3SM simulation), the same aerosol files used in E3SMv1 SSP* simulations can be used for v2 simulations.  It was found that these aerosol files already existed in $INPUTDATA/atm/cam/chem/trop_mozart_aero/emis/CMIP6_SSP585_ne30.  Once identified, the listing of these files was added to ext_frc_specifier (elevated emissions, aerosol production away from the surface) and srf_emis_specifier (surface emissions) in the SSP585 use case (see EAM SSP585 use case below).

EAM greenhouse gas/ozone/oxidation files (SSP585)

As with the EAM aerosol emissions files, the radiative forcing files are specific to a future SSP but are independent of the spatial grid.  It was found that the radiative forcing files associated with namelist entries chlorine_loading_file, linoz_data_file, bndtvghg, and tracer_cnst_file already existed and the paths and filenames were already included in the SSP585 E3SMv2 use case file.  No additional action was required.

 

E3SM SSP585: Configuration settings

As noted at the beginning of this document, important information about which files must be modified to create the new SSP585 compset can be obtained by recursively searching for instances of 'ssp585' in $E3SMROOT using, e.g., grep -r -i ssp5 $E3SMROOT.

Here, we highlight recommended steps to create or modify files needed for the SSP585 compset.  As the SSP585 compset has already been created for E3SMv2, the user does not need to repeat these steps, but they are presented as reference in the event that new customizations or compsets must be created.

Review of SSP585 "use case" files

"Use case" files are simply collections of namelist entries for a particular compset.  They serve as the first update to namelist entries from their hardwired default values but, importantly, use case settings are superseded by any user_nl_* settings.  It is important to remember this order of operations when configuring the model!

EAM SSP585 use case

No changes were required and the use case file that existed for E3SMv2 SSP585 was left unmodified.

ELM SSP585 use case

File $E3SMROOT/components/elm/bld/namelist_files/use_cases/2015-2100_SSP585_transient.xml already existed. Change/check the following – NOTE: File date stamps below correspond to the values used to create SSP585:

  • fsurdat

    • NEW setting – this should be the surfdata* file produced from section ELM land use files (SSP585) above:  <fsurdat>lnd/clm2/surfdata_map/surfdata_ne30np4.pg2_simyr1850_2015_c211105.nc </fsurdat>

  • flanduse_timeseries

    • NEW setting – this should be the surfdata* file produced from section ELM land use files (SSP585) above:  <flanduse_timeseries>lnd/clm2/surfdata_map/landuse.timeseries_ne30pg2_SSP3_RCP70_simyr2015-2100_c211015.nc</flanduse_timeseries>

  • The following namelist settings should be set in the ELM use case file:

  • For SSP370 in E3SMv2, all other ELM use case settings we kept as is.

Source code modifications (SSP585)

All changes to the source code files have already been made for SSP585 in E3SMv2.

E3SM SSP585:  Running the model

See E3SM SSP370: Running the model.

 

Definitions

YYYYMMDD: Standard 4-digit year, 2-digit month, and 2-digit day format used as a timestamp for model clone date and/or file creation date. In some instances for ELM input files, the date stamp convention changes to YYMMDD.

$INPUTDATA:  Path to E3SM input data directory.  On NERSC, this is /global/cfs/cdirs/e3sm/inputdata

$E3SMROOT = /path/to/E3SMv2_maint2.0_YYYYMMDD:  Path to local model workspace (to where E3SM was cloned), used for running E3SMv2 simulations. For production simulations, it is generally recommended to use the “maintenance” branch of E3SM (not “master”).

$ELMTOOLSROOT = /path/to/E3SMv2_master_YYYYMMDD:  Path to model workspace containing ELM tools (recommended to use latest “master” branch, in case this toolkit has been updated).  $E3SMROOT may be used instead, if sufficient disk space exists at that location.

 

List of potentially useful links and paths (as of May 2022)

  • High-level overview of compset creation (from 2016, some information may be obsolete!):

  • Public repository of selected GHG-related scripts and data files (BOXMODEL and TRACERROOT):

  • CMIP6 DECK compset overview: 

  • CMIP6 future compset notes: 

  • Future forcing dataset summary: 

  • E3SM horizontal grid resources

    • Repository of mapping files on NERSC:  /global/homes/z/zender/data/maps

    • Repository of grid files on NERSC:  /global/homes/z/zender/data/grids