Statement on ELM land use behavior for future projection simulations

Metadata

Origin: 9 May 2022
Document author(s): @Jim Benedict (LANL)
Contributors: @Alan Di Vittorio @Gautam Bisht @Wuyin Lin

Overview

This document summarizes the ELM configuration of SSP370 and SSP585 compsets for E3SMv2, a known (essentially inconsequential) discrepancy in the setting of one of the ELM input files for these compsets, a known (very minor) bug in the ELM code base that was discovered during the process, test simulations used to uncover these issues, our findings, and recommendations.

Issue

ELM land use and land cover change input data for E3SM future projections is defined by a listing of many files.  Two of these files are most relevant for this discussion: "fsurdat" and "fdyndat" in the ELM namelist, which represent the land cover properties (fsurdat = surfdata*.nc) and temporal land use changes (fdyndat = landuse.timeseries*.nc).  These files are created using a script packaged with the E3SM download:  $E3SMROOT/components/elm/tools/clm4_5/mksurfdata_map/mksurfdata.pl.  Creation of fdyndat for SSP370 (or SSP858) for E3SMv2 was relatively straightforward, but there was some uncertainty involving fsurdat.  We discovered that (a) a version of fsurdat that represents 1850 land surface conditions (rather than those for the initialization year of 2015) is being used in the E3SMv2 setup, and (b) the ELM codebase is not correctly handling updated land conditions from user-provided ELM initialization files in a "hybrid" simulation setup (set via finidat) that should overwrite all data from fsurdat but do not.

Test simulations

Two E3SMv2 test simulations were conducted using E3SMv2.0 maintenance branch cloned on 2022-04-20.  The two runs are identical EXCEPT for the specification of fsurdat:

  • Case A:  v2.LR.SSP370_0101.maint-2.0_testELMfsurdat1850

    • fsurdat = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/surfdata_map/surfdata_ne30np4.pg2_simyr1850_2015_c211105.nc'

  • Case B:  v2.LR.SSP370_0101.maint-2.0_testELMfsurdat2015

    • fsurdat = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/surfdata_map/surfdata_ne30np4.pg2_SSP3_RCP70_simyr2015_c220420.nc' 

In Run A, fsurdat points to a file that represents land surface conditions from 1850. In Run B, fsurdat points to a file that represents land surface conditions from 2015. The same E3SM run script was used to produce Runs A & B, and all other settings between the runs were identical, including:

COMPSET="WCYCLSSP370" RESOLUTION="ne30pg2_EC30to60E2r2" MODEL_START_TYPE="hybrid"  # 'initial', 'continue', 'branch', 'hybrid' START_DATE="2015-01-01"    # for SSP370 this is set automatically via the compset config GET_REFCASE=TRUE RUN_REFDIR="/global/cscratch1/sd/jjbenedi/ssp370/rest_for_hybrid/v2.LR.historical_0101/init/2015-01-01-00000" RUN_REFCASE="v2.LR.historical_0101" RUN_REFDATE="2015-01-01"   # same as MODEL_START_DATE for 'branch', can be different for 'hybrid' do_transient_pfts = .true. check_finidat_fsurdat_consistency = .false. check_finidat_year_consistency = .true. check_finidat_pct_consistency = .true. check_dynpft_consistency = .false. flanduse_timeseries = '/global/cfs/cdirs/e3sm/inputdata/lnd/clm2/surfdata_map/landuse.timeseries_ne30pg2_SSP3_RCP70_simyr2015-2100_c211015.nc'

Note again that Runs A & B use the same flanduse_timeseries file.

Test simulation findings

Runs A & B are not bit-for-bit (BFB) identical. BFB accuracy was expected a priori, in part because it was assumed that in a hybrid simulation, the user-provided ELM restart files would supersede those associated with the default SSP370 configuration (specifically, fsurdat as it was the only configuration change made). Between the two fsurdat files, the following variables differ:

Fields that differ between fsurdat from Runs A & B

Corresponding field names in flanduse_timeseries

Fields that differ between fsurdat from Runs A & B

Corresponding field names in flanduse_timeseries

PCT_NATVEG(gridcell)

PCT_NATVEG(gridcell)

PCT_NAT_PFT(natpft, gridcell)

PCT_NAT_PFT(time, natpft, gridcell)

PCT_LAKE(gridcell)

PCT_LAKE(gridcell)

PCT_GLACIER(gridcell)

PCT_GLACIER(gridcell)

PCT_URBAN(numurbl, gridcell)

PCT_URBAN(numurbl, gridcell)

These fields also appear in flanduse_timeseries (as shown in the right column), though PCT_NAT_PFT now includes the time dimension.  In standard ELM, land unit values are all constant because the standard land surface files (non-crop-model) do not include time-varying land unit values.  Also, the finidat/restart file always has precedence for the land unit values, and fsurdat is used only if there is no finidat/restart specified (which is usually a cold start).

Our SSP simulations, and most current transient E3SMv2 simulations in general, are designed to use dynamic PFTs but not dynamic land units.  The current generation of land surface input files have transient PFTs only.  For the do_transient_pfts=.true. case, only transient PFT weights are read in from the landuse_timeseries file.  Dynamic land unit code is supported but requires special land surface files (e.g., using transient crops).  For transient_crops, a different flanduse_timeseries file is needed that also has transient land unit weights because the crop land unit is separate.  This is not standard for most current transient simulations and is not used for E3SMv2 SSP configurations.

The differences between land unit values in the Run A & B fsurdat files are at “model precision” level:

PCT_NATVEG:   -4.4805677e-07   9.1621397e-07 PCT_LAKE:     -3.2981192e-07   2.5376499e-07 PCT_GLACIER:  -1.2037546e-07   1.1948390e-07 PCT_URBAN:    -4.3546921e-07   3.2966447e-07

Part of these differences are likely attributed to rounding and renormalization steps within the mksurfdata.pl file generation script.  The differences in PCT_NAT_PFT in the Run A & B fsurdat files can be much larger, but this is of no concern because PCT_NAT_PFT in the SSP* (MODEL_START_TYPE="hybrid") simulation is sourced from flanduse_timeseries, which we know is correct.  The PCT_NAT_PFT values from fsurdat will be compared with the first time slice of PCT_NAT_PFT from landuse_timeseries if check_dynpft_consistency = .true., but E3SMv2 SSP* as is currently configured instead use check_dynpft_consistency = .false..

It is expected that, in a hybrid simulation with finidat set, ALL information in fsurdat will be superseded by information in finidat (this would also expected to be the case for restarts).  However, we find that this is not strictly true as ELM is currently configured.  At the time of this writing, the addition of ELM's "topounits" does not appear to be fully fleshed out with respect to finidata and restarts. From @Alan Di Vittorio:

subgridRest_write_and_read reads all of the weights, but does not update the fraction of land unit on the topounit, which is not in the restart file, but is fundamental for calculating the rest of the weights in the land hierarchy.  More generally, the relevant topounit weights are not in the restart file, as topounits are not fully implemented (they are available for testing and require special land surface files that provide the topounit information) but are part of the master code and are essentially a layer that is equal to the land grid cell. This topounit layer is essentially hardcoded (this is determined by the lack of topounit info in the land surface file) with a single topounit with weight 1 on the land grid cell, and the land unit weights on the single topounit in a gridcell are set equal to the available landunit weights on the land gridcell.

The fsurdat data are read in initialize1(), before initGridCells() is called. InitGridCells() accounts for the topounit code and sets lun_pp%wttopounit equal to lun_pp%wtgcell (which here is from fsurdat).

In intialize2(), subgridRest_write_and_read reads the land weights and sets them directly, but there are limited topounit-related variables in finidat/restart files. To be consistent with the rest of the initialization code the lun_pp%wttopounit should be set to equal lun_pp%wtgcell here also, but it is not, and other code is not subsequently called to do this.

So when the weights are recalculated with compute_higher_order_weights(), which happens on a regular basis, lun_pp%wtgcell and associated column and pft weights are recalculated using the fsurdat lun_pp%wttopounit=lun_pp%wtgcell values, and not the weights from finidat or restart. So the different fsurdat weights are indeed used, although I don’t think that is how it is supposed to be.

Are the non-zero diffs cause for concern? Given the precision-level differences for the land-unit weights, which might have been around for quite some time, my intuition says no. My experience using this model to assess the impacts of land change on climate and carbon is what gives me this 'intuition.' It usually takes a significantly measurable amount of land/pft change to affect the outputs. Since this is a very small one-time shift at the the land unit level, it is unlikely to produce output differences that are even comparable to know noise levels.

Conclusions

  • As is currently set in the v2 SSP370 and SSP585 compsets, the corresponding flanduse_timeseries files are correct.

  • The released version of SSP370 and SSP585 compsets for E3SMv2 will use the same “1850” version of fsurdat (<fsurdat>lnd/clm2/surfdata_map/surfdata_ne30np4.pg2_simyr1850_2015_c211105.nc </fsurdat>). Alone, this is not optimal, but this setting becomes irrelevant for hybrid-type simulations, the run type of SSP*.

  • Technically, an incorrect fsurdat ELM input file is being used by the current configuration of the SSP370 (or SSP585) compset for E3SMv2, but the implications for this appear to be negligible.

    • A fsurdat version representing 1850 land surface conditions is being used when a 2015 version should be used instead to make the land surface conditions more compatible with the 2015-01-01 run initialization.

    • Only 5 variables differ between the "1850" and "2015" fsurdat file versions.  One of these variables (PCT_NAT_PFT) is overwritten by data in flanduse_timeseries and so has no impact on the simulation.  The remaining 4 variables differ "within machine precision".

  • Data from finidat/restart in a hybrid simulation does not entirely supersede that from fsurdat due to a minor bug in ELM's "topounits" code, but this bug is not expected to have a meaningful impact on the resulting simulations.  Effectively, the lack of topounit-related variables in finidat/restart files allows some land unit information from fsurdat to avoid being overwritten.

Recommendations