EAM's HOMME dycore
EAM and SCREAM/EAMxx uses the HOMME dycore package. HOMME has both Fortran and C++ versions with different options for the prognostic thermodynamic variable (preqx use temperature while theta-l uses potential temperature) and those in turn have some additional capabilities.
- 2 Documentation on parameter tuning
- 3 Documentation on algorithms
- 4 Related pages
Summary of HOMME capabilities available for different versions.
| SL-transport | phys grid | non-hydrostatic option | GPU option | Allows RRM grids | E3SM version used |
HOMME preqx Fortran | no | yes | no | no | yes | v1 |
HOMME preqx C++ | no | yes | no | yes (Kokkos) | no | only available in standalone HOMME |
HOMME theta-l Fortran | yes (Fortran calling C++/Kokkos) | yes | yes | yes1 | yes | v2,v3 |
HOMME theta-l C++ | yes | yes | yes | yes (Kokkos) | not yet | v4, SCREAM |
When building Kokkos for theta-l Fortran, the "cpu" target is used by default because running only the SL on the GPU would not give good overall performance.
New features in EAMxx to be considered for EAM V3:
Physics mass adjustment due to phase change: apply locally to dp3d instead of to surface pressure. (reduces magnitude of artificial pressure work term
Turn on topographic improvements (pgrad_correction=1, hv_ref_profiles=6) and switch to rougher “x6t” topography
Using the THETA dycore in E3SM:
As of 2020/12 in E3SM master (efb184d525), Theta dycore running in hydrostatic mode with SL transport is the default.
To switch to the older Eulerian transport add to user_nl_eam
transport_alg=0, hypervis_subcycle_q=1, dt_tracer_factor=1
To switch to NH mode:
theta_hydrostatic_mode=.false., tstep_type=9.
In NH mode, tstep_type=9 is our current best recommendation. Other options:
tstep_type=5 is the 3rd order accurate KGU53 explicit method used for hydrostatic simulations. It can be used with NH simulations with very small timestepes, ~0.5s
tstep_type=9 is an IMEX method, combining an explicit Butcher table (KGU53, used for most equations) and an implicit Butcher table (used for terms responsible for vertically propagating acoustic waves not present in hydrostatic models).
tstep_type=10 is another IMEX method which uses KG52 for the explicit table. It has a slightly larger stability region.
Older preqx dycore is still available, but wont support newer namelist options:
./xmlchange CAM_TARGET=preqx
Detailed description of timesteps and viscosity parameters
The PREQX and tHETA dycores share much of the HOMME infrastructure and have several timesteps:
dtime: The physics timestep. set in env_run.XML via ATM_NCPL. Note: dtime can be entered in user_nl_cam, but it will be silently ignored.
dt_remap: The Lagrangian vertical level remap timestep
dt_tracer: The tracer advection timestep
dt_dyn Dynamics (u,v,temperature,mass) timestep
dt_vis Hyperviscosity timestep for dynamics
dt_vis_tom top-of-model Laplacian diffusion (sponge layer)
dt_vis_q hyperviscosity timestep for tracers
These timesteps are controlled by various splitting parameters. In EAM, "dtime" is set outside the dycore and used to define all other timesteps. The splitting parameters enforce the fact that the various timesteps must evenly divide other timesteps.
See also Timestepping within the ACME v1 Atmosphere Model
THETA dycore: (CAM_TARGET=theta-l)
EAM v2 is using the "theta" dycore, which supports both hydrostatic and nonhydrostatic dynamics and semi-Lagrangian tracer transport. The theta dycore also supports the ability to have dt_tracer > dt_remap, and introduces a separate timestep for the TOM sponge layer.
New namelist variables: EAMv2 now allows the user to directly set the main dycore timesteps with new namelist options replacing "nsplit, rsplit and qsplit". When using the new namelist options, the old variables are set to: nsplit=-1, rsplit=-1 and qsplit=-1. The older namelist variables should be avoided, and they will removed when E3SM drops support for V1 compsets and the preqx dycore. For information on the older namelist variables, see the PREQX section below.
namelist inputs: tstep, dt_remap_factor, dt_tracer_factor, hypervis_subcycle, hypervis_subcycle_q, hypervis_subcycle_tom
Resulting timesteps:
dt_remap = tstep*dt_remap_factor ( must evenly divide EAM's dtime, checked at runtime)
dt_tracer = tstep*dt_tracer_factor ( must evenly divide EAM's dtime, checked at runtime)
dt_dyn = tstep
dt_vis = dt_dyn/hypervis_subcycle
dt_vis_q = dt_tracer/hypervis_subcycle_q
dt_vis_tom = dt_dyn / max(hypervis_subcycle, hypervis_subcycle_tom)
An additional constraint is that either dt_remap must evenly divide dt_tracer or dt_tracer must evenly divide dt_remap.
CFL Conditions
dt_dyn: The CFL condition for dt_dyn with uniform resolutions is well understood and depends on the horizontal resolution and the speed of the Lamb wave (~340m/s). This assumes the use of the KG5 RK method (tstep_type=5) for hydrostatic, and the related IMEX method (tstep_type=9 or 10) for NH.
NE30: dt_dyn = 300
NE60: dt_dyn = 150
NE120: dt_dyn = 75
NE240: dt_dyn = 40
NE256: dt_dyn <= 36
NE512: dt_dyn <= 18
NE1024 dt_dyn <= 9
The code prints an estimate of the viscous CFL condition in the log file, in the form "dt < S * max_eigen_value", where S depends on the timestepping method and for dt_dyn, assumes a gravity wave speed of 342 m/s. For KGU53 S=3.8. In practice, dt is often ~30% lower, so taking S=2.65 is a good target for new grids.
dt_tracer: The tracer CFL condition is controlled by the maximum advective velocity (~200m/s) which usually occurs near the model top. For Eulerian advection HOMME uses a RK3 SSP and the CFL also depends on the mesh spacing. For SL tracers, the limit is governed by the size of the halo exchange.
Eulerian tracers, dt_tracer = dt_dyn. (assuming the default RK3 SSP method)
SL tracers, dt_tracer = 6*dt_dyn.
dt_remap: This timestep is usually limited by physical constraints in order to resolve large divergence/convergence created by physical parameterizations. It is also the only source of vertical dissipation in the dycore, so at extreme resolutions or perhaps in the NH dycore it may have to be decreased to increase the amount of vertical dissipation.
dt_remap = 2*dt_dyn
dt_vis, dt_vis_q, dt_vis_tom: These timesteps have a CFL condition that depends on the horizontal resolution and the viscosity coefficient. In the case of the TOM diffusion, the viscosity coefficient depends on the number of levels, bringing in an additional dependency.
With the recommended hyper viscosity settings, one should take:
dt_vis = dt_vis_q = dt_dyn
The code prints an estimate of the viscous CFL condition in the log file, in the form "dt < S * max_eigen_value", where S depends on the timestepping method. Hyperviscosity is applied with forward Euler, suggesting S=2. However, numerical experiments on cubed sphere grids suggest the HV operator goes unstable at S=.80 and so we recommend taking S<.70 when estimating the maximum stable timestep. For RRM grids S might need to be smaller.
The TOM sponge layer uses a laplacian operator with coefficient, nu_top. We used to keep this coefficient fixed (nu_top=2.5e5) for all resolutions. But above 1/4 degree, this creates a severe CFL condition. For v2 and beyond, we now recommend running 2.5e5 for resolutions up to 1 degree (NE30) and for higher resolution run with a CFL number of 1.9. This number is based on the CFL condition printed by the code. Experiments suggest the operator is stable up to S=2.89, but unstable at S=3.38.
In V1, we had nu_div >> nu, requiring a much smaller dt_vis that must be tuned specially. This is not needed in V2.
Recommended timestep tuning for new grids
Do not get mislead by spinup issues! Determining stable timestep values is complicated by spinup. If the initial condition file is very much out of balance (say it came from interpolation from observations using different topography) then you may require very small timesteps and larger hyperviscosity coefficients in order to get past an initial adjustment period. Only perform the tuning for the final timesteps with a spun up initial condition file.
Old procedures from https://docs.google.com/document/d/1ymlTgKz2SIvveRS72roKvNHN6a79B4TLOGrypPjRvg0/edit?usp=sharing
For the THETA dycore, we need to determine the stable dt_remap, dt_dyn, dt_vis and dt_vis_tom. We then take dt_vis_q=dt_vis and with SL tracers, we will assume that dt_tracer = 6*dt_dyn.
Run 10 days with very small, inefficient timesteps. Take dt_remap=dt_dyn and use small values for dt_dyn, dt_vis and dt_vis_tom. Create a new IC file (see INITHIST in EAM). Use this new IC file for all future runs below
dt_vis_tom: Use the CFL condition (take S=2) printed in atm.log to determine the largest safe value of dt_vis_tom. In practice the code appears to be unstable around S=3.3, and it is good not to run right at the stability limit.
dt_dyn: Keeping all other timesteps below their known stable limits (which takes some care due to how they are all set) make a serious of runs increasing just dt_dyn until the run crashes. May need to make relatively long runs (several months) to ensure stability.
dt_vis: Keeping all other timesteps fixed find the largest stable value. For THETA, with the recommended viscosity coefficients, dt_vis = dt should be stable. For some RRM grids, dt_vis might be need to be slightly smaller due to mesh distortion. When the viscous CFL is violated (dt_vis too large), the run usually crashes within a couple of steps.
The procedure outlined above can find timesteps that are borderline unstable, but don’t blow up do to various dissipation mechanisms. Hence it is a good idea to run 3 months, and look at the monthly mean OMEGA500 from the 3rd month. This field will be noisy, but there should not be any obvious grid artifacts. Weak instabilities can be masked by the large transients in flow snapshots, so it best to look at time averages.
dt_remap: Using 2*dt_dyn is a conservative choice that is well tested up to ne240. Decreasing dt_remap results in more frequent vertical remaps resulting in increased vertical dissipation. If dt_remap is too large, the code may crash with negative layer thickness errors. The code has a "dp3d" limiter that can prevent some of these crashes. If this limiter is triggered (warnings in e3sm.log file), that can mean either dt_remap is too large of one of the CFL conditions has been violated and the code is unstable.
During this tuning process, it is useful to compare the smallest ‘dx’ from the atmosphere log file to the smallest ‘dx’ from the global uniform high resolution run. Use the ‘dx’ based on the singular values of Dinv, not the ‘dx’ based on element area. If the ‘dx’ for newmesh.g is 20% smaller than the value from the global uniform grid, it suggests the advective timesteps might need to be 20% smaller. The code prints out CFL estimates that are rough approximation that can be used to check if you are in the correct ballpark.
RRM Grids
With RRM grids, the timesteps will be controlled by the highest resolution region. So with an RRM grid with refinement down to NE120, the timesteps should be close to what we run on a uniform cubed-sphere NE120 grid. The timesteps may need to be slightly smaller because of the deformed elements in the transition region. With a hiqh quality RRM mesh ( Max Dinv-based element distortion metric <= 4, see Generate the Grid Mesh (Exodus) File for a new Regionally-Refined Grid) we can usually run with the expected dt_dyn and dt_tracer values, and only the viscosity timesteps need to be slightly reduced.
Tensor vs constant coefficient hyperviscosity
HOMME’s tensor hyperviscosity operator uses resolution aware tensor coefficients to scale automatically with resolution and to adapt to deformed elements ( https://gmd.copernicus.org/articles/7/2803/2014/ ) It is required for RRM grids, and was adopted for all grids with E3SM v3. The strength is controlled by namelist variable “nu”, given in units of 1/s. The viscosity is scaled with resolution via hypervis_scalings=3, meaning the coefficient is reduced as dx^3. (for RRM simulations using the preqx dycore, hypervis_scalings=3.2 was used)
The CAM-SE and E3SM v2 hyperviscosity used for cubed-sphere grids was specified by “nu” in physical units, m^4/s. For a uniform grid with resolution “dx”, the tensor hyperviscosity will be identical to the constant coefficient hyperviscosity, with coefficients related by:
nu_tensor = nu_const *( 2*rearth /((np-1)*dx))^{hypervis_scaling} * rearth^{-4.0}.
i.e. tensor nu=3.4e-8 when used at 1 degree resolution (dx=110,000m, np=4, rearth=6.376e6, hypervis_scaling=3) is equivalent to 1e15 m^4/s. Spreadsheet for looking at scaling with resolution of constant and tensor coefficient HV: https://docs.google.com/spreadsheets/d/1LHTl2_A065pfdWC69OHmXvNL_v1484cXg7ZowhyEbPU/edit?usp=sharing
For regular viscosity:
nu_tom_tensor = nu_tom_const *( 2*rearth /((np-1)*dx))^{laplace_scaling} * rearth^{-2.0}.
i.e. tensor nu=3.2e-7 when used at 3.25km (dx=3250m, np=4, rearth=6.376e6, laplace_scaling=1) is equivalent to 1e4 m^2/s. Tensor coefficient for regular viscosity is not currently in the model but may be added for use in the sponge layer.
Topo smoothing: The topo smoothing code uses hyperviscosity tensor (with hypervis_scalings=2) but with the regular laplace operator, not hyperviscosity. The coefficient is thus incorrectly scaled by rearth^4 instead of rearth^2. To convert that coefficient (4e-16) to a physical diffusion coefficient one must use rearth^{-4} in the above formula.
Recommended settings (THETA)
All resolutions: (default for theta-l dycore in master as of https://github.com/E3SM-Project/E3SM/pull/3368 )
se_ftype=2
se_limiter_option=9
cubed_sphere_map=2 (change to 0 in user_nl_cam when using old compsets with ESMF maps)
transport_alg=12
semi_lagrange_cdr_alg=3 (2020/4: default changed to 3 reproduce some aspects of the v1 climate in v2)
semi_lagrange_nearest_point_lev = 256
semi_lagrange_hv_q = 1
theta_advect_form =1
vert_remap_q_alg=10
nu_div=nu_p=nu_q=nu_s=-1 (use nu for all values)
nu, nu_top: Resolution dependent
se_nsplit=-1 ( timesteps set via se_tstep, dt_remap_factor, dt_tracer_factor)
qsplit=-1
rsplit=-1
se_tstep (Resolution dependent)
dt_tracer_factor = 6
dt_remap_factor = 2
hypervis_subcycle_q = dt_tracer_factor
hypervis_subcycle=1
hypervis_subcycle_tom=1
hypervis_scaling=3.0 ( default for grids as of 2023/4 )
nu=3.4e-8
WARNING on dtime: in the table below we give "dtime", the physics timestep. This is also a namelist option in EAM, but it will be ignored. The only way to set dtime is to set ATM_NCPL = ( 24*60*60 / dtime) in env_run.XML
Resolution | Timesteps | Namelist settings | Notes | Tested? |
---|---|---|---|---|
7.5 degree | dtime=7200 dt_tracer=3600 | nu_top = 2.5e5 se_tstep=600 | Ultra low res for regression testing only. | |
1 degree (NE30) | dtime=1800 dt_tracer=1800 | nu_top = 2.5e5 se_tstep=300 E3SM v2 uses constant coefficient HV, enabled with: | With larger dt_remap( dt_remap=1800), we see occasional (every 2-3 years) dp3d limiter activation, meaning that the model levels are approaching zero. This appears to be due to strong divergence above tropical cyclones created by one of the parameterizations. | HS+topo(72L): H and NH (H can run at t 360s but not 400s with either tstep_type=4 or 5). dt_remap=600 runs with no limiter warnings, while dt_remap=900 crashes with dp3d<0 at surface. F-EAMv1-AQP1: H and NH FAV1C-L: H and NH |
NE45 | dtime=1200. dt_tracer=1200 dt_vis_tom=200 | nu_top=2.5e5 se_tstep=200 | ||
1/4 degree (NE120) | dtime=900 dt_remap=150 | nu_top=1e5 se_tstep=75 | CFL estimates suggest: dt_vis_tom*nu_top <= 31*2.5e5 nu_top=2.5e5 would need | HS+topo(72L): H and NH (with dt_remap=75 and theta limiter to handle unphysical boundary layer) F-EAMv1-AQP1: H and NH, both 72 and 128 levels (1+ years) FC5AV1C-H01B: NH 72L runs several years. |
12km (NE256) | dtime=600 dt_tracer=200 NOTE: these defaults were updated 2021/9 based on SCREAM v0.1 3km runs. But NE256 is known to run stably at slightly larger timsteps: dt_tracer=300 | nu_top=4e4 | nu_tom=4e4 is running at the code's estimate of the CFL limit with S=1.9 | F-EAMv1-AQP1:
FC5AV1C-H01B: NH 128L run for several months dt=37.5/75/300/600. Occasional problems near coastlines - considering ( Mar 26, 2020 ) reducing dtime, increasing HV, tunning CLUBB dtime=600 used in: https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021MS002805 |
6km (NE512) | dtime=200 dt_tracer=100 | nu_top=2e4 | CFL estimates suggest: dt_vis_tom*nu_top <= 1.7*2.5e5, nu_top=2.5e5 needs hypervis_subcycle_tom=13 | F-EAMv1-AQP1:
FC5AV1C-H01B: NH 128L run for 1 day with dt=18.75/37.5/150/300, then NaNs in microphysics (not yet debugged) |
3km (NE1024) | dtime=100 dt_tracer=50 | nu_top=1e4 | CFL estimates suggest: dt_vis_tom*nu_top <= 0.43*2.5e5 with nu_top=2.5e5, | F-EAMv1-AQP1:
FC5AV1C-H01B: SCREAM v0: run for 40 days with constantHV, dt=9.375/18.75/75/75. SCREAM v0.1: switch to tensorHV (slightly less diffusion), needs dt_dyn<=9s, dt_tracer<60 |
RRM | dtime=??? dycore timesteps should be set based on the finest region in the RRM. | nu_top=Uncertain - needs more research. Should probably switch to tensor laplacian. For NE30→NE120 grids, start with NE120 constant coefficient value, 1e5. |
PREQX Default Settings (for reference)
PREQX timesteps are controlled by namelist inputs: nsplit, rsplit, qsplit, hypervis_subcycle, hypervis_subcycle_q
Resulting timesteps:
dt_remap = dtime/nsplit
dt_tracer = dt_remap/rsplit
dt_dyn = dt_tracer/qsplit
dt_vis = dt_dyn/hypervis_subcycle
dt_vis_tom = dt_vis
dt_vis_q = dt_tracer/hypervis_subcycle_q
se_ftype=2
se_limiter_option=8
cubed_sphere_map=0 (2 when using TR maps or RRM Grids)
transport_alg=0
vert_remap_q_alg=1
nu_top = 2.5e5
nu_div = 2.5*nu
nu_p=nu_q=nu_s=nu
se_tstep=-1 ( timesteps set via se_nsplit, rsplit, qsplit)
dt_remap_factor=-1
dt_tracer_factor=-1
Resolution | Timesteps | Namelist settings | Notes |
---|---|---|---|
1 degree (NE30) | dtime=1800 (ATM_NCPL=48) dt_remap=900 | nu=1e15 se_nsplit=2 | hypervis and TOM sponge layer done together. With timesplit sponge layer, can we get dt_dyn_vis=300? |
1/4 degree (NE120) | dtime=900 (ATM_NCPL=96) dt_remap=150 | nu=1.5e13 se_nsplit=6 | CFL estimates suggest: hypervis_subcycle=3 should work. if we timesplit out the sponge layer, we would probably need: hypervis_subcycle=2, hypervis_subcycle_tom=3 which is unlikely to be more efficient. |
1/8 degree (NE240) | |||
Documentation on parameter tuning
ftype experiments including spectra sensitivity to PG2 and transport.
Documentation on algorithms
SL transport
NH formulation
Related pages
old page documenting XML changes to integrate theta-l: E3SM NH integration: namelist defaults