EAM and SCREAM uses the HOMME dycore package. HOMME has both Fortran and C++ versions with different options for the prognostic thermodynamic variable (preqx use temperature while theta-l uses potential temperature) and those in turn have some additional capabilities.
...
Summary of HOMME capabilities available for different versions.
SL-transport (C++ only) | phys grid | non-hydrostatic option | GPU option | Allows RRM grids | E3SM version used | |
HOMME preqx Fortran | no | yes | no | yes (OpenACC) | yes | v1 |
HOMME preqx C++ | no | yes | no | yes (Kokkos) | no | only available in standalone HOMME |
HOMME theta-l Fortran | yes (Fortran calling C++/Kokkos) | yes | yes | yes (SL only via Kokkos, in a branch)1 | yes | v2 |
HOMME theta-l C++ | yes (in a branch) | not yet | yes | yes (Kokkos) | not yet | v3, SCREAM |
When building Kokkos for theta-l Fortran, the "cpu" target is used by default because running only the SL on the GPU would not give good performance.
New features that did not make the V2 cutoff:
These features will hopefully be turned on for use in V3 development, after the V2 maintenance branch is created.
Physics mass adjustment due to phase change: apply locally to dp3d instead of to surface pressure. (reduces magnitude of artificial pressure work term)
Adopt tensorHV for NE30 grids. (it is the default for all other grids)
Turn on topographic improvements and switch to rougher topography
Using the THETA dycore in E3SM:
...
dt_dyn: The CFL condition for dt_dyn with uniform resolutions is well understood and depends on the horizontal resolution and the speed of the Lamb wave (~340m/s). This assumes the use of the KG5 RK method (tstep_type=5) for hydrostatic, and the related IMEX method (tstep_type=9 or 10) for NH.
NE30: dt_dyn = 300 (probably stable up to 360)
NE60: dt_dyn = 150
NE120: dt_dyn = 75
NE240: dt_dyn = 40
NE256: dt_dyn <= 37.536
NE512: dt_dyn <= 18.75
NE1024 dt_dyn <= 9.375
dt_tracer: The tracer CFL condition is controlled by the maximum advective velocity (~200m/s) which usually occurs near the model top. For Eulerian advection HOMME uses a RK3 SSP and the CFL also depends on the mesh spacing. For SL tracers, the limit is governed by the size of the halo exchange.
...
For the THETA dycore, we need to determine the stable dt_remap, dt_dyn, dt_vis and dt_vis_tom. We then take dt_vis_q=dt_vis and with SL tracers, we will assume that dt_tracer = 6*dt_dyn.
Run 10 days with very small, inefficient timesteps. Take dt_remap=dt_dyn and use small values for dt_dyn, dt_vis and dt_vis_tom. Create a new IC file (see INITHIST in EAM). Use this new IC file for all future runs below
dt_vis_tom: Use the CFL condition (take S=2) printed in atm.log to determine the largest safe value of dt_vis_tom. In practice the code appears to be unstable around S=3.3, and it is good not to run right at the stability limit.
dt_dyn: Keeping all other timesteps below their known stable limits (which takes some care due to how they are all set) make a serious of runs increasing just dt_dyn until the run crashes. May need to make relatively long runs (several months) to ensure stability.
dt_vis: Keeping all other timesteps fixed find the largest stable value. For THETA, with the recommended viscosity coefficients, dt_vis = dt should be stable. For some RRM grids, dt_vis might be need to be slightly smaller due to mesh distortion. When the viscous CFL is violated (dt_vis too large), the run usually crashes within a couple of steps.
The procedure outlined above can find timesteps that are borderline unstable, but don’t blow up do to various dissipation mechanisms. Hence it is a good idea to run 3 months, and look at the monthly mean OMEGA500 from the 3rd month. This field will be noisy, but there should not be any obvious grid artifacts. Weak instabilities can be masked by the large transients in flow snapshots, so it best to look at time averages.
dt_remap: Using 2*dt_dyn is a conservative choice that is well tested up to ne240. Decreasing dt_remap results in more frequent vertical remaps resulting in increased vertical dissipation. If dt_remap is too large, the code may crash with negative layer thickness errors. The code has a "dp3d" limiter that can prevent some of these crashes. If this limiter is triggered (warnings in e3sm.log file), that can mean either dt_remap is too large of one of the CFL conditions has been violated and the code is unstable.
During this tuning process, it is useful to compare the smallest ‘dx’ from the atmosphere log file to the smallest ‘dx’ from the global uniform high resolution run. Use the ‘dx’ based on the singular values of Dinv, not the ‘dx’ based on element area. If the ‘dx’ for newmesh.g is 20% smaller than the value from the global uniform grid, it suggests the advective timesteps might need to be 20% smaller. The code prints out CFL estimates that are rough approximation that can be used to check if you are in the correct ballpark.
...
With RRM grids, the timesteps will be controlled by the highest resolution region. So with an RRM grid with refinement down to NE120, the timesteps should be close to what we run on a uniform cubed-sphere NE120 grid. The timesteps may need to be slightly smaller because of the deformed elements in the transition region. With a hiqh quality RRM mesh ( Max Dinv-based element distortion metric <= 4, see Generate the Grid Mesh (Exodus) File for a new Regionally-Refined Grid) we can usually run with the expected dt_dyn and dt_tracer values, and only the viscosity timesteps need to be slightly reduced.
Spreadsheet for looking at scaling with resolution of constant and tensor coefficient HV:
https://docs.google.com/spreadsheets/d/1LHTl2_A065pfdWC69OHmXvNL_v1484cXg7ZowhyEbPU/edit?usp=sharing
...
WARNING on dtime: in the table below we give "dtime", the physics timestep. This is also a namelist option in EAM, but it will be ignored. The only way to set dtime is to set ATM_NCPL = ( 24*60*60 / dtime) in env_run.XML
Resolution | Timesteps | Namelist settings | Notes | Tested? |
---|---|---|---|---|
7.5 degree | dtime=7200 dt_tracer=3600 | nu_top = 2.5e5 se_tstep=600 | Ultra low res for regression testing only. | |
1 degree (NE30) | dtime=1800 dt_tracer=1800 | nu_top = 2.5e5 se_tstep=300 If backward compatability is needed for V2 water cycle, change HV defaults to: hypervis_scaling=0 | E3SMv2 NE30 does not use tensorHV to avoid having to retune. Should change in v3. With dt_remap=1800, we see occasional (every 2-3 years) dp3d limiter activation, meaning that the model levels are approaching zero. This appears to be due to strong divergence above tropical cyclones created by one of the parameterizations. | HS+topo(72L): H and NH (H can run at t 360s but not 400s with either tstep_type=4 or 5). dt_remap=600 runs with no limiter warnings, while dt_remap=900 crashes with dp3d<0 at surface. F-EAMv1-AQP1: H and NH FAV1C-L: H and NH |
NE45 | dtime=1200. dt_tracer=1200 dt_vis_tom=200 | nu_top=2.5e5 se_tstep=200 | ||
1/4 degree (NE120) | dtime=900 dt_remap=150 | nu_top=1e5 se_tstep=75 | CFL estimates suggest: dt_vis_tom*nu_top <= 31*2.5e5 nu_top=2.5e5 would need | HS+topo(72L): H and NH (with dt_remap=75 and theta limiter to handle unphysical boundary layer) F-EAMv1-AQP1: H and NH, both 72 and 128 levels (1+ years) FC5AV1C-H01B: NH 72L runs several years. |
12km (NE256) | dtime=600 dt_tracer=200 NOTE: these defaults were updated 2021/9 based on SCREAM v0.1 3km runs. But NE256 is known to run stably at slightly larger timsteps: dt_tracer=300 | nu_top=4e4 | nu_tom=4e4 is running at the code's estimate of the CFL limit with S=1.9 | F-EAMv1-AQP1:
FC5AV1C-H01B: NH 128L run for several months dt=37.5/75/300/600. Occasional problems near coastlines - considering ( ) reducing dtime, increasing HV, tunning CLUBB dtime=600 used in: https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021MS002805 |
6km (NE512) | dtime=200 dt_tracer=100 | nu_top=2e4 | CFL estimates suggest: dt_vis_tom*nu_top <= 1.7*2.5e5, nu_top=2.5e5 needs hypervis_subcycle_tom=13 | F-EAMv1-AQP1:
FC5AV1C-H01B: NH 128L run for 1 day with dt=18.75/37.5/150/300, then NaNs in microphysics (not yet debugged) |
3km (NE1024) | dtime=100 dt_tracer=50 | nu_top=1e4 | CFL estimates suggest: dt_vis_tom*nu_top <= 0.43*2.5e5 with nu_top=2.5e5, | F-EAMv1-AQP1:
FC5AV1C-H01B: SCREAM v0: run for 40 days with constantHV, dt=9.375/18.75/75/75. SCREAM v0.1: switch to tensorHV (slightly less diffusion), needs dt_dyn<=9s, dt_tracer<60 |
RRM | dtime=??? dycore timesteps should be set based on the finest region in the RRM. | nu_top=Uncertain - needs more research. Should probably switch to tensor laplacian. For NE30→NE120 grids, start with NE120 constant coefficient value, 1e5. | RRM uses a tensor HV formulation which scales with resolution dx^3.0 (For preqx, we used a dx^3.2 scaling. ) To determine the effective HV coefficient at a given resolution "dx", use: nu_tensor = nu_const *( 2*rearth /((np-1)*dx))^{hv_scaling} * rearth^{-4.0}. i.e. tensor nu=3.4e-8 when used at 1 degree resolution (dx=110,000m, np=4, rearth=6.376e6) is equivalent to 1e15 m^4/s. |
PREQX Default Settings (for reference)
...
se_tstep=-1 ( timesteps set via se_nsplit, rsplit, qsplit)
dt_remap_factor=-1
dt_tracer_factor=-1
Resolution | Timesteps | Namelist settings | Notes |
---|---|---|---|
1 degree (NE30) | dtime=1800 (ATM_NCPL=48) dt_remap=900 | nu=1e15 se_nsplit=2 | hypervis and TOM sponge layer done together. With timesplit sponge layer, can we get dt_dyn_vis=300? |
1/4 degree (NE120) | dtime=900 (ATM_NCPL=96) dt_remap=150 | nu=1.5e13 se_nsplit=6 | CFL estimates suggest: hypervis_subcycle=3 should work. if we timesplit out the sponge layer, we would probably need: hypervis_subcycle=2, hypervis_subcycle_tom=3 which is unlikely to be more efficient. |
1/8 degree (NE240) | |||
Documentation on parameter tuning
...