Infrastructure Breakouts Notes

Topics Speed Dating Topics

Round 1: Water Cycle and HES

GCAM coupling: several code bases. LULCC, emissions passed to E3SM.

Want to rewrite part that passed from GCAM to E3SM.

Current working version is based on v2.1

Rebase to v3 is underway by Balwinder. Should be straigthforward

Can create_newcase. Have a few cases with other models. Need to do some extra file moving.

GCAM: human component mesh same as land grid.

GCAM internally does mapping from GCAM to that.

HES (BGC) bgc PRs coming in that will change climate

 

Running ensembles: doing one member at a time so no need to manage many at once right now.

Partly CMOR-ize to get variables in time series format.

Postprocessing of ensembles: space limitations. Splitting jobs between compy and chrysalis.

Any publishing of v3? Not yet. Also ESGF is changing. “CMIP6” closing.

Land group is interested in data compression options

Round 2: EAMxx+ Omega

Tools/tests; both groups using ctest.

Sywalker: comparing output in yaml from two small subroutines from Fortran and C++. Will flag non-bfb diffs. Provides infrastructure to add to Fortran to write these files. Low-level unit testing with limited input and output.

EAMxx generates LOTS of ctests. Different ranks, different threads. Write a few lines, get dozens of tests generated. Part of ekat. cmake, ctest, kokkos utils that are generic

Documentation of all that? Needs to be worked on. examples are in EAMxx

Omega and EAMxx have been talking about C++ standards.

shared utilities directory. E3SM/share/ for any shared code that lives in monorepo

E3SM/externals is for actualy submodules

 

Rolling incremental releases? will stil do that.

 

Versioning of external libraries. Like Kokkos. Need to test against candidate releases.

 

Round 3: Land and v3 Atm

 

Status of land on MPAS. ICOM is generating the meshes. can include features like dam’s, watersheds, coastal refinement. Generates global mesh in one shot. E3SM needs to define resolution we want for simulation campaign. Generate mesh. Then generate land surface data sets. Generate mosart stream network. Land river I-cases can then be run. Need to coordinated with MPAS-O.

tools are on github under ICOM.

High-res ELM: branch has been rebased recently. Will split into a few PRs. CPU first

GPU compile of shared code?

Post run: zppy and land. Whenever e3sm_diags gets run, do available land pieces as well. Time series, lat-lon plots, iLAMB pieces.

Global time series is done.

For iLAMB, need more formulas for generating CMIP6 variables from ELM output.

Add a base-flow index diagnostic? Need to talk to RUBSICO to get that in iLAMB.

Also soil moisture memory.

ELM is reporting water, energy, carbon, missing energy term needs to pass into coupler for conservation check, coupler only cares about fluxes at the surface

EAMv3 is reporting water, energy, aerosals (dust, black carbon, more with chemistry).

Coupling of aereosols/dust with land? Nitrogen deposition coupling is important. Not very good data sets to do it offline.

land-atmosphere interaction diags. ARM sites have these diags. Can they be applied globally?

“coupling strength” metric? Ongoing research in ARM on identifying those.

We incrementally add new ARM features to e3sm_diags. Can do them globally if fields are available globally. Paul Deirm? web page.

Tools for making RRM data? Interpinic works but needs to know the compset. Inline interpolation was in ELM when we started. Not sure its still working. Not straigt interpolation. neighboring pfts.

 

 

Round 4: Polar and Ice

high-order mapping was turned off for rain/snow in v3 because of conservation errors.

Can turn it back on if you want to see its effect.

Or turn it on just for atmosphere to land.

Andrew B and Jon are working on conservation issue

LivvKit will be made part of zppy workflow this year. After Antarctic port of its diags.

Can add things to zppy as a plug in or more direct.

zppy workflow items can depend on other output

coupler support? Jon has been helping add things.

MOAB team will handle transition.

time evolving domains. Still in the future. Developments needed in land model.

Will there be documentation on how workflow changes with moab coupler? (setting things up.) Yes

MALI builds: will be on limited machine. Need to report its installation version and record in history.

Kokkos compatiability with EAMxx, ALbany, OMEGA.

Albany will need to also test against upcoming Kokkos releases.

MPAS-seaice is reporting global integrals (not salt)

MALI - working on these.

Salt budget needs to be added.

cprnc testing of MPAS files. Need to improve on hacky workarounds.

Sea-ice has separate test that does do bit-for-bit. custom script. See how much can be converted to CIME.

 

Round 5: Performance, Coupled, AI/ML

Pacer will provide GPU timing. Internal or through interface.

model tracker:

One page to get to model results, provenance, data, performance.

Page generated automatically. Editable. Manual notes.

Way of designating which pages have the official runs. campaigns production sims, latest best master.

Automate creating of latest best master climate.

Need better way to get around machine fails as long run occurs. Less babysitting. Robust recovery from fails. Archiving to HPSS as you go. Archiving with zstash. Want notifications. md5 checking. Work on globus-zstash interaction.

Globus certs used to be good for weeks. Now needs 2 days. Should be a way to get around that.

 

Getting around bad nodes. Give some extra nodes to slurm job. Include flag to not exit if a node fails. Nature of node fail matters. Restart the run without exiting the queue.

 

MOAB and future versions; v4 is easier because there’s one global surface mesh. Just need the mask for which points are land, ocean, sea-ice, land-ice.

 

Beyond v4: wetting and drying. Mesh doesn’t change. Just need to add logic for dynamic masks.

The fewer fractions the better.

 

mbtempest and self-intersecting elements. Bring in NCO’s workaround.

mesh generation: with uniform surface mesh. Need to run ICOM tools.

Data management for v3 HR. Lossless compression? Yes lets do it.

2x savings in file size. Best performing is write with ADIOS. Then use tool to convert to compressed NetCDF.

History files are output in ADIOS BP format in RUNDIR

Tool converts them to NetCDF. Integrated in to CIME automatically is you chose ADIOS as PIO_TYPENAME. Seperate job launched on 1/4th of parent jobs processors.

Output to Zarr?

NetCDF4 can write/read zarr v2. PR is up for zaar v3.

Give flag when opening file on if its zaar and API doesn’t change.

Could convert ADIOS to Zarr.