The purpose of this page is to document E3SM project’s policy for managing the E3SM simulations that need to be archived, including official simulations and important internal working simulations. The following steps are required to be done immediately after simulation completion to avoid data loss/corruption and time consuming re-runs.
Note E3SM’s policy change: All 3 steps are now required steps.
1. Short term archiving
...
Note that using zstash is required when archiving E3SM data. For the systems that do not provide HPSS, use zstash with “--hpss=none” to create a tar files to be then copied to LLNL NERSC HPSS for permanent storage.
The original model output should be archived on NERSC HPSS using zstash:
...
Documenting the HPSS locations on a central confluence page is a required step and it is helpful for everyone in the project who might have a need to locate the files. Members from infrastructure group will be closely monitoring these pages. Once a new simulation is entered, the data will be copied to a centralized space at LLNL (E3SM Archive - Data Source and Transfer Status ) for further post-processing (i.e. ESGF publication). A default set of simulation data will be published to ESGF for official simulations that follow CMIP protocols. Native model output will, by default, be available from NERSC HPSS.
After v3 release, publishing native datasets to ESGF is on-demand only, and must be requested by group leaders. Please drop an email to Jill Chengzhu Zhang (zhang40@llnl.gov) and e3sm-data-support@llnl.gov for special publication requirements (i.e., specify experiments, specific years, cmip publication, etc.).
We use a set of confluence pages for documenting data locations and other details once a simulation is completed, as follows:
For v2 and v3: a project wide page is pages are created /wiki/spaces/ED/pages/2766340117 and /wiki/spaces/ED/pages/4282679297 , and all the production runs are required to have an additional copy available through the NERSC Science Gateway:
...
Cryosphere: /wiki/spaces/ECG/pages/1736933506
A project wide page was later generated for consolidating the data locations: /wiki/spaces/ED/pages/4495441922
The public-facing data documentation webpage for published simulations is maintained through e3sm_data_doc: https://docs.e3sm.org/e3sm_data_docs/_build/html/index.html