...
The Domain “Dataset Spec”: This global specification contains the static configuration information that “anchors” the process to a specific domain (e.g. “E3SM datasets”). This document (/p/user_pub/e3sm/staging/resource/dataset_spec.yaml) details the metadata that defines each (E3SM) dataset, experiment, model version(s), resolutions, realms, grids, frequencies, etc. By “walking” the branches of this document, the complete list of E3SM dataset_ids (as reflected in the ESGF “master_id”) may be generated. Subsets of these dataset_ids are passed as tokens to those processes intended to operate upon the corresponding datasets.
The Process “Transition Graph”: This global specification contains the transition rules that define the path(s) of conditional processing.
...
Beyond just serving to check-point and condition the state of future processing, these files can be broadly surveyed to determine and report upon the status of the entire dataset warehouse (which datasets are at a particular stage of processing), and to study things like “How often was process X engaged” or “How much time was spent in a particular processing stage”, or “What fraction of time is spent per stage”, etc.
(work in progress)
[This section under construction: random notes may appear here as long as “(work in progress)” appears above.]
NOTE: Publication:Pub_Push operation.
Moves a publishable warehouse dataset to the correspondingly-versioned publication directory.
If mapfile already generated in warehouse, the version_directory “.mapfile” is moved as well, else it is subsequently generated and set to that location.
The warehouse ensemble directory “.status” file remains in the warehouse until the 'warehouse:Eviction' occurs, at which point the .status file is moved to the publication ensemble directory.
Therefore, processes that operate primarily over the publication directory (possibly Mapfile_Gen, Pub_Commit, Pub_Verify) must be able to translate between a dataset’s warehouse and publication paths, in order to locate and update the .status file. For a detailed exposition, see: /wiki/spaces/EIDMG/pages/2907766794
Operational State Machine
To install and operate the existing warehouse state machine (Validate, PostProcess, Publish), see: https://github.com/E3SM-Project/esgfpub/blob/master/docs/3_warehouse.md