Archive Irregularities - HR-v1

This is the catalog of archive issues in HR-v1, obtained by post-processing the zstash index.db file manifests (filenames only, without actually extracting and opening the files.)

Note: Only the “Production runs” (1950-Control-HR and 1950-Control-21yrContHiVol-HR) are considered here.

The 1950-Control-21yrContHiVol-HR simulation was zstashed uniformly in the archive “theta.20190910.branch_noCNT.n438b.unc03.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG“.

The 1950-Control-HR simulation was given in a single archive “theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG”, but divided into two separate runs and internal tar-paths, the first run (years 0006 through 0045) under the tar path “run-0006-01-01-180907--0046-01-01-190111/”, with the follow-on run (years 0046 through 0055) archived under the path simply labeled “run/”

The following table summarizes the irregularities found:

Archive

Dataset

Issue

theta.20190910.branch_noCNT.n438b.unc03.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-21yrContHiVol-HR,ens1,ocn_nat_5day

year 0077 contains only file 0077-01-01_00.00.00 (extra file?)

theta.20190910.branch_noCNT.n438b.unc03.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-21yrContHiVol-HR,ens1,ocn_nat_globalStats

year 0077 contains only file 0077-01-01_00.00.00 (extra file?)

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,atm_nat_3hr

set 0006 to 0045 ends “0046-01-11-10800”, set 0046 to 0055 begins 0046-01-01-10800

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,atm_nat_6hr

set 0006 to 0045 ends “0046-01-11-10800”, set 0046 to 0055 begins 0046-01-01-10800

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,atm_nat_6hr_snap

set 0006 to 0045 ends “0046-01-11-10800”, set 0046 to 0055 begins 0046-01-01-10800

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,atm_nat_day

set 0006 to 0045 ends “0046-01-12-00000”, set 0046 to 0055 begins 0046-01-02-00000

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,atm_nat_mon

Both set 0006 to 0045 and set 0046 to 0055 contain file 0045-12, latter is a broken link [*]

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,ocn_nat_5day

Both set 0006 to 0045 and set 0046 to 0055 contain file 0046-01-01. Equal?

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,ocn_nat_globalStats

Both set 0006 to 0045 and set 0046 to 0055 contain file 0046-01-01. Equal?

theta.20180906.branch_noCNT.A_WCYCL1950S_CMIP6_HR.ne120_oRRS18v3_ICG

HR-v1,1_0,1950-Control-HR,ens1,sea-ice_nat_day

MISSING set 0006 to 0045 except 0045-11, 0045-12, and 0046-01.

 

[*] The issue with 1950-Control-HR,ens1,atm_nat_mon was caught pre-publication, and not obvious. The files for years 0026-0045 were extracted from the first archive path and were solid, including the last file (0045-12). The second set (years 0046-0055) inadvertently contained another entry for the file 0045-12, but was a broken link to a non-local directory. Extracting the second set in preparation for publication had the effect of clobbering the “good” 0045-12 with a broken link, which required re-extracting that file from the first archive.