B04. zstash: HPSS long-term archiving tool
Title
zstash: HPSS long-term archiving tool
Authors
Chris Golaz, LLNL
Abstract
E3SM simulations generate large amount of data that need be archived on HPSS (High Performance Storage System). For optimal performance, storage on HPSS should consist of a relatively small number of large files. Therefore, it is not possible to directly archive individual E3SM model output files on HPSS.
Zstash is a python command line utility developed to serve E3SM long-term archiving needs. With zstash, files are archived into standard tar files with a user specified maximum size. Tar files are created locally, then transferred to HPSS. For improved performance, md5 checksums of input files are computed on-the-fly during archiving. Checksums and additional metadata is stored in a database. File integrity is verified by computing checksums on-the-fly during extraction.