B04. zstash: HPSS long-term archiving tool



Poster Title

zstash: HPSS long-term archiving tool

First AuthorChris Golaz
Topicsoftware tools, infrastructure
AffiliationE3SM Water Cycle
Link to document


Title

zstash: HPSS long-term archiving tool

Authors

Chris Golaz, LLNL

Abstract

E3SM simulations generate large amount of data that need be archived on HPSS (High Performance Storage System). For optimal performance, storage on HPSS should consist of a relatively small number of large files. Therefore, it is not possible to directly archive individual E3SM model output files on HPSS.

Zstash is a python command line utility developed to serve E3SM long-term archiving needs. With zstash, files are archived into standard tar files with a user specified maximum size. Tar files are created locally, then transferred to HPSS. For improved performance, md5 checksums of input files are computed on-the-fly during archiving. Checksums and additional metadata is stored in a database. File integrity is verified by computing checksums on-the-fly during extraction.