Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc]: Update docs to not place log file in directory being archived. #332

Closed
forsyth2 opened this issue Apr 8, 2024 · 3 comments · Fixed by #333
Closed

[Doc]: Update docs to not place log file in directory being archived. #332

forsyth2 opened this issue Apr 8, 2024 · 3 comments · Fixed by #333
Assignees
Labels
Documentation Files in `docs` modified

Comments

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 8, 2024

Describe your documentation update

In https://e3sm-project.github.io/zstash/_build/html/main/best_practices.html, under “Archive” – we need to put the log file in different directory, otherwise zstash will crash trying to archive a file being modified. Typically put it under the zstash subdir instead. (mkdir zstash; ... tee zstash/zstash_create_20190226.log) instead. Check if that is done in other areas and fix it in those places too.

@forsyth2 forsyth2 added the Documentation Files in `docs` modified label Apr 8, 2024
@forsyth2 forsyth2 self-assigned this Apr 8, 2024
@forsyth2
Copy link
Collaborator Author

forsyth2 commented Apr 8, 2024

@golaz Does this only happen if there's a large number of files/tars? Or only on a function besides create?

I tried to make a Minimal Complete Verifiable Example (MCVE) to fill out a bug report (to eventually fix the underlying issue), but it seems to work for me:

$ mkdir zstash_20240408; echo 'file0 stuff' > zstash_20240408/file0.txt
$ cd zstash_20240408/
$ zstash create --hpss=zstash_archive_20240408 . 2>&1 | tee zstash_create_20240408.log
$ ls
file0.txt  zstash  zstash_create_20240408.log
$ ls zstash
index.db

The workaround also works:

$ mkdir zstash_20240408v2; echo 'file0 stuff' > zstash_20240408v2/file0.txt
$ cd zstash_20240408v2/
$ ls
$ mkdir zstash
$ zstash create --hpss=zstash_archive_20240408v2 . 2>&1 | tee zstash/zstash_create_20240408v2.log
$ ls
file0.txt  zstash
$ ls zstash/
index.db  zstash_create_20240408v2.log

@golaz
Copy link
Collaborator

golaz commented Apr 11, 2024

@forsyth2, two reasons why your example failed to reproduce the bug:

  1. The log file cannot be the last file being archived.
  2. The error is only visible when you run zstash extract or zstash check

Here is a modified example that will reproduce (verified on chrysalis)

# Create zstash archive
mkdir zstash_20240408
echo 'file0 stuff' > zstash_20240408/file0.txt
cd zstash_20240408/
zstash create --hpss=none . 2>&1 | tee 20240408.log

# Now, try to extract it
rm -f 20240408.log file0.txt
zstash extract --hpss=none "*"
For help, please see https://e3sm-project.github.io/zstash. Ask questions at https://github.com/E3SM-Project/zstash/discussions/categories/q-a.
INFO: zstash/000000.tar exists. Checking expected size matches actual size.
INFO: Opening tar archive zstash/000000.tar
INFO: Extracting 20240408.log
ERROR: md5 mismatch for: 20240408.log
ERROR: md5 of extracted file: a8600c75b3d84cdaefd020cf13fb6556
ERROR: md5 of original file:  00a33f0fdfbe470ae5b32123cc3e372c
INFO: Extracting file0.txt
Traceback (most recent call last):
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.3_login/lib/python3.10/site-packages/zstash/extract.py", line 535, in extractFiles
    tarinfo: tarfile.TarInfo = tar.tarinfo.fromtarfile(tar)
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.3_login/lib/python3.10/tarfile.py", line 1293, in fromtarfile
    obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.3_login/lib/python3.10/tarfile.py", line 1237, in frombuf
    raise InvalidHeaderError("bad checksum")
tarfile.InvalidHeaderError: bad checksum
ERROR: Retrieving file0.txt
ERROR: Encountered an error for files:
ERROR: 20240408.log in 000000.tar
ERROR: file0.txt in 000000.tar
ERROR: The following tar archives had errors:
ERROR: 000000.tar

# Furthermore, try extracting the tar file directly
cd zstash
tar xvf 000000.tar 
20240408.log
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

# The only way to salvage data from the tar file is to use cpio (ironically)
cpio -ivd -H ustar < 000000.tar
././@PaxHeader
20240408.log
cpio: invalid header: checksum error
cpio: warning: skipped 29 bytes of junk
cpio: ././@PaxHeader not created: newer or same age version exists
././@PaxHeader
file0.txt
10 blocks

# At least the data file is recoverable even if the log file is not
cat file0.txt
file0 stuff

@forsyth2
Copy link
Collaborator Author

Thanks @golaz. I documented the issue in #335.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Files in `docs` modified
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants