-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job status query fails if slurm accounting storage is disabled #38
Comments
Thank you for this report. We definitively need to update the error message! We had the fallback in the executor, but decided to drop it to be able to check the states in asynchronous mode with one command. A cluster without accounting db is pretty unusual. Re-introducing the fallback might not be so easy. Is your particular cluster in an experimental stage? |
Thanks for looking at this, I know this is a weird edge case. The cluster in question is somewhat artisanal. I think the slurm cluster profile may be a workable fallback for me. And it looks like 6a197ae fixes the issue of |
Perhaps. Then again, you might want to use storage plugins and/or other plugins. That would be a mess. Is there any chance your admins set up the cluster ... eh, properly? |
As you mentioned, sometimes slurm accouting is not enabled in non-production and the way to support this is to use the cluster-generic plugin to run the job, is that correct?
|
I get an error when running a job on with a slurm instance whose accounting storage is disabled (i.e. the
sacct
command just repliesSlurm accounting storage is disabled
). Here's the stack trace :Looks like there's some error handling here:
snakemake-executor-plugin-slurm/snakemake_executor_plugin_slurm/__init__.py
Line 221 in 7e3de33
But after the loop over attempts to get job status the rest of the code assumes no error and treats
status_of_jobs
as a valid set.The slurm profile also uses
sacct
but falls back toscontrol
if that fails, might be a solution : https://github.com/Snakemake-Profiles/slurm/blob/c44315217d1ce36493dc7dccbd013528657747f9/%7B%7Bcookiecutter.profile_name%7D%7D/slurm-status.py#L40The text was updated successfully, but these errors were encountered: