Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Commit

Permalink
Overwritten "Payload exceeded max allowed memory" fix
Browse files Browse the repository at this point in the history
  • Loading branch information
PalNilsson committed Aug 6, 2018
1 parent 95899e6 commit 074ae3e
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 0 deletions.
7 changes: 7 additions & 0 deletions ATLASExperiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -1439,6 +1439,13 @@ def interpretPayloadStdout(self, job, res, getstatusoutput_was_interrupted, curr
else:
job.pilotErrorDiag = "Payload failed due to unknown reason (check payload stdout)"
job.result[2] = error.ERR_UNKNOWN

# Any errors due to signals can be ignored if the job was killed because of out of memory
if os.path.exists(os.path.join(job.workdir, "MEMORYEXCEEDED")):
tolog("Ignoring any previously detected errors (like signals) since MEMORYEXCEEDED file was found")
job.pilotErrorDiag = "Payload exceeded maximum allowed memory"
job.result[2] = error.ERR_PAYLOADEXCEEDMAXMEM

tolog("!!FAILED!!3000!! %s" % (job.pilotErrorDiag))

# set the trf diag error
Expand Down
4 changes: 4 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,10 @@ Log tailing (requested by R. Walker)
list_replicas()
- Specifying --pfn in rucio download, stageIn(), which will prevent list_replicas() from being used on server side (rucio_sitemover)

Overwritten "Payload exceeded max allowed memory" fix
- Now setting ERR_PAYLOADEXCEEDMAXMEM if MEMORYEXCEEDED file detected at the end of interpretPayloadStdout() to prevent
signal error from being set instead. Requested by R. Walker (ATLASExperiment)

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

TODO:
Expand Down

0 comments on commit 074ae3e

Please sign in to comment.