Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with long running processes? #52

Open
sckott opened this issue Dec 20, 2024 · 5 comments
Open

How to deal with long running processes? #52

sckott opened this issue Dec 20, 2024 · 5 comments
Labels
infrastructure Infrastructure fix to execute WDL GitHub Actions question Further information is requested

Comments

@sckott
Copy link
Collaborator

sckott commented Dec 20, 2024

(pulling info from a slack thread convo with @dtenenba on 2024-12-20 so it doesn't get lost)

We have some "long" running processes that we'll have to deal with for tests. Long here may be 30 seconds or multiple hours, both of which need specific handling - a request to get metadata about job X after 2 seconds may error or return a different set of data than the same request after job X has completed.

Some options:

  • use test fixtures - record a real http response and then use that cached http response insetad of making a real request to the cromwell server - this will be very fast
  • use mocks - similar to fixtures, but instead of the whole http response saved to a file that we read (the fixture) we just specify the returned json in a variable
  • forget about tests that depend on longer running operations? seems not ideal
  • on a cron schedule submit all the wdls/jsons that have longer running operations - say once a day - when we run tests we just use the results already in the test user DB

If things can take a long time and we're okay with that (we don't use the above approaches of fixtures, mocks, etc.), then we need a different testing setup than is typical (where you do X and Y and then run the unit test right away). How would that work?

Some background info:

curl -X POST \
  -H "Accept: application/vnd.github+json" \
  -H "Authorization: Bearer YOUR_GITHUB_TOKEN" \
  https://api.github.com/repos/OWNER/REPO/actions/workflows/WORKFLOW_ID/dispatches \
  -d '{"ref":"main"}'

It's not clear yet what the solution is, but likely involves some system where one workflow (.yml file) submits workflows, then another runs tests that depend on those workflows (presumably all in final state on cromwell) - how the 2nd system knows when to run is up for debate; possibly some script polling for all workflows being done

@sckott sckott added the question Further information is requested label Dec 20, 2024
@sckott
Copy link
Collaborator Author

sckott commented Jan 6, 2025

Once we have a solution here, we'll want to add additional tests that deal with final states, e.g., right now the cromwell api tests are dealing with initial state after workflow submission, which of course only covers initial state conditions.

@sckott
Copy link
Collaborator Author

sckott commented Jan 6, 2025

@dtenenba @tefirman @seankross Thoughts on this?

@seankross
Copy link
Collaborator

@sckott, based on our discussion, let me know if you agree with this (others are encouraged to weigh in too of course):

For testing the PROOF API, we'll have the following kinds of tests:

  • Short jobs that submitted on PR, push, or any other circumstance that is appropriate.
  • Fixtures that are updated frequently via cron.

Also, tests should only evaluate jobs when they have just been submitted, or they are completely finished running (at either the start or end of their lifecycle).

@sckott
Copy link
Collaborator Author

sckott commented Jan 9, 2025

Yep, agree @seankross

@sckott
Copy link
Collaborator Author

sckott commented Jan 9, 2025

Work started locally on my machine on branch vcr incorporating vcrpy and some other pkgs to cache http requests/responses. This will stay local only until we've worked out how to minimize any exposed sensitive strings in test fixtures (aka: vcr cassettes).

vcr notes and todos:

  • vcr was branched off of api-tests-adding-more - that branch merged to main, so vcr branch is off of main now (merged locally on my machine)
  • the change of dir cromwell-api to cromwellapi originally occurred on vcr branche has been cherry picked onto api-tests-adding-more main
  • make sure that anyone working on vcr cassettes locally is using the token for the test user in 1password
  • maybe change vcr matchers to ignore the hostname that can change (e.g., gizmok99.fhcrc.org:44444) even for the same user

To do still for vcr work:

  • figure out what things need to be hidden. so far excluding authorization header from all requests
  • api-tests.yml workflow has a new rewrite cassettes job, BUT it needs a step in the job that commits any changes to repo
  • think about what tests really need vcr and what tests do not benefit from it (and may add more complexity than is really needed)
    • focus on FINAL state (succeeded, failed, etc.) testing and probably leave initial state to not use vcr
  • ...

@tefirman tefirman added the infrastructure Infrastructure fix to execute WDL GitHub Actions label Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Infrastructure fix to execute WDL GitHub Actions question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants