-
Notifications
You must be signed in to change notification settings - Fork 739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use new nf-test features #6286
Use new nf-test features #6286
Conversation
edmundmiller
commented
Aug 23, 2024
- ci: Attempt to split everything out
- ci: Add changed since, sharding, and ci
- ci: Add filter to try to get jobs split up
- ci: Switch to only-changed
- ci: See if follow-dependencies works without "related-tests"
489e5fb
to
26653c8
Compare
That would close #391 as well, and might encourage people to make smaller PRs if we're only giving them 9 CI runners per PR anyway. |
6030ae8
to
5594071
Compare
Current things blocking this is nf-core/tools#3140 or we can give up and keep the paths-filter step for that |
9cd5419
to
8748f04
Compare
Removed the need to use nf-core lint with pre-commit, as that wasn't truely blocking this from getting merged. We can make a follow up issue to run every linting step through pre-commit. |
Looking good, there's still some leftover comments in the nf-test yml, but I like it |
@maxulysse I think I've address the comments that can be addressed. @adamrtalbot I think this goes slightly against #4545 (comment) but I think it's worth splitting up since we should be running pytest-workflow less and less, and the sharding is a completely different flow. Though it will probably Start up the pytest-workflow workflow every PR... |
da25f80
to
2516a34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is getting stuck in a loop where there is too much data per runner and having no way of resolving it.
We over shard now, but at least it's a 1:1 relationship which makes it easy to diagnose an issue. We need to be confident we don't end up in a death loop in the middle of a hackathon.
In addition, we're over-provisioning 3 machines when 1 would suffice.
Still, this has fewer moving parts and is easier to replicate on a local machine, which makes it better under certain situations.
- uses: mirpedrol/paths-filter@main | ||
id: filter | ||
with: | ||
filters: "tests/config/pytest_modules.yml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This relies on the pytest_modules.yml tags, is that what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, probably just want it to run on any module that's changed. We could include both the nf-test and pytest_modules ymls though I think.
.github/workflows/nf-test.yml
Outdated
matrix: | ||
filter: [process, workflow] | ||
profile: [conda, docker_self_hosted, singularity] | ||
shard: [1, 2, 3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 is a very small number for sharding.
If we edit UNTAR, we will run hundreds of tests packed onto the same machines. Remember, if too many tests are packed onto 1 machine there is no way to fix it other than editing this file which essentially blocks the person who opened the PR.
Conversely, if we run 1 nf-test we will add 2 machines that do nothing but install nf-test and close down.
This might be the most practical way of doing this, so I'm not 100% against it but what's the rationale for the number 3 and can we document it somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to set it dynamically based on number of tests with an estimate of how many we can pack onto a machine? e.g. n_tests %% 5
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, by making it dyanmic we're back to the current problem of having a dynamic workflow ID which causes issues with GH branch protection.
So let's go back to my initial point. If you modify a major process like UNTAR, will this still succeed or will it run out of storage? Remember every nf-test invocation downloads all the data it needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to set it dynamically based on number of tests with an estimate of how many we can pack onto a machine? e.g. n_tests %% 5?
That would be the dream. But yeah I also wanted to avoid the dynamic number of jobs.
3 was just a random number that I picked. Trying to walk the line between runners sitting there, and not enough jobs.
Hot take: What if we just did a --dryRun
at the beginning of the shard and if there's no jobs, exit before we set anything else up.
.github/workflows/nf-test.yml
Outdated
strategy: | ||
fail-fast: false | ||
matrix: | ||
filter: [process, workflow] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter: [process, workflow] | |
filter: [process, workflow] |
Function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any in this repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep: https://github.com/nf-core/modules/blob/master/subworkflows/nf-core/utils_nfcore_pipeline/main.nf
Also, I'd like to encourage people to put this stuff in modules or a plugin so we don't fix it to the template.
confirm-pass: | ||
runs-on: ubuntu-latest | ||
needs: [pytest-changes, pytest] | ||
if: always() | ||
steps: | ||
- name: All tests ok | ||
if: ${{ success() || !contains(needs.*.result, 'failure') }} | ||
run: exit 0 | ||
- name: One or more tests failed | ||
if: ${{ contains(needs.*.result, 'failure') }} | ||
run: exit 1 | ||
|
||
- name: debug-print | ||
if: always() | ||
run: | | ||
echo "toJSON(needs) = ${{ toJSON(needs) }}" | ||
echo "toJSON(needs.*.result) = ${{ toJSON(needs.*.result) }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add one of these to every dynamic matrix, so it must be added to all the linting workflow as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I left it out orginally because I'm hoping to get everything to run in pre-commit and we can do away with the filters and make the branch protections happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no way of forcing someone to use a pre-commit and I'm wary of forcing people to install stuff on their laptops. In general, it's a good idea to verify code at the source of truth (github), not in isolation. It's much easier to prevent bad code being added then remove it later.
Perhaps we could run pre-commit before merging? Never really looked into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/nf-core/modules/blob/3c464e75051db485c1b37ab9f1ea2182fb3d3533/.github/workflows/test.yml#L29C2-L39C36
We already are ✨
I was talking about pre-commit
the tool, not a pre-commit git hook.
Agreed on not expecting people to jump through a ton of hoops to get their local dev environments set up. (But I think this might be over since gitpod switched to self-hosted, but they also adopt devcontainers, which if we can get those setup well, anyone with vscode and docker should be able to start hacking quickly.)
Also - the GHA says it uses actions/setup-java v4 but so does master and they're in conflict 😬 You should add v4.4.0 for that action as a comment since you are referencing an explicit commit. |
Stress test: #6716 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh just spotted. Currently the tests aren't doing anything (this is every test!):
🚀 nf-test 0.9.0
https://www.nf-test.com/
(c) 2021 - 2024 Lukas Forer and Sebastian Schoenherr
Load .nf-test/plugins/nft-bam/0.3.0/nft-bam-0.3.0.jar
Load .nf-test/plugins/nft-vcf/1.0.7/nft-vcf-1.0.7.jar
Nothing to do.
d924406
to
c47e407
Compare
Co-authored-by: mashehu <mashehu@users.noreply.github.com>
nextflow secrets set SENTIEON_AUTH_DATA $(python3 tests/modules/nf-core/sentieon/license_message.py encrypt --key "$SENTIEON_ENCRYPTION_KEY" --message "$SENTIEON_LICENSE_MESSAGE") | ||
|
||
# TODO If conda-fail.yml exists and matrix = conda skip | ||
# Really we need in-verse tag selection https://github.com/askimed/nf-test/issues/260 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we wait for that, maybe we can add it here: adamrtalbot/detect-nf-test-changes#12
apptainer.runOptions = '--no-mount tmp --writable-tmpfs --nv' | ||
singularity.runOptions = '--no-mount tmp --writable-tmpfs --nv' | ||
use_gpu = true | ||
} | ||
docker_self_hosted { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other gpu profile is outdated, needs platform argument
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So cut the other one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well doesn't matter which one to keep, but make sure it has all the settings from the one in master
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's two on master though 😆
modules/tests/config/nf-test.config
Lines 35 to 51 in af84433
gpu { | |
docker.runOptions = '-u $(id -u):$(id -g) --gpus device=0' | |
apptainer.runOptions = '--no-mount tmp --writable-tmpfs --nv' | |
singularity.runOptions = '--no-mount tmp --writable-tmpfs --nv' | |
use_gpu = true | |
} | |
docker_self_hosted { | |
docker.enabled = true | |
docker.fixOwnership = true | |
docker.runOptions = '--platform=linux/amd64' | |
} | |
gpu { | |
docker.runOptions = '-u $(id -u):$(id -g) --gpus all' | |
apptainer.runOptions = '--nv' | |
singularity.runOptions = '--nv' | |
use_gpu = true | |
} |
Co-authored-by: mashehu <mashehu@users.noreply.github.com>
https://github.com/nf-core/modules/pull/6286/files#r1846856369 Co-authored-by: mashehu <mashehu@users.noreply.github.com>
0764ed0
to
f97c351
Compare
3ab97fa
to
a5557af
Compare
This reverts commit d134e78.
Going to merge since the broken tests aren't in the scope, and hopefully I catch any buygs before Europe wakes up 🤞🏻 |