diff --git a/sites/main-site/src/content/blog/2024/maintainers-minutes-2024-11-29.mdx b/sites/main-site/src/content/blog/2024/maintainers-minutes-2024-11-29.mdx index 32130b5a44..99ceaa0501 100644 --- a/sites/main-site/src/content/blog/2024/maintainers-minutes-2024-11-29.mdx +++ b/sites/main-site/src/content/blog/2024/maintainers-minutes-2024-11-29.mdx @@ -22,80 +22,95 @@ by providing brief summaries of the monthly team meetings. In the November meeting we discussed (amongst others) the following topics: -- [Stricter SemVer versioning](stricter-semver-versioning) -- [Push for BAM to CRAM?](push-for-bam-to-cram) -- [Removal of `workflowCitation()`](removal-of-workflowcitation) -- [Recent CI approach changes](recent-ci-approach-changes) -- [Modules piling up](modules-piling-up) +- [Stricter SemVer versioning](#Stricter SemVer versioning) +- [Push for BAM to CRAM?](#Push for BAM to CRAM?) +- [Removal of `workflowCitation()`](#Removal of `workflowCitation()`) +- [Recent CI approach changes](#Recent CI approach changes) +- [Modules piling up](#Modules piling up) ## Stricter SemVer versioning -We dived immediately into what was expected to be a prickly topic: Jon Manning raised the 'loose' use of [Semantic Versioning (SemVer)](https://semver.org/) within nf-core pipeline release version numbers. +We immediately dove into what was expected to be a prickly topic: Jon Manning raised concerns about the 'loose' use of [Semantic Versioning (SemVer)](https://semver.org/), within nf-core pipeline release versioning. -Jon brought up that different pipelines take different approaches, and few follow 'true' SemVer rules. -The main issue was 'major releases': in SemVer, going from 1.0.0 to 2.0.0 indicates a _breaking change_, where the way a user (or computer) interacts with the software has changed in a way that the way they are used to using the software will not work any more. -However in many nf-core pipelines, developers have used the major version bump to indicate major new functionalities or large back-end refactoring of the code base, even if it doesn't affect the way a user interacts with the pipeline. +Jon pointed out that different pipelines take different approaches, with few adhering strictly to ‘true’ SemVer rules. The primary issue revolved around ‘major releases.’ According to SemVer, transitioning from version 1.0.0 to 2.0.0 signifies a breaking change—a change that alters the software’s functionality such that existing usage patterns no longer work. However, many nf-core pipelines have used major version bumps to indicate substantial new functionalities or extensive back-end codebase refactoring, even when these changes do not disrupt user interactions. -There was a discussion on pros and cons of this with other versioning systems. +This sparked a discussion about the pros and cons of alternative versioning systems. -James Fellows Yates and Jose Espinosa felt -that the 'misuse' of SemVer within nf-core was partly due to the SemVer specification being very technical, and hard to easily -'convert' to how this applies to pipelines, and thus misinterpreting what it means - particularly when comparing to the definition -of 'major' releases in existing pipelines. +James Fellows Yates and Jose Espinosa suggested that the ‘misuse’ of SemVer within nf-core stems partly from the technical nature of the SemVer specification, which can be challenging to apply to pipelines. This complexity often leads to misinterpretation, particularly when comparing the formal SemVer definition of ‘major’ releases to how they are currently used in pipelines. -Florian Wuennemann continued to display his love of AI ([Nextflow advent calendar](https://flowuenne.github.io/nextflow_advent_calender/) -anyone?) by proposing having an LLM scan a code base on `dev` branches and work out the versioning for you. This was so not -well received by the AI skeptics in the group, but it did bring up the question: could we provide tooling and automation -to help developers evaluate what is the best version bump type for a particular release. +Florian Wuennemann showcasing his enthusiasm for AI ([Nextflow advent calendar](https://flowuenne.github.io/nextflow_advent_calender/) +anyone?), proposed using a large language model (LLM) to scan the codebase on dev branches and determine the appropriate versioning automatically. While this idea was met with skepticism from some AI critics in the group, it sparked a productive conversation about whether tooling and automation could assist developers in evaluating the appropriate version bump for a release. -We agreed that Jon would prepare a [PR](https://github.com/nf-core/website/pull/2842) where the rest of the community can propose what they consider to be the 'rules' about what defines different types of release, following SemVer as far as possible (note the PR has since been merged, but it's still open for updating by the rest of the community). -Then with these specifications Júlia Mir Pedrol and the infrastructure team would explore tooling to propose during linting the 'best fitting' version bump prior to a new release. +The group agreed that Jon would prepare a [PR](https://github.com/nf-core/website/pull/2842) where the community could collectively define ‘rules’ for different release types, adhering to SemVer principles as much as possible. (Note: The PR has since been merged, but it remains open for updates from the community.) Following this, Júlia Mir Pedrol and the infrastructure team committed to exploring tooling that could suggest the most appropriate version bump during the linting process before a new release. ## Push for BAM to CRAM? Maxime Garcia spoke on behalf of Friederike Hanssen (Rike) - who asked whether we should have a wholesale push to replace BAMs with CRAMs across all modules in nf-core. + who raised the question of whether nf-core should initiate a widespread push to replace BAM files with CRAM files across all modules. -[CRAMs]() are a highly compressed variant of SAM and BAM files, that can in many places greatly reduce hard-drive usage of reference genome-aligned DNA/RNA sequencing data files. +[CRAM]() files are a highly compressed version of SAM and BAM files, offering significant reductions in hard drive usage for reference genome-aligned DNA/RNA sequencing data. -While CRAMs are natively supported by modern versions the `htslib` `SAMTools` suite of bioinformatics tools and libraries, there was a lot of hesitation from many of the maintainers to start requiring or pushing harder on this (despite the potential benefits). -Many of the weary bioinformaticians in the group were concerned that this would require a lot more work than it may seem (Adam Talbot and James Fellows Yates simultaneously bitterly laughed in 'C' and 'Ancient DNA' tools), as knowing bioinformaticians, many tools commonly used in nf-core pipelines may not support more recent versions of `HTSlib` or may have hardcoded BAM as a required input format. -Getting this to work everywhere would likely require not just updating the nf-core module but in many cases patching the code-base of the tools themselves - something that would be a whole other project to execute due to variety of languages the tools are in and the number of 'abandoned' tools. +Although CRAM files are natively supported by modern versions of the HTSlib SAMTools suite, many maintainers hesitated to enforce or advocate for their adoption despite the potential benefits. Concerns centered around the substantial effort that such a shift would entail. (Adam Talbot and James Fellows Yates exchanged knowing, bitter laughs over their experiences with ‘C’ programming and ancient DNA tools.) -We concluded on asking Rike to propose ways of making more awareness during reviewing and maybe stronger emphasis in the module specifications to encourage the adoption of the format where possible. +The primary challenge lies in tool compatibility. Many bioinformatics tools used in nf-core pipelines either lack support for recent versions of HTSlib or have hardcoded BAM as the required input format. Transitioning to CRAM would not only involve updating nf-core modules but could also necessitate patching the codebases of numerous tools—often written in diverse programming languages, and many of which are no longer actively maintained. + +The group agreed to avoid a broad mandate for now. Instead, Rike was encouraged to propose ways to raise awareness of CRAM’s benefits during the review process. Additionally, there was consensus on emphasizing CRAM adoption in the module specifications to encourage its use where feasible. ## Removal of `workflowCitation()` -Júlia Mir Pedrol showed her [proposal](https://github.com/nf-core/modules/pull/7094) -for removing the `workflowCitation()` function from the nf-core pipeline template. +Júlia Mir Pedrol presented her [proposal](https://github.com/nf-core/modules/pull/7094) +to remove the workflowCitation() function from the nf-core pipeline template. -This function was previously used for printing citation information at the beginning of a pipeline run, to encourage users to cite the pipeline's publication, the nf-core project and other dependencies. -However it had not been used elsewhere in the pipeline template for a while. +This function was originally used to display citation information at the start of a pipeline run, encouraging users to cite the pipeline’s publication, the nf-core project, and relevant dependencies. However, it had not been actively used elsewhere in the pipeline template for some time. -While she had previously asked on Slack, she wanted to check with the maintainers team if anyone was using it before dropping it - to which she she received a unanimous '[PURGE](https://github.com/nf-core/modules/pull/7094#pullrequestreview-2470115528) IT'. +Although Júlia had previously inquired about its usage on Slack, she wanted to confirm with the maintainers whether anyone was still relying on it before removing it entirely. The response was a unanimous '[PURGE](https://github.com/nf-core/modules/pull/7094#pullrequestreview-2470115528) IT'. -So in case you have being using the function for your own purposes - be aware it'll no longer be in the nf-core template from the next nf-core/tools release! +If you’ve been using this function for your own purposes, take note: it will no longer be part of the nf-core template starting with the next release of nf-core/tools! ## Recent CI approach changes -Sateesh Peri gave an overview of the major changes in way the nf-core/modules CI -tests execute that were implemented over a couple of weeks in November by a small team including himself, -Edmund Miller, and a recent maintainer team member addition Usman Rashid, among others. +Sateesh Peri provided an overview of significant changes to the nf-core/modules CI testing approach, implemented over a few weeks in November by a small team that included himself, Edmund Miller, and a recent addition to maintainers team, Usman Rashid, among others. + +Key Updates to CI Testing +1. Introduction of nf-test ‘Shards’ +The use of nf-test shards enables tests defined in the nf-test script to run in parallel instead of sequentially. This significantly speeds up module development cycles by providing faster feedback, especially for failures, enabling quicker iteration and resolution. +2. --changed-since Feature for Smarter Testing +The --changed-since flag allows tests to focus on code changes made since a specified commit hash or branch name. By default, this parameter uses HEAD, but developers can specify other targets for more targeted validation. This feature improves CI efficiency by avoiding redundant tests and reduces the environmental impact of running tests. +3. Support for GPU-Dependent Module Testing +Modules requiring GPU-enabled tools can now be tested effectively. A new gpu tag triggers these tests to run on ‘self-hosted’ runners outside GitHub Actions, utilizing nf-core credits generously donated by AWS. + +> The nf-core/modules CI can be seen in the modules repo `.github` folder [here](https://github.com/nf-core/modules/tree/master/.github/workflows) + +Pipeline CI Workflow Updates: + +Pipelines using GPU-dependent modules must update their CI workflows to support GPU testing. Currently, nf-core/methylseq serves as a test case for these CI updates. -The two main changes were the use of nf-test 'shards', allowing different tests defined in the nf-test script file to run in parallel rather than sequentially. -This means a faster development cycle of modules by having faster successes, or more importantly, fails. -A second major change is the ability to properly test modules that use tools that require GPUs, with a specific tag now triggering the test to run on 'self-hosted' runners outside of GitHub using nf-core credits kindly donated by AWS. +The updated CI workflow involves two main github actions: +• **nf-test-shard**: Performs a dry run of nf-test with the `--changed-since` flag, filtering by tags. It outputs the test shard and total number of shards. +• **nf-test**: Accepts inputs such as the profile, shard, and total shard count. It filters by tags and runs the tests accordingly. -Some of the other maintainers brought up a couple of issues such as the now 'uninformative' test names making it harder to know which test failed without clicking through main layers of the GitHub Actions interface, however there are more improvements in the works for this from various areas. +There are now two YAML workflows in the repository: +• **nf-test.yml**: For non-GPU pipelines, running the nf-test-shard and nf-test actions. +• **nf-test-gpu.yml**: For GPU-dependent pipelines, combining the same shard and test actions but tailored for GPU testing. -The team is not finished yet however, +> The nf-core/methylseq CI updates are under review and will be included in the upcoming `v2.8.0` release. In the meantime, the implementation PR can be viewed [here](https://github.com/nf-core/methylseq/pull/478) + +Current Challenges + +While these improvements mark significant progress, they aren’t without challenges. Some maintainers noted that the test names have become less informative, making it harder to quickly identify failing tests without navigating through several layers of the GitHub Actions interface. However, the team is already working on further enhancements to address these usability issues. + +Work in Progress + +The team emphasized that these changes are just the beginning, with additional refinements and optimizations still in the pipeline especially with more nf-test CI features like `--excludeTags` feature for easier handling of tests. ## Modules piling up -To wrap up, our favourite module maestro Simon Pearce brought up that the number of open pull-requests on the module repository is starting to pile up. +To close out the discussion, our friendly neighbourhood module maestro Simon Pearce raised a concern about the growing number of open pull requests (PRs) in the module repository. + +Simon encouraged community members to spare some time to help review and merge as many PRs as possible before the new year. Even 10 minutes of effort could make a significant difference in reducing the backlog! -It would be great to have more community members help get as many as possible in before the new year, so if you have a spare 10 minutes help us get them reviewed and merged in! +Sateesh Peri highlighted another issue: many PRs lack proper labels—such as the `Ready for Review` label—which could help developers filter and prioritize PRs that are ready for review. Sateesh suggested making an announcement to encourage contributors to add appropriate labels based on the status of their PRs, streamlining the review process. ## The end