From 073eb27795b7cb91f3a3448ebb3036d6f69a30f6 Mon Sep 17 00:00:00 2001 From: FranBonath Date: Wed, 20 Mar 2024 09:54:38 +0100 Subject: [PATCH 1/3] new structure tools training WIP I --- .../gitpod_environment.md | 82 ++ .../nf_core_basic_training/index.md | 76 ++ .../nf_core_create_tool.md | 1143 +++++++++++++++++ .../template_walk_through.md | 198 +++ 4 files changed, 1499 insertions(+) create mode 100644 src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/index.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/template_walk_through.md diff --git a/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md b/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md new file mode 100644 index 0000000000..09d51b5bdc --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md @@ -0,0 +1,82 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: A guide to create Nextflow pipelines using nf-core tools +--- + +## Preparation + +### Prerequisites + +- Familiarity with Nextflow syntax and configuration. + +### Follow the training videos + +This training can be followed either based on this documentation alone, or via a training video hosted on youtube. You can find the youtube video in the Youtube playlist below: + +(no such video yet) + +### Gitpod + +For this tutorial we will use Gitpod, which runs in the learners web browser. The Gitpod environment contains a preconfigured Nextflow development environment +which includes a terminal, file editor, file browser, Nextflow, and nf-core tools. To use Gitpod, you will need: + +- A GitHub account +- Web browser (Google Chrome, Firefox) +- Internet connection + +Click the link and log in using your GitHub account to start the tutorial: + +

+ + Launch GitPod + +

+ +For more information about Gitpod, including how to make your own Gitpod environement, see the Gitpod bytesize talk on youtube (link to the bytesize talk), +check the [nf-core Gitpod documentation](gitpod/index) or [Gitpod's own documentation](https://www.gitpod.io/docs). + +
+ Expand this section for instructions to explore your Gitpod environment + +#### Explore your Gitpod interface + +You should now see something similar to the following: + +(insert Gitpod welcome image) + +- **The sidebar** allows you to customize your Gitpod environment and perform basic tasks (copy, paste, open files, search, git, etc.). Click the Explorer button to see which files are in this repository. +- **The terminal** allows you to run all the programs in the repository. For example, both `nextflow` and `docker` are installed and can be executed. +- **The main window** allows you to view and edit files. Clicking on a file in the explorer will open it within the main window. You should also see the nf-training material browser (). + +To test that the environment is working correctly, type the following into the terminal: + +```bash +nextflow info +``` + +This should come up with the Nextflow version and runtime information: + +``` +Version: 23.10.0 build 5889 +Created: 15-10-2023 15:07 UTC (15:07 GMT) +System: Linux 6.1.54-060154-generic +Runtime: Groovy 3.0.19 on OpenJDK 64-Bit Server VM 17.0.8-internal+0-adhoc..src +Encoding: UTF-8 (UTF-8) +``` + +#### Reopening a Gitpod session + +When a Gitpod session is not used for a while, i.e., goes idle, it will timeout and close the interface. +You can reopen the environment from . Find your previous environment in the list, then select the ellipsis (three dots icon) and select Open. + +If you have saved the URL for your previous Gitpod environment, you can simply open it in your browser. + +Alternatively, you can start a new workspace by following the Gitpod URL: + +If you have lost your environment, you can find the main scripts used in this tutorial in the `nf-training` directory. + +#### Saving files from Gitpod to your local machine + +To save any file locally from the explorer panel, right-click the file and select Download. + +
diff --git a/src/content/docs/contributing/nf_core_basic_training/index.md b/src/content/docs/contributing/nf_core_basic_training/index.md new file mode 100644 index 0000000000..bafb263a47 --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/index.md @@ -0,0 +1,76 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: A guide to create Nextflow pipelines using nf-core tools +--- + +## Scope + +- How do I create a pipeline using nf-core tools? +- How do I incorporate modules from nf-core modules? +- How can I use custom code in my pipeline? + +:::note + +### Learning objectives + +- The learner will create a simple pipeline using the nf-core template. +- The learner will identify key files in the pipeline. +- The learner will lint their pipeline code to identify work to be done. +- The learner will incorporate modules from nf-core/modules into their pipeline. +- The learner will add custom code as a local module into their pipeline. +- The learner will build an nf-core schema to describe and validate pipeline parameters. + +::: + +This training course aims to demonstrate how to build an nf-core pipeline using the nf-core pipeline template and nf-core modules as well as custom, local modules. Be aware that we are not going to explain any fundamental Nextflow concepts, as such we advise anyone taking this course to have completed the [Basic Nextflow Training Workshop](https://training.nextflow.io/). + +```md +During this course we are going to build a Simple RNA-Seq workflow. +This workflow is by no means ment to be a useful bioinformatics workflow, +but should only teach the objectives of the course, so please, +**DO NOT use this workflow to analyse RNA sequencing data**! +``` + +## Overview + +### Layout of the pipeline + +The course is going to build an (totally unscientific and useless) RNA seq pipeline that does the following: + +1. Indexing of a transcriptome file +2. Quality control +3. Quantification of transcripts +4. [whatever the custom script does] +5. Generation of a MultiQC report + +### Let's get started + +1. **Setting up the gitpod environment for the course** + +The course is using gitpod in order to avoid the time expense for downloading and installing tools and data. [Learn how to setup the GitPod environment](/docs/contributing/nf_core_basic_training/gitpod_environment.md) + +2. **Creating a new nf-core pipeline from the nf-core template** + + a) generate the pipeline with `nf-core create` + + b) The template git repository + + c) Running the pipeline using the test profile + + d) Linting the pipeline + + [These steps are described in the section "Generate a pipeline with nf-core tools"](/docs/contributing/nf_core_basic_training/nf_core_create_tool.md) + +3. **Exploring the nf-core template files** + + The template contains a range of important files and directories. Check them out in the [walk-through of the template files](/docs/contributing/nf_core_basic_training/template_walk_through.md) + +4. **Building a nf-core pipeline using the template** + + a) [Adding a nf-core module to your pipeline]() + + b) [Adding a local custom module to your pipeline]() + + c) [Working with Nextflow schema]() + + d) [Linting your modules]() diff --git a/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md b/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md new file mode 100644 index 0000000000..fb2e42d1a4 --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md @@ -0,0 +1,1143 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: A guide to create Nextflow pipelines using nf-core tools +--- + +## Explore nf-core/tools + +The nf-core/tools package is already installed in the gitpod environment. Now you can check out which pipelines, subworkflows and modules are available via tools. To see all available commands of nf-core tools, run the following: + +```bash +nf-core --help +``` + +We will touch on most of the commands for developers later throughout this tutorial. + +## Create a pipeline from template + +To get started with your new pipeline, run the create command: + +```bash +nf-core create +``` + +This should open a command prompt similar to this: + +``` + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +? Workflow name demotest +? Description test pipeline for demo +? Author FBo +? Do you want to customize which parts of the template are used? No +INFO Creating new nf-core pipeline: 'nf-core/demotest' +INFO Initialising pipeline git repository +INFO Done. Remember to add a remote and push to GitHub: + cd /workspace/basic_training/nf-core-demotest + git remote add origin git@github.com:USERNAME/REPO_NAME.git + git push --all origin +INFO This will also push your newly created dev branch and the TEMPLATE branch for syncing. +INFO !!!!!! IMPORTANT !!!!!! + If you are interested in adding your pipeline to the nf-core community, + PLEASE COME AND TALK TO US IN THE NF-CORE SLACK BEFORE WRITING ANY CODE! + + Please read: https://nf-co.re/developers/adding_pipelines#join-the-community +``` + +Although you can provide options on the command line, it’s easiest to use the interactive prompts. For now we are assuming that we want to create a new nf-core pipeline, so we chose not to customize the template. +It is possible to use nf-core tools for non-nf-core pipelines, but the setup of such pipelines will be handled in a later chapter # ARE WE GOING TO DO THIS? + +### Pipeline git repo + +The nf-core create command has made a fully fledged pipeline for you. Before getting too carried away looking at all of the files, note that it has also initiated a git repository: + +```bash +cd nf-core-demotest +git status +``` + +``` +On branch master +nothing to commit, working tree clean +``` + +It’s actually created three branches for you: + +```bash +git branch +``` + +``` + TEMPLATE + dev +* master +``` + +Each have the same initial commit, with the vanilla template: + +```bash +git log +``` + +``` +commit 77e77783aab19e47ce6b2d736d766fbef2483de8 (HEAD -> master, dev, TEMPLATE) +Author: Franziska Bonath +Date: Tue Oct 31 15:21:33 2023 +0000 + + initial template build from nf-core/tools, version 2.10 +``` + +This is important, because this shared git history with unmodified nf-core template in the TEMPLATE branch is how the nf-core automated template synchronisation works (see the docs for more details). + +The main thing to remember with this is that: + +Never make changes to the TEMPLATE branch, otherwise it will interfere with the synchronisation of nf-core updates. + +Ideally code should be developed on feature branches (i.e. a new branch made with `git checkout -b my_new_feature`), and when ready merged into the `dev` branch upon a successful code review. The `dev` branch is then merged to the `master` branch when a stable release of the workflow is ready to be made. + +When creating a new repository on GitHub, create it as an empty repository without a README or any other file. Then push the repo with the template of your new pipeline from your local clone. + +:::tip{title="Exercise 1 - Getting around the git environment"} + +1. Create and switch to a new git branch called `demo`. +
+ solution 1 + + ```bash + git checkout -b demo + ``` + +
+ +2. Display all available git branches. +
+ solution 2 + + ```bash + git branch + ``` + +
+ +3. Create a directory within the new pipeline directory called `results` and add it to the `.gitignore` file. +
+ solution 3 + + ```bash + mkdir results + ``` + + ```groovy title=".gitignore" + .nextflow* + work/ + data/ + results/ + .DS_Store + testing/ + testing* + *.pyc + results/ + ``` + +
+ +4. Commit the changes you have made. +
+ solution 4 + + ```bash + git add . + git commit -m "creating results dir and adding it to gitignore" + ``` + +
+ ::: + +### Run the new pipeline + +The new pipeline should run with Nextflow, right out of the box. Let’s try: + +```bash +cd ../ +nextflow run nf-core-demotest/ -profile test,docker --outdir test_results +``` + +This basic template pipeline contains already the FastQC and MultiQC modules, which do run on a selection of test data. + + +## Customising the template + +In many of the files generated by the nf-core template, you’ll find code comments that look like this: + +``` +// TODO nf-core: Do something here +``` + +These are markers to help you get started with customising the template code as you write your pipeline. Editor tools such as Todo tree help you easily navigate these and work your way through them. + +## Linting your pipeline + +Customising the template is part of writing your new pipeline. However, not all files should be edited - indeed, nf-core strives to promote standardisation amongst pipelines. + +To try to keep pipelines up to date and using the same code where possible, we have an automated code linting tool for nf-core pipelines. Running nf-core lint will run a comprehensive test suite against your pipeline: + +```bash +cd nf-core-demotest/ +nf-core lint +``` + +The first time you run this command it will download some modules and then perform the linting tests. Linting tests can have one of four statuses: pass, ignore, warn or fail. For example, at first you will see a large number of warnings about TODO comments, letting you know that you haven’t finished setting up your new pipeline. Warnings are ok at this stage, but should be cleared up before a pipeline release. + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +INFO Testing pipeline: . + +╭─ [!] 24 Pipeline Test Warnings ──────────────────────────────────────────────────────────────────────────────────────────────────╮ +│ │ +│ readme: README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release). │ +│ pipeline_todos: TODO string in README.md: TODO nf-core: │ +│ pipeline_todos: TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core │ +│ pipeline_todos: TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline │ +│ pipeline_todos: TODO string in README.md: Describe the minimum required steps to execute the pipeline, e.g. how to prepare │ +│ samplesheets. │ +│ pipeline_todos: TODO string in README.md: update the following command to include all required parameters for a minimal example │ +│ pipeline_todos: TODO string in README.md: If applicable, make list of people who have also contributed │ +│ pipeline_todos: TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo │ +│ doi and badge at the top of this file. │ +│ pipeline_todos: TODO string in README.md: Add bibliography of tools and data used in your pipeline │ +│ pipeline_todos: TODO string in main.nf: Remove this line if you don't need a FASTA file │ +│ pipeline_todos: TODO string in nextflow.config: Specify your pipeline's command line flags │ +│ pipeline_todos: TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required │ +│ pipeline_todos: TODO string in ci.yml: You can customise CI pipeline run tests as required │ +│ pipeline_todos: TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, │ +│ e.g. add publication citation for this pipeline │ +│ pipeline_todos: TODO string in base.config: Check the defaults for all processes │ +│ pipeline_todos: TODO string in base.config: Customise requirements for specific processes. │ +│ pipeline_todos: TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets │ +│ pipeline_todos: TODO string in test.config: Give any required params for the test so that command line flags are not needed │ +│ pipeline_todos: TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly │ +│ in repositories, e.g. SRA) │ +│ pipeline_todos: TODO string in test_full.config: Give any required params for the test so that command line flags are not needed │ +│ pipeline_todos: TODO string in output.md: Write this documentation describing your workflow's output │ +│ pipeline_todos: TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, │ +│ please point to (and add to) the main nf-core website. │ +│ pipeline_todos: TODO string in WorkflowDemotest.groovy: Optionally add in-text citation tools to this list. │ +│ pipeline_todos: TODO string in WorkflowMain.groovy: Add Zenodo DOI for pipeline after first release │ +│ │ +╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ + +╭─ [!] 3 Module Test Warnings ─────────────────────────────────────────────────────────────────────────────────────────────────────╮ +│ ╷ ╷ │ +│ Module name │ File path │ Test message │ +│╶──────────────────────────────────────────┼──────────────────────────────────────────┼──────────────────────────────────────────╴│ +│ custom/dumpsoftwareversions │ modules/nf-core/custom/dumpsoftwarevers… │ New version available │ +│ fastqc │ modules/nf-core/fastqc │ New version available │ +│ multiqc │ modules/nf-core/multiqc │ New version available │ +│ ╵ ╵ │ +╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ +╭───────────────────────╮ +│ LINT RESULTS SUMMARY │ +├───────────────────────┤ +│ [✔] 183 Tests Passed │ +│ [?] 0 Tests Ignored │ +│ [!] 27 Test Warnings │ +│ [✗] 0 Tests Failed │ +╰───────────────────────╯ +``` + +Failures are more serious however, and will typically prevent pull-requests from being merged. For example, if you edit CODE_OF_CONDUCT.md, which should match the template, you’ll get a pipeline lint test failure: + +```bash +echo "Edited" >> CODE_OF_CONDUCT.md +nf-core lint +``` + +``` +[...] +╭─ [✗] 1 Pipeline Test Failed ─────────────────────────────────────────────────────╮ +│ │ +│ files_unchanged: CODE_OF_CONDUCT.md does not match the template │ +│ │ +╰──────────────────────────────────────────────────────────────────────────────────╯ +[...] +╭───────────────────────╮ +│ LINT RESULTS SUMMARY │ +├───────────────────────┤ +│ [✔] 182 Tests Passed │ +│ [?] 0 Tests Ignored │ +│ [!] 27 Test Warnings │ +│ [✗] 1 Test Failed │ +╰───────────────────────╯ +[...] +``` + +:::tip{title="Exercise 3 - ToDos and linting"} + +1. Add the following bullet point list to the README file, where the ToDo indicates to describe the default steps to execute the pipeline + + ```groovy title="pipeline overview" + - Indexing of a transcriptome file + - Quality control + - Quantification of transcripts + - [whatever the custom script does] + - Generation of a MultiQC report + ``` + +
+ solution 1 + + ```bash title="README.md" + [...] + + ## Introduction + + **nf-core/a** is a bioinformatics pipeline that ... + + + + + + Default steps: + - Indexing of a transcriptome file + - Quality control + - Quantification of transcripts + - [whatever the custom script does] + - Generation of a MultiQC report + + 1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)) + 2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/)) + + [...] + + ``` + +
+ +2. Lint the changes you have made +
+ solution 2 + + ```bash + nf-core lint + ``` + + You should see that we now get one less `warning` in our lint overview, since we removed one of the TODO items. + +
+ +3. Commit your changes +
+ solution 3 + + ```bash + git add . + git commit -m "adding pipeline overview to pipeline README" + ``` + +
+ + ::: + +### Adding an existing nf-core module + +#### Identify available nf-core modules + +The nf-core pipeline template comes with a few nf-core/modules pre-installed. You can list these with the command below: + +```bash +nf-core modules list local +``` + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +INFO Modules installed in '.': + +┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ +┃ Module Name ┃ Repository ┃ Version SHA ┃ Message ┃ Date ┃ +┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ +│ custom/dumpsoftwareversions │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ +│ │ │ │ biocontainers (#3380) │ │ +│ fastqc │ https://github.com/nf-cor… │ bd8092b67b5103bdd52e300f75… │ Add singularity.registry = │ 2023-07-01 │ +│ │ │ │ 'quay.io' for tests │ │ +│ │ │ │ (#3499) │ │ +│ multiqc │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ +│ │ │ │ biocontainers (#3380) │ │ +└─────────────────────────────┴────────────────────────────┴─────────────────────────────┴────────────────────────────┴────────────┘ + +``` + +These version hashes and repository information for the source of the modules are tracked in the modules.json file in the root of the repo. This file will automatically be updated by nf-core/tools when you create, remove or update modules. + +Let’s see if all of our modules are up-to-date: + +```bash +nf-core modules update +``` + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +? Update all modules or a single named module? All modules +? Do you want to view diffs of the proposed changes? No previews, just update everything +INFO Updating 'nf-core/custom/dumpsoftwareversions' +INFO Updating 'nf-core/fastqc' +INFO Updating 'nf-core/multiqc' +INFO Updates complete ✨ +``` + +You can list all of the modules available on nf-core/modules via the command below but we have added search functionality to the nf-core website to do this too! + +```bash +nf-core modules list remote +``` + +#### Install a remote nf-core module + +To install a remote nf-core module from the website, you can first get information about a tool, including the installation command by executing: + +```bash +nf-core modules info salmon/index +``` + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + +╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ +│ 🌐 Repository: https://github.com/nf-core/modules.git │ +│ 🔧 Tools: salmon │ +│ 📖 Description: Create index for salmon │ +╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ +╷ ╷ +📥 Inputs │Description │Pattern +╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ +genome_fasta (file) │Fasta file of the reference genome │ +╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ +transcriptome_fasta (file)│Fasta file of the reference transcriptome │ +╵ ╵ +╷ ╷ +📤 Outputs │Description │ Pattern +╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ +index (directory)│Folder containing the star index files │ salmon +╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ +versions (file) │File containing software versions │versions.yml +╵ ╵ + +💻 Installation command: nf-core modules install salmon/index + +``` + +:::tip{title="Exercise 4 - Identification of available nf-core modules"} + +1. Get information abou the nf-core module `salmon/quant`. +
+ solution 1 + + ``` + nf-core modules info salmon/quant + ``` + +
+ +2. Is there any version of `salmon/quant` already installed locally? +
+ solution 2 + + ``` + nf-core modules list local + ``` + + If `salmon/quant` is not listed, there is no local version installed. + +
+ ::: + +The output from the info command will among other things give you the nf-core/tools installation command, lets see what it is doing: + +```bash +nf-core modules install salmon/index +``` + +``` + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ +│ 🌐 Repository: https://github.com/nf-core/modules.git │ +│ 🔧 Tools: salmon │ +│ 📖 Description: Create index for salmon │ +╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ + ╷ ╷ + 📥 Inputs │Description │Pattern +╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ + genome_fasta (file) │Fasta file of the reference genome │ +╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ + transcriptome_fasta (file)│Fasta file of the reference transcriptome │ + ╵ ╵ + ╷ ╷ + 📤 Outputs │Description │ Pattern +╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ + index (directory)│Folder containing the star index files │ salmon +╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ + versions (file) │File containing software versions │versions.yml + ╵ ╵ + + 💻 Installation command: nf-core modules install salmon/index + +gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core modules install salmon/index + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +INFO Installing 'salmon/index' +INFO Use the following statement to include this module: + + include { SALMON_INDEX } from '../modules/nf-core/salmon/index/main' +``` + +(lots of steps missing here) +exercise to add a different module would be nice! => salmon/quant! +comparison to simple nextflow pipeline from the basic Nextflow training would be nice!) + +:::tip{title="Exercise 5 - Installing a remote module from nf-core"} + +1. Install the nf-core module `adapterremoval` +
+ solution 1 + + ```bash + nf-core modules install adapterremoval + ``` + +
+ +2. Which file(s) were/are added and what does it / do they do? +
+ solution 2 + + ``` + Installation added the module directory `/workspace/basic_training/nf-core-demotest/modules/nf-core/adapterremoval`: + . + ├── environment.yml + ├── main.nf + ├── meta.yml + └── tests + ├── main.nf.test + ├── main.nf.test.snap + └── tags.yml + + The `test` directory contains all information required to perform basic tests for the module, it rarely needs to be changed. `main.nf` is the main workflow file that contains the module code. All input and output variables of the module are described in the `meta.yml` file, whereas the `environment.yml` file contains the dependancies of the module. + ``` + +
+ +3. Import the installed `adapterremoval` pipeline into your main workflow. +
+ solution 3 + + ```bash title="workflows/demotest.nf" + [...] + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + include { FASTQC } from '../modules/nf-core/fastqc/main' + include { MULTIQC } from '../modules/nf-core/multiqc/main' + include { paramsSummaryMap } from 'plugin/nf-validation' + include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' + include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' + include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_demotest_pipeline' + include { ADAPTERREMOVAL } from '../modules/nf-core/adapterremoval/main' + + [...] + + ``` + +
+ +4. Call the `ADAPTERREMOVAL` process in your workflow +
+ solution 4 + + ```bash title="workflows/demotest.nf" + [...] + FASTQC ( + ch_samplesheet + ) + ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.collect{it[1]}) + ch_versions = ch_versions.mix(FASTQC.out.versions.first()) + + // + // MODULE: ADAPTERREMOVAL + // + ADAPTERREMOVAL( + + ) + [...] + ``` + +
+ +5. Add required parameters for `adapterremoval`to the `ADAPTERREMOVAL` process +
+ solution 5 + + `adapterremoval` requires three input channels: `meta`, `reads` and `adapterlist`, as outlined in the `meta.yml` of the module. `meta` and `reads` are typically given in one channel as a metamap, whereas the `adapterlist` will be it's own channel for which we should give a path. See here: + + ```bash title="adapterremoval/main.nf" + [...] + input: + tuple val(meta), path(reads) + path(adapterlist) + [...] + ``` + + The meta map containing the metadata and the reads can be taken directly from the samplesheet as is the case for FastQC, therefore we can give it the input channel `ch_samplesheet`. The `adapterlist` could either be a fixed path, or a parameter that is given on the command line. For now, we will just add a dummy channel called `adapterlist` assuming that it will be a parameter given in the command line. With this, the new module call for adapterremoval looks as follows: + + ```bash title="workflows/demotest.nf" + [...] + // + // MODULE: ADAPTERREMOVAL + // + ADAPTERREMOVAL( + ch_samplesheet + params.adapterlist + ) + [...] + ``` + +
+ +6. Add the input parameter `adapterlist` +
+ solution 7 + In order to use `params.adapterlist` we need to add the parameter to the `nextflow.config`. + + ```bash title="nextflow.config" + /// Global default params, used in configs + params { + + /// TODO nf-core: Specify your pipeline's command line flags + /// Input options + input = null + adapterlist = null + + [...] + ``` + + Then use the `nf-core schema build` tool to have the new parameter integrated into `nextflow_schema.json`. The output should look as follows. + + ``` + gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core schema build + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.13.1 - https://nf-co.re + + INFO [✓] Default parameters match schema validation + INFO [✓] Pipeline schema looks valid (found 32 params) + ✨ Found 'params.test' in the pipeline config, but not in the schema. Add to pipeline schema? [y/n]: y + ``` + + Select y on the final prompt to launch a web browser to edit your schema graphically. + +
+ +7. Lint your pipeline +
+ solution 7 + + ```bash + nf-core lint + ``` + +
+ +8. Run the pipeline and inspect the results +
+ solution 8 + + To run the pipeline, be aware that we now need to specify a file containing the adapters. As such, we create a new file called "adapterlist.txt" and add the adapter sequence "[WE NEED AN ADAPTER SEQUENCE HERE]" to it. Then we can run the pipeline as follows: + + ```bash + nextflow run nf-core-demotest/ -profile test,docker --outdir test_results --adapterlist /path/to/adapterlist.txt + + ``` + +
+ +9. Commit the changes +
+ solution 9 + + ```bash + git add . + git commit -m "add adapterremoval module" + ``` + +
+ +::: + +### Adding a local module + +If there is no nf-core module available for the software you want to include, the nf-core tools package can also aid in the generation of a local module that is specific for your pipeline. To add a local module run the following: + +``` + +nf-core modules create + +``` + +Open ./modules/local/demo/module.nf and start customising this to your needs whilst working your way through the extensive TODO comments! + +### Making a local module for a custom script + +To generate a module for a custom script you need to follow the same steps when adding a remote module. +Then, you can supply the command for your script in the `script` block but your script needs to be present +and _executable_ in the `bin` +folder of the pipeline. +In the nf-core pipelines, +this folder is in the main directory and you can see in [`rnaseq`](https://github.com/nf-core/rnaseq). +Let's look at an publicly available example in this pipeline, +for instance [`tximport.r`](https://github.com/nf-core/rnaseq/blob/master/bin/tximport.r). +This is an Rscript present in the [`bin`](https://github.com/nf-core/rnaseq/tree/master/bin) of the pipeline. +We can find the module that runs this script in +[`modules/local/tximport`](https://github.com/nf-core/rnaseq/blob/master/modules/local/tximport/main.nf). +As we can see the script is being called in the `script` block, note that `tximport.r` is +being executed as if it was called from the command line and therefore needs to be _executable_. + +
+ +

TL;DR

+ +1. Write your script on any language (python, bash, R, + ruby). E.g. `maf2bed.py` +2. If not there yet, move your script to `bin` folder of + the pipeline and make it + executable (`chmod +x `) +3. Create a module with a single process to call your script from within the workflow. E.g. `./modules/local/convert_maf2bed/main.nf` +4. Include your new module in your workflow with the command `include {CONVERT_MAF2BED} from './modules/local/convert_maf2bed/main'` that is written before the workflow call. +
+ +_Tip: Try to follow best practices when writing a script for +reproducibility and maintenance purposes: add the +shebang (e.g. `#!/usr/bin/env python`), and a header +with description and type of license._ + +### 1. Write your script + +Let's create a simple custom script that converts a MAF file to a BED file called `maf2bed.py` and place it in the bin directory of our nf-core-testpipeline:: + +``` + +#!/usr/bin/env python +"""bash title="maf2bed.py" +Author: Raquel Manzano - @RaqManzano +Script: Convert MAF to BED format keeping ref and alt info +License: MIT +""" +import argparse +import pandas as pd + +def argparser(): +parser = argparse.ArgumentParser(description="") +parser.add_argument("-maf", "--mafin", help="MAF input file", required=True) +parser.add_argument("-bed", "--bedout", help="BED input file", required=True) +parser.add_argument( +"--extra", help="Extra columns to keep (space separated list)", nargs="+", required=False, default=[] +) +return parser.parse_args() + +def maf2bed(maf_file, bed_file, extra): +maf = pd.read_csv(maf_file, sep="\t", comment="#") +bed = maf[["Chromosome", "Start_Position", "End_Position"] + extra] +bed.to_csv(bed_file, sep="\t", index=False, header=False) + +def main(): +args = argparser() +maf2bed(maf_file=args.mafin, bed_file=args.bedout, extra=args.extra) + +if **name** == "**main**": +main() + +``` + +### 2. Make sure your script is in the right folder + +Now, let's move it to the correct directory and make sure it is executable: + +```bash +mv maf2bed.py /path/where/pipeline/is/bin/. +chmod +x /path/where/pipeline/is/bin/maf2bed.py +``` + +### 3. Create your custom module + +Then, let's write our module. We will call the process +"CONVERT_MAF2BED" and add any tags or/and labels that +are appropriate (this is optional) and directives (via +conda and/or container) for +the definition of dependencies. + +
+ +

Some additional infos that might be of interest

+ +
+More info on labels +A `label` will +annotate the processes with a reusable identifier of your +choice that can be used for configuring. E.g. we use the +`label` 'process_single', this looks as follows: + +``` + +withLabel:process_single { +cpus = { check_max( 1 _ task.attempt, 'cpus' ) } +memory = { check_max( 1.GB _ task.attempt, 'memory') } +time = { check_max( 1.h \* task.attempt, 'time' ) } +} + +``` + +
+ +
+More info on tags + +A `tag` is simple a user provided identifier associated to +the task. In our process example, the input is a tuple +comprising a hash of metadata for the maf file called +`meta` and the path to the `maf` file. It may look +similar to: `[[id:'123', data_type:'maf'], +/path/to/file/example.maf]`. Hence, when nextflow makes +the call and `$meta.id` is `123` name of the job +will be "CONVERT_MAF2BED(123)". If `meta` does not have +`id` in its hash, then this will be literally `null`. + +
+ +
+More info on conda/container directives + +The `conda` directive allows for the definition of the +process dependencies using the [Conda package manager](https://docs.conda.io/en/latest/). Nextflow automatically sets up an environment for the given package names listed by in the conda directive. For example: + +``` + +process foo { +conda 'bwa=0.7.15' + +''' +your_command --here +''' +} + +``` + +Multiple packages can be specified separating them with a blank space e.g. `bwa=0.7.15 samtools=1.15.1`. The name of the channel from where a specific package needs to be downloaded can be specified using the usual Conda notation i.e. prefixing the package with the channel name as shown here `bioconda::bwa=0.7.15` + +``` + +process foo { +conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' + +''' +your_bwa_cmd --here +your_samtools_cmd --here +''' +} + +``` + +Similarly, we can apply the `container` directive to execute the process script in a [Docker](http://docker.io/) or [Singularity](https://docs.sylabs.io/guides/3.5/user-guide/introduction.html) container. When running Docker, it requires the Docker daemon to be running in machine where the pipeline is executed, i.e. the local machine when using the local executor or the cluster nodes when the pipeline is deployed through a grid executor. + +``` + +process foo { +conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' +container 'dockerbox:tag' + +''' +your_bwa_cmd --here +your_samtools_cmd --here +''' +} + +``` + +Additionally, the `container` directive allows for a more sophisticated choice of container and if it Docker or Singularity depending on the users choice of container engine. This practice is quite common on official nf-core modules. + +``` + +process foo { +conda "bioconda::fastqc=0.11.9" +container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? +'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' : +'biocontainers/fastqc:0.11.9--0' }" + +''' +your_fastqc_command --here +''' +} + +``` + +
+ +
+ +Since `maf2bed.py` is in the `bin` directory we can directory call it in the script block of our new module `CONVERT_MAF2BED`. You only have to be careful with how you call variables (some explanations on when to use `${variable}` vs. `$variable`): +A process may contain any of the following definition blocks: directives, inputs, outputs, when clause, and the process script. Here is how we write it: + +``` +process CONVERT_MAF2BED { +// HEADER +tag "$meta.id" + label 'process_single' + // DEPENDENCIES DIRECTIVES + conda "anaconda::pandas=1.4.3" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? +'https://depot.galaxyproject.org/singularity/pandas:1.4.3' : +'quay.io/biocontainers/pandas:1.4.3' }" + +// INPUT BLOCK +input: +tuple val(meta), path(maf) + +// OUTPUT BLOCK +output: +tuple val(meta), path('\*.bed') , emit: bed +path "versions.yml" , emit: versions + +// WHEN CLAUSE +when: +task.ext.when == null || task.ext.when + +// SCRIPT BLOCK +script: // This script is bundled with the pipeline in bin +def args = task.ext.args ?: '' +def prefix = task.ext.prefix ?: "${meta.id}" + +""" +maf2bed.py --mafin $maf --bedout ${prefix}.bed +""" +} +``` + +More on nextflow's process components in the [docs](https://www.nextflow.io/docs/latest/process.html). + +### Include your module in the workflow + +In general, we will call out nextflow module `main.nf` and save it in the `modules` folder under another folder called `conver_maf2bed`. If you believe your custom script could be useful for others and it is potentially reusable or calling a tool that is not yet present in nf-core modules you can start the process of making it official adding a `meta.yml` [explained above](#adding-modules-to-a-pipeline). In the `meta.yml` The overall tree for the pipeline skeleton will look as follows: + +``` + +pipeline/ +├── bin/ +│ └── maf2bed.py +├── modules/ +│ ├── local/ +│ │ └── convert_maf2bed/ +│ │ ├── main.nf +│ │ └── meta.yml +│ └── nf-core/ +├── config/ +│ ├── base.config +│ └── modules.config +... + +``` + +To use our custom module located in `./modules/local/convert_maf2bed` within our workflow, we use a module inclusions command as follows (this has to be done before we invoke our workflow): + +```bash title="workflows/demotest.nf" +include { CONVERT_MAF2BED } from './modules/local/convert_maf2bed/main' +workflow { +input_data = [[id:123, data_type='maf'], /path/to/maf/example.maf] +CONVERT_MAF2BED(input_data) +} +``` + +:::tip{title="Exercise 6 - Adding a custom module"} +In the directory `exercise_6` you will find the custom script `print_hello.py`, which will be used for this and the next exercise. + +1. Create a local module that runs the `print_hello.py` script +2. Add the module to your main workflow +3. Run the pipeline +4. Lint the pipeline +5. Commit your changes +
+ solution 1 + + ``` + + ``` + +
+ +::: + +### Further reading and additional notes + +#### What happens in I want to use containers but there is no image created with the packages I need? + +No worries, this can be done fairly easy thanks to [BioContainers](https://biocontainers-edu.readthedocs.io/en/latest/what_is_biocontainers.html), see instructions [here](https://github.com/BioContainers/multi-package-containers). If you see the combination that you need in the repo, you can also use [this website](https://midnighter.github.io/mulled) to find out the "mulled" name of this container. + +### I want to know more about software dependencies! + +You are in luck, we have more documentation [here](https://nf-co.re/docs/contributing/modules#software-requirements) + +#### I want to know more about modules! + +See more info about modules in the nextflow docs [here](https://nf-co.re/docs/contributing/modules#software-requirements.) + +## Lint all modules + +As well as the pipeline template you can lint individual or all modules with a single command: + +``` + +nf-core modules lint --all + +``` + +## Nextflow Schema + +All nf-core pipelines can be run with --help to see usage instructions. We can try this with the demo pipeline that we just created: + +``` + +cd ../ +nextflow run nf-core-demo/ --help + +``` + +### Working with Nextflow schema + +If you peek inside the nextflow_schema.json file you will see that it is quite an intimidating thing. The file is large and complex, and very easy to break if edited manually. + +Thankfully, we provide a user-friendly tool for editing this file: nf-core schema build. + +To see this in action, let’s add some new parameters to nextflow.config: + +``` + +params { +demo = 'param-value-default' +foo = null +bar = false +baz = 12 +// rest of the config file.. + +``` + +Then run nf-core schema build: + +``` + +cd nf-core-demo/ +nf-core schema build + +``` + +The CLI tool should then prompt you to add each new parameter. +Here in the schema editor you can edit: + +- Description and help text +- Type (string / boolean / integer etc) +- Grouping of parameters +- Whether a parameter is required, or hidden from help by default +- Enumerated values (choose from a list) +- Min / max values for numeric types +- Regular expressions for validation +- Special formats for strings, such as file-path +- Additional fields for files such as mime-type + +:::tip{title="Exercise 7 - Using nextflow schema to add command line parameters"} + +1. Feed a string to your custom script from exercise 6 from the command line. Use `nf-core schema build` to add the parameter to the `nextflow.config` file. + + + +::: + +:::note + +### Key points + +- `nf-core create ` creates a pipeline from the nf-core template. +- `nf-core lint` lints the pipeline code for things that must be completed. +- `nf-core modules list local` lists modules currently installed into your pipeline. +- `nf-core modules list remote` lists modules available to install into your pipeline. +- `nf-core modules install ` installs the tool module into your pipeline. +- `nf-core modules create` creates a module locally to add custom code into your pipeline. +- `nf-core modules lint --all` lints your module code for things that must be completed. +- `nf-core schema build` opens an interface to allow you to describe your pipeline parameters and set default values, and which values are valid. + +::: + +``` + +``` diff --git a/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md b/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md new file mode 100644 index 0000000000..6bed8a9529 --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md @@ -0,0 +1,198 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: A guide to create Nextflow pipelines using nf-core tools +--- + +### Template code walk through + +Now let us have a look at the files that were generated within the `nf-core-demotest` directory when we created this pipeline. You can see all files and directories either on the left hand side in the Explorer, or by running the command: + +```bash +cd nf-core-demotest +tree +``` + +``` +. +├── assets +│ ├── adaptivecard.json +│ ├── email_template.html +│ ├── email_template.txt +│ ├── methods_description_template.yml +│ ├── multiqc_config.yml +│ ├── nf-core-demotest_logo_light.png +│ ├── samplesheet.csv +│ ├── schema_input.json +│ ├── sendmail_template.txt +│ └── slackreport.json +├── bin +│ └── check_samplesheet.py +├── CHANGELOG.md +├── CITATIONS.md +├── CODE_OF_CONDUCT.md +├── conf +│ ├── base.config +│ ├── igenomes.config +│ ├── modules.config +│ ├── test.config +│ └── test_full.config +├── docs +│ ├── images +│ │ ├── mqc_fastqc_adapter.png +│ │ ├── mqc_fastqc_counts.png +│ │ ├── mqc_fastqc_quality.png +│ │ ├── nf-core-demotest_logo_dark.png +│ │ └── nf-core-demotest_logo_light.png +│ ├── output.md +│ ├── README.md +│ └── usage.md +├── lib +│ ├── nfcore_external_java_deps.jar +│ ├── NfcoreTemplate.groovy +│ ├── Utils.groovy +│ ├── WorkflowDemotest.groovy +│ └── WorkflowMain.groovy +├── LICENSE +├── main.nf +├── modules +│ ├── local +│ │ └── samplesheet_check.nf +│ └── nf-core +│ ├── custom +│ │ └── dumpsoftwareversions +│ │ ├── main.nf +│ │ ├── meta.yml +│ │ └── templates +│ │ └── dumpsoftwareversions.py +│ ├── fastqc +│ │ ├── main.nf +│ │ └── meta.yml +│ └── multiqc +│ ├── main.nf +│ └── meta.yml +├── modules.json +├── nextflow.config +├── nextflow_schema.json +├── pyproject.toml +├── README.md +├── subworkflows +│ └── local +│ └── input_check.nf +├── tower.yml +└── workflows + └── demotest.nf +``` + +These are the files in detail: + +1. **main.nf** + + This file contains the main nextflow pipeline code. Mostly this file is not touched. + +2. **workflows/demotest.nf** + + This file is where the pipeline is going to be assembled. It connects the different modules and subworkflows. + +3. **CHANGELOG.md, CODE_OF_CONDUCT.md, LICENSE, README.md, CITATIONS.md** + + These are standard files created for github repositories. As a default, this pipeline will be under an MIT licence. The CODE_OF_CONDUCT is specific for the nf-core community. + +4. **assets/** + + This directory contains different templates such as email templates or the MultiQC config. In this course contents of this directory can be largely ignored. + +5. **bin/** + + The `bin` directory contains custom executable scripts, and is automatically added to the `PATH` by Nextflow allowing these scripts to become findable by Nextflow. As such, they can be called by name without using their absolute or relative path. The python script `check_samplesheet.py` is part of the nf-core template, since typically, nf-core pipelines require a samplesheet as one of their inputs. + +6. **conf/** and **nextflow.config** + + The `nextflow.config` file is the main config file. In addition to supplying default parameters, it imports all the configurations in `conf/`. Importantly, `conf/` contains the `test.config` file, which is used for pipeline testing. In this course we are not going to touch config files, but they have been extensively covered in the following bytesize talks: [How nf-core configs work (nf-core/bytesize #2)](https://www.youtube.com/watch?v=cXBYusdjrc0&list=PL3xpfTVZLcNiF4hgkW0yXeNzr0d35qlIB&index=8&pp=gAQBiAQB), [Making a new institutional config profile (nf-core/bytesize #10)](https://www.youtube.com/watch?v=Ym1s6sKGzkw&list=PL3xpfTVZLcNiF4hgkW0yXeNzr0d35qlIB&index=9&pp=gAQBiAQB), [nf-core/bytesize: Using nf-core configs in custom pipelines](https://www.youtube.com/watch?v=zgcrI_0SUgg&list=PL3xpfTVZLcNiF4hgkW0yXeNzr0d35qlIB&index=40&pp=gAQBiAQB) + +7. **docs/** + + This directory contains additional information to the README file. The most important files are the `output.md` and the `usage.md` files. `usage.md` should describe what exactly is needed to run the pipeline and `output.md` should describe all outputs that can be expected. Importantly, for nf-core pipelines, the information from these two files will automatically be displayed on the nf-core website page of the pipeline. + +8. **lib/** + + This directory contains groovy functions and classes that are imported into the `main.nf` file to provide additional functionality not native to Nextflow. + +9. **modules/local** + + This is where all your custom non-nf-core modules go. We will cover when and how to make local modules later in the course. + +10. **modules/nf-core** + + All nf-core modules that are installed using the nf-core tooling will automatically show up in this directory. Keep them here, it is important for automatic updates. + +11. **modules.json** + +This file keeps track of modules installed using nf-core tools from the nf-core/modules repository. This file should only be updated using nf-core tools, and never manually. + +12. **nextflow_schema.json** + + This file hosts all the parameters for the pipeline. Any new parameter should be added to this file using the `nf-core schema build` command. Similar to `output.md` and `usage.md`, the contents of `nextflow_schema.json` will get displayed on the pipeline page of nf-core pipelines. + +13. **pyproject.toml** +14. **subworkflows/local** +15. **tower.yml** +16. **hidden directories and files** + + a) _.devcontainer/devcontainer.json_ + + b) _.github/_ + + Files in here are used for Continuous integration tests (CI) with github actions as well as other github related defaults, such as a template for issues. We will not touch on these in the course. + + c) _.gitignore_ + + d) _.editorconfig_ + + e) _.gitpod.yml_ + + This file provides settings to create a Cloud development environment in your browser using Gitpod. It comes installed with the tools necessary to develop and test nf-core pipelines, modules, and subworkflows, allowing you to develop from anywhere without installing anything locally. + + f) _.nf-core.yml_ + + g) _.pre-commit-config.yaml_ + + h) _.prettierignore_ + + i) _.prettierrc.yml_ + +:::tip{title="Exercise 2 - Test your knowledge of the nf-core pipeline structure"} + +1. In which directory can you find the main script of the nf-core module `fastqc` +
+ solution 1 + + ``` + modules/nf-core/fastqc/ + ``` + +
+ +2. Which file contains the main workflow of your new pipeline? +
+ solution 2 + + ``` + workflows/demotest.nf + ``` + +
+ +3. `check_samplesheet.py` is a script that can be called by any module of your pipeline, where is it located? +
+ solution 3 + + ``` + bin/ + ``` + + This directory can also contain a custom scripts that you may wish to call from within a custom module. + +
+ +[MORE QUESTIONS CAN BE ADDED HERE] +::: From e0a9b5e1502f4d3db0cc62bdad7c00509446909b Mon Sep 17 00:00:00 2001 From: FranBonath Date: Wed, 20 Mar 2024 10:02:26 +0100 Subject: [PATCH 2/3] package-lock.json after npm update --- package-lock.json | 238 +++++++++++++++++++++++++++------------------- 1 file changed, 141 insertions(+), 97 deletions(-) diff --git a/package-lock.json b/package-lock.json index 0c5b9ab9ca..bf2f58af71 100644 --- a/package-lock.json +++ b/package-lock.json @@ -263,9 +263,9 @@ } }, "node_modules/@astrojs/compiler": { - "version": "2.1.0", - "resolved": "https://registry.npmjs.org/@astrojs/compiler/-/compiler-2.1.0.tgz", - "integrity": "sha512-Mp+qrNhly+27bL/Zq8lGeUY+YrdoU0eDfIlAeGIPrzt0PnI/jGpvPUdCaugv4zbCrDkOUScFfcbeEiYumrdJnw==" + "version": "2.7.0", + "resolved": "https://registry.npmjs.org/@astrojs/compiler/-/compiler-2.7.0.tgz", + "integrity": "sha512-XpC8MAaWjD1ff6/IfkRq/5k1EFj6zhCNqXRd5J43SVJEBj/Bsmizkm8N0xOYscGcDFQkRgEw6/eKnI5x/1l6aA==" }, "node_modules/@astrojs/internal-helpers": { "version": "0.2.1", @@ -403,12 +403,12 @@ } }, "node_modules/@astrojs/sitemap": { - "version": "3.0.1", - "resolved": "https://registry.npmjs.org/@astrojs/sitemap/-/sitemap-3.0.1.tgz", - "integrity": "sha512-ErCthOQF0Yt6KkvaS+v/7CM6TxztOPHJjla4cbM3fBAFsDQbCS3tvoWSMqPJtmTFiy7veQ1eC5gH58FhPe85zg==", + "version": "3.1.1", + "resolved": "https://registry.npmjs.org/@astrojs/sitemap/-/sitemap-3.1.1.tgz", + "integrity": "sha512-qPgdBIcDUaea98mTtLfi5z9oXZpzSjEn/kes70/Ex8FOZZ+DIHVKRYOLOtvy8p+FTXr/9oc7BjmIbTYmYLLJVg==", "dependencies": { "sitemap": "^7.1.1", - "zod": "3.21.1" + "zod": "^3.22.4" } }, "node_modules/@astrojs/svelte": { @@ -428,9 +428,9 @@ } }, "node_modules/@astrojs/telemetry": { - "version": "3.0.3", - "resolved": "https://registry.npmjs.org/@astrojs/telemetry/-/telemetry-3.0.3.tgz", - "integrity": "sha512-j19Cf5mfyLt9hxgJ9W/FMdAA5Lovfp7/CINNB/7V71GqvygnL7KXhRC3TzfB+PsVQcBtgWZzCXhUWRbmJ64Raw==", + "version": "3.0.4", + "resolved": "https://registry.npmjs.org/@astrojs/telemetry/-/telemetry-3.0.4.tgz", + "integrity": "sha512-A+0c7k/Xy293xx6odsYZuXiaHO0PL+bnDoXOc47sGDF5ffIKdKQGRPFl2NMlCF4L0NqN4Ynbgnaip+pPF0s7pQ==", "dependencies": { "ci-info": "^3.8.0", "debug": "^4.3.4", @@ -2160,9 +2160,9 @@ } }, "node_modules/@octokit/app": { - "version": "14.0.1", - "resolved": "https://registry.npmjs.org/@octokit/app/-/app-14.0.1.tgz", - "integrity": "sha512-4opdXcWBVhzd6FOxlaxDKXXqi9Vz2hsDSWQGNo49HbYFAX11UqMpksMjEdfvHy0x19Pse8Nvn+R6inNb/V398w==", + "version": "14.0.2", + "resolved": "https://registry.npmjs.org/@octokit/app/-/app-14.0.2.tgz", + "integrity": "sha512-NCSCktSx+XmjuSUVn2dLfqQ9WIYePGP95SDJs4I9cn/0ZkeXcPkaoCLl64Us3dRKL2ozC7hArwze5Eu+/qt1tg==", "dependencies": { "@octokit/auth-app": "^6.0.0", "@octokit/auth-unauthenticated": "^5.0.0", @@ -2170,16 +2170,16 @@ "@octokit/oauth-app": "^6.0.0", "@octokit/plugin-paginate-rest": "^9.0.0", "@octokit/types": "^12.0.0", - "@octokit/webhooks": "^12.0.1" + "@octokit/webhooks": "^12.0.4" }, "engines": { "node": ">= 18" } }, "node_modules/@octokit/auth-app": { - "version": "6.0.1", - "resolved": "https://registry.npmjs.org/@octokit/auth-app/-/auth-app-6.0.1.tgz", - "integrity": "sha512-tjCD4nzQNZgmLH62+PSnTF6eGerisFgV4v6euhqJik6yWV96e1ZiiGj+NXIqbgnpjLmtnBqVUrNyGKu3DoGEGA==", + "version": "6.0.4", + "resolved": "https://registry.npmjs.org/@octokit/auth-app/-/auth-app-6.0.4.tgz", + "integrity": "sha512-TPmJYgd05ok3nzHj7Y6we/V7Ez1wU3ztLFW3zo/afgYFtqYZg0W7zb6Kp5ag6E85r8nCE1JfS6YZoZusa14o9g==", "dependencies": { "@octokit/auth-oauth-app": "^7.0.0", "@octokit/auth-oauth-user": "^4.0.0", @@ -2188,7 +2188,7 @@ "@octokit/types": "^12.0.0", "deprecation": "^2.3.1", "lru-cache": "^10.0.0", - "universal-github-app-jwt": "^1.1.1", + "universal-github-app-jwt": "^1.1.2", "universal-user-agent": "^6.0.0" }, "engines": { @@ -2196,9 +2196,9 @@ } }, "node_modules/@octokit/auth-app/node_modules/lru-cache": { - "version": "10.0.1", - "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-10.0.1.tgz", - "integrity": "sha512-IJ4uwUTi2qCccrioU6g9g/5rvvVl13bsdczUUcqbciD9iLr095yj8DQKdObriEvuNSx325N1rV1O0sJFszx75g==", + "version": "10.2.0", + "resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-10.2.0.tgz", + "integrity": "sha512-2bIM8x+VAf6JT4bKAljS1qUWgMsqZRPGJS6FSahIMPVvctcNhyVp7AJu7quxOW9jwkryBReKZY5tY5JYv2n/7Q==", "engines": { "node": "14 || >=16.14" } @@ -2314,9 +2314,9 @@ } }, "node_modules/@octokit/oauth-app": { - "version": "6.0.0", - "resolved": "https://registry.npmjs.org/@octokit/oauth-app/-/oauth-app-6.0.0.tgz", - "integrity": "sha512-bNMkS+vJ6oz2hCyraT9ZfTpAQ8dZNqJJQVNaKjPLx4ue5RZiFdU1YWXguOPR8AaSHS+lKe+lR3abn2siGd+zow==", + "version": "6.1.0", + "resolved": "https://registry.npmjs.org/@octokit/oauth-app/-/oauth-app-6.1.0.tgz", + "integrity": "sha512-nIn/8eUJ/BKUVzxUXd5vpzl1rwaVxMyYbQkNZjHrF7Vk/yu98/YDF/N2KeWO7uZ0g3b5EyiFXFkZI8rJ+DH1/g==", "dependencies": { "@octokit/auth-oauth-app": "^7.0.0", "@octokit/auth-oauth-user": "^4.0.0", @@ -2340,37 +2340,24 @@ } }, "node_modules/@octokit/oauth-methods": { - "version": "4.0.0", - "resolved": "https://registry.npmjs.org/@octokit/oauth-methods/-/oauth-methods-4.0.0.tgz", - "integrity": "sha512-dqy7BZLfLbi3/8X8xPKUKZclMEK9vN3fK5WF3ortRvtplQTszFvdAGbTo71gGLO+4ZxspNiLjnqdd64Chklf7w==", + "version": "4.0.1", + "resolved": "https://registry.npmjs.org/@octokit/oauth-methods/-/oauth-methods-4.0.1.tgz", + "integrity": "sha512-1NdTGCoBHyD6J0n2WGXg9+yDLZrRNZ0moTEex/LSPr49m530WNKcCfXDghofYptr3st3eTii+EHoG5k/o+vbtw==", "dependencies": { "@octokit/oauth-authorization-url": "^6.0.2", "@octokit/request": "^8.0.2", "@octokit/request-error": "^5.0.0", - "@octokit/types": "^11.0.0", + "@octokit/types": "^12.0.0", "btoa-lite": "^1.0.0" }, "engines": { "node": ">= 18" } }, - "node_modules/@octokit/oauth-methods/node_modules/@octokit/openapi-types": { - "version": "18.1.1", - "resolved": "https://registry.npmjs.org/@octokit/openapi-types/-/openapi-types-18.1.1.tgz", - "integrity": "sha512-VRaeH8nCDtF5aXWnjPuEMIYf1itK/s3JYyJcWFJT8X9pSNnBtriDf7wlEWsGuhPLl4QIH4xM8fqTXDwJ3Mu6sw==" - }, - "node_modules/@octokit/oauth-methods/node_modules/@octokit/types": { - "version": "11.1.0", - "resolved": "https://registry.npmjs.org/@octokit/types/-/types-11.1.0.tgz", - "integrity": "sha512-Fz0+7GyLm/bHt8fwEqgvRBWwIV1S6wRRyq+V6exRKLVWaKGsuy6H9QFYeBVDV7rK6fO3XwHgQOPxv+cLj2zpXQ==", - "dependencies": { - "@octokit/openapi-types": "^18.0.0" - } - }, "node_modules/@octokit/openapi-types": { - "version": "19.0.0", - "resolved": "https://registry.npmjs.org/@octokit/openapi-types/-/openapi-types-19.0.0.tgz", - "integrity": "sha512-PclQ6JGMTE9iUStpzMkwLCISFn/wDeRjkZFIKALpvJQNBGwDoYYi2fFvuHwssoQ1rXI5mfh6jgTgWuddeUzfWw==" + "version": "20.0.0", + "resolved": "https://registry.npmjs.org/@octokit/openapi-types/-/openapi-types-20.0.0.tgz", + "integrity": "sha512-EtqRBEjp1dL/15V7WiX5LJMIxxkdiGJnabzYx5Apx4FkQIFgAfKumXeYAqqJCj1s+BMX4cPFIFC4OLCR6stlnA==" }, "node_modules/@octokit/plugin-paginate-graphql": { "version": "4.0.0", @@ -2384,17 +2371,17 @@ } }, "node_modules/@octokit/plugin-paginate-rest": { - "version": "9.0.0", - "resolved": "https://registry.npmjs.org/@octokit/plugin-paginate-rest/-/plugin-paginate-rest-9.0.0.tgz", - "integrity": "sha512-oIJzCpttmBTlEhBmRvb+b9rlnGpmFgDtZ0bB6nq39qIod6A5DP+7RkVLMOixIgRCYSHDTeayWqmiJ2SZ6xgfdw==", + "version": "9.2.1", + "resolved": "https://registry.npmjs.org/@octokit/plugin-paginate-rest/-/plugin-paginate-rest-9.2.1.tgz", + "integrity": "sha512-wfGhE/TAkXZRLjksFXuDZdmGnJQHvtU/joFQdweXUgzo1XwvBCD4o4+75NtFfjfLK5IwLf9vHTfSiU3sLRYpRw==", "dependencies": { - "@octokit/types": "^12.0.0" + "@octokit/types": "^12.6.0" }, "engines": { "node": ">= 18" }, "peerDependencies": { - "@octokit/core": ">=5" + "@octokit/core": "5" } }, "node_modules/@octokit/plugin-rest-endpoint-methods": { @@ -2471,21 +2458,21 @@ } }, "node_modules/@octokit/types": { - "version": "12.0.0", - "resolved": "https://registry.npmjs.org/@octokit/types/-/types-12.0.0.tgz", - "integrity": "sha512-EzD434aHTFifGudYAygnFlS1Tl6KhbTynEWELQXIbTY8Msvb5nEqTZIm7sbPEt4mQYLZwu3zPKVdeIrw0g7ovg==", + "version": "12.6.0", + "resolved": "https://registry.npmjs.org/@octokit/types/-/types-12.6.0.tgz", + "integrity": "sha512-1rhSOfRa6H9w4YwK0yrf5faDaDTb+yLyBUKOCV4xtCDB5VmIPqd/v9yr9o6SAzOAlRxMiRiCic6JVM1/kunVkw==", "dependencies": { - "@octokit/openapi-types": "^19.0.0" + "@octokit/openapi-types": "^20.0.0" } }, "node_modules/@octokit/webhooks": { - "version": "12.0.3", - "resolved": "https://registry.npmjs.org/@octokit/webhooks/-/webhooks-12.0.3.tgz", - "integrity": "sha512-8iG+/yza7hwz1RrQ7i7uGpK2/tuItZxZq1aTmeg2TNp2xTUB8F8lZF/FcZvyyAxT8tpDMF74TjFGCDACkf1kAQ==", + "version": "12.1.2", + "resolved": "https://registry.npmjs.org/@octokit/webhooks/-/webhooks-12.1.2.tgz", + "integrity": "sha512-+nGS3ReCByF6m+nbNB59x7Aa3CNjCCGuBLFzfkiJP1O3uVKKuJbkP4uO4t46YqH26nlugmOhqjT7nx5D0VPtdA==", "dependencies": { "@octokit/request-error": "^5.0.0", - "@octokit/webhooks-methods": "^4.0.0", - "@octokit/webhooks-types": "7.1.0", + "@octokit/webhooks-methods": "^4.1.0", + "@octokit/webhooks-types": "7.3.2", "aggregate-error": "^3.1.0" }, "engines": { @@ -2493,17 +2480,17 @@ } }, "node_modules/@octokit/webhooks-methods": { - "version": "4.0.0", - "resolved": "https://registry.npmjs.org/@octokit/webhooks-methods/-/webhooks-methods-4.0.0.tgz", - "integrity": "sha512-M8mwmTXp+VeolOS/kfRvsDdW+IO0qJ8kYodM/sAysk093q6ApgmBXwK1ZlUvAwXVrp/YVHp6aArj4auAxUAOFw==", + "version": "4.1.0", + "resolved": "https://registry.npmjs.org/@octokit/webhooks-methods/-/webhooks-methods-4.1.0.tgz", + "integrity": "sha512-zoQyKw8h9STNPqtm28UGOYFE7O6D4Il8VJwhAtMHFt2C4L0VQT1qGKLeefUOqHNs1mNRYSadVv7x0z8U2yyeWQ==", "engines": { "node": ">= 18" } }, "node_modules/@octokit/webhooks-types": { - "version": "7.1.0", - "resolved": "https://registry.npmjs.org/@octokit/webhooks-types/-/webhooks-types-7.1.0.tgz", - "integrity": "sha512-y92CpG4kFFtBBjni8LHoV12IegJ+KFxLgKRengrVjKmGE5XMeCuGvlfRe75lTRrgXaG6XIWJlFpIDTlkoJsU8w==" + "version": "7.3.2", + "resolved": "https://registry.npmjs.org/@octokit/webhooks-types/-/webhooks-types-7.3.2.tgz", + "integrity": "sha512-JWOoOgtWTFnTSAamPXXyjTY5/apttvNxF+vPBnwdSu5cj5snrd7FO0fyw4+wTXy8fHduq626JjhO+TwCyyA6vA==" }, "node_modules/@playwright/test": { "version": "1.38.1", @@ -3271,9 +3258,9 @@ } }, "node_modules/@types/aws-lambda": { - "version": "8.10.123", - "resolved": "https://registry.npmjs.org/@types/aws-lambda/-/aws-lambda-8.10.123.tgz", - "integrity": "sha512-YBxxrA5fZzt1UFslJFBXH4AJUjkxu5ePv9RPjrpq5Tif5Xi2YhG3P8RAJfXXWL5/M6wPrlzlAMqaMM5ysci/EQ==" + "version": "8.10.136", + "resolved": "https://registry.npmjs.org/@types/aws-lambda/-/aws-lambda-8.10.136.tgz", + "integrity": "sha512-cmmgqxdVGhxYK9lZMYYXYRJk6twBo53ivtXjIUEFZxfxe4TkZTZBK3RRWrY2HjJcUIix0mdifn15yjOAat5lTA==" }, "node_modules/@types/babel__core": { "version": "7.20.2", @@ -3313,9 +3300,9 @@ } }, "node_modules/@types/btoa-lite": { - "version": "1.0.0", - "resolved": "https://registry.npmjs.org/@types/btoa-lite/-/btoa-lite-1.0.0.tgz", - "integrity": "sha512-wJsiX1tosQ+J5+bY5LrSahHxr2wT+uME5UDwdN1kg4frt40euqA+wzECkmq4t5QbveHiJepfdThgQrPw6KiSlg==" + "version": "1.0.2", + "resolved": "https://registry.npmjs.org/@types/btoa-lite/-/btoa-lite-1.0.2.tgz", + "integrity": "sha512-ZYbcE2x7yrvNFJiU7xJGrpF/ihpkM7zKgw8bha3LNJSesvTtUNxbpzaT7WXBIryf6jovisrxTBvymxMeLLj1Mg==" }, "node_modules/@types/d3-scale": { "version": "4.0.5", @@ -3388,9 +3375,9 @@ } }, "node_modules/@types/jsonwebtoken": { - "version": "9.0.3", - "resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.3.tgz", - "integrity": "sha512-b0jGiOgHtZ2jqdPgPnP6WLCXZk1T8p06A/vPGzUvxpFGgKMbjXJDjC5m52ErqBnIuWZFgGoIJyRdeG5AyreJjA==", + "version": "9.0.6", + "resolved": "https://registry.npmjs.org/@types/jsonwebtoken/-/jsonwebtoken-9.0.6.tgz", + "integrity": "sha512-/5hndP5dCjloafCXns6SZyESp3Ldq7YjH3zwzwczYnjxIT0Fqzk5ROSYVGfFyczIue7IUEj8hkvLbPoLQ18vQw==", "dependencies": { "@types/node": "*" } @@ -3643,14 +3630,14 @@ } }, "node_modules/astro": { - "version": "3.3.1", - "resolved": "https://registry.npmjs.org/astro/-/astro-3.3.1.tgz", - "integrity": "sha512-PzXJN8IIjQrxf+D39utnBtxscdqizwslOKoXXGtVFCA66IPh37HzGts0VLtB/kSf8ouTojk5JBQWbevDiW53VQ==", + "version": "3.6.5", + "resolved": "https://registry.npmjs.org/astro/-/astro-3.6.5.tgz", + "integrity": "sha512-fyVubQfb6+Bc2/XigXNeJ06HHwNFB9EnAQohJcrja6RB1PhIcZRusyIamywHWjoJ4a1T0HxMkzBLNW96DBqpNw==", "dependencies": { - "@astrojs/compiler": "^2.1.0", + "@astrojs/compiler": "^2.3.0", "@astrojs/internal-helpers": "0.2.1", - "@astrojs/markdown-remark": "3.3.0", - "@astrojs/telemetry": "3.0.3", + "@astrojs/markdown-remark": "3.5.0", + "@astrojs/telemetry": "3.0.4", "@babel/core": "^7.22.10", "@babel/generator": "^7.22.10", "@babel/parser": "^7.22.10", @@ -3681,9 +3668,11 @@ "js-yaml": "^4.1.0", "kleur": "^4.1.4", "magic-string": "^0.30.3", + "mdast-util-to-hast": "12.3.0", "mime": "^3.0.0", "ora": "^7.0.1", "p-limit": "^4.0.0", + "p-queue": "^7.4.1", "path-to-regexp": "^6.2.1", "preferred-pm": "^3.1.2", "probe-image-size": "^7.2.3", @@ -3702,7 +3691,7 @@ "vitefu": "^0.2.4", "which-pm": "^2.1.1", "yargs-parser": "^21.1.1", - "zod": "3.21.1" + "zod": "^3.22.4" }, "bin": { "astro": "astro.js" @@ -3725,6 +3714,30 @@ "svgo": "^2.8.0" } }, + "node_modules/astro/node_modules/@astrojs/markdown-remark": { + "version": "3.5.0", + "resolved": "https://registry.npmjs.org/@astrojs/markdown-remark/-/markdown-remark-3.5.0.tgz", + "integrity": "sha512-q7vdIqzYhxpsfghg2YmkmSXCfp4w7lBTYP+SSHw89wVhC5Riltr3u8w2otBRxNLSByNi+ht/gGkFC23Shetytw==", + "dependencies": { + "@astrojs/prism": "^3.0.0", + "github-slugger": "^2.0.0", + "import-meta-resolve": "^3.0.0", + "mdast-util-definitions": "^6.0.0", + "rehype-raw": "^6.1.1", + "rehype-stringify": "^9.0.4", + "remark-gfm": "^3.0.1", + "remark-parse": "^10.0.2", + "remark-rehype": "^10.1.0", + "remark-smartypants": "^2.0.0", + "shikiji": "^0.6.8", + "unified": "^10.1.2", + "unist-util-visit": "^4.1.2", + "vfile": "^5.3.7" + }, + "peerDependencies": { + "astro": "^3.0.0" + } + }, "node_modules/astro/node_modules/unist-util-visit": { "version": "4.1.2", "resolved": "https://registry.npmjs.org/unist-util-visit/-/unist-util-visit-4.1.2.tgz", @@ -5186,9 +5199,9 @@ } }, "node_modules/dset": { - "version": "3.1.2", - "resolved": "https://registry.npmjs.org/dset/-/dset-3.1.2.tgz", - "integrity": "sha512-g/M9sqy3oHe477Ar4voQxWtaPIFw1jTdKZuomOjhCcBx9nHUNn0pu6NopuFFrTh/TRZIKEj+76vLWFu9BNKk+Q==", + "version": "3.1.3", + "resolved": "https://registry.npmjs.org/dset/-/dset-3.1.3.tgz", + "integrity": "sha512-20TuZZHCEZ2O71q9/+8BwKwZ0QtD9D8ObhrihJPr+vLLYlSuAU3/zL4cSlgbfeoGHTjCSJBa7NGcrF9/Bx/WJQ==", "engines": { "node": ">=4" } @@ -5395,6 +5408,11 @@ "@types/estree": "^1.0.0" } }, + "node_modules/eventemitter3": { + "version": "5.0.1", + "resolved": "https://registry.npmjs.org/eventemitter3/-/eventemitter3-5.0.1.tgz", + "integrity": "sha512-GWkBvjiSZK87ELrYOSESUYeVIc9mvLLf/nXalMOS5dYrgZq9o5OVkbZAVM06CVxYsCwH9BDZFPlQTlPA1j4ahA==" + }, "node_modules/execa": { "version": "8.0.1", "resolved": "https://registry.npmjs.org/execa/-/execa-8.0.1.tgz", @@ -8544,11 +8562,11 @@ } }, "node_modules/octokit": { - "version": "3.1.1", - "resolved": "https://registry.npmjs.org/octokit/-/octokit-3.1.1.tgz", - "integrity": "sha512-AKJs5XYs7iAh7bskkYpxhUIpsYZdLqjnlnqrN5s9FFZuJ/a6ATUHivGpUKDpGB/xa+LGDtG9Lu8bOCfPM84vHQ==", + "version": "3.1.2", + "resolved": "https://registry.npmjs.org/octokit/-/octokit-3.1.2.tgz", + "integrity": "sha512-MG5qmrTL5y8KYwFgE1A4JWmgfQBaIETE/lOlfwNYx1QOtCQHGVxkRJmdUJltFc1HVn73d61TlMhMyNTOtMl+ng==", "dependencies": { - "@octokit/app": "^14.0.0", + "@octokit/app": "^14.0.2", "@octokit/core": "^5.0.0", "@octokit/oauth-app": "^6.0.0", "@octokit/plugin-paginate-graphql": "^4.0.0", @@ -8671,6 +8689,32 @@ "url": "https://github.com/sponsors/sindresorhus" } }, + "node_modules/p-queue": { + "version": "7.4.1", + "resolved": "https://registry.npmjs.org/p-queue/-/p-queue-7.4.1.tgz", + "integrity": "sha512-vRpMXmIkYF2/1hLBKisKeVYJZ8S2tZ0zEAmIJgdVKP2nq0nh4qCdf8bgw+ZgKrkh71AOCaqzwbJJk1WtdcF3VA==", + "dependencies": { + "eventemitter3": "^5.0.1", + "p-timeout": "^5.0.2" + }, + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, + "node_modules/p-timeout": { + "version": "5.1.0", + "resolved": "https://registry.npmjs.org/p-timeout/-/p-timeout-5.1.0.tgz", + "integrity": "sha512-auFDyzzzGZZZdHz3BtET9VEz0SE/uMEAx7uWfGPucfzEwwe/xH0iVeZibQmANYE/hp9T2+UUZT5m+BKyrDp3Ew==", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/sindresorhus" + } + }, "node_modules/p-try": { "version": "2.2.0", "resolved": "https://registry.npmjs.org/p-try/-/p-try-2.2.0.tgz", @@ -12642,12 +12686,12 @@ } }, "node_modules/universal-github-app-jwt": { - "version": "1.1.1", - "resolved": "https://registry.npmjs.org/universal-github-app-jwt/-/universal-github-app-jwt-1.1.1.tgz", - "integrity": "sha512-G33RTLrIBMFmlDV4u4CBF7dh71eWwykck4XgaxaIVeZKOYZRAAxvcGMRFTUclVY6xoUPQvO4Ne5wKGxYm/Yy9w==", + "version": "1.1.2", + "resolved": "https://registry.npmjs.org/universal-github-app-jwt/-/universal-github-app-jwt-1.1.2.tgz", + "integrity": "sha512-t1iB2FmLFE+yyJY9+3wMx0ejB+MQpEVkH0gQv7dR6FZyltyq+ZZO0uDpbopxhrZ3SLEO4dCEkIujOMldEQ2iOA==", "dependencies": { "@types/jsonwebtoken": "^9.0.0", - "jsonwebtoken": "^9.0.0" + "jsonwebtoken": "^9.0.2" } }, "node_modules/universal-user-agent": { @@ -12781,9 +12825,9 @@ } }, "node_modules/vite": { - "version": "4.4.11", - "resolved": "https://registry.npmjs.org/vite/-/vite-4.4.11.tgz", - "integrity": "sha512-ksNZJlkcU9b0lBwAGZGGaZHCMqHsc8OpgtoYhsQ4/I2v5cnpmmmqe5pM4nv/4Hn6G/2GhTdj0DhZh2e+Er1q5A==", + "version": "4.5.2", + "resolved": "https://registry.npmjs.org/vite/-/vite-4.5.2.tgz", + "integrity": "sha512-tBCZBNSBbHQkaGyhGCDUGqeo2ph8Fstyp6FMSvTtsXeZSPpSMGlviAOav2hxVTqFcx8Hj/twtWKsMJXNY0xI8w==", "dependencies": { "esbuild": "^0.18.10", "postcss": "^8.4.27", @@ -13454,9 +13498,9 @@ } }, "node_modules/zod": { - "version": "3.21.1", - "resolved": "https://registry.npmjs.org/zod/-/zod-3.21.1.tgz", - "integrity": "sha512-+dTu2m6gmCbO9Ahm4ZBDapx2O6ZY9QSPXst2WXjcznPMwf2YNpn3RevLx4KkZp1OPW/ouFcoBtBzFz/LeY69oA==", + "version": "3.22.4", + "resolved": "https://registry.npmjs.org/zod/-/zod-3.22.4.tgz", + "integrity": "sha512-iC+8Io04lddc+mVqQ9AZ7OQ2MrUKGN+oIQyq1vemgt46jwCwLfhq7/pwnBnNXXXZb8VTVLKwp9EDkx+ryxIWmg==", "funding": { "url": "https://github.com/sponsors/colinhacks" } From d6c4b12d8e4021535c8e1e84b92ed7bdf9025533 Mon Sep 17 00:00:00 2001 From: FranBonath Date: Wed, 20 Mar 2024 11:09:53 +0100 Subject: [PATCH 3/3] restructure tools training WIP II --- .../contributing/nf_core_basic_training.md | 1490 ----------------- .../add_custom_module.md | 303 ++++ .../add_nf_core_module.md | 454 +++++ .../gitpod_environment.md | 14 +- .../nf_core_basic_training/index.md | 31 +- .../nf_core_basic_training/linting_modules.md | 14 + .../nf_core_create_tool.md | 788 +-------- .../nf_core_basic_training/nf_schema.md | 64 + .../template_walk_through.md | 2 +- 9 files changed, 865 insertions(+), 2295 deletions(-) delete mode 100644 src/content/docs/contributing/nf_core_basic_training.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/add_custom_module.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/add_nf_core_module.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/linting_modules.md create mode 100644 src/content/docs/contributing/nf_core_basic_training/nf_schema.md diff --git a/src/content/docs/contributing/nf_core_basic_training.md b/src/content/docs/contributing/nf_core_basic_training.md deleted file mode 100644 index c9a01dd2c0..0000000000 --- a/src/content/docs/contributing/nf_core_basic_training.md +++ /dev/null @@ -1,1490 +0,0 @@ ---- -title: Basic training to create an nf-core pipeline -subtitle: A guide to create Nextflow pipelines using nf-core tools ---- - -## Scope - -- How do I create a pipeline using nf-core tools? -- How do I incorporate modules from nf-core modules? -- How can I use custom code in my pipeline? - -:::note - -### Learning objectives - -- The learner will create a simple pipeline using the nf-core template. -- The learner will identify key files in the pipeline. -- The learner will lint their pipeline code to identify work to be done. -- The learner will incorporate modules from nf-core/modules into their pipeline. -- The learner will add custom code as a local module into their pipeline. -- The learner will build an nf-core schema to describe and validate pipeline parameters. - -::: - -This training course aims to demonstrate how to build an nf-core pipeline using the nf-core pipeline template and nf-core modules as well as custom, local modules. Be aware that we are not going to explain any fundamental Nextflow concepts, as such we advise anyone taking this course to have completed the [Basic Nextflow Training Workshop](https://training.nextflow.io/). - -```md -During this course we are going to build a Simple RNA-Seq workflow. -This workflow is by no means ment to be a useful bioinformatics workflow, -but should only teach the objectives of the course, so please, -**DO NOT use this workflow to analyse RNA sequencing data**! -``` - -## Overview - -### Layout of the pipeline - -The course is going to build an (totally unscientific and useless) RNA seq pipeline that does the following: - -1. Indexing of a transcriptome file -2. Quality control -3. Quantification of transcripts -4. [whatever the custom script does] -5. Generation of a MultiQC report - -### Outline of the Course - -The following sections will be handled in the course: - -1. **Setting up the gitpod environment for the course** - -The course is using gitpod in order to avoid the time expense for downloading and installing tools and data. - -2. **Exploring the nf-core tools command** - -A very basic walk-through of what can be done with nf-core tools - -3. **Creating a new nf-core pipeline from the nf-core template** - -4. **Exploring the nf-core template** - - a) The git repository - - b) running the pipeline - - c) linting the pipeline - - d) walk-through of the template files - -5. **Building a nf-core pipeline using the template** - - a) Adding a nf-core module to your pipeline - - b) Adding a local custom module to your pipeline - - c) Working with Nextflow schema - - d) Linting your modules - -## Preparation - -### Prerequisites - -- Familiarity with Nextflow syntax and configuration. - -### Follow the training videos - -This training can be followed either based on this documentation alone, or via a training video hosted on youtube. You can find the youtube video in the Youtube playlist below: - -(no such video yet) - -### Gitpod - -For this tutorial we will use Gitpod, which runs in the learners web browser. The Gitpod environment contains a preconfigured Nextflow development environment -which includes a terminal, file editor, file browser, Nextflow, and nf-core tools. To use Gitpod, you will need: - -- A GitHub account -- Web browser (Google Chrome, Firefox) -- Internet connection - -Click the link and log in using your GitHub account to start the tutorial: - -

- - Launch GitPod - -

- -For more information about Gitpod, including how to make your own Gitpod environement, see the Gitpod bytesize talk on youtube (link to the bytesize talk), -check the [nf-core Gitpod documentation](gitpod/index) or [Gitpod's own documentation](https://www.gitpod.io/docs). - -
- Expand this section for instructions to explore your Gitpod environment - -#### Explore your Gitpod interface - -You should now see something similar to the following: - -(insert Gitpod welcome image) - -- **The sidebar** allows you to customize your Gitpod environment and perform basic tasks (copy, paste, open files, search, git, etc.). Click the Explorer button to see which files are in this repository. -- **The terminal** allows you to run all the programs in the repository. For example, both `nextflow` and `docker` are installed and can be executed. -- **The main window** allows you to view and edit files. Clicking on a file in the explorer will open it within the main window. You should also see the nf-training material browser (). - -To test that the environment is working correctly, type the following into the terminal: - -```bash -nextflow info -``` - -This should come up with the Nextflow version and runtime information: - -``` -Version: 23.10.0 build 5889 -Created: 15-10-2023 15:07 UTC (15:07 GMT) -System: Linux 6.1.54-060154-generic -Runtime: Groovy 3.0.19 on OpenJDK 64-Bit Server VM 17.0.8-internal+0-adhoc..src -Encoding: UTF-8 (UTF-8) -``` - -#### Reopening a Gitpod session - -When a Gitpod session is not used for a while, i.e., goes idle, it will timeout and close the interface. -You can reopen the environment from . Find your previous environment in the list, then select the ellipsis (three dots icon) and select Open. - -If you have saved the URL for your previous Gitpod environment, you can simply open it in your browser. - -Alternatively, you can start a new workspace by following the Gitpod URL: - -If you have lost your environment, you can find the main scripts used in this tutorial in the `nf-training` directory. - -#### Saving files from Gitpod to your local machine - -To save any file locally from the explorer panel, right-click the file and select Download. - -
- -## Explore nf-core/tools - -The nf-core/tools package is already installed in the gitpod environment. Now you can check out which pipelines, subworkflows and modules are available via tools. To see all available commands of nf-core tools, run the following: - -```bash -nf-core --help -``` - -We will touch on most of the commands for developers later throughout this tutorial. - -## Create a pipeline from template - -To get started with your new pipeline, run the create command: - -```bash -nf-core create -``` - -This should open a command prompt similar to this: - -``` - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -? Workflow name demotest -? Description test pipeline for demo -? Author FBo -? Do you want to customize which parts of the template are used? No -INFO Creating new nf-core pipeline: 'nf-core/demotest' -INFO Initialising pipeline git repository -INFO Done. Remember to add a remote and push to GitHub: - cd /workspace/basic_training/nf-core-demotest - git remote add origin git@github.com:USERNAME/REPO_NAME.git - git push --all origin -INFO This will also push your newly created dev branch and the TEMPLATE branch for syncing. -INFO !!!!!! IMPORTANT !!!!!! - If you are interested in adding your pipeline to the nf-core community, - PLEASE COME AND TALK TO US IN THE NF-CORE SLACK BEFORE WRITING ANY CODE! - - Please read: https://nf-co.re/developers/adding_pipelines#join-the-community -``` - -Although you can provide options on the command line, it’s easiest to use the interactive prompts. For now we are assuming that we want to create a new nf-core pipeline, so we chose not to customize the template. -It is possible to use nf-core tools for non-nf-core pipelines, but the setup of such pipelines will be handled in a later chapter # ARE WE GOING TO DO THIS? - -### Pipeline git repo - -The nf-core create command has made a fully fledged pipeline for you. Before getting too carried away looking at all of the files, note that it has also initiated a git repository: - -```bash -cd nf-core-demotest -git status -``` - -``` -On branch master -nothing to commit, working tree clean -``` - -It’s actually created three branches for you: - -```bash -git branch -``` - -``` - TEMPLATE - dev -* master -``` - -Each have the same initial commit, with the vanilla template: - -```bash -git log -``` - -``` -commit 77e77783aab19e47ce6b2d736d766fbef2483de8 (HEAD -> master, dev, TEMPLATE) -Author: Franziska Bonath -Date: Tue Oct 31 15:21:33 2023 +0000 - - initial template build from nf-core/tools, version 2.10 -``` - -This is important, because this shared git history with unmodified nf-core template in the TEMPLATE branch is how the nf-core automated template synchronisation works (see the docs for more details). - -The main thing to remember with this is that: - -Never make changes to the TEMPLATE branch, otherwise it will interfere with the synchronisation of nf-core updates. - -Ideally code should be developed on feature branches (i.e. a new branch made with `git checkout -b my_new_feature`), and when ready merged into the `dev` branch upon a successful code review. The `dev` branch is then merged to the `master` branch when a stable release of the workflow is ready to be made. - -When creating a new repository on GitHub, create it as an empty repository without a README or any other file. Then push the repo with the template of your new pipeline from your local clone. - -:::tip{title="Exercise 1 - Getting around the git environment"} - -1. Create and switch to a new git branch called `demo`. -
- solution 1 - - ```bash - git checkout -b demo - ``` - -
- -2. Display all available git branches. -
- solution 2 - - ```bash - git branch - ``` - -
- -3. Create a directory within the new pipeline directory called `results` and add it to the `.gitignore` file. -
- solution 3 - - ```bash - mkdir results - ``` - - ```groovy title=".gitignore" - .nextflow* - work/ - data/ - results/ - .DS_Store - testing/ - testing* - *.pyc - results/ - ``` - -
- -4. Commit the changes you have made. -
- solution 4 - - ```bash - git add . - git commit -m "creating results dir and adding it to gitignore" - ``` - -
- ::: - -### Run the new pipeline - -The new pipeline should run with Nextflow, right out of the box. Let’s try: - -```bash -cd ../ -nextflow run nf-core-demotest/ -profile test,docker --outdir test_results -``` - -This basic template pipeline contains already the FastQC and MultiQC modules, which do run on a selection of test data. - -### Template code walk through - -Now let us have a look at the files that were generated within the `nf-core-demotest` directory when we created this pipeline. You can see all files and directories either on the left hand side in the Explorer, or by running the command: - -```bash -cd nf-core-demotest -tree -``` - -``` -. -├── assets -│ ├── adaptivecard.json -│ ├── email_template.html -│ ├── email_template.txt -│ ├── methods_description_template.yml -│ ├── multiqc_config.yml -│ ├── nf-core-demotest_logo_light.png -│ ├── samplesheet.csv -│ ├── schema_input.json -│ ├── sendmail_template.txt -│ └── slackreport.json -├── bin -│ └── check_samplesheet.py -├── CHANGELOG.md -├── CITATIONS.md -├── CODE_OF_CONDUCT.md -├── conf -│ ├── base.config -│ ├── igenomes.config -│ ├── modules.config -│ ├── test.config -│ └── test_full.config -├── docs -│ ├── images -│ │ ├── mqc_fastqc_adapter.png -│ │ ├── mqc_fastqc_counts.png -│ │ ├── mqc_fastqc_quality.png -│ │ ├── nf-core-demotest_logo_dark.png -│ │ └── nf-core-demotest_logo_light.png -│ ├── output.md -│ ├── README.md -│ └── usage.md -├── lib -│ ├── nfcore_external_java_deps.jar -│ ├── NfcoreTemplate.groovy -│ ├── Utils.groovy -│ ├── WorkflowDemotest.groovy -│ └── WorkflowMain.groovy -├── LICENSE -├── main.nf -├── modules -│ ├── local -│ │ └── samplesheet_check.nf -│ └── nf-core -│ ├── custom -│ │ └── dumpsoftwareversions -│ │ ├── main.nf -│ │ ├── meta.yml -│ │ └── templates -│ │ └── dumpsoftwareversions.py -│ ├── fastqc -│ │ ├── main.nf -│ │ └── meta.yml -│ └── multiqc -│ ├── main.nf -│ └── meta.yml -├── modules.json -├── nextflow.config -├── nextflow_schema.json -├── pyproject.toml -├── README.md -├── subworkflows -│ └── local -│ └── input_check.nf -├── tower.yml -└── workflows - └── demotest.nf -``` - -These are the files in detail: - -1. **main.nf** - - This file contains the main nextflow pipeline code. Mostly this file is not touched. - -2. **workflows/demotest.nf** - - This file is where the pipeline is going to be assembled. It connects the different modules and subworkflows. - -3. **CHANGELOG.md, CODE_OF_CONDUCT.md, LICENSE, README.md, CITATIONS.md** - - These are standard files created for github repositories. As a default, this pipeline will be under an MIT licence. The CODE_OF_CONDUCT is specific for the nf-core community. - -4. **assets/** - - This directory contains different templates such as email templates or the MultiQC config. In this course contents of this directory can be largely ignored. - -5. **bin/** - - The `bin` directory contains custom executable scripts, and is automatically added to the `PATH` by Nextflow allowing these scripts to become findable by Nextflow. As such, they can be called by name without using their absolute or relative path. The python script `check_samplesheet.py` is part of the nf-core template, since typically, nf-core pipelines require a samplesheet as one of their inputs. - -6. **conf/** and **nextflow.config** - - The `nextflow.config` file is the main config file. In addition to supplying default parameters, it imports all the configurations in `conf/`. Importantly, `conf/` contains the `test.config` file, which is used for pipeline testing. In this course we are not going to touch config files, but they have been extensively covered in the following bytesize talks: [How nf-core configs work (nf-core/bytesize #2)](https://www.youtube.com/watch?v=cXBYusdjrc0&list=PL3xpfTVZLcNiF4hgkW0yXeNzr0d35qlIB&index=8&pp=gAQBiAQB), [Making a new institutional config profile (nf-core/bytesize #10)](https://www.youtube.com/watch?v=Ym1s6sKGzkw&list=PL3xpfTVZLcNiF4hgkW0yXeNzr0d35qlIB&index=9&pp=gAQBiAQB), [nf-core/bytesize: Using nf-core configs in custom pipelines](https://www.youtube.com/watch?v=zgcrI_0SUgg&list=PL3xpfTVZLcNiF4hgkW0yXeNzr0d35qlIB&index=40&pp=gAQBiAQB) - -7. **docs/** - - This directory contains additional information to the README file. The most important files are the `output.md` and the `usage.md` files. `usage.md` should describe what exactly is needed to run the pipeline and `output.md` should describe all outputs that can be expected. Importantly, for nf-core pipelines, the information from these two files will automatically be displayed on the nf-core website page of the pipeline. - -8. **lib/** - - This directory contains groovy functions and classes that are imported into the `main.nf` file to provide additional functionality not native to Nextflow. - -9. **modules/local** - - This is where all your custom non-nf-core modules go. We will cover when and how to make local modules later in the course. - -10. **modules/nf-core** - - All nf-core modules that are installed using the nf-core tooling will automatically show up in this directory. Keep them here, it is important for automatic updates. - -11. **modules.json** - -This file keeps track of modules installed using nf-core tools from the nf-core/modules repository. This file should only be updated using nf-core tools, and never manually. - -12. **nextflow_schema.json** - - This file hosts all the parameters for the pipeline. Any new parameter should be added to this file using the `nf-core schema build` command. Similar to `output.md` and `usage.md`, the contents of `nextflow_schema.json` will get displayed on the pipeline page of nf-core pipelines. - -13. **pyproject.toml** -14. **subworkflows/local** -15. **tower.yml** -16. **hidden directories and files** - - a) _.devcontainer/devcontainer.json_ - - b) _.github/_ - - Files in here are used for Continuous integration tests (CI) with github actions as well as other github related defaults, such as a template for issues. We will not touch on these in the course. - - c) _.gitignore_ - - d) _.editorconfig_ - - e) _.gitpod.yml_ - - This file provides settings to create a Cloud development environment in your browser using Gitpod. It comes installed with the tools necessary to develop and test nf-core pipelines, modules, and subworkflows, allowing you to develop from anywhere without installing anything locally. - - f) _.nf-core.yml_ - - g) _.pre-commit-config.yaml_ - - h) _.prettierignore_ - - i) _.prettierrc.yml_ - -:::tip{title="Exercise 2 - Test your knowledge of the nf-core pipeline structure"} - -1. In which directory can you find the main script of the nf-core module `fastqc` -
- solution 1 - - ``` - modules/nf-core/fastqc/ - ``` - -
- -2. Which file contains the main workflow of your new pipeline? -
- solution 2 - - ``` - workflows/demotest.nf - ``` - -
- -3. `check_samplesheet.py` is a script that can be called by any module of your pipeline, where is it located? -
- solution 3 - - ``` - bin/ - ``` - - This directory can also contain a custom scripts that you may wish to call from within a custom module. - -
- -[MORE QUESTIONS CAN BE ADDED HERE] -::: - -## Customising the template - -In many of the files generated by the nf-core template, you’ll find code comments that look like this: - -``` -// TODO nf-core: Do something here -``` - -These are markers to help you get started with customising the template code as you write your pipeline. Editor tools such as Todo tree help you easily navigate these and work your way through them. - -## Linting your pipeline - -Customising the template is part of writing your new pipeline. However, not all files should be edited - indeed, nf-core strives to promote standardisation amongst pipelines. - -To try to keep pipelines up to date and using the same code where possible, we have an automated code linting tool for nf-core pipelines. Running nf-core lint will run a comprehensive test suite against your pipeline: - -```bash -cd nf-core-demotest/ -nf-core lint -``` - -The first time you run this command it will download some modules and then perform the linting tests. Linting tests can have one of four statuses: pass, ignore, warn or fail. For example, at first you will see a large number of warnings about TODO comments, letting you know that you haven’t finished setting up your new pipeline. Warnings are ok at this stage, but should be cleared up before a pipeline release. - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -INFO Testing pipeline: . - -╭─ [!] 24 Pipeline Test Warnings ──────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ │ -│ readme: README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release). │ -│ pipeline_todos: TODO string in README.md: TODO nf-core: │ -│ pipeline_todos: TODO string in README.md: Include a figure that guides the user through the major workflow steps. Many nf-core │ -│ pipeline_todos: TODO string in README.md: Fill in short bullet-pointed list of the default steps in the pipeline │ -│ pipeline_todos: TODO string in README.md: Describe the minimum required steps to execute the pipeline, e.g. how to prepare │ -│ samplesheets. │ -│ pipeline_todos: TODO string in README.md: update the following command to include all required parameters for a minimal example │ -│ pipeline_todos: TODO string in README.md: If applicable, make list of people who have also contributed │ -│ pipeline_todos: TODO string in README.md: Add citation for pipeline after first release. Uncomment lines below and update Zenodo │ -│ doi and badge at the top of this file. │ -│ pipeline_todos: TODO string in README.md: Add bibliography of tools and data used in your pipeline │ -│ pipeline_todos: TODO string in main.nf: Remove this line if you don't need a FASTA file │ -│ pipeline_todos: TODO string in nextflow.config: Specify your pipeline's command line flags │ -│ pipeline_todos: TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required │ -│ pipeline_todos: TODO string in ci.yml: You can customise CI pipeline run tests as required │ -│ pipeline_todos: TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, │ -│ e.g. add publication citation for this pipeline │ -│ pipeline_todos: TODO string in base.config: Check the defaults for all processes │ -│ pipeline_todos: TODO string in base.config: Customise requirements for specific processes. │ -│ pipeline_todos: TODO string in test.config: Specify the paths to your test data on nf-core/test-datasets │ -│ pipeline_todos: TODO string in test.config: Give any required params for the test so that command line flags are not needed │ -│ pipeline_todos: TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly │ -│ in repositories, e.g. SRA) │ -│ pipeline_todos: TODO string in test_full.config: Give any required params for the test so that command line flags are not needed │ -│ pipeline_todos: TODO string in output.md: Write this documentation describing your workflow's output │ -│ pipeline_todos: TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, │ -│ please point to (and add to) the main nf-core website. │ -│ pipeline_todos: TODO string in WorkflowDemotest.groovy: Optionally add in-text citation tools to this list. │ -│ pipeline_todos: TODO string in WorkflowMain.groovy: Add Zenodo DOI for pipeline after first release │ -│ │ -╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ - -╭─ [!] 3 Module Test Warnings ─────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ ╷ ╷ │ -│ Module name │ File path │ Test message │ -│╶──────────────────────────────────────────┼──────────────────────────────────────────┼──────────────────────────────────────────╴│ -│ custom/dumpsoftwareversions │ modules/nf-core/custom/dumpsoftwarevers… │ New version available │ -│ fastqc │ modules/nf-core/fastqc │ New version available │ -│ multiqc │ modules/nf-core/multiqc │ New version available │ -│ ╵ ╵ │ -╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ -╭───────────────────────╮ -│ LINT RESULTS SUMMARY │ -├───────────────────────┤ -│ [✔] 183 Tests Passed │ -│ [?] 0 Tests Ignored │ -│ [!] 27 Test Warnings │ -│ [✗] 0 Tests Failed │ -╰───────────────────────╯ -``` - -Failures are more serious however, and will typically prevent pull-requests from being merged. For example, if you edit CODE_OF_CONDUCT.md, which should match the template, you’ll get a pipeline lint test failure: - -```bash -echo "Edited" >> CODE_OF_CONDUCT.md -nf-core lint -``` - -``` -[...] -╭─ [✗] 1 Pipeline Test Failed ─────────────────────────────────────────────────────╮ -│ │ -│ files_unchanged: CODE_OF_CONDUCT.md does not match the template │ -│ │ -╰──────────────────────────────────────────────────────────────────────────────────╯ -[...] -╭───────────────────────╮ -│ LINT RESULTS SUMMARY │ -├───────────────────────┤ -│ [✔] 182 Tests Passed │ -│ [?] 0 Tests Ignored │ -│ [!] 27 Test Warnings │ -│ [✗] 1 Test Failed │ -╰───────────────────────╯ -[...] -``` - -:::tip{title="Exercise 3 - ToDos and linting"} - -1. Add the following bullet point list to the README file, where the ToDo indicates to describe the default steps to execute the pipeline - - ```groovy title="pipeline overview" - - Indexing of a transcriptome file - - Quality control - - Quantification of transcripts - - [whatever the custom script does] - - Generation of a MultiQC report - ``` - -
- solution 1 - - ```bash title="README.md" - [...] - - ## Introduction - - **nf-core/a** is a bioinformatics pipeline that ... - - - - - - Default steps: - - Indexing of a transcriptome file - - Quality control - - Quantification of transcripts - - [whatever the custom script does] - - Generation of a MultiQC report - - 1. Read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)) - 2. Present QC for raw reads ([`MultiQC`](http://multiqc.info/)) - - [...] - - ``` - -
- -2. Lint the changes you have made -
- solution 2 - - ```bash - nf-core lint - ``` - - You should see that we now get one less `warning` in our lint overview, since we removed one of the TODO items. - -
- -3. Commit your changes -
- solution 3 - - ```bash - git add . - git commit -m "adding pipeline overview to pipeline README" - ``` - -
- - ::: - -## Adding Modules to a pipeline - -### Adding an existing nf-core module - -#### Identify available nf-core modules - -The nf-core pipeline template comes with a few nf-core/modules pre-installed. You can list these with the command below: - -```bash -nf-core modules list local -``` - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -INFO Modules installed in '.': - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ -┃ Module Name ┃ Repository ┃ Version SHA ┃ Message ┃ Date ┃ -┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ -│ custom/dumpsoftwareversions │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ -│ │ │ │ biocontainers (#3380) │ │ -│ fastqc │ https://github.com/nf-cor… │ bd8092b67b5103bdd52e300f75… │ Add singularity.registry = │ 2023-07-01 │ -│ │ │ │ 'quay.io' for tests │ │ -│ │ │ │ (#3499) │ │ -│ multiqc │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ -│ │ │ │ biocontainers (#3380) │ │ -└─────────────────────────────┴────────────────────────────┴─────────────────────────────┴────────────────────────────┴────────────┘ - -``` - -These version hashes and repository information for the source of the modules are tracked in the modules.json file in the root of the repo. This file will automatically be updated by nf-core/tools when you create, remove or update modules. - -Let’s see if all of our modules are up-to-date: - -```bash -nf-core modules update -``` - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -? Update all modules or a single named module? All modules -? Do you want to view diffs of the proposed changes? No previews, just update everything -INFO Updating 'nf-core/custom/dumpsoftwareversions' -INFO Updating 'nf-core/fastqc' -INFO Updating 'nf-core/multiqc' -INFO Updates complete ✨ -``` - -You can list all of the modules available on nf-core/modules via the command below but we have added search functionality to the nf-core website to do this too! - -```bash -nf-core modules list remote -``` - -#### Install a remote nf-core module - -To install a remote nf-core module from the website, you can first get information about a tool, including the installation command by executing: - -```bash -nf-core modules info salmon/index -``` - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - -╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ 🌐 Repository: https://github.com/nf-core/modules.git │ -│ 🔧 Tools: salmon │ -│ 📖 Description: Create index for salmon │ -╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ -╷ ╷ -📥 Inputs │Description │Pattern -╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ -genome_fasta (file) │Fasta file of the reference genome │ -╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ -transcriptome_fasta (file)│Fasta file of the reference transcriptome │ -╵ ╵ -╷ ╷ -📤 Outputs │Description │ Pattern -╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ -index (directory)│Folder containing the star index files │ salmon -╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ -versions (file) │File containing software versions │versions.yml -╵ ╵ - -💻 Installation command: nf-core modules install salmon/index - -``` - -:::tip{title="Exercise 4 - Identification of available nf-core modules"} - -1. Get information abou the nf-core module `salmon/quant`. -
- solution 1 - - ``` - nf-core modules info salmon/quant - ``` - -
- -2. Is there any version of `salmon/quant` already installed locally? -
- solution 2 - - ``` - nf-core modules list local - ``` - - If `salmon/quant` is not listed, there is no local version installed. - -
- ::: - -The output from the info command will among other things give you the nf-core/tools installation command, lets see what it is doing: - -```bash -nf-core modules install salmon/index -``` - -``` - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ 🌐 Repository: https://github.com/nf-core/modules.git │ -│ 🔧 Tools: salmon │ -│ 📖 Description: Create index for salmon │ -╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ - ╷ ╷ - 📥 Inputs │Description │Pattern -╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ - genome_fasta (file) │Fasta file of the reference genome │ -╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ - transcriptome_fasta (file)│Fasta file of the reference transcriptome │ - ╵ ╵ - ╷ ╷ - 📤 Outputs │Description │ Pattern -╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ - index (directory)│Folder containing the star index files │ salmon -╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ - versions (file) │File containing software versions │versions.yml - ╵ ╵ - - 💻 Installation command: nf-core modules install salmon/index - -gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core modules install salmon/index - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -INFO Installing 'salmon/index' -INFO Use the following statement to include this module: - - include { SALMON_INDEX } from '../modules/nf-core/salmon/index/main' -``` - -(lots of steps missing here) -exercise to add a different module would be nice! => salmon/quant! -comparison to simple nextflow pipeline from the basic Nextflow training would be nice!) - -:::tip{title="Exercise 5 - Installing a remote module from nf-core"} - -1. Install the nf-core module `adapterremoval` -
- solution 1 - - ```bash - nf-core modules install adapterremoval - ``` - -
- -2. Which file(s) were/are added and what does it / do they do? -
- solution 2 - - ``` - Installation added the module directory `/workspace/basic_training/nf-core-demotest/modules/nf-core/adapterremoval`: - . - ├── environment.yml - ├── main.nf - ├── meta.yml - └── tests - ├── main.nf.test - ├── main.nf.test.snap - └── tags.yml - - The `test` directory contains all information required to perform basic tests for the module, it rarely needs to be changed. `main.nf` is the main workflow file that contains the module code. All input and output variables of the module are described in the `meta.yml` file, whereas the `environment.yml` file contains the dependancies of the module. - ``` - -
- -3. Import the installed `adapterremoval` pipeline into your main workflow. -
- solution 3 - - ```bash title="workflows/demotest.nf" - [...] - /* - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - */ - - include { FASTQC } from '../modules/nf-core/fastqc/main' - include { MULTIQC } from '../modules/nf-core/multiqc/main' - include { paramsSummaryMap } from 'plugin/nf-validation' - include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' - include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' - include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_demotest_pipeline' - include { ADAPTERREMOVAL } from '../modules/nf-core/adapterremoval/main' - - [...] - - ``` - -
- -4. Call the `ADAPTERREMOVAL` process in your workflow -
- solution 4 - - ```bash title="workflows/demotest.nf" - [...] - FASTQC ( - ch_samplesheet - ) - ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.collect{it[1]}) - ch_versions = ch_versions.mix(FASTQC.out.versions.first()) - - // - // MODULE: ADAPTERREMOVAL - // - ADAPTERREMOVAL( - - ) - [...] - ``` - -
- -5. Add required parameters for `adapterremoval`to the `ADAPTERREMOVAL` process -
- solution 5 - - `adapterremoval` requires three input channels: `meta`, `reads` and `adapterlist`, as outlined in the `meta.yml` of the module. `meta` and `reads` are typically given in one channel as a metamap, whereas the `adapterlist` will be it's own channel for which we should give a path. See here: - - ```bash title="adapterremoval/main.nf" - [...] - input: - tuple val(meta), path(reads) - path(adapterlist) - [...] - ``` - - The meta map containing the metadata and the reads can be taken directly from the samplesheet as is the case for FastQC, therefore we can give it the input channel `ch_samplesheet`. The `adapterlist` could either be a fixed path, or a parameter that is given on the command line. For now, we will just add a dummy channel called `adapterlist` assuming that it will be a parameter given in the command line. With this, the new module call for adapterremoval looks as follows: - - ```bash title="workflows/demotest.nf" - [...] - // - // MODULE: ADAPTERREMOVAL - // - ADAPTERREMOVAL( - ch_samplesheet - params.adapterlist - ) - [...] - ``` - -
- -6. Add the input parameter `adapterlist` -
- solution 7 - In order to use `params.adapterlist` we need to add the parameter to the `nextflow.config`. - - ```bash title="nextflow.config" - /// Global default params, used in configs - params { - - /// TODO nf-core: Specify your pipeline's command line flags - /// Input options - input = null - adapterlist = null - - [...] - ``` - - Then use the `nf-core schema build` tool to have the new parameter integrated into `nextflow_schema.json`. The output should look as follows. - - ``` - gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core schema build - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.13.1 - https://nf-co.re - - INFO [✓] Default parameters match schema validation - INFO [✓] Pipeline schema looks valid (found 32 params) - ✨ Found 'params.test' in the pipeline config, but not in the schema. Add to pipeline schema? [y/n]: y - ``` - - Select y on the final prompt to launch a web browser to edit your schema graphically. - -
- -7. Lint your pipeline -
- solution 7 - - ```bash - nf-core lint - ``` - -
- -8. Run the pipeline and inspect the results -
- solution 8 - - To run the pipeline, be aware that we now need to specify a file containing the adapters. As such, we create a new file called "adapterlist.txt" and add the adapter sequence "[WE NEED AN ADAPTER SEQUENCE HERE]" to it. Then we can run the pipeline as follows: - - ```bash - nextflow run nf-core-demotest/ -profile test,docker --outdir test_results --adapterlist /path/to/adapterlist.txt - - ``` - -
- -9. Commit the changes -
- solution 9 - - ```bash - git add . - git commit -m "add adapterremoval module" - ``` - -
- -::: - -### Adding a local module - -If there is no nf-core module available for the software you want to include, the nf-core tools package can also aid in the generation of a local module that is specific for your pipeline. To add a local module run the following: - -``` - -nf-core modules create - -``` - -Open ./modules/local/demo/module.nf and start customising this to your needs whilst working your way through the extensive TODO comments! - -### Making a local module for a custom script - -To generate a module for a custom script you need to follow the same steps when adding a remote module. -Then, you can supply the command for your script in the `script` block but your script needs to be present -and _executable_ in the `bin` -folder of the pipeline. -In the nf-core pipelines, -this folder is in the main directory and you can see in [`rnaseq`](https://github.com/nf-core/rnaseq). -Let's look at an publicly available example in this pipeline, -for instance [`tximport.r`](https://github.com/nf-core/rnaseq/blob/master/bin/tximport.r). -This is an Rscript present in the [`bin`](https://github.com/nf-core/rnaseq/tree/master/bin) of the pipeline. -We can find the module that runs this script in -[`modules/local/tximport`](https://github.com/nf-core/rnaseq/blob/master/modules/local/tximport/main.nf). -As we can see the script is being called in the `script` block, note that `tximport.r` is -being executed as if it was called from the command line and therefore needs to be _executable_. - -
- -

TL;DR

- -1. Write your script on any language (python, bash, R, - ruby). E.g. `maf2bed.py` -2. If not there yet, move your script to `bin` folder of - the pipeline and make it - executable (`chmod +x `) -3. Create a module with a single process to call your script from within the workflow. E.g. `./modules/local/convert_maf2bed/main.nf` -4. Include your new module in your workflow with the command `include {CONVERT_MAF2BED} from './modules/local/convert_maf2bed/main'` that is written before the workflow call. -
- -_Tip: Try to follow best practices when writing a script for -reproducibility and maintenance purposes: add the -shebang (e.g. `#!/usr/bin/env python`), and a header -with description and type of license._ - -### 1. Write your script - -Let's create a simple custom script that converts a MAF file to a BED file called `maf2bed.py` and place it in the bin directory of our nf-core-testpipeline:: - -``` - -#!/usr/bin/env python -"""bash title="maf2bed.py" -Author: Raquel Manzano - @RaqManzano -Script: Convert MAF to BED format keeping ref and alt info -License: MIT -""" -import argparse -import pandas as pd - -def argparser(): -parser = argparse.ArgumentParser(description="") -parser.add_argument("-maf", "--mafin", help="MAF input file", required=True) -parser.add_argument("-bed", "--bedout", help="BED input file", required=True) -parser.add_argument( -"--extra", help="Extra columns to keep (space separated list)", nargs="+", required=False, default=[] -) -return parser.parse_args() - -def maf2bed(maf_file, bed_file, extra): -maf = pd.read_csv(maf_file, sep="\t", comment="#") -bed = maf[["Chromosome", "Start_Position", "End_Position"] + extra] -bed.to_csv(bed_file, sep="\t", index=False, header=False) - -def main(): -args = argparser() -maf2bed(maf_file=args.mafin, bed_file=args.bedout, extra=args.extra) - -if **name** == "**main**": -main() - -``` - -### 2. Make sure your script is in the right folder - -Now, let's move it to the correct directory and make sure it is executable: - -```bash -mv maf2bed.py /path/where/pipeline/is/bin/. -chmod +x /path/where/pipeline/is/bin/maf2bed.py -``` - -### 3. Create your custom module - -Then, let's write our module. We will call the process -"CONVERT_MAF2BED" and add any tags or/and labels that -are appropriate (this is optional) and directives (via -conda and/or container) for -the definition of dependencies. - -
- -

Some additional infos that might be of interest

- -
-More info on labels -A `label` will -annotate the processes with a reusable identifier of your -choice that can be used for configuring. E.g. we use the -`label` 'process_single', this looks as follows: - -``` - -withLabel:process_single { -cpus = { check_max( 1 _ task.attempt, 'cpus' ) } -memory = { check_max( 1.GB _ task.attempt, 'memory') } -time = { check_max( 1.h \* task.attempt, 'time' ) } -} - -``` - -
- -
-More info on tags - -A `tag` is simple a user provided identifier associated to -the task. In our process example, the input is a tuple -comprising a hash of metadata for the maf file called -`meta` and the path to the `maf` file. It may look -similar to: `[[id:'123', data_type:'maf'], -/path/to/file/example.maf]`. Hence, when nextflow makes -the call and `$meta.id` is `123` name of the job -will be "CONVERT_MAF2BED(123)". If `meta` does not have -`id` in its hash, then this will be literally `null`. - -
- -
-More info on conda/container directives - -The `conda` directive allows for the definition of the -process dependencies using the [Conda package manager](https://docs.conda.io/en/latest/). Nextflow automatically sets up an environment for the given package names listed by in the conda directive. For example: - -``` - -process foo { -conda 'bwa=0.7.15' - -''' -your_command --here -''' -} - -``` - -Multiple packages can be specified separating them with a blank space e.g. `bwa=0.7.15 samtools=1.15.1`. The name of the channel from where a specific package needs to be downloaded can be specified using the usual Conda notation i.e. prefixing the package with the channel name as shown here `bioconda::bwa=0.7.15` - -``` - -process foo { -conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' - -''' -your_bwa_cmd --here -your_samtools_cmd --here -''' -} - -``` - -Similarly, we can apply the `container` directive to execute the process script in a [Docker](http://docker.io/) or [Singularity](https://docs.sylabs.io/guides/3.5/user-guide/introduction.html) container. When running Docker, it requires the Docker daemon to be running in machine where the pipeline is executed, i.e. the local machine when using the local executor or the cluster nodes when the pipeline is deployed through a grid executor. - -``` - -process foo { -conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' -container 'dockerbox:tag' - -''' -your_bwa_cmd --here -your_samtools_cmd --here -''' -} - -``` - -Additionally, the `container` directive allows for a more sophisticated choice of container and if it Docker or Singularity depending on the users choice of container engine. This practice is quite common on official nf-core modules. - -``` - -process foo { -conda "bioconda::fastqc=0.11.9" -container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? -'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' : -'biocontainers/fastqc:0.11.9--0' }" - -''' -your_fastqc_command --here -''' -} - -``` - -
- -
- -Since `maf2bed.py` is in the `bin` directory we can directory call it in the script block of our new module `CONVERT_MAF2BED`. You only have to be careful with how you call variables (some explanations on when to use `${variable}` vs. `$variable`): -A process may contain any of the following definition blocks: directives, inputs, outputs, when clause, and the process script. Here is how we write it: - -``` -process CONVERT_MAF2BED { -// HEADER -tag "$meta.id" - label 'process_single' - // DEPENDENCIES DIRECTIVES - conda "anaconda::pandas=1.4.3" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? -'https://depot.galaxyproject.org/singularity/pandas:1.4.3' : -'quay.io/biocontainers/pandas:1.4.3' }" - -// INPUT BLOCK -input: -tuple val(meta), path(maf) - -// OUTPUT BLOCK -output: -tuple val(meta), path('\*.bed') , emit: bed -path "versions.yml" , emit: versions - -// WHEN CLAUSE -when: -task.ext.when == null || task.ext.when - -// SCRIPT BLOCK -script: // This script is bundled with the pipeline in bin -def args = task.ext.args ?: '' -def prefix = task.ext.prefix ?: "${meta.id}" - -""" -maf2bed.py --mafin $maf --bedout ${prefix}.bed -""" -} -``` - -More on nextflow's process components in the [docs](https://www.nextflow.io/docs/latest/process.html). - -### Include your module in the workflow - -In general, we will call out nextflow module `main.nf` and save it in the `modules` folder under another folder called `conver_maf2bed`. If you believe your custom script could be useful for others and it is potentially reusable or calling a tool that is not yet present in nf-core modules you can start the process of making it official adding a `meta.yml` [explained above](#adding-modules-to-a-pipeline). In the `meta.yml` The overall tree for the pipeline skeleton will look as follows: - -``` - -pipeline/ -├── bin/ -│ └── maf2bed.py -├── modules/ -│ ├── local/ -│ │ └── convert_maf2bed/ -│ │ ├── main.nf -│ │ └── meta.yml -│ └── nf-core/ -├── config/ -│ ├── base.config -│ └── modules.config -... - -``` - -To use our custom module located in `./modules/local/convert_maf2bed` within our workflow, we use a module inclusions command as follows (this has to be done before we invoke our workflow): - -```bash title="workflows/demotest.nf" -include { CONVERT_MAF2BED } from './modules/local/convert_maf2bed/main' -workflow { -input_data = [[id:123, data_type='maf'], /path/to/maf/example.maf] -CONVERT_MAF2BED(input_data) -} -``` - -:::tip{title="Exercise 6 - Adding a custom module"} -In the directory `exercise_6` you will find the custom script `print_hello.py`, which will be used for this and the next exercise. - -1. Create a local module that runs the `print_hello.py` script -2. Add the module to your main workflow -3. Run the pipeline -4. Lint the pipeline -5. Commit your changes -
- solution 1 - - ``` - - ``` - -
- -::: - -### Further reading and additional notes - -#### What happens in I want to use containers but there is no image created with the packages I need? - -No worries, this can be done fairly easy thanks to [BioContainers](https://biocontainers-edu.readthedocs.io/en/latest/what_is_biocontainers.html), see instructions [here](https://github.com/BioContainers/multi-package-containers). If you see the combination that you need in the repo, you can also use [this website](https://midnighter.github.io/mulled) to find out the "mulled" name of this container. - -### I want to know more about software dependencies! - -You are in luck, we have more documentation [here](https://nf-co.re/docs/contributing/modules#software-requirements) - -#### I want to know more about modules! - -See more info about modules in the nextflow docs [here](https://nf-co.re/docs/contributing/modules#software-requirements.) - -## Lint all modules - -As well as the pipeline template you can lint individual or all modules with a single command: - -``` - -nf-core modules lint --all - -``` - -## Nextflow Schema - -All nf-core pipelines can be run with --help to see usage instructions. We can try this with the demo pipeline that we just created: - -``` - -cd ../ -nextflow run nf-core-demo/ --help - -``` - -### Working with Nextflow schema - -If you peek inside the nextflow_schema.json file you will see that it is quite an intimidating thing. The file is large and complex, and very easy to break if edited manually. - -Thankfully, we provide a user-friendly tool for editing this file: nf-core schema build. - -To see this in action, let’s add some new parameters to nextflow.config: - -``` - -params { -demo = 'param-value-default' -foo = null -bar = false -baz = 12 -// rest of the config file.. - -``` - -Then run nf-core schema build: - -``` - -cd nf-core-demo/ -nf-core schema build - -``` - -The CLI tool should then prompt you to add each new parameter. -Here in the schema editor you can edit: - -- Description and help text -- Type (string / boolean / integer etc) -- Grouping of parameters -- Whether a parameter is required, or hidden from help by default -- Enumerated values (choose from a list) -- Min / max values for numeric types -- Regular expressions for validation -- Special formats for strings, such as file-path -- Additional fields for files such as mime-type - -:::tip{title="Exercise 7 - Using nextflow schema to add command line parameters"} - -1. Feed a string to your custom script from exercise 6 from the command line. Use `nf-core schema build` to add the parameter to the `nextflow.config` file. - - - -::: - -:::note - -### Key points - -- `nf-core create ` creates a pipeline from the nf-core template. -- `nf-core lint` lints the pipeline code for things that must be completed. -- `nf-core modules list local` lists modules currently installed into your pipeline. -- `nf-core modules list remote` lists modules available to install into your pipeline. -- `nf-core modules install ` installs the tool module into your pipeline. -- `nf-core modules create` creates a module locally to add custom code into your pipeline. -- `nf-core modules lint --all` lints your module code for things that must be completed. -- `nf-core schema build` opens an interface to allow you to describe your pipeline parameters and set default values, and which values are valid. - -::: - -``` - -``` diff --git a/src/content/docs/contributing/nf_core_basic_training/add_custom_module.md b/src/content/docs/contributing/nf_core_basic_training/add_custom_module.md new file mode 100644 index 0000000000..3812f9946c --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/add_custom_module.md @@ -0,0 +1,303 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: Adding a local custom module to your pipeline +--- + +### Adding a local module + +If there is no nf-core module available for the software you want to include, the nf-core tools package can also aid in the generation of a local module that is specific for your pipeline. To add a local module run the following: + +``` + +nf-core modules create + +``` + +Open ./modules/local/demo/module.nf and start customising this to your needs whilst working your way through the extensive TODO comments! For further help and guidelines for the modules code, check out the [modules specific documentation](https://nf-co.re/docs/contributing/tutorials/dsl2_modules_tutorial). + +### Making a local module for a custom script + +To generate a module for a custom script you need to follow the same steps when adding a remote module. +Then, you can supply the command for your script in the `script` block but your script needs to be present +and _executable_ in the `bin` +folder of the pipeline. +In the nf-core pipelines, +this folder is in the main directory and you can see in [`rnaseq`](https://github.com/nf-core/rnaseq). +Let's look at an publicly available example in this pipeline, +for instance [`tximport.r`](https://github.com/nf-core/rnaseq/blob/master/bin/tximport.r). +This is an Rscript present in the [`bin`](https://github.com/nf-core/rnaseq/tree/master/bin) of the pipeline. +We can find the module that runs this script in +[`modules/local/tximport`](https://github.com/nf-core/rnaseq/blob/master/modules/local/tximport/main.nf). +As we can see the script is being called in the `script` block, note that `tximport.r` is +being executed as if it was called from the command line and therefore needs to be _executable_. + +
+ +

TL;DR

+ +1. Write your script on any language (python, bash, R, + ruby). E.g. `maf2bed.py` +2. If not there yet, move your script to `bin` folder of + the pipeline and make it + executable (`chmod +x `) +3. Create a module with a single process to call your script from within the workflow. E.g. `./modules/local/convert_maf2bed/main.nf` +4. Include your new module in your workflow with the command `include {CONVERT_MAF2BED} from './modules/local/convert_maf2bed/main'` that is written before the workflow call. +
+ +_Tip: Try to follow best practices when writing a script for +reproducibility and maintenance purposes: add the +shebang (e.g. `#!/usr/bin/env python`), and a header +with description and type of license._ + +### 1. Write your script + +Let's create a simple custom script that converts a MAF file to a BED file called `maf2bed.py` and place it in the bin directory of our nf-core-testpipeline:: + +``` + +#!/usr/bin/env python +"""bash title="maf2bed.py" +Author: Raquel Manzano - @RaqManzano +Script: Convert MAF to BED format keeping ref and alt info +License: MIT +""" +import argparse +import pandas as pd + +def argparser(): +parser = argparse.ArgumentParser(description="") +parser.add_argument("-maf", "--mafin", help="MAF input file", required=True) +parser.add_argument("-bed", "--bedout", help="BED input file", required=True) +parser.add_argument( +"--extra", help="Extra columns to keep (space separated list)", nargs="+", required=False, default=[] +) +return parser.parse_args() + +def maf2bed(maf_file, bed_file, extra): +maf = pd.read_csv(maf_file, sep="\t", comment="#") +bed = maf[["Chromosome", "Start_Position", "End_Position"] + extra] +bed.to_csv(bed_file, sep="\t", index=False, header=False) + +def main(): +args = argparser() +maf2bed(maf_file=args.mafin, bed_file=args.bedout, extra=args.extra) + +if **name** == "**main**": +main() + +``` + +### 2. Make sure your script is in the right folder + +Now, let's move it to the correct directory and make sure it is executable: + +```bash +mv maf2bed.py /path/where/pipeline/is/bin/. +chmod +x /path/where/pipeline/is/bin/maf2bed.py +``` + +### 3. Create your custom module + +Then, let's write our module. We will call the process +"CONVERT_MAF2BED" and add any tags or/and labels that +are appropriate (this is optional) and directives (via +conda and/or container) for +the definition of dependencies. + +
+ +

Some additional infos that might be of interest

+ +
+More info on labels +A `label` will +annotate the processes with a reusable identifier of your +choice that can be used for configuring. E.g. we use the +`label` 'process_single', this looks as follows: + +``` + +withLabel:process_single { +cpus = { check_max( 1 _ task.attempt, 'cpus' ) } +memory = { check_max( 1.GB _ task.attempt, 'memory') } +time = { check_max( 1.h \* task.attempt, 'time' ) } +} + +``` + +
+ +
+More info on tags + +A `tag` is simple a user provided identifier associated to +the task. In our process example, the input is a tuple +comprising a hash of metadata for the maf file called +`meta` and the path to the `maf` file. It may look +similar to: `[[id:'123', data_type:'maf'], /path/to/file/example.maf]`. Hence, when nextflow makes +the call and `$meta.id` is `123` name of the job +will be "CONVERT_MAF2BED(123)". If `meta` does not have +`id` in its hash, then this will be literally `null`. + +
+ +
+More info on conda/container directives + +The `conda` directive allows for the definition of the +process dependencies using the [Conda package manager](https://docs.conda.io/en/latest/). Nextflow automatically sets up an environment for the given package names listed by in the conda directive. For example: + +``` + +process foo { +conda 'bwa=0.7.15' + +''' +your_command --here +''' +} + +``` + +Multiple packages can be specified separating them with a blank space e.g. `bwa=0.7.15 samtools=1.15.1`. The name of the channel from where a specific package needs to be downloaded can be specified using the usual Conda notation i.e. prefixing the package with the channel name as shown here `bioconda::bwa=0.7.15` + +``` + +process foo { +conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' + +''' +your_bwa_cmd --here +your_samtools_cmd --here +''' +} + +``` + +Similarly, we can apply the `container` directive to execute the process script in a [Docker](http://docker.io/) or [Singularity](https://docs.sylabs.io/guides/3.5/user-guide/introduction.html) container. When running Docker, it requires the Docker daemon to be running in machine where the pipeline is executed, i.e. the local machine when using the local executor or the cluster nodes when the pipeline is deployed through a grid executor. + +``` + +process foo { +conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' +container 'dockerbox:tag' + +''' +your_bwa_cmd --here +your_samtools_cmd --here +''' +} + +``` + +Additionally, the `container` directive allows for a more sophisticated choice of container and if it Docker or Singularity depending on the users choice of container engine. This practice is quite common on official nf-core modules. + +``` + +process foo { +conda "bioconda::fastqc=0.11.9" +container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? +'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' : +'biocontainers/fastqc:0.11.9--0' }" + +''' +your_fastqc_command --here +''' +} + +``` + +
+ +
+ +Since `maf2bed.py` is in the `bin` directory we can directory call it in the script block of our new module `CONVERT_MAF2BED`. You only have to be careful with how you call variables (some explanations on when to use `${variable}` vs. `$variable`): +A process may contain any of the following definition blocks: directives, inputs, outputs, when clause, and the process script. Here is how we write it: + +``` +process CONVERT_MAF2BED { +// HEADER +tag "$meta.id" + label 'process_single' + // DEPENDENCIES DIRECTIVES + conda "anaconda::pandas=1.4.3" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? +'https://depot.galaxyproject.org/singularity/pandas:1.4.3' : +'quay.io/biocontainers/pandas:1.4.3' }" + +// INPUT BLOCK +input: +tuple val(meta), path(maf) + +// OUTPUT BLOCK +output: +tuple val(meta), path('\*.bed') , emit: bed +path "versions.yml" , emit: versions + +// WHEN CLAUSE +when: +task.ext.when == null || task.ext.when + +// SCRIPT BLOCK +script: // This script is bundled with the pipeline in bin +def args = task.ext.args ?: '' +def prefix = task.ext.prefix ?: "${meta.id}" + +""" +maf2bed.py --mafin $maf --bedout ${prefix}.bed +""" +} +``` + +More on nextflow's process components in the [docs](https://www.nextflow.io/docs/latest/process.html). + +### Include your module in the workflow + +In general, we will call out nextflow module `main.nf` and save it in the `modules` folder under another folder called `conver_maf2bed`. If you believe your custom script could be useful for others and it is potentially reusable or calling a tool that is not yet present in nf-core modules you can start the process of making it official adding a `meta.yml` [explained above](#adding-modules-to-a-pipeline). In the `meta.yml` The overall tree for the pipeline skeleton will look as follows: + +``` + +pipeline/ +├── bin/ +│ └── maf2bed.py +├── modules/ +│ ├── local/ +│ │ └── convert_maf2bed/ +│ │ ├── main.nf +│ │ └── meta.yml +│ └── nf-core/ +├── config/ +│ ├── base.config +│ └── modules.config +... + +``` + +To use our custom module located in `./modules/local/convert_maf2bed` within our workflow, we use a module inclusions command as follows (this has to be done before we invoke our workflow): + +```bash title="workflows/demotest.nf" +include { CONVERT_MAF2BED } from './modules/local/convert_maf2bed/main' +workflow { +input_data = [[id:123, data_type='maf'], /path/to/maf/example.maf] +CONVERT_MAF2BED(input_data) +} +``` + +:::tip{title="Exercise 6 - Adding a custom module"} +In the directory `exercise_6` you will find the custom script `print_hello.py`, which will be used for this and the next exercise. + +1. Create a local module that runs the `print_hello.py` script +2. Add the module to your main workflow +3. Run the pipeline +4. Lint the pipeline +5. Commit your changes +
+ solution 1 + + ``` + + ``` + +
+ +::: diff --git a/src/content/docs/contributing/nf_core_basic_training/add_nf_core_module.md b/src/content/docs/contributing/nf_core_basic_training/add_nf_core_module.md new file mode 100644 index 0000000000..110b68f77a --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/add_nf_core_module.md @@ -0,0 +1,454 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: Adding a nf-core module to your pipeline +--- + +# Building a pipeline from existing components + +Nextflow pipelines can be build in a very modular fashion. In nf-core, we have simple building blocks available: nf-core/modules. Usually, they are wrappers around individual tools. In addition, we have subworkflows: smaller pre-build pipeline chunks. You can think about the modules as Lego bricks and subworkflows as pre-build chunks that can be added to various sets. These components are centrally available for all Nextflow pipelines. To make working with them easy, you can use `nf-core/tools`. + +## Identify available nf-core modules + +The nf-core pipeline template comes with a few nf-core/modules pre-installed. You can list these with the command below: + +```bash +nf-core modules list local +``` + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +INFO Modules installed in '.': + +┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ +┃ Module Name ┃ Repository ┃ Version SHA ┃ Message ┃ Date ┃ +┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ +│ custom/dumpsoftwareversions │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ +│ │ │ │ biocontainers (#3380) │ │ +│ fastqc │ https://github.com/nf-cor… │ bd8092b67b5103bdd52e300f75… │ Add singularity.registry = │ 2023-07-01 │ +│ │ │ │ 'quay.io' for tests │ │ +│ │ │ │ (#3499) │ │ +│ multiqc │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ +│ │ │ │ biocontainers (#3380) │ │ +└─────────────────────────────┴────────────────────────────┴─────────────────────────────┴────────────────────────────┴────────────┘ + +``` + +These version hashes and repository information for the source of the modules are tracked in the modules.json file in the root of the repo. This file will automatically be updated by nf-core/tools when you create, remove or update modules. + +Let’s see if all of our modules are up-to-date: + +```bash +nf-core modules update +``` + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +? Update all modules or a single named module? All modules +? Do you want to view diffs of the proposed changes? No previews, just update everything +INFO Updating 'nf-core/custom/dumpsoftwareversions' +INFO Updating 'nf-core/fastqc' +INFO Updating 'nf-core/multiqc' +INFO Updates complete ✨ +``` + +You can list all of the modules available on nf-core/modules via the command below but we have added search functionality to the nf-core website to do this too! + +```bash +nf-core modules list remote +``` + +In addition, all modules are listed on the website: [https://nf-co.re/modules](https://nf-co.re/modules) + +## Install a remote nf-core module + +To install a remote nf-core module, you can first get information about a tool, including the installation command by executing: + +```bash +nf-core modules info salmon/index +``` + +``` + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + +╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ +│ 🌐 Repository: https://github.com/nf-core/modules.git │ +│ 🔧 Tools: salmon │ +│ 📖 Description: Create index for salmon │ +╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ +╷ ╷ +📥 Inputs │Description │Pattern +╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ +genome_fasta (file) │Fasta file of the reference genome │ +╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ +transcriptome_fasta (file)│Fasta file of the reference transcriptome │ +╵ ╵ +╷ ╷ +📤 Outputs │Description │ Pattern +╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ +index (directory)│Folder containing the star index files │ salmon +╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ +versions (file) │File containing software versions │versions.yml +╵ ╵ + +💻 Installation command: nf-core modules install salmon/index + +``` + +:::tip{title="Exercise 4 - Identification of available nf-core modules"} + +1. Get information abou the nf-core module `salmon/quant`. +
+ solution 1 + + ``` + nf-core modules info salmon/quant + ``` + +
+ +2. Is there any version of `salmon/quant` already installed locally? +
+ solution 2 + + ``` + nf-core modules list local + ``` + + If `salmon/quant` is not listed, there is no local version installed. + +
+ ::: + +The output from the info command will among other things give you the nf-core/tools installation command, lets see what it is doing: + +```bash +nf-core modules install salmon/index +``` + +``` + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ +│ 🌐 Repository: https://github.com/nf-core/modules.git │ +│ 🔧 Tools: salmon │ +│ 📖 Description: Create index for salmon │ +╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ + ╷ ╷ + 📥 Inputs │Description │Pattern +╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ + genome_fasta (file) │Fasta file of the reference genome │ +╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ + transcriptome_fasta (file)│Fasta file of the reference transcriptome │ + ╵ ╵ + ╷ ╷ + 📤 Outputs │Description │ Pattern +╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ + index (directory)│Folder containing the star index files │ salmon +╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ + versions (file) │File containing software versions │versions.yml + ╵ ╵ + + 💻 Installation command: nf-core modules install salmon/index + +gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core modules install salmon/index + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.10 - https://nf-co.re + + +INFO Installing 'salmon/index' +INFO Use the following statement to include this module: + + include { SALMON_INDEX } from '../modules/nf-core/salmon/index/main' +``` + +The module is now installed into the folder `modules/nf-core`. Now open the file `workflow/demotest.nf`. You will find already several `include` statements there from the installed modules (`MultiQC` and `FastQC`): + +```bash title="workflow/demotest.nf" + +include { FASTQC } from '../modules/nf-core/fastqc/main' +include { MULTIQC } from '../modules/nf-core/multiqc/main' +``` + +Now add the above line underneath it: + +```bash title="workflow/demotest.nf" + +include { FASTQC } from '../modules/nf-core/fastqc/main' +include { MULTIQC } from '../modules/nf-core/multiqc/main' +include { SALMON_INDEX } from '../modules/nf-core/salmon/index/main' + +``` + +This makes the module now available in the workflow script and it can be called with the right input data. + + + +We can now call the module in our workflow. Let's place it after FastQC: + +```bash title="workflow/demotest.nf" + +workflow DEMOTEST { + + ... + // + // MODULE: Run FastQC + // + FASTQC ( + INPUT_CHECK.out.reads + ) + ch_versions = ch_versions.mix(FASTQC.out.versions.first()) + + SALMON_INDEX() +``` + +Now we are still missing an input for our module. In order to build an index, we require the reference fasta. Luckily, the template pipeline has this already all configured, and we can access it by just using `params.fasta` and `view` it to insppect the channel content. (We will see later how to add more input files.) + +```bash + fasta = Channel.fromPath(params.fasta) + + fasta.view() + + SALMON_INDEX( + fasta.map{it -> [id:it.getName(), it]} + ) + ch_versions = ch_versions.mix(SALMON_INDEX.out.versions.first()) + +``` + +Now what is happening here: + +To pass over our input FastA file, we need to do a small channel manipulation. nf-core/modules typically take the input together with a `meta` map. This is just a hashmap that contains relevant information for the analysis, that should be passed around the pipeline. There are a couple of keys that we share across all modules, such as `id`. So in order, to have a valid input for our module, we just use the fasta file name (`it.getName()`) as our `id`. In addition, we collect the versions of the tools that are run in the module. This will allow us later to track all tools and all versions allow us to generate a report. + +How test your pipeline: + +```bash +nextflow run main.nf -profile test,docker --outdir results +``` + +You should now see that `SALMON_INDEX` is run. + +(lots of steps missing here) +exercise to add a different module would be nice! => salmon/quant! +comparison to simple nextflow pipeline from the basic Nextflow training would be nice!) + +:::tip{title="Exercise 5 - Installing a remote module from nf-core"} + +1. Install the nf-core module `adapterremoval` +
+ solution 1 + + ```bash + nf-core modules install adapterremoval + ``` + +
+ +2. Which file(s) were/are added and what does it / do they do? +
+ solution 2 + + ``` + Installation added the module directory `/workspace/basic_training/nf-core-demotest/modules/nf-core/adapterremoval`: + . + ├── environment.yml + ├── main.nf + ├── meta.yml + └── tests + ├── main.nf.test + ├── main.nf.test.snap + └── tags.yml + + The `test` directory contains all information required to perform basic tests for the module, it rarely needs to be changed. `main.nf` is the main workflow file that contains the module code. All input and output variables of the module are described in the `meta.yml` file, whereas the `environment.yml` file contains the dependancies of the module. + ``` + +
+ +3. Import the installed `adapterremoval` pipeline into your main workflow. +
+ solution 3 + + ```bash title="workflows/demotest.nf" + [...] + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + include { FASTQC } from '../modules/nf-core/fastqc/main' + include { MULTIQC } from '../modules/nf-core/multiqc/main' + include { paramsSummaryMap } from 'plugin/nf-validation' + include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' + include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' + include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_demotest_pipeline' + include { ADAPTERREMOVAL } from '../modules/nf-core/adapterremoval/main' + + [...] + + ``` + +
+ +4. Call the `ADAPTERREMOVAL` process in your workflow +
+ solution 4 + + ```bash title="workflows/demotest.nf" + [...] + FASTQC ( + ch_samplesheet + ) + ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.collect{it[1]}) + ch_versions = ch_versions.mix(FASTQC.out.versions.first()) + + // + // MODULE: ADAPTERREMOVAL + // + ADAPTERREMOVAL( + + ) + [...] + ``` + +
+ +5. Add required parameters for `adapterremoval`to the `ADAPTERREMOVAL` process +
+ solution 5 + + `adapterremoval` requires three input channels: `meta`, `reads` and `adapterlist`, as outlined in the `meta.yml` of the module. `meta` and `reads` are typically given in one channel as a metamap, whereas the `adapterlist` will be it's own channel for which we should give a path. See here: + + ```bash title="adapterremoval/main.nf" + [...] + input: + tuple val(meta), path(reads) + path(adapterlist) + [...] + ``` + + The meta map containing the metadata and the reads can be taken directly from the samplesheet as is the case for FastQC, therefore we can give it the input channel `ch_samplesheet`. The `adapterlist` could either be a fixed path, or a parameter that is given on the command line. For now, we will just add a dummy channel called `adapterlist` assuming that it will be a parameter given in the command line. With this, the new module call for adapterremoval looks as follows: + + ```bash title="workflows/demotest.nf" + [...] + // + // MODULE: ADAPTERREMOVAL + // + ADAPTERREMOVAL( + ch_samplesheet + params.adapterlist + ) + [...] + ``` + +
+ +6. Add the input parameter `adapterlist` +
+ solution 7 + In order to use `params.adapterlist` we need to add the parameter to the `nextflow.config`. + + ```bash title="nextflow.config" + /// Global default params, used in configs + params { + + /// TODO nf-core: Specify your pipeline's command line flags + /// Input options + input = null + adapterlist = null + + [...] + ``` + + Then use the `nf-core schema build` tool to have the new parameter integrated into `nextflow_schema.json`. The output should look as follows. + + ``` + gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core schema build + + ,--./,-. + ___ __ __ __ ___ /,-._.--~\ + |\ | |__ __ / ` / \ |__) |__ } { + | \| | \__, \__/ | \ |___ \`-._,-`-, + `._,._,' + + nf-core/tools version 2.13.1 - https://nf-co.re + + INFO [✓] Default parameters match schema validation + INFO [✓] Pipeline schema looks valid (found 32 params) + ✨ Found 'params.test' in the pipeline config, but not in the schema. Add to pipeline schema? [y/n]: y + ``` + + Select y on the final prompt to launch a web browser to edit your schema graphically. + +
+ +7. Lint your pipeline +
+ solution 7 + + ```bash + nf-core lint + ``` + +
+ +8. Run the pipeline and inspect the results +
+ solution 8 + + To run the pipeline, be aware that we now need to specify a file containing the adapters. As such, we create a new file called "adapterlist.txt" and add the adapter sequence "[WE NEED AN ADAPTER SEQUENCE HERE]" to it. Then we can run the pipeline as follows: + + ```bash + nextflow run nf-core-demotest/ -profile test,docker --outdir test_results --adapterlist /path/to/adapterlist.txt + + ``` + +
+ +9. Commit the changes +
+ solution 9 + + ```bash + git add . + git commit -m "add adapterremoval module" + ``` + +
+ +::: diff --git a/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md b/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md index 09d51b5bdc..511ee6dab8 100644 --- a/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md +++ b/src/content/docs/contributing/nf_core_basic_training/gitpod_environment.md @@ -1,20 +1,8 @@ --- title: Basic training to create an nf-core pipeline -subtitle: A guide to create Nextflow pipelines using nf-core tools +subtitle: Setting up the gitpod environment for the course --- -## Preparation - -### Prerequisites - -- Familiarity with Nextflow syntax and configuration. - -### Follow the training videos - -This training can be followed either based on this documentation alone, or via a training video hosted on youtube. You can find the youtube video in the Youtube playlist below: - -(no such video yet) - ### Gitpod For this tutorial we will use Gitpod, which runs in the learners web browser. The Gitpod environment contains a preconfigured Nextflow development environment diff --git a/src/content/docs/contributing/nf_core_basic_training/index.md b/src/content/docs/contributing/nf_core_basic_training/index.md index bafb263a47..bacaf5404c 100644 --- a/src/content/docs/contributing/nf_core_basic_training/index.md +++ b/src/content/docs/contributing/nf_core_basic_training/index.md @@ -22,7 +22,7 @@ subtitle: A guide to create Nextflow pipelines using nf-core tools ::: -This training course aims to demonstrate how to build an nf-core pipeline using the nf-core pipeline template and nf-core modules as well as custom, local modules. Be aware that we are not going to explain any fundamental Nextflow concepts, as such we advise anyone taking this course to have completed the [Basic Nextflow Training Workshop](https://training.nextflow.io/). +This training course aims to demonstrate how to build an nf-core pipeline using the nf-core pipeline template and nf-core modules and subworkflows as well as custom, local modules. Be aware that we are not going to explain any fundamental Nextflow concepts, as such we advise anyone taking this course to have completed the [Basic Nextflow Training Workshop](https://training.nextflow.io/). ```md During this course we are going to build a Simple RNA-Seq workflow. @@ -31,6 +31,12 @@ but should only teach the objectives of the course, so please, **DO NOT use this workflow to analyse RNA sequencing data**! ``` +### Follow the training videos + +This training can be followed either based on this documentation alone, or via a training video hosted on youtube. You can find the youtube video in the Youtube playlist below: + +(no such video yet) + ## Overview ### Layout of the pipeline @@ -67,10 +73,25 @@ The course is using gitpod in order to avoid the time expense for downloading an 4. **Building a nf-core pipeline using the template** - a) [Adding a nf-core module to your pipeline]() + a) [Adding a nf-core module to your pipeline](/docs/contributing/nf_core_basic_training/add_nf_core_module.md) - b) [Adding a local custom module to your pipeline]() + b) [Adding a local custom module to your pipeline](/docs/contributing/nf_core_basic_training/add_custom_module.md) - c) [Working with Nextflow schema]() + c) [Working with Nextflow schema](/docs/contributing/nf_core_basic_training/nf_schema.md) - d) [Linting your modules]() + d) [Linting your modules](/docs/contributing/nf_core_basic_training/linting_modules.md) + +:::note + +### Key points + +- `nf-core create ` creates a pipeline from the nf-core template. +- `nf-core lint` lints the pipeline code for things that must be completed. +- `nf-core modules list local` lists modules currently installed into your pipeline. +- `nf-core modules list remote` lists modules available to install into your pipeline. +- `nf-core modules install ` installs the tool module into your pipeline. +- `nf-core modules create` creates a module locally to add custom code into your pipeline. +- `nf-core modules lint --all` lints your module code for things that must be completed. +- `nf-core schema build` opens an interface to allow you to describe your pipeline parameters and set default values, and which values are valid. + +::: diff --git a/src/content/docs/contributing/nf_core_basic_training/linting_modules.md b/src/content/docs/contributing/nf_core_basic_training/linting_modules.md new file mode 100644 index 0000000000..b80abf407f --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/linting_modules.md @@ -0,0 +1,14 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: Linting your modules +--- + +## Lint all modules + +As well as the pipeline template you can lint individual or all modules with a single command: + +``` + +nf-core modules lint --all + +``` diff --git a/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md b/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md index fb2e42d1a4..18debb141a 100644 --- a/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md +++ b/src/content/docs/contributing/nf_core_basic_training/nf_core_create_tool.md @@ -1,6 +1,6 @@ --- title: Basic training to create an nf-core pipeline -subtitle: A guide to create Nextflow pipelines using nf-core tools +subtitle: Create a pipeline with `nf-core create` --- ## Explore nf-core/tools @@ -53,7 +53,7 @@ INFO !!!!!! IMPORTANT !!!!!! ``` Although you can provide options on the command line, it’s easiest to use the interactive prompts. For now we are assuming that we want to create a new nf-core pipeline, so we chose not to customize the template. -It is possible to use nf-core tools for non-nf-core pipelines, but the setup of such pipelines will be handled in a later chapter # ARE WE GOING TO DO THIS? +It is possible to use nf-core tools for non-nf-core pipelines, but the setup of such pipelines will not be handled in this tutorial. ### Pipeline git repo @@ -172,7 +172,6 @@ nextflow run nf-core-demotest/ -profile test,docker --outdir test_results This basic template pipeline contains already the FastQC and MultiQC modules, which do run on a selection of test data. - ## Customising the template In many of the files generated by the nf-core template, you’ll find code comments that look like this: @@ -358,786 +357,3 @@ nf-core lint ::: - -### Adding an existing nf-core module - -#### Identify available nf-core modules - -The nf-core pipeline template comes with a few nf-core/modules pre-installed. You can list these with the command below: - -```bash -nf-core modules list local -``` - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -INFO Modules installed in '.': - -┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓ -┃ Module Name ┃ Repository ┃ Version SHA ┃ Message ┃ Date ┃ -┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩ -│ custom/dumpsoftwareversions │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ -│ │ │ │ biocontainers (#3380) │ │ -│ fastqc │ https://github.com/nf-cor… │ bd8092b67b5103bdd52e300f75… │ Add singularity.registry = │ 2023-07-01 │ -│ │ │ │ 'quay.io' for tests │ │ -│ │ │ │ (#3499) │ │ -│ multiqc │ https://github.com/nf-cor… │ 911696ea0b62df80e900ef244d… │ Remove quay from │ 2023-05-04 │ -│ │ │ │ biocontainers (#3380) │ │ -└─────────────────────────────┴────────────────────────────┴─────────────────────────────┴────────────────────────────┴────────────┘ - -``` - -These version hashes and repository information for the source of the modules are tracked in the modules.json file in the root of the repo. This file will automatically be updated by nf-core/tools when you create, remove or update modules. - -Let’s see if all of our modules are up-to-date: - -```bash -nf-core modules update -``` - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -? Update all modules or a single named module? All modules -? Do you want to view diffs of the proposed changes? No previews, just update everything -INFO Updating 'nf-core/custom/dumpsoftwareversions' -INFO Updating 'nf-core/fastqc' -INFO Updating 'nf-core/multiqc' -INFO Updates complete ✨ -``` - -You can list all of the modules available on nf-core/modules via the command below but we have added search functionality to the nf-core website to do this too! - -```bash -nf-core modules list remote -``` - -#### Install a remote nf-core module - -To install a remote nf-core module from the website, you can first get information about a tool, including the installation command by executing: - -```bash -nf-core modules info salmon/index -``` - -``` - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - -╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ 🌐 Repository: https://github.com/nf-core/modules.git │ -│ 🔧 Tools: salmon │ -│ 📖 Description: Create index for salmon │ -╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ -╷ ╷ -📥 Inputs │Description │Pattern -╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ -genome_fasta (file) │Fasta file of the reference genome │ -╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ -transcriptome_fasta (file)│Fasta file of the reference transcriptome │ -╵ ╵ -╷ ╷ -📤 Outputs │Description │ Pattern -╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ -index (directory)│Folder containing the star index files │ salmon -╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ -versions (file) │File containing software versions │versions.yml -╵ ╵ - -💻 Installation command: nf-core modules install salmon/index - -``` - -:::tip{title="Exercise 4 - Identification of available nf-core modules"} - -1. Get information abou the nf-core module `salmon/quant`. -
- solution 1 - - ``` - nf-core modules info salmon/quant - ``` - -
- -2. Is there any version of `salmon/quant` already installed locally? -
- solution 2 - - ``` - nf-core modules list local - ``` - - If `salmon/quant` is not listed, there is no local version installed. - -
- ::: - -The output from the info command will among other things give you the nf-core/tools installation command, lets see what it is doing: - -```bash -nf-core modules install salmon/index -``` - -``` - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -╭─ Module: salmon/index ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮ -│ 🌐 Repository: https://github.com/nf-core/modules.git │ -│ 🔧 Tools: salmon │ -│ 📖 Description: Create index for salmon │ -╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯ - ╷ ╷ - 📥 Inputs │Description │Pattern -╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━╸ - genome_fasta (file) │Fasta file of the reference genome │ -╶────────────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────┼───────╴ - transcriptome_fasta (file)│Fasta file of the reference transcriptome │ - ╵ ╵ - ╷ ╷ - 📤 Outputs │Description │ Pattern -╺━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━╸ - index (directory)│Folder containing the star index files │ salmon -╶───────────────────┼───────────────────────────────────────────────────────────────────────���──────────────────────────┼────────────╴ - versions (file) │File containing software versions │versions.yml - ╵ ╵ - - 💻 Installation command: nf-core modules install salmon/index - -gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core modules install salmon/index - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.10 - https://nf-co.re - - -INFO Installing 'salmon/index' -INFO Use the following statement to include this module: - - include { SALMON_INDEX } from '../modules/nf-core/salmon/index/main' -``` - -(lots of steps missing here) -exercise to add a different module would be nice! => salmon/quant! -comparison to simple nextflow pipeline from the basic Nextflow training would be nice!) - -:::tip{title="Exercise 5 - Installing a remote module from nf-core"} - -1. Install the nf-core module `adapterremoval` -
- solution 1 - - ```bash - nf-core modules install adapterremoval - ``` - -
- -2. Which file(s) were/are added and what does it / do they do? -
- solution 2 - - ``` - Installation added the module directory `/workspace/basic_training/nf-core-demotest/modules/nf-core/adapterremoval`: - . - ├── environment.yml - ├── main.nf - ├── meta.yml - └── tests - ├── main.nf.test - ├── main.nf.test.snap - └── tags.yml - - The `test` directory contains all information required to perform basic tests for the module, it rarely needs to be changed. `main.nf` is the main workflow file that contains the module code. All input and output variables of the module are described in the `meta.yml` file, whereas the `environment.yml` file contains the dependancies of the module. - ``` - -
- -3. Import the installed `adapterremoval` pipeline into your main workflow. -
- solution 3 - - ```bash title="workflows/demotest.nf" - [...] - /* - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - IMPORT MODULES / SUBWORKFLOWS / FUNCTIONS - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - */ - - include { FASTQC } from '../modules/nf-core/fastqc/main' - include { MULTIQC } from '../modules/nf-core/multiqc/main' - include { paramsSummaryMap } from 'plugin/nf-validation' - include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline' - include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline' - include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_demotest_pipeline' - include { ADAPTERREMOVAL } from '../modules/nf-core/adapterremoval/main' - - [...] - - ``` - -
- -4. Call the `ADAPTERREMOVAL` process in your workflow -
- solution 4 - - ```bash title="workflows/demotest.nf" - [...] - FASTQC ( - ch_samplesheet - ) - ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip.collect{it[1]}) - ch_versions = ch_versions.mix(FASTQC.out.versions.first()) - - // - // MODULE: ADAPTERREMOVAL - // - ADAPTERREMOVAL( - - ) - [...] - ``` - -
- -5. Add required parameters for `adapterremoval`to the `ADAPTERREMOVAL` process -
- solution 5 - - `adapterremoval` requires three input channels: `meta`, `reads` and `adapterlist`, as outlined in the `meta.yml` of the module. `meta` and `reads` are typically given in one channel as a metamap, whereas the `adapterlist` will be it's own channel for which we should give a path. See here: - - ```bash title="adapterremoval/main.nf" - [...] - input: - tuple val(meta), path(reads) - path(adapterlist) - [...] - ``` - - The meta map containing the metadata and the reads can be taken directly from the samplesheet as is the case for FastQC, therefore we can give it the input channel `ch_samplesheet`. The `adapterlist` could either be a fixed path, or a parameter that is given on the command line. For now, we will just add a dummy channel called `adapterlist` assuming that it will be a parameter given in the command line. With this, the new module call for adapterremoval looks as follows: - - ```bash title="workflows/demotest.nf" - [...] - // - // MODULE: ADAPTERREMOVAL - // - ADAPTERREMOVAL( - ch_samplesheet - params.adapterlist - ) - [...] - ``` - -
- -6. Add the input parameter `adapterlist` -
- solution 7 - In order to use `params.adapterlist` we need to add the parameter to the `nextflow.config`. - - ```bash title="nextflow.config" - /// Global default params, used in configs - params { - - /// TODO nf-core: Specify your pipeline's command line flags - /// Input options - input = null - adapterlist = null - - [...] - ``` - - Then use the `nf-core schema build` tool to have the new parameter integrated into `nextflow_schema.json`. The output should look as follows. - - ``` - gitpod /workspace/basic_training/nf-core-demotest (master) $ nf-core schema build - - ,--./,-. - ___ __ __ __ ___ /,-._.--~\ - |\ | |__ __ / ` / \ |__) |__ } { - | \| | \__, \__/ | \ |___ \`-._,-`-, - `._,._,' - - nf-core/tools version 2.13.1 - https://nf-co.re - - INFO [✓] Default parameters match schema validation - INFO [✓] Pipeline schema looks valid (found 32 params) - ✨ Found 'params.test' in the pipeline config, but not in the schema. Add to pipeline schema? [y/n]: y - ``` - - Select y on the final prompt to launch a web browser to edit your schema graphically. - -
- -7. Lint your pipeline -
- solution 7 - - ```bash - nf-core lint - ``` - -
- -8. Run the pipeline and inspect the results -
- solution 8 - - To run the pipeline, be aware that we now need to specify a file containing the adapters. As such, we create a new file called "adapterlist.txt" and add the adapter sequence "[WE NEED AN ADAPTER SEQUENCE HERE]" to it. Then we can run the pipeline as follows: - - ```bash - nextflow run nf-core-demotest/ -profile test,docker --outdir test_results --adapterlist /path/to/adapterlist.txt - - ``` - -
- -9. Commit the changes -
- solution 9 - - ```bash - git add . - git commit -m "add adapterremoval module" - ``` - -
- -::: - -### Adding a local module - -If there is no nf-core module available for the software you want to include, the nf-core tools package can also aid in the generation of a local module that is specific for your pipeline. To add a local module run the following: - -``` - -nf-core modules create - -``` - -Open ./modules/local/demo/module.nf and start customising this to your needs whilst working your way through the extensive TODO comments! - -### Making a local module for a custom script - -To generate a module for a custom script you need to follow the same steps when adding a remote module. -Then, you can supply the command for your script in the `script` block but your script needs to be present -and _executable_ in the `bin` -folder of the pipeline. -In the nf-core pipelines, -this folder is in the main directory and you can see in [`rnaseq`](https://github.com/nf-core/rnaseq). -Let's look at an publicly available example in this pipeline, -for instance [`tximport.r`](https://github.com/nf-core/rnaseq/blob/master/bin/tximport.r). -This is an Rscript present in the [`bin`](https://github.com/nf-core/rnaseq/tree/master/bin) of the pipeline. -We can find the module that runs this script in -[`modules/local/tximport`](https://github.com/nf-core/rnaseq/blob/master/modules/local/tximport/main.nf). -As we can see the script is being called in the `script` block, note that `tximport.r` is -being executed as if it was called from the command line and therefore needs to be _executable_. - -
- -

TL;DR

- -1. Write your script on any language (python, bash, R, - ruby). E.g. `maf2bed.py` -2. If not there yet, move your script to `bin` folder of - the pipeline and make it - executable (`chmod +x `) -3. Create a module with a single process to call your script from within the workflow. E.g. `./modules/local/convert_maf2bed/main.nf` -4. Include your new module in your workflow with the command `include {CONVERT_MAF2BED} from './modules/local/convert_maf2bed/main'` that is written before the workflow call. -
- -_Tip: Try to follow best practices when writing a script for -reproducibility and maintenance purposes: add the -shebang (e.g. `#!/usr/bin/env python`), and a header -with description and type of license._ - -### 1. Write your script - -Let's create a simple custom script that converts a MAF file to a BED file called `maf2bed.py` and place it in the bin directory of our nf-core-testpipeline:: - -``` - -#!/usr/bin/env python -"""bash title="maf2bed.py" -Author: Raquel Manzano - @RaqManzano -Script: Convert MAF to BED format keeping ref and alt info -License: MIT -""" -import argparse -import pandas as pd - -def argparser(): -parser = argparse.ArgumentParser(description="") -parser.add_argument("-maf", "--mafin", help="MAF input file", required=True) -parser.add_argument("-bed", "--bedout", help="BED input file", required=True) -parser.add_argument( -"--extra", help="Extra columns to keep (space separated list)", nargs="+", required=False, default=[] -) -return parser.parse_args() - -def maf2bed(maf_file, bed_file, extra): -maf = pd.read_csv(maf_file, sep="\t", comment="#") -bed = maf[["Chromosome", "Start_Position", "End_Position"] + extra] -bed.to_csv(bed_file, sep="\t", index=False, header=False) - -def main(): -args = argparser() -maf2bed(maf_file=args.mafin, bed_file=args.bedout, extra=args.extra) - -if **name** == "**main**": -main() - -``` - -### 2. Make sure your script is in the right folder - -Now, let's move it to the correct directory and make sure it is executable: - -```bash -mv maf2bed.py /path/where/pipeline/is/bin/. -chmod +x /path/where/pipeline/is/bin/maf2bed.py -``` - -### 3. Create your custom module - -Then, let's write our module. We will call the process -"CONVERT_MAF2BED" and add any tags or/and labels that -are appropriate (this is optional) and directives (via -conda and/or container) for -the definition of dependencies. - -
- -

Some additional infos that might be of interest

- -
-More info on labels -A `label` will -annotate the processes with a reusable identifier of your -choice that can be used for configuring. E.g. we use the -`label` 'process_single', this looks as follows: - -``` - -withLabel:process_single { -cpus = { check_max( 1 _ task.attempt, 'cpus' ) } -memory = { check_max( 1.GB _ task.attempt, 'memory') } -time = { check_max( 1.h \* task.attempt, 'time' ) } -} - -``` - -
- -
-More info on tags - -A `tag` is simple a user provided identifier associated to -the task. In our process example, the input is a tuple -comprising a hash of metadata for the maf file called -`meta` and the path to the `maf` file. It may look -similar to: `[[id:'123', data_type:'maf'], -/path/to/file/example.maf]`. Hence, when nextflow makes -the call and `$meta.id` is `123` name of the job -will be "CONVERT_MAF2BED(123)". If `meta` does not have -`id` in its hash, then this will be literally `null`. - -
- -
-More info on conda/container directives - -The `conda` directive allows for the definition of the -process dependencies using the [Conda package manager](https://docs.conda.io/en/latest/). Nextflow automatically sets up an environment for the given package names listed by in the conda directive. For example: - -``` - -process foo { -conda 'bwa=0.7.15' - -''' -your_command --here -''' -} - -``` - -Multiple packages can be specified separating them with a blank space e.g. `bwa=0.7.15 samtools=1.15.1`. The name of the channel from where a specific package needs to be downloaded can be specified using the usual Conda notation i.e. prefixing the package with the channel name as shown here `bioconda::bwa=0.7.15` - -``` - -process foo { -conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' - -''' -your_bwa_cmd --here -your_samtools_cmd --here -''' -} - -``` - -Similarly, we can apply the `container` directive to execute the process script in a [Docker](http://docker.io/) or [Singularity](https://docs.sylabs.io/guides/3.5/user-guide/introduction.html) container. When running Docker, it requires the Docker daemon to be running in machine where the pipeline is executed, i.e. the local machine when using the local executor or the cluster nodes when the pipeline is deployed through a grid executor. - -``` - -process foo { -conda 'bioconda::bwa=0.7.15 bioconda::samtools=1.15.1' -container 'dockerbox:tag' - -''' -your_bwa_cmd --here -your_samtools_cmd --here -''' -} - -``` - -Additionally, the `container` directive allows for a more sophisticated choice of container and if it Docker or Singularity depending on the users choice of container engine. This practice is quite common on official nf-core modules. - -``` - -process foo { -conda "bioconda::fastqc=0.11.9" -container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? -'https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0' : -'biocontainers/fastqc:0.11.9--0' }" - -''' -your_fastqc_command --here -''' -} - -``` - -
- -
- -Since `maf2bed.py` is in the `bin` directory we can directory call it in the script block of our new module `CONVERT_MAF2BED`. You only have to be careful with how you call variables (some explanations on when to use `${variable}` vs. `$variable`): -A process may contain any of the following definition blocks: directives, inputs, outputs, when clause, and the process script. Here is how we write it: - -``` -process CONVERT_MAF2BED { -// HEADER -tag "$meta.id" - label 'process_single' - // DEPENDENCIES DIRECTIVES - conda "anaconda::pandas=1.4.3" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? -'https://depot.galaxyproject.org/singularity/pandas:1.4.3' : -'quay.io/biocontainers/pandas:1.4.3' }" - -// INPUT BLOCK -input: -tuple val(meta), path(maf) - -// OUTPUT BLOCK -output: -tuple val(meta), path('\*.bed') , emit: bed -path "versions.yml" , emit: versions - -// WHEN CLAUSE -when: -task.ext.when == null || task.ext.when - -// SCRIPT BLOCK -script: // This script is bundled with the pipeline in bin -def args = task.ext.args ?: '' -def prefix = task.ext.prefix ?: "${meta.id}" - -""" -maf2bed.py --mafin $maf --bedout ${prefix}.bed -""" -} -``` - -More on nextflow's process components in the [docs](https://www.nextflow.io/docs/latest/process.html). - -### Include your module in the workflow - -In general, we will call out nextflow module `main.nf` and save it in the `modules` folder under another folder called `conver_maf2bed`. If you believe your custom script could be useful for others and it is potentially reusable or calling a tool that is not yet present in nf-core modules you can start the process of making it official adding a `meta.yml` [explained above](#adding-modules-to-a-pipeline). In the `meta.yml` The overall tree for the pipeline skeleton will look as follows: - -``` - -pipeline/ -├── bin/ -│ └── maf2bed.py -├── modules/ -│ ├── local/ -│ │ └── convert_maf2bed/ -│ │ ├── main.nf -│ │ └── meta.yml -│ └── nf-core/ -├── config/ -│ ├── base.config -│ └── modules.config -... - -``` - -To use our custom module located in `./modules/local/convert_maf2bed` within our workflow, we use a module inclusions command as follows (this has to be done before we invoke our workflow): - -```bash title="workflows/demotest.nf" -include { CONVERT_MAF2BED } from './modules/local/convert_maf2bed/main' -workflow { -input_data = [[id:123, data_type='maf'], /path/to/maf/example.maf] -CONVERT_MAF2BED(input_data) -} -``` - -:::tip{title="Exercise 6 - Adding a custom module"} -In the directory `exercise_6` you will find the custom script `print_hello.py`, which will be used for this and the next exercise. - -1. Create a local module that runs the `print_hello.py` script -2. Add the module to your main workflow -3. Run the pipeline -4. Lint the pipeline -5. Commit your changes -
- solution 1 - - ``` - - ``` - -
- -::: - -### Further reading and additional notes - -#### What happens in I want to use containers but there is no image created with the packages I need? - -No worries, this can be done fairly easy thanks to [BioContainers](https://biocontainers-edu.readthedocs.io/en/latest/what_is_biocontainers.html), see instructions [here](https://github.com/BioContainers/multi-package-containers). If you see the combination that you need in the repo, you can also use [this website](https://midnighter.github.io/mulled) to find out the "mulled" name of this container. - -### I want to know more about software dependencies! - -You are in luck, we have more documentation [here](https://nf-co.re/docs/contributing/modules#software-requirements) - -#### I want to know more about modules! - -See more info about modules in the nextflow docs [here](https://nf-co.re/docs/contributing/modules#software-requirements.) - -## Lint all modules - -As well as the pipeline template you can lint individual or all modules with a single command: - -``` - -nf-core modules lint --all - -``` - -## Nextflow Schema - -All nf-core pipelines can be run with --help to see usage instructions. We can try this with the demo pipeline that we just created: - -``` - -cd ../ -nextflow run nf-core-demo/ --help - -``` - -### Working with Nextflow schema - -If you peek inside the nextflow_schema.json file you will see that it is quite an intimidating thing. The file is large and complex, and very easy to break if edited manually. - -Thankfully, we provide a user-friendly tool for editing this file: nf-core schema build. - -To see this in action, let’s add some new parameters to nextflow.config: - -``` - -params { -demo = 'param-value-default' -foo = null -bar = false -baz = 12 -// rest of the config file.. - -``` - -Then run nf-core schema build: - -``` - -cd nf-core-demo/ -nf-core schema build - -``` - -The CLI tool should then prompt you to add each new parameter. -Here in the schema editor you can edit: - -- Description and help text -- Type (string / boolean / integer etc) -- Grouping of parameters -- Whether a parameter is required, or hidden from help by default -- Enumerated values (choose from a list) -- Min / max values for numeric types -- Regular expressions for validation -- Special formats for strings, such as file-path -- Additional fields for files such as mime-type - -:::tip{title="Exercise 7 - Using nextflow schema to add command line parameters"} - -1. Feed a string to your custom script from exercise 6 from the command line. Use `nf-core schema build` to add the parameter to the `nextflow.config` file. - - - -::: - -:::note - -### Key points - -- `nf-core create ` creates a pipeline from the nf-core template. -- `nf-core lint` lints the pipeline code for things that must be completed. -- `nf-core modules list local` lists modules currently installed into your pipeline. -- `nf-core modules list remote` lists modules available to install into your pipeline. -- `nf-core modules install ` installs the tool module into your pipeline. -- `nf-core modules create` creates a module locally to add custom code into your pipeline. -- `nf-core modules lint --all` lints your module code for things that must be completed. -- `nf-core schema build` opens an interface to allow you to describe your pipeline parameters and set default values, and which values are valid. - -::: - -``` - -``` diff --git a/src/content/docs/contributing/nf_core_basic_training/nf_schema.md b/src/content/docs/contributing/nf_core_basic_training/nf_schema.md new file mode 100644 index 0000000000..4bde325e93 --- /dev/null +++ b/src/content/docs/contributing/nf_core_basic_training/nf_schema.md @@ -0,0 +1,64 @@ +--- +title: Basic training to create an nf-core pipeline +subtitle: Working with Nextflow schema +--- + +## Nextflow Schema + +All nf-core pipelines can be run with --help to see usage instructions. We can try this with the demo pipeline that we just created: + +``` + +cd ../ +nextflow run nf-core-demo/ --help + +``` + +### Working with Nextflow schema + +If you peek inside the nextflow_schema.json file you will see that it is quite an intimidating thing. The file is large and complex, and very easy to break if edited manually. + +Thankfully, we provide a user-friendly tool for editing this file: nf-core schema build. + +To see this in action, let’s add some new parameters to nextflow.config: + +``` + +params { +demo = 'param-value-default' +foo = null +bar = false +baz = 12 +// rest of the config file.. + +``` + +Then run nf-core schema build: + +``` + +cd nf-core-demo/ +nf-core schema build + +``` + +The CLI tool should then prompt you to add each new parameter. +Here in the schema editor you can edit: + +- Description and help text +- Type (string / boolean / integer etc) +- Grouping of parameters +- Whether a parameter is required, or hidden from help by default +- Enumerated values (choose from a list) +- Min / max values for numeric types +- Regular expressions for validation +- Special formats for strings, such as file-path +- Additional fields for files such as mime-type + +:::tip{title="Exercise 7 - Using nextflow schema to add command line parameters"} + +1. Feed a string to your custom script from exercise 6 from the command line. Use `nf-core schema build` to add the parameter to the `nextflow.config` file. + + + +::: diff --git a/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md b/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md index 6bed8a9529..21f7d67f97 100644 --- a/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md +++ b/src/content/docs/contributing/nf_core_basic_training/template_walk_through.md @@ -1,6 +1,6 @@ --- title: Basic training to create an nf-core pipeline -subtitle: A guide to create Nextflow pipelines using nf-core tools +subtitle: Exploring the nf-core template files --- ### Template code walk through