Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Template: Add gpu profile #3272

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from
Open

Conversation

mashehu
Copy link
Contributor

@mashehu mashehu commented Nov 7, 2024

No description provided.

@mashehu mashehu requested a review from sateeshperi November 7, 2024 13:26
Copy link

codecov bot commented Nov 7, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 75.72%. Comparing base (9412504) to head (4c88397).
Report is 13 commits behind head on dev.

Additional details and impacted files

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mashehu mashehu requested a review from nictru November 7, 2024 15:37
nictru and others added 2 commits November 7, 2024 16:38
Co-authored-by: Sateesh_Peri <33637490+sateeshperi@users.noreply.github.com>
Co-authored-by: Sateesh_Peri <33637490+sateeshperi@users.noreply.github.com>
@nictru
Copy link

nictru commented Nov 7, 2024

So the flow of information in the scdownstream pipeline looks as following:

  1. Users provide the gpu profile
  2. The gpu profile prepares the containerization tools for GPU mounting and sets the hidden pipeline parameter use_gpu to true -> can be used for if-clauses in workflows
  3. Processes with GPU support get the process_gpu label -> can be used to handle all GPU-enabled processes in a certain way. Different executors need different tweaking to handle tasks from these processes correctly. I added a bit of documentation here.
  4. This section makes sure that processes have an ext variable which reflects if they should use GPU or not. This can be useful if the same module supports usage both with and without GPU. Example: cellbender

This implementation is the best I could come up with so far, but not many really "senior" people have had a look at it AFAIK. Maybe we can discuss which elements of the suggested approach should be part of the template and where we can perform some improvements.

@nictru
Copy link

nictru commented Nov 7, 2024

Another thought: Should this not be an optional part of the template?

@mashehu
Copy link
Contributor Author

mashehu commented Nov 7, 2024

it's teeny-tiny enough, that we might keep it in, imo, similar to arm... but also not 100% sure

@nictru
Copy link

nictru commented Nov 7, 2024

But is the goal to only add the profile to the template?

I think it would at least be nice to have a common structure for all GPU-enabled modules, even if it only means adding the process_gpu label

@sateeshperi sateeshperi requested a review from GallVp November 7, 2024 18:12
@GallVp
Copy link
Member

GallVp commented Nov 7, 2024

Thank you @nictru

  1. Users provide the gpu profile

The gpu profile being added to the template does not have the use_gpu configuration variable. I can see that you have added it to the profile in the pipeline: https://github.com/nf-core/scdownstream/blob/3231971f309d1ac025e7180b69852c3637c975dd/nextflow.config#L182

I don't think we need a separate configuration variable.

  1. The gpu profile prepares the containerization tools for GPU mounting and sets the hidden pipeline parameter use_gpu to true -> can be used for if-clauses in workflows

In the scdownstream pipeline, use_gpu is defined as a pipeline parameter (https://github.com/nf-core/scdownstream/blob/3231971f309d1ac025e7180b69852c3637c975dd/nextflow.config#L40) and a configuration variable (https://github.com/nf-core/scdownstream/blob/3231971f309d1ac025e7180b69852c3637c975dd/nextflow.config#L182). These two are different so setting gpu in the -profile, will not set the use_gpu pipeline parameter. It will only set the configuration variable.

I think we only need the use_gpu pipeline parameter and should get rid of the use_gpu configuration variable. The use_gpu label in the base.config file can be changed to,

process {
    withLabel: 'use_gpu' {
        ext.use_gpu = { params.use_gpu }
    }
}

At pipeline execution, the user must set the gpu profile and the use_gpu pipeline parameter to utilise the GPU-based tools.

  1. Processes with GPU support get the process_gpu label -> can be used to handle all GPU-enabled processes in a certain way. Different executors need different tweaking to handle tasks from these processes correctly. I added a bit of documentation here.

Thank you. This is very helpful.

  1. This section makes sure that processes have an ext variable which reflects if they should use GPU or not. This can be useful if the same module supports usage both with and without GPU. Example: cellbender

This is quite clever. Nice!

@nictru
Copy link

nictru commented Nov 8, 2024

Hey @GallVp, thanks for the input.

Some notes:

It might sound stupid, but I did not know there was a difference between pipeline parameters and configuration variables. I agree that we should only have one.

I am not fully aware about what you can and can't do with configuration variables. I also was not able to find any documentation about them at all. I only used the configuration variable in scdownstream. Not sure if one can use configuration variables for workflow if clauses?

  • If no, then the pipeline parameter should be preferred, but it should still be set to true by the gpu profile and hidden from the parameter documentation
  • If yes, I think then the configuration variable is better, because it prevents users from using one without the other

At pipeline execution, the user must set the gpu profile and the use_gpu pipeline parameter to utilise the GPU-based tools.

I would prefer if setting the gpu profile would be sufficient. Keeping them separate will probably lead to a lot of users using one without the other, which almost certainly will lead to errors. I am not aware of any use cases where this would be an advantage, but I'm eager to hear.

Copy link

@nictru nictru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not approving yet, because we should first decide what the implementation should look like and add all necessary parts

@GallVp
Copy link
Member

GallVp commented Nov 9, 2024

Not sure if one can use configuration variables for workflow if clauses?

No, configuration variables are not available in the workflows. Maybe, there is a backdoor that I am not aware of.

How about we get rid of both the variable and the parameter? Instead, we modify the profile as,

gpu {
    process {
        withLabel: process_gpu {
            ext.use_gpu = true
        }
    }
    docker.runOptions = '-u $(id -u):$(id -g) --gpus all'
    apptainer.runOptions = '--nv'
    singularity.runOptions = '--nv'
}

The workflows can use workflow.profile.contains('gpu') in place of the use_gpu parameter.

The above solution, however, does not take care of the arm profile. So, we need to have a gpu_arm profile as well.

@nictru
Copy link

nictru commented Nov 13, 2024

Hey, I just tested the suggested approach and it seems to work as expected.
I think we could do it like this, but I would prefer having the withLabel block in the base.config - just to make sure that all the withLabel configs are collected in one place.

I did this using the following:

withLabel: process_gpu {
    ext.use_gpu = {workflow.profile.contains('gpu')}
}

So that we don't need a config variable. But this is more a cosmetic topic and not a hill-to-die-on.

@mashehu
Copy link
Contributor Author

mashehu commented Nov 13, 2024

feel free to add the changes, @nictru

Co-authored-by: Nico Trummer <nictru32@gmail.com>
@GallVp
Copy link
Member

GallVp commented Nov 13, 2024

I did this using the following:

withLabel: process_gpu {
    ext.use_gpu = { workflow.profile.contains('gpu') }
}

I agree. This is more in line with the existing infrastructure. Thank you!

Copy link

@nictru nictru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy now :)

@ewels ewels modified the milestones: 3.1, 3.2 Nov 21, 2024
@ewels
Copy link
Member

ewels commented Nov 21, 2024

Needs opt-in functionality in the nf-core pipelines create. See also #3261

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants