Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parallel startup with startup intervals between multiple tasks in multirun mode #3002

Open
Jinshijiming opened this issue Jan 6, 2025 · 0 comments
Labels
enhancement Enhanvement request

Comments

@Jinshijiming
Copy link

Jinshijiming commented Jan 6, 2025

🚀 Feature Request

Implement parallel startup with startup intervals between multiple tasks in multirun mode

Motivation

During neural network training, it is necessary to load the model and dataset onto the GPU.
We select GPUs by sorting their available resources.
If parallel tasks are started simultaneously, because GPU loading takes time, there is a high probability that sorting functions running at the same time will choose the same GPU, which may result in memory overflow.

@Jinshijiming Jinshijiming added the enhancement Enhanvement request label Jan 6, 2025
@Jinshijiming Jinshijiming changed the title multirun模式下,实现多任务之间有启动间隔的并行启动 Implement parallel startup with startup intervals between multiple tasks in multirun mode Jan 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhanvement request
Projects
None yet
Development

No branches or pull requests

1 participant