Docker + GPUs in Google HPC toolkit? #1622
Replies: 2 comments 1 reply
-
To provide an update on GitHub based on a direct discussion. We've previously integrated enroot and Pyxis and had success running CPU jobs via Slurm on Debian 11. However, the CUDA Toolkit is not being initialized properly inside the containerized environment, making the GPU unavailable. I just tested an alternative approach using Docker and the NVIDIA Container Toolkit and saw initial success outside of Slurm job environment. I will update here when I have further news about execution within Slurm. |
Beta Was this translation helpful? Give feedback.
-
@yaroslavvb I believe this can now be closed with the recommendation to create a docker.socket file that is writable by your users or adding the users to a docker group. Please feel free to reopen this discussion if this is not the case. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Machine Learning often needs Docker + GPU. Any tips how to get this kind of configuration in the HPC toolkit?
Here's how Oracle does it: https://github.com/oracle-quickstart/oci-hpc
Beta Was this translation helpful? Give feedback.
All reactions