Replies: 4 comments 1 reply
-
i've used the NVidia GPUs and operator with OCP but not with OKD/Fedora CoreOS. when the operator is deployed to a cluster, and configured properly, it will attempt to build drivers for the GPU on each node that needs it. on OCP this requires build entitlements for the nodes that will do the compiling. i don't know for sure, but i have a suspicion that this compiling process might break down on FCOS because the operator will be looking for specific packages to install so that it can complete the compilation. it may work, but i have a feeling there will need to be some way for the operator to identify an FCOS host and then setup the appropriate build packages to make the driver. |
Beta Was this translation helpful? Give feedback.
-
No entitlements are needed.
https://hackmd.io/-vpQKhC8SVmDewmJLQgGnw
- References:
- OpenShift Commons Briefing Youtube: bit.ly/ocbgpu
- OpenShift Commons Briefing Deck: bit.ly/GPUONOPENSHIFT
…On Tue, 28 Sept 2021 at 20:08, Michael McCune ***@***.***> wrote:
i've used the NVidia GPUs and operator with OCP but not with OKD/Fedora
CoreOS. when the operator is deployed to a cluster, and configured
properly, it will attempt to build drivers for the GPU on each node that
needs it. on OCP this requires build entitlements for the nodes that will
do the compiling. i don't know for sure, but i have a suspicion that this
compiling process might break down on FCOS because the operator will be
looking for specific packages to install so that it can complete the
compilation.
it *may* work, but i have a feeling there will need to be some way for
the operator to identify an FCOS host and then setup the appropriate build
packages to make the driver.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#896 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA4NPNWR4GXDFWVWR4JX5LLUEH77XANCNFSM5E4TXVLQ>
.
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the links as well, I'll check them later. |
Beta Was this translation helpful? Give feedback.
-
Hello, Resolving RHEL version...
I hacked /etc/os-release file on node but pod wasn't able to compile driver. Unfortunatelly as I see in the link you sent there is image rebuild for the driver pod. Thank you once again for your answers. Best regards, |
Beta Was this translation helpful? Give feedback.
-
Hi,
Do you have experience with using NVIDIA GPU (model GV100GL [Tesla V100 PCIe 16GB] in OKD 4 with Fedora Coreos? I have one physical node connected to 3.11 cluster with NVIDIA driver installed manually and have to upgrade environment to version 4. I've just found that there is NVIDIA GPU Operator available in Operator Hub which is supported by Redhat in Openshift but not tried yet in OKD 4.
Thank you for answers and suggestions.
Best regards,
Chris
Beta Was this translation helpful? Give feedback.
All reactions