K8s cluster contain types of GPUs and none GPUs
- Kubernetes worker nodes have to installed with NVIDIA drivers, nvidia-docker 2.0.
nvidia-container-runtime
must be configured as the default runtime for Docker. Change default docker runtime on/etc/docker/daemon.json
.
{
"default-runtime": "nvidia",
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}
- Add label and taint on worker nodes GPU
kubectl label nodes <node-gpu> app=gpu
kubectl taint nodes <node-gpu> app=gpu:NoSchedule
- Install nvidia-device-plugin on worker nodes GPU
kubectl apply -f https://raw.githubusercontent.com/thangtq710/gpu-k8s/master/files/nvidia-device-plugin.yml
- Verification. Run deployment test
kubectl apply -f https://raw.githubusercontent.com/thangtq710/gpu-k8s/master/files/deployment-gpu-test.yml