You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The MPIJob EFA example here, doesn't apply cleanly, it shows the following error:
Error from server (BadRequest): error when creating "mpijob.yaml": MPIJob in version "v2beta1" cannot be handled as a MPIJob: strict decoding error: unknown field "spec.mpiReplicaSpecs.launcher.template.spec.imagePullPolicy", unknown field "spec.mpiReplicaSpecs.worker.template.spec.imagePullPolicy"
The issue is that the imagePullPolicy must be specified on the container, not the spec. Changing it so the scheduler reads like this (and the same for the worker) allows it to apply:
Edit: actually, even with this fix, I'm unable to get it running. The connection from the launcher is refused by the worker: Connection reset by 172.17.5.245 port 22.
The text was updated successfully, but these errors were encountered:
The MPIJob EFA example here, doesn't apply cleanly, it shows the following error:
The issue is that the
imagePullPolicy
must be specified on the container, not the spec. Changing it so the scheduler reads like this (and the same for the worker) allows it to apply:Edit: actually, even with this fix, I'm unable to get it running. The connection from the launcher is refused by the worker:
Connection reset by 172.17.5.245 port 22
.The text was updated successfully, but these errors were encountered: