Advance csi-node-driver-registrar version to 1.1.0 #248

okartau · 2019-04-25T10:54:45Z

One deployment trial case shows that driver deployment fails with version 1.0.2 but succeeds with 1.1.0.
In failing case, node-driver-registrar fails in getting connection to local csi socket, times out in 60s, and causes pod to exit and CrashLoop. For some reason (still not explained), this scenario repeats multiple times.
There is explanation why connection can take longer time 1st time, because node driver is in turn waiting to register with controller, and controller is still in starting stage.
But such wait should not happen on 2nd start of node pod.
In 1.1.0 the timeout and exit was removed and driver-registrar keeps trying. That seems to help
in the current deployment case.

okartau · 2019-04-25T11:08:58Z

Note that this commit also has side-artifact in form of /sys mount change by kustomize pass in a file which is not relevant to this PR.
This seems to be caused by sequence:

I had made /sys mount add change in a branch
branch by @pohly was merged adding deployment files (testing variant) which did not had /sys mount initially
my branch was merged, leaving some files without /sys mount addition

So the lack of some parts was side effect of generating files and adding files in overlapping branches.
But, what is interesting: I tried to amend the state of devel separately from current PR
by running make customize in hope that it would fix the state.
Surprisingly, it does not generate the missing part (it does not make any changes).
I get missing parts generated only after I make a next (release number) change.

okartau · 2019-04-25T13:20:10Z

Joy too early: first impression "it works now" was based on observation that pod did not enter CrashLoop state. But instead, driver-registrar container remains retrying without timeout, and does not reach functional state.
In a sense the new semantics is not so good because it hides the "can not connect to socket" problem.

okartau · 2019-04-25T14:00:36Z

The problem we see in this deployment trial is similar to what is reported here:
kubernetes-csi/node-driver-registrar#36

The SELinux=enabled has been pointed to trigger connection failure

pohly · 2019-04-25T15:00:01Z

qOlev Kartau <notifications@github.com> writes:

In a sense the new semantics is not so good because it hides the "can not connect to socket" problem.

It would be worthwhile to file a feature request: - implement a readiness probe for the sidecar - return "ready" only once connected to the driver

okartau · 2019-04-25T15:14:06Z

The SELinux=enabled has been pointed to trigger connection failure

But how come only driver-registrar has problem with that?
Other sidecars also mount same host directory as /csi.
Is it so the other sidecars do not try to connect to same socket?

okartau · 2019-04-29T19:17:25Z

Although the reason for deployment issue which caused this trial, turned out to be other misconfiguration, we can still consider this PR as independent contribution

pohly

Looks good. I also verified with the Kubernetes-CSI WG that csi-node-driver-registrar is indeed compatible with 1.13 and that the resulting merge leaves deployment files in a consistent state (merge manually, run make test-kustomize).

okartau added deployment affects deployment, not driver code question Further information is requested labels Apr 25, 2019

okartau mentioned this pull request Apr 26, 2019

Deployment failure: node-driver-registrar can not connect to /csi/csi.sock #250

Closed

okartau force-pushed the advance-node-driver-registrar branch from 610ec49 to 0206d95 Compare April 29, 2019 14:22

Advance csi-node-driver-registrar version to 1.1.0

f2b0d82

okartau force-pushed the advance-node-driver-registrar branch from 0206d95 to f2b0d82 Compare April 29, 2019 14:35

okartau removed the question Further information is requested label Apr 29, 2019

pohly approved these changes Apr 30, 2019

View reviewed changes

pohly merged commit ad2b018 into intel:devel Apr 30, 2019

okartau deleted the advance-node-driver-registrar branch April 30, 2019 07:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advance csi-node-driver-registrar version to 1.1.0 #248

Advance csi-node-driver-registrar version to 1.1.0 #248

okartau commented Apr 25, 2019 •

edited

Loading

okartau commented Apr 25, 2019

okartau commented Apr 25, 2019 •

edited

Loading

okartau commented Apr 25, 2019

pohly commented Apr 25, 2019 via email

okartau commented Apr 25, 2019

okartau commented Apr 29, 2019

pohly left a comment

Advance csi-node-driver-registrar version to 1.1.0 #248

Advance csi-node-driver-registrar version to 1.1.0 #248

Conversation

okartau commented Apr 25, 2019 • edited Loading

okartau commented Apr 25, 2019

okartau commented Apr 25, 2019 • edited Loading

okartau commented Apr 25, 2019

pohly commented Apr 25, 2019 via email

okartau commented Apr 25, 2019

okartau commented Apr 29, 2019

pohly left a comment

Choose a reason for hiding this comment

okartau commented Apr 25, 2019 •

edited

Loading

okartau commented Apr 25, 2019 •

edited

Loading