Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to create IBM Cloud VPC Block CSI Driver on Single Node OpenShift in IBM Cloud VPC #155

Closed
aazraq opened this issue Jun 28, 2023 · 13 comments

Comments

@aazraq
Copy link

aazraq commented Jun 28, 2023

Environment details:
Single Node OpenShift cluster running on IBM Cloud VPC VSI.
OpenShift: 4.12

Problem Description:
I tried to follow the instructions here, and built the image from the master branch and pushed it to IBM Cloud Container Registry. However this command failed

bash deploy/kubernetes/driver/kubernetes/deploy-vpc-block-driver.sh stage

with this issue

This will install 'stage' version of vpc csi driver!
darwin22
Error: invalid Kustomization: json: cannot unmarshal string into Go struct field Kustomization.patches of type types.Patch
error: no objects passed to apply

I contacted @ambiknai who was very helpful and very generous with her time and troubleshooted the issue with me, and gave me manual yaml files which progressed the CSI driver creation, but still it's failing with the below issue.

Attached are the logs.
csi-provisioner.log
ibm-vpc-block-csi-controller.log

@ambiknai
Copy link
Contributor

ambiknai commented Jun 28, 2023

Thanks @aazraq for opening this issue.

Adding few findings

[4](https://ibm-argonauts.slack.com/archives/D05EJ25R2J1/p1687954464057979)
{"level":"fatal","timestamp":"2023-06-28T12:13:04.113Z","caller":"cmd/main.go:116","msg":"Failed to initialize driver...","name":"ibm-vpc-block-csi-driver","CSIDriverName":"IBM VPC block driver","error":"Controller_Helper: Failed to initialize node metadata: error: Unable to fetch instance ID from node provider ID - "}

Point of failure - https://github.com/IBM/ibm-csi-common/blob/master/pkg/metadata/metadata.go#L89-L97

Added labels manually to bring driver pods up

ibm-cloud.kubernetes.io/machine-type]: upi

Pods came to healthy state.

We see errors in PVC

 Warning ProvisioningFailed  43s (x7 over 106s)  vpc.block.csi.ibm.io_ibm-vpc-block-csi-controller-6c879448cd-9r25v_0ed93c94-66e5-4cab-9a88-fb4b364f0cd4 failed to provision volume with StorageClass "ibmc-vpc-block-5iops-tier": error generating accessibility requirements: no available topology found

@ambiknai
Copy link
Contributor

  1. After adding above label, also added
    ibm-cloud.kubernetes.io/vpc-instance-id: 0787_d639182e-44cc-4ab7-bbfc-8f69e9606664
  2. Restarted all pods

node-pod was still in CrashLoopBackoff

node-driver-registart is unable to connect to socket

I did strace

  1. In the worker node terminal, you do `ps -aux| grep “node-driver-registrar”
  2. Get the pid
  3. and then you do strace -p $( pidof node-driver-registrar)
clock_gettime(CLOCK_MONOTONIC, {tv_sec=241466, tv_nsec=927945202}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1688114739, tv_nsec=31720390}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=241466, tv_nsec=929814102}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=241466, tv_nsec=930230648}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=241466, tv_nsec=930500755}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1688114739, tv_nsec=34867350}) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=241466, tv_nsec=931948439}) = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/csi/csi.sock"}, 16) = -1 EACCES (Permission denied)
close(3)                                = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1688114739, tv_nsec=37431080}) = 0

It has some permission issues

@jeffnowicki
Copy link

jeffnowicki commented Jul 12, 2023

Suggest following up on this reported issue (now closed) which looks like the problem you are hitting: kubernetes-csi/node-driver-registrar#36 (comment)

@ambiknai
Copy link
Contributor

@jsafrane Could you please help here to check what permission is missing for selinux.

@jsafrane
Copy link
Contributor

This smells like SELinux. In RHEL, we traditionally run both CSI driver and driver registrar as privileged containers, so both can access the host (/var/lib/kubelet/*). Even a pod that runs as root can do very little on the host because of SELinux, unless it's privileged.

Running the registrar as privileged: true is not necessary since kubernetes/kubernetes#73241, Kubernetes 1.15?), even an unprivileged pod that runs as root can write to /var/lib/kubelet/plugins and /var/lib/kubelet/plugins_registry.

I don't know what prevents the registrar to access /csi/csi.sock.

  • What SELinux context does the driver socket have on the host?
    ls -aZ /var/lib/kubelet/plugins and ls -aZ /var/lib/kubelet/plugins_registry could tell you. It should be system_u:object_r:container_file_t:s0 in both cases
  • How the node-driver-registrar runs? ps -auxZ | grep registrar. A privileged container should have system_u:system_r:spc_t:s0, allowing the registrar accessing anything on the host. A non-privileged registrar should have something like system_u:system_r:container_t:s0:c101,c943, allowing the registrar to access anything with system_u:object_r:container_file_t:s0, but nothing else on the host.

You see anything else, who changed the contexts and why?

@ambiknai
Copy link
Contributor

@aazraq Could you share above command O/P from your setup.

@ambiknai
Copy link
Contributor

ambiknai commented Oct 4, 2023

@jsafrane Another team reported similar issue where node-driver pods were restarting.

I checked all commands which you mentioned in above comment.

[root@test-ckehqph20ccgdlvbd0j0-dmhyper71-default-00000d87 /]# ps -aux| grep "node-driver-registrar"
root       5805  0.0  0.1 725532 23580 ?        Ssl  09:45   0:11 /csi-node-driver-registrar --v=5 --csi-address=/csi/csi.sock --kubelet-registration-path=/var/data/kubelet/csi-plugins/vpc.block.csi.ibm.io/csi.sock
root      34910  0.0  0.0   3884  2040 pts/0    S+   14:13   0:00 grep --color=auto node-driver-registrar
[root@test-ckehqph20ccgdlvbd0j0-dmhyper71-default-00000d87 /]# strace -p 5805
strace: Process 5805 attached
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0xc00005c548, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [], 128, 740, NULL, 16478342516424) = 0
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc00005c548, FUTEX_WAKE_PRIVATE, 1) = 1
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/csi/csi.sock"}, 16) = -1 EACCES (Permission denied)
close(3)                                = 0
futex(0xc00005c948, FUTEX_WAKE_PRIVATE, 1) = 1
write(6, "\0", 1)                       = 1
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [{events=EPOLLIN, data={u32=17034584, u64=17034584}}], 128, 3660, NULL, 16487601007594) = 1
read(5, "\0", 16)                       = 1
epoll_pwait(4, [], 128, 984, NULL, 16484926956312) = 0
futex(0xc00005c948, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc00005c548, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc00005c548, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc00005c948, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [{events=EPOLLIN, data={u32=17034584, u64=17034584}}], 128, 1630, NULL, 16487601007594) = 1
read(5, "\0", 16)                       = 1
epoll_pwait(4, [], 128, 958, NULL, 16486934167865) = 0
futex(0xc0003f8148, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc00005c948, FUTEX_WAKE_PRIVATE, 1) = 1
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_UNIX, sun_path="/csi/csi.sock"}, 16) = -1 EACCES (Permission denied)
close(3)                                = 0
futex(0xc00005c948, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc00005c548, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0x100f4a8, FUTEX_WAIT_PRIVATE, 0, NULL^Cstrace: Process 5805 detached
 <detached ...>

[root@test-ckehqph20ccgdlvbd0j0-dmhyper71-default-00000d87 /]# ls -aZ /var/lib/kubelet/plugins
   system_u:object_r:container_file_t:s0 .  system_u:object_r:container_var_lib_t:s0 ..
[root@test-ckehqph20ccgdlvbd0j0-dmhyper71-default-00000d87 /]# ls -aZ /var/lib/kubelet/plugins_registry
   system_u:object_r:container_file_t:s0 .  system_u:object_r:container_var_lib_t:s0 ..
[root@test-ckehqph20ccgdlvbd0j0-dmhyper71-default-00000d87 /]# ps -auxZ | grep registrar
system_u:system_r:container_runtime_t:s0 root 5789 0.0  0.0 8304  1992 ?        Ss   09:45   0:00 /usr/bin/conmon -b /var/data/crioruntimestorage/overlay-containers/d817c068ed9c86bea0d3024903e707e2c63ad11d4456aebd22ab0f4e394214c8/userdata -c d817c068ed9c86bea0d3024903e707e2c63ad11d4456aebd22ab0f4e394214c8 --exit-dir /var/run/crio/exits -l /var/log/pods/kube-system_ibm-vpc-block-csi-node-tz5m4_222f01fb-d7cd-4075-8fba-120ff0b32ccd/csi-driver-registrar/0.log --log-level info -n k8s_csi-driver-registrar_ibm-vpc-block-csi-node-tz5m4_kube-system_222f01fb-d7cd-4075-8fba-120ff0b32ccd_0 -P /var/data/crioruntimestorage/overlay-containers/d817c068ed9c86bea0d3024903e707e2c63ad11d4456aebd22ab0f4e394214c8/userdata/conmon-pidfile -p /var/data/crioruntimestorage/overlay-containers/d817c068ed9c86bea0d3024903e707e2c63ad11d4456aebd22ab0f4e394214c8/userdata/pidfile --persist-dir /var/data/criorootstorage/overlay-containers/d817c068ed9c86bea0d3024903e707e2c63ad11d4456aebd22ab0f4e394214c8/userdata -r /usr/bin/runc --runtime-arg --root=/run/runc --socket-dir-path /var/run/crio --syslog -u d817c068ed9c86bea0d3024903e707e2c63ad11d4456aebd22ab0f4e394214c8 -s
system_u:system_r:container_t:s0:c138,c358 root 5805 0.0  0.1 725532 23592 ?    Ssl  09:45   0:11 /csi-node-driver-registrar --v=5 --csi-address=/csi/csi.sock --kubelet-registration-path=/var/data/kubelet/csi-plugins/vpc.block.csi.ibm.io/csi.sock
system_u:system_r:spc_t:s0      root      38152  0.0  0.0   3876  2092 pts/0    S+   14:15   0:00 grep --color=auto registrar
[root@test-ckehqph20ccgdlvbd0j0-dmhyper71-default-00000d87 /]# 
    spec:
      containers:
      - args:
        - --v=5
        - --csi-address=$(ADDRESS)
        - --kubelet-registration-path=$(DRIVER_REGISTRATION_SOCK)
        env:
        - name: ADDRESS
          value: /csi/csi.sock
        - name: DRIVER_REGISTRATION_SOCK
          value: /var/data/kubelet/csi-plugins/vpc.block.csi.ibm.io/csi.sock
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: icr.io/ext/sig-storage/csi-node-driver-registrar:v2.7.0
        imagePullPolicy: Always
        name: csi-driver-registrar
        resources:
          limits:
            cpu: 40m
            memory: 80Mi
          requests:
            cpu: 10m
            memory: 20Mi
        securityContext:
          privileged: false
          runAsNonRoot: false
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /csi
          name: plugin-dir
        - mountPath: /registration
          name: registration-dir
      - args:
        - --v=5
        - --endpoint=unix:/csi/csi.sock
        - --sidecarEndpoint=$(SIDECAREP)
        env:
        - name: SIDECAREP
          value: /sidecardir/provider.sock
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        envFrom:
        - configMapRef:
            name: ibm-vpc-block-csi-configmap
        image: us.icr.io/armada-master/ibm-vpc-block-csi-driver:v5.1.13
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 5
          httpGet:
            path: /healthz
            port: healthz
            scheme: HTTP
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 3
        name: iks-vpc-block-node-driver
        ports:
        - containerPort: 9808
          name: healthz
          protocol: TCP
        resources:
          limits:
            cpu: 120m
            memory: 300Mi
          requests:
            cpu: 30m
            memory: 75Mi
        securityContext:
          privileged: true
          runAsNonRoot: false
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /sidecardir
          name: secret-sidecar-sock-dir
        - mountPath: /var/data/kubelet
          mountPropagation: Bidirectional
          name: kubelet-data-dir
        - mountPath: /var/lib/kubelet
          mountPropagation: Bidirectional
          name: kubelet-lib-dir
        - mountPath: /csi
          name: plugin-dir
        - mountPath: /dev
          name: device-dir
        - mountPath: /etc/udev
          name: etcudevpath
        - mountPath: /run/udev
          name: runudevpath
        - mountPath: /lib/udev
          name: libudevpath
        - mountPath: /sys
          name: syspath
        - mountPath: /etc/storage_ibmc
          name: customer-auth
          readOnly: true
        - mountPath: /etc/storage_ibmc/cluster_info
          name: cluster-info
          readOnly: true
      - args:
        - --csi-address=/csi/csi.sock
        image: icr.io/ext/sig-storage/livenessprobe:v2.9.0
        imagePullPolicy: IfNotPresent
        name: liveness-probe
        resources:
          limits:
            cpu: 20m
            memory: 40Mi
          requests:
            cpu: 5m
            memory: 10Mi
        securityContext:
          privileged: false
          runAsNonRoot: false
          runAsUser: 0
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /csi
          name: plugin-dir
      - args:
        - --endpoint=$(ENDPOINT)
        env:
        - name: ENDPOINT
          value: unix:/sidecardir/provider.sock
        - name: TOKEN_EXPIRY_DIFF
          value: 20m
        - name: PROFILE_CAPACITY
          value: "1"
        image: icr.io/obs/armada-storage-secret:v1.2.26
        imagePullPolicy: Always
        name: storage-secret-sidecar
        resources:
          limits:
            cpu: 40m
            memory: 80Mi
          requests:
            cpu: 10m
            memory: 20Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /sidecardir
          name: secret-sidecar-sock-dir
        - mountPath: /var/run/secrets/tokens
          name: vault-token
      dnsPolicy: ClusterFirst
      priorityClassName: system-node-critical
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: ibm-vpc-block-node-sa
      serviceAccountName: ibm-vpc-block-node-sa
      terminationGracePeriodSeconds: 30
      tolerations:
      - operator: Exists
      volumes:
      - name: vault-token
        projected:
          defaultMode: 420
          sources:
          - serviceAccountToken:
              expirationSeconds: 600
              path: vault-token
      - emptyDir: {}
        name: secret-sidecar-sock-dir
      - hostPath:
          path: /var/data/kubelet/plugins_registry/
          type: Directory
        name: registration-dir
      - hostPath:
          path: /var/data/kubelet
          type: Directory
        name: kubelet-data-dir
      - hostPath:
          path: /var/lib/kubelet
          type: Directory
        name: kubelet-lib-dir
      - hostPath:
          path: /var/data/kubelet/csi-plugins/vpc.block.csi.ibm.io/
          type: DirectoryOrCreate
        name: plugin-dir
      - hostPath:
          path: /dev
          type: Directory
        name: device-dir
      - hostPath:
          path: /etc/udev
          type: Directory
        name: etcudevpath
      - hostPath:
          path: /run/udev
          type: Directory
        name: runudevpath
      - hostPath:
          path: /lib/udev
          type: Directory
        name: libudevpath
      - hostPath:
          path: /sys
          type: Directory
        name: syspath
      - name: customer-auth
        secret:
          defaultMode: 420
          secretName: storage-secret-store
      - configMap:
          defaultMode: 420
          name: cluster-info
        name: cluster-info
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

Do you see any gap here @jsafrane

@jsafrane
Copy link
Contributor

jsafrane commented Oct 5, 2023

Sorry, I was wrong. This is not true;

Non-privileged registrar should have something like system_u:system_r:container_t:s0:c101,c943, allowing the registrar to access anything with system_u:object_r:container_file_t:s0, but nothing else on the host.

Only privileged containers can write to directory with label system_u:object_r:container_file_t:s0. So the node-driver-registrar container needs to be privileged too (securityContext.privileged: true instead of false)

@ambiknai
Copy link
Contributor

ambiknai commented Oct 5, 2023

@jsafrane Is this specific to RHCOS environment.. because above deployment just works fine on Managed Clusters [IBM IKS and ROKS ]

@ambiknai
Copy link
Contributor

ambiknai commented Oct 6, 2023

Also with privileged: true

I1006 04:45:27.798059       1 main.go:166] Version: v2.5.0
I1006 04:45:27.798159       1 main.go:167] Running node-driver-registrar in mode=registration
I1006 04:45:27.799183       1 main.go:191] Attempting to open a gRPC connection with: "/csi/csi.sock"
I1006 04:45:27.799240       1 connection.go:154] Connecting to unix:///csi/csi.sock
W1006 04:45:37.800122       1 connection.go:173] Still connecting to unix:///csi/csi.sock
I1006 04:45:41.990139       1 main.go:198] Calling CSI driver to discover driver name
I1006 04:45:41.990172       1 connection.go:183] GRPC call: /csi.v1.Identity/GetPluginInfo
I1006 04:45:41.990181       1 connection.go:184] GRPC request: {}
I1006 04:45:42.001521       1 connection.go:186] GRPC response: {"name":"vpc.block.csi.ibm.io","vendor_version":"vpcBlockDriver-"}
I1006 04:45:42.096155       1 connection.go:187] GRPC error: <nil>
I1006 04:45:42.096174       1 main.go:208] CSI driver name: "vpc.block.csi.ibm.io"
I1006 04:45:42.096236       1 node_register.go:53] Starting Registration Server at: /registration/vpc.block.csi.ibm.io-reg.sock
I1006 04:45:42.098137       1 node_register.go:62] Registration Server started at: /registration/vpc.block.csi.ibm.io-reg.sock
I1006 04:45:42.098833       1 node_register.go:92] Skipping HTTP server because endpoint is set to: ""
I1006 04:45:42.405188       1 main.go:102] Received GetInfo call: &InfoRequest{}
I1006 04:45:42.405742       1 main.go:109] "Kubelet registration probe created" path="/var/lib/kubelet/csi-plugins/vpc.block.csi.ibm.io/registration"
I1006 04:45:42.538956       1 main.go:120] Received NotifyRegistrationStatus call: &RegistrationStatus{PluginRegistered:true,Error:,}
E1006 04:46:25.366965       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:47:25.369386       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:48:25.371619       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:49:25.385575       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:50:25.367922       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:51:25.367381       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:53:55.368184       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 04:57:35.367967       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:03:35.381544       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:04:35.368036       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:10:35.369785       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:11:35.390963       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:17:45.369021       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:18:45.375188       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:24:45.372015       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:25:45.368662       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:31:45.373822       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
E1006 05:32:45.369119       1 connection.go:132] Lost connection to unix:///csi/csi.sock.
ambikanair@Ambikas-MBP uuid % 
[root@pres-ckf46i210uc77rmi4dgg-hypershift1-default-0000012b /]# ls -aZ /var/lib/kubelet/plugins
   system_u:object_r:container_file_t:s0 .  system_u:object_r:container_var_lib_t:s0 ..
[root@pres-ckf46i210uc77rmi4dgg-hypershift1-default-0000012b /]# ls -aZ /var/lib/kubelet/plugins_registry
   system_u:object_r:container_file_t:s0 .  system_u:object_r:container_var_lib_t:s0 ..     system_u:object_r:container_file_t:s0 vpc.block.csi.ibm.io-reg.sock
[root@pres-ckf46i210uc77rmi4dgg-hypershift1-default-0000012b /]# ps -auxZ | grep registrar
system_u:system_r:spc_t:s0      root      22360  0.0  0.0   3876  2100 pts/0    S+   05:32   0:00 grep --color=auto registrar
system_u:system_r:container_runtime_t:s0 root 25286 0.0  0.0 8304 2068 ?        Ss   04:45   0:00 /usr/bin/conmon -b /var/data/crioruntimestorage/overlay-containers/5a5c69621c9eeec22fd8235960005c6daa2b844c1f1f0e94f42a6ff74aeea14c/userdata -c 5a5c69621c9eeec22fd8235960005c6daa2b844c1f1f0e94f42a6ff74aeea14c --exit-dir /var/run/crio/exits -l /var/log/pods/kube-system_ibm-vpc-block-csi-node-zljrx_8129be92-e91d-461f-b3ff-d9b13836f98b/csi-driver-registrar/0.log --log-level info -n k8s_csi-driver-registrar_ibm-vpc-block-csi-node-zljrx_kube-system_8129be92-e91d-461f-b3ff-d9b13836f98b_0 -P /var/data/crioruntimestorage/overlay-containers/5a5c69621c9eeec22fd8235960005c6daa2b844c1f1f0e94f42a6ff74aeea14c/userdata/conmon-pidfile -p /var/data/crioruntimestorage/overlay-containers/5a5c69621c9eeec22fd8235960005c6daa2b844c1f1f0e94f42a6ff74aeea14c/userdata/pidfile --persist-dir /var/data/criorootstorage/overlay-containers/5a5c69621c9eeec22fd8235960005c6daa2b844c1f1f0e94f42a6ff74aeea14c/userdata -r /usr/bin/runc --runtime-arg --root=/run/runc --socket-dir-path /var/run/crio --syslog -u 5a5c69621c9eeec22fd8235960005c6daa2b844c1f1f0e94f42a6ff74aeea14c -s
system_u:system_r:spc_t:s0      root      25320  0.0  0.1 716540 22696 ?        Ssl  04:45   0:01 /csi-node-driver-registrar --v=5 --csi-address=/csi/csi.sock --kubelet-registration-path=/var/lib/kubelet/csi-plugins/vpc.block.csi.ibm.io/csi.sock
[root@pres-ckf46i210uc77rmi4dgg-hypershift1-default-0000012b /]# 
[root@pres-ckf46i210uc77rmi4dgg-hypershift1-default-0000012b /]# 
[root@pres-ckf46i210uc77rmi4dgg-hypershift1-default-0000012b /]# strace -p 25320
strace: Process 25320 attached
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [], 128, 0, NULL, 0)     = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
epoll_pwait(4, [], 128, 0, NULL, 0)     = 0
futex(0xc000054d50, FUTEX_WAKE_PRIVATE, 1) = 1
epoll_pwait(4, [], 128, 0, NULL, 12279) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
write(6, "\0", 1)                       = 1
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
futex(0xf86ed0, FUTEX_WAIT_PRIVATE, 0, NULL) = 0

@ambiknai
Copy link
Contributor

Thank you @jsafrane
privileged:true did solve the issue

@jsafrane
Copy link
Contributor

/close

@k8s-ci-robot
Copy link
Contributor

@jsafrane: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants