Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sbom-action not working with docker-in-docker GitHub actions runner #424

Closed
apr-1985 opened this issue Jun 21, 2023 · 18 comments
Closed

sbom-action not working with docker-in-docker GitHub actions runner #424

apr-1985 opened this issue Jun 21, 2023 · 18 comments
Labels
needs-reproduction missing steps to reproduce or steps have not been confirmed

Comments

@apr-1985
Copy link

I build a docker container using the docker/setup-buildx-action@v2.6.0 and docker/build-push-action@v4.1.0 actions. But do not push the container to Artifactory at this point.

The image is built and loaded into docker fine, however when I try to run this action it cannot find the image locally and appears to be looking for a podman sock rather than docker.

- name: Generate SBOM
        id: sbom
        uses: anchore/sbom-action@v0
        with:
          image: "${{ inputs.registry }}/${{ inputs.image_name }}:${{ steps.image_date_tag.outputs.TAG_NAME }}"

gives the error (PII removed from container and repo names)

[0000] DEBUG no socket address was found. Trying default address: /run/user/1001/podman/podman.sock from-lib=stereoscope
  [0000] DEBUG looking for socket file: stat /run/user/1001/podman/podman.sock: no such file or directory from-lib=stereoscope
  
  [0000] DEBUG image: source=OciRegistry location=docker-local.my.artifactory.systems/gha-scaleset-runner:20230621-121746 from-lib=stereoscope
  
  [0000] DEBUG pulling image info directly from registry image="docker-local.my.artifactory.systems/gha-scaleset-runner:2023062-[121](https://github.com/my_org/my_repo/actions/runs/5333937405/jobs/9665073233#step:13:122)746" from-lib=stereoscope
  [0000] DEBUG no registry credentials configured, using the default keychain from-lib=stereoscope
  
  2023/06/21 12:22:03 error during command execution: 1 error occurred:
  	* failed to construct source from user input "docker-local.my.artifactory.systems/gha-scaleset-runner:20230621-121746": could not fetch image "docker-local.my.artifactory.systems/gha-scaleset-runner:20230621-121746": unable to use OciRegistry source: failed to get image descriptor from registry: GET https://docker-local.my.artifactory.systems/v2/gha-scaleset-runner/manifests/20230621-121746: MANIFEST_UNKNOWN: The named manifest is not known to the registry.; map[manifest:gha-scaleset-runner/20230621-121746/manifest.json]

I am not sure why it is looking for a podman socket when all the documentation refers to using docker and I cannot find a config option for this.

Thanks for the help.

@willmurphyscode
Copy link
Contributor

Hi @apr-1985, thanks for reporting this issue!

Can you help me understand one thing: Is this a setup that previously worked, and stopped working, or is this a new setup?

@willmurphyscode willmurphyscode self-assigned this Jun 23, 2023
@apr-1985
Copy link
Author

Hi @apr-1985, thanks for reporting this issue!

Can you help me understand one thing: Is this a setup that previously worked, and stopped working, or is this a new setup?

Thanks for the reply.
It is a brand new workflow.

The setup is kubernetes self hosted runners using the github scale set runners https://github.com/actions/actions-runner-controller
The runners use a dind container in the pod for running containers as the nodes have no docker sockets.

As far as I am aware nothing has podman installed.

Cheers

Adam

@willmurphyscode
Copy link
Contributor

I think stereoscope is failing to talk to the Docker socket, and so falling back to podman (in case the user is running podman and not docker.

On my system, if I quit docker and run docker logout, and then run syft against a private docker image, I get a similar output (replacing the repo/tag of the real image with $PRIVATE b/c it's a private image):

$ syft -vv $PRIVATE
[0000] DEBUG no socket address was found. Trying default address: /run/user/502/podman/podman.sock from-lib=stereoscope
[0000] DEBUG looking for socket file: stat /run/user/502/podman/podman.sock: no such file or directory from-lib=stereoscope
[0000] DEBUG image: source=OciRegistry location=$PRIVATE from-lib=stereoscope
[0000] DEBUG pulling image info directly from registry image="$PRIVATE" from-lib=stereoscope
[0000] DEBUG no registry credentials configured, using the default keychain from-lib=stereoscope
2023/06/23 13:13:59 error during command execution: 1 error occurred:
	* failed to construct source from user input "$PRIVATE": could not fetch image "$PRIVATE": 
  unable to use OciRegistry source: failed to get image descriptor from registry: 
  GET https://index.docker.io/v2/$PRIVATE: UNAUTHORIZED: authentication required; [map[Action:pull Class: Name:$PRIVATE Type:repository]]

So I think the debug output you're seeing is syft trying Docker, then falling back to Podman, then falling back to using its built-in OCI Registry client directly, then failing. It's definitely a little surprising that it doesn't mention docker at all in the output. Are you sure the action has authorization and connectivity to talk to the docker socket at that point?

@apr-1985
Copy link
Author

Hi.
I added in a docker images step to my pipeline to test this. There shouldn't be any auth issues as the image is local to the container having just been built.

The section of workflow is now

      - name: Build Docker Image
        uses: docker/build-push-action@v4.1.0
        with:
          context: .
          load: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

      - name: Test Docker
        run: docker images

      - name: Generate SBOM
        id: sbom
        uses: anchore/sbom-action@v0
        with:
          image: "${{ inputs.registry }}/${{ inputs.image_name }}:${{ steps.image_date_tag.outputs.TAG_NAME }}"

The test docker step is working and shows the images with the correct tags however the Syft Action fails with the podman error described above.
See the screenshot (apologies for the censoring)

Screenshot 2023-06-27 at 08 44 36

I think the issue probably comes from the fact that the GitHub Scaleset implementation uses a Docker In Docker container inside the pod and the the Runner container uses DOCKER_HOST env var to ship all the docker commands off to that DinD container

    env:
    - name: DOCKER_HOST
      value: tcp://localhost:2376
    - name: DOCKER_TLS_VERIFY
      value: "1"
    - name: DOCKER_CERT_PATH
      value: /certs/client

This means there is no docker sock in the runner container so if the Sbom-Action is not respecting the DOCKER_HOST env var and just looking for /var/run/docker.sock (or similar) then it wont find it on the runner.

Thanks for the help looking at this

@willmurphyscode
Copy link
Contributor

Thanks @apr-1985 for the detailed follow-up!

I think you're right that something is off with stereoscope's handling of the DOCKER_HOST environment variable, looking at https://github.com/anchore/stereoscope/blob/cd49355d934e9e09339e0b690398afe7bd9f63f1/internal/docker/client.go#L19-L51

It looks like we special case host's that start with ssh, but we probably need to also respect people setting DOCKER_HOST to a local tcp port. I'll move this to the backlog, and we'll try to pick it up soon. I'd also be more than happy to review a pull request if that's something you're interested in.

@willmurphyscode willmurphyscode transferred this issue from anchore/sbom-action Jun 28, 2023
@willmurphyscode
Copy link
Contributor

It looks like the client itself doesn't automatically respect the DOCKER_HOST env var:

https://github.com/moby/moby/blob/b6ad25bf5e718142a03ae1027933e8b976dfc923/client/client.go#L133-L155

@apr-1985
Copy link
Author

apr-1985 commented Jul 4, 2023

Hi, Apologies I have been away.
Thanks for your continued investigation of this.

@willmurphyscode
Copy link
Contributor

Here is some testing I did after disabling the default docker socket on my laptops (the one at /var/run/docker.sock):

$ syft docker:busybox:latest
2023/07/10 11:54:40 error during command execution: 1 error occurred:
  * failed to construct source from user input "docker:busybox:latest": could not
  fetch image "busybox:latest": scheme "docker" specified; image retrieval
  using scheme parsing (busybox:latest) was unsuccessful: unable to use
  DockerDaemon source: unable to inspect existing image: Cannot connect to the
  Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?;
  image retrieval without scheme parsing (docker:busybox:latest) was
  unsuccessful: unable to determine image source

$ DOCKER_HOST="unix://$HOME/.docker/run/docker.sock" syft docker:busybox:latest
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged packages      [1 packages]
NAME     VERSION  TYPE
busybox  1.36.1   binary

It looks like syft is already respecting DOCKER_HOST, I just haven't found the code that makes it do that.

@willmurphyscode
Copy link
Contributor

@apr-1985 one thing you might try doing is explicitly specifying that syft should be using docker:

 - name: Generate SBOM
        id: sbom
        uses: anchore/sbom-action@v0
        with:
          image: docker:"${{ inputs.registry }}/${{ inputs.image_name }}:${{ steps.image_date_tag.outputs.TAG_NAME }}"

(note docker prefixed on the image input). Please let us know if that helps. Thanks!

@apr-1985
Copy link
Author

Hi thanks for the update and the continued investigation.
I have tried as requested but it is still failing on looking for the docker sock

  [0000] DEBUG image: source=DockerDaemon location=docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433 from-lib=stereoscope
  
  [0000]  WARN scheme "docker" specified, but it coincides with a common image name; re-examining user input "docker:docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433" without scheme parsing because image retrieval using scheme parsing was unsuccessful: unable to use DockerDaemon source: unable to inspect existing image: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
  
  [0000] DEBUG image: source=UnknownSource location=docker:docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433 from-lib=stereoscope
  2023/07/11 09:08:42 error during command execution: 1 error occurred:
  	* failed to construct source from user input "docker:docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433": could not fetch image "docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433": scheme "docker" specified; image retrieval using scheme parsing (docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433) was unsuccessful: unable to use DockerDaemon source: unable to inspect existing image: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?; image retrieval without scheme parsing (docker:docker-local.artifactory.systems/gha-scaleset-runner:20230711-090433) was unsuccessful: unable to determine image source

@willmurphyscode
Copy link
Contributor

Thanks for testing again and for the logs @apr-1985. I'll keep looking into this.

@willmurphyscode
Copy link
Contributor

So I've confirmed in a Linux VM (fedora running under lima in case that matters for some reason in the future) that syft is respecting DOCKER_HOST even when its a tcp port instead of a local socket. Running the same command below with and without DOCKER_HOST in the environment:

$ DOCKER_HOST=tcp://localhost:2376 syft packages docker:busybox:latest
 ✔ Loaded image
 ✔ Parsed image
 ✔ Cataloged packages      [1 packages]
NAME     VERSION  TYPE
busybox  1.36.1   binary
# re-run without DOCKER_HOST to prove that was why it worked previously
$ syft packages docker:busybox:latest

2023/07/11 21:34:04 error during command execution: 1 error occurred:
	* failed to construct source from user input "docker:busybox:latest": could not fetch image "busybox:latest": scheme "docker" specified; image retrieval using scheme parsing (busybox:latest) was unsuccessful: unable to use DockerDaemon source: unable to inspect existing image: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?; image retrieval without scheme parsing (docker:busybox:latest) was unsuccessful: unable to determine image source

So it looks like DOCKER_HOST is respected by syft correctly in this case. It seems like there are two possibilities:

  1. The environment variable is not being set for syft when syft runs
  2. Something else about the setup is making syft try to talk to docker on the wrong host.

@willmurphyscode willmurphyscode transferred this issue from anchore/stereoscope Jul 11, 2023
@apr-1985
Copy link
Author

This is a strange one as the Env var is definitely set as I can run Docker commands (e.g. docker images #424 (comment)) in the Workflow and build my test container and push it to Artifactory.

Does the Syft Action run in a container and the env var not being passed through? Or is it a JS action?

@kzantow
Copy link
Contributor

kzantow commented Jul 18, 2023

This sbom-action is a JS action.

@jdagostino9188
Copy link

We are hitting this same issue in our env. Action not working w dind actions runner deployment.

@willmurphyscode
Copy link
Contributor

Thanks for the comment @jdagostino9188 and @apr-1985! I wonder, can this issue be reproduced just by running syft in docker-in-docker without involving GitHub actions? That would make it much easier to work on. I'll try to reproduce it that way when I get a chance, but if anyone else could try to reproduce the issue just by running syft with docker in docker, or any way to reproduce it locally in general, that would be a huge help.

@willmurphyscode willmurphyscode changed the title Steroscope looking for podman rather than docker sock sbom-action not working with docker-in-docker GitHub actions runner Jul 25, 2023
@willmurphyscode willmurphyscode removed their assignment Aug 31, 2023
@willmurphyscode willmurphyscode moved this from In Progress to Ready in OSS Feb 12, 2024
@wagoodman wagoodman added needs-investigation needs-reproduction missing steps to reproduce or steps have not been confirmed and removed needs-investigation labels Mar 19, 2024
@apr-1985
Copy link
Author

I have circled back round to this again.
Using the latest Action release it is all working on my setup now, so whatever the issue was seems to have been resolved organically.

This can probably be closed.

@kzantow
Copy link
Contributor

kzantow commented Jul 18, 2024

Thanks for following up, @apr-1985 ! Please let us know if the problem resurfaces!

@kzantow kzantow closed this as not planned Won't fix, can't repro, duplicate, stale Jul 18, 2024
@github-project-automation github-project-automation bot moved this from Ready to Done in OSS Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-reproduction missing steps to reproduce or steps have not been confirmed
Projects
Archived in project
Development

No branches or pull requests

5 participants