Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 177 remove unneeded dependencies due to security concerns #185

Conversation

BGrasnick
Copy link
Contributor

@BGrasnick BGrasnick commented Aug 29, 2023

  • Feat: remove deprecated mount of /sys/kernel/debug which is not needed anymore since add an option to use libbpf attacher kepler#752
  • Feat: make pod use HostPID to avoid any possible PID collision in the future
  • Feat: make as many mounted host volumes as possible ReadOnly: true
  • Feat: reduce priviliges granted through SecurityContextConstraints as minimal as possible
  • Test: add test for HostPID: true
  • Test: volume mounts are set up correctly
  • Test: add tests for SCC allows
  • Test: add tests for volumes
  • Chore: move utility functions to pkg/utils/k8s/k8s.go

handles one part of #177

…d anymore since sustainable-computing-io/kepler#752

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
… future

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
… minimal as possible

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
no /sys/kernel/debug and as many VolumeMounts as possible ReadOnly

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
@BGrasnick BGrasnick changed the title feat/177 remove unneeded dependencies due to security concerns feat: 177 remove unneeded dependencies due to security concerns Aug 29, 2023
@BGrasnick
Copy link
Contributor Author

I tested this with both kind locally as well as an OpenShift cluster running version 4.11.47.
All go unit tests ran successfully locally.

I was not so sure about the testing code because I only have 1 test case each but wanted to follow the previous approach started with the NodeSelection and Tolerations test. However, maybe this is a bit too much and could be made more minimal. I am open to your feedback and suggestions :)

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
@BGrasnick BGrasnick force-pushed the feat/177-remove-unneeded-dependencies-due-to-security-concerns branch from b858848 to 45ad03c Compare August 29, 2023 16:24
AllowHostPID: SCC.AllowHostPID,
AllowHostPorts: SCC.AllowHostPorts,
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (feel free to ignore): I think these generic utility functions are better placed in pkg/utils/k8s so that these are useful in e2e as and in controllers as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@sthaha
Copy link
Collaborator

sthaha commented Aug 29, 2023

@BGrasnick , thanks a lot for the contribution 🤗. Looks good to me. Since this needs to be tested in openshift 4.13 (due to SCC chnage), we will have to do it manually. @vimalk78 does this align with what you were also trying to achieve?

@vprashar2929 could you please take a look as well?

{Name: "lib-modules", MountPath: "/lib/modules"},
{Name: "tracing", MountPath: "/sys"},
{Name: "kernel-src", MountPath: "/usr/src/kernels"},
{Name: "kernel-debug", MountPath: "/sys/kernel/debug"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need sys/kernel/debug for bcc tracepoints to compile. can we remove this when bcc is removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, it should be obsolete now that sustainable-computing-io/kepler#752 is merged and released

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BGrasnick Could you please clarify which release contains that PR?

git tag --contains 885c1bc37ba18e00725149cf06c3f53c86846b59

returns nothing 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that this version of operator deploy 0.5.4 version of kepler. See https://github.com/sustainable-computing-io/kepler-operator/blob/v1alpha1/pkg/components/exporter/exporter.go#L51

Would you like to update that as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what happened with the commit and its SHA but apparently the change is in the main branch and also releases 0.5.2, 0.5.3 and 0.5.4. It was first merged, then reverted and the revert was reverted again...

But in quay there is also a release-0.5.3-libbpf, release-0.5.4-libbpf and latest-libbpf build.

Regarding the version I don't quite understand because as far as I can tell, this PR also uses release-0.5.4 version of kepler.

Also I just tried locally with our 4.11 OpenShift cluster and from what I can see this PR works with both works with latest and latest-libbpf images of kepler. There were some missing metrics when using release-0.5.4 but I think this is due to sustainable-computing-io/kepler#877. So it should work with both bcc and libbpf without mounting sys/kernel/debug but I am no expert in that area and would love if you could confirm this.

If I understand correctly, right now the user can choose whether they want to use bcc or libbpf by using a specific image of kepler. If sys/kernel/debug is necessary for bcc maybe we can make this optional and enable choosing bcc vs libbpf image in the spec of kepler CR and then optionally mount sys/kernel/debug? Is the plan for the future to offer both or go for only one of them?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this either since it is @vimalk78 who has been looking into kepler implementation (including bcc)

@rootfs , could you please help us resolve this?
I guess the question is if we require {Name: "kernel-debug", MountPath: "/sys/kernel/debug"}, mount for 0.5.4 to work properly given it uses bcc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sthaha This is a good question. The k8s kepler exporter yaml doesn't have it now https://github.com/sustainable-computing-io/kepler/blob/main/manifests/config/exporter/exporter.yaml, but for some unknown reason it is required for openshift when first experimented https://github.com/sustainable-computing-io/kepler/blob/main/manifests/config/exporter/patch/patch-openshift.yaml

Would you please check if this is still the requirement for openshift? It would be great if kepler doesn't need this mount any more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BGrasnick I would love to include this patch in the release we are going to make today. Since we don't have a consensus on whether {Name: "kernel-debug", MountPath: "/sys/kernel/debug"} is required, would you mind restoring that back which gives buys us time to properly investigate this issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we could do that and separate it into another PR and then both together would fix the issue. If you don't mind, what is the release cadence so when could this /sys/kernel/debug fix be released in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #198 as a follow-up PR for the /sys/kernel/debug mount. Let's continue the discussion there! @vimalk78 @sthaha @rootfs

AllowHostPID: SCC.AllowHostPID,
AllowHostPorts: SCC.AllowHostPorts,
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@sthaha
Copy link
Collaborator

sthaha commented Aug 30, 2023

@BGrasnick, could you also please rebase this patch on to the latest v1alpha1 which has the fix for "broken" e2e tests.

@BGrasnick
Copy link
Contributor Author

Rebased and put the util functions from the tests into pkg/utils/k8s :)
If there is anything else we could improve, let me know!

@@ -121,6 +121,7 @@ func NewDaemonSet(detail components.Detail, k *v1alpha1.Kepler) *appsv1.DaemonSe
Labels: podSelector,
},
Spec: corev1.PodSpec{
HostPID: true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sthaha, I added this because kepler mounts /proc and therefore can access the PID namespace of the host. I am not sure about the internals of kepler yet but apparently it does not interfere with the host PID namespace yet. However, if we don't set HostPID: true, the kepler pod will have its own PID namespace. In this kepler PID namespace maybe some process for example has PID 0 or 1 and another process on the host could have the same PID. If in the future kepler would somehow interact with this PID there could be collisions and unwanted behavior like interacting with systemd without knowing. To prevent this, HostPID: true should be set so there are no PID collisions between host and kepler pod.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rootfs may be kepler manifests should have this as well

… it is obsolete

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
@BGrasnick BGrasnick force-pushed the feat/177-remove-unneeded-dependencies-due-to-security-concerns branch from 618fe62 to 92840d6 Compare August 31, 2023 10:15
Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
Copy link
Collaborator

@sthaha sthaha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @BGrasnick

@sthaha sthaha merged commit 368f975 into sustainable-computing-io:v1alpha1 Sep 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants