Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add method field to exporter spec #287

Conversation

BGrasnick
Copy link
Contributor

@BGrasnick BGrasnick commented Oct 23, 2023

Addresses #195.

Adds a new enum to kepler exporter spec that allows choosing which method/image to use: bcc or libbpf.

Needs to be changed once kepler 0.7.0 is released and default will be libbpf.

Testing was a little challenging and I am not super happy with the solution and open for suggestions.
As of now, the RELATED_IMAGE_KEPLER var gets set for Config.Image and determines the image used.
This is done in main.go lines 78 & 79. As this was needed for the test, I added lines 111-113 in exporter.go to set a default in case Config.Image is empty.
However, because of this and the test, now the image & its version is hardcoded in 3 different locations which is not good.
I am open to any help here :)

Was tested manually on an OpenShift 4.11 cluster.

@sthaha: any thoughts?

…ose corresponding image

Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
Signed-off-by: Bastien Grasnick <bastien.grasnick@deutschebahn.com>
@sthaha
Copy link
Collaborator

sthaha commented Oct 23, 2023

@BGrasnick , thank you for the PR. @vimalk78 is already working on enabling something similar but using an (undocumented) annotation since we know this (hack) is going to be short lived.
The reason I call it a hack is because after 0.7, we will have to deprecate the method unless bcc is going to be supported for the foreseeable future.

@BGrasnick
Copy link
Contributor Author

Ah ok nevermind then! I wasn't sure if bcc will be deprecated or still be used in the future and thought it might be helpful also in the future because of your answer in #195.
When will this workaround come? We want to roll out kepler but need this or a release of 0.7.0 for kepler to work as we depend on libbpf...

@sthaha
Copy link
Collaborator

sthaha commented Oct 24, 2023

@BGrasnick , it's good to validate our assumptions :)

@marceloamaral @rootfs, Could you please confirm if bcc will be deprecated? In this PR @BGrasnick adds a field in exporter spec to allow choosing between bcc and libbpf, depending on which, operator will deploy appropriate image.

We (@vimalk78 ) are currently working on a solution to allow this through a hack. We intent to add the ability to honour an annotation instead of a field in the spec, so that we don't have to delete the field later on when bcc won't be used anymore - which is causes API breakage.

But, this is all made under the assumption that there will only be libbpf image in future (from 0.7). Is that the right assumption to make? or would continue to use bcc, or some other mechanism later?

@BGrasnick
Copy link
Contributor Author

@sthaha yeah definitely sounds good to check again what is planned for the future.

My team and organization want to deploy kepler to our cloud/VM-based OpenShift clusters ASAP because we love the integrated dashboards and how easy use to use the operator is. We want to make this available for all teams on our clusters but cannot do it right now because we need to use libbpf instead of bcc. And there is neither a release date for kepler 0.7.0 when libbpf will become default nor a preview for it as far as I know. Do you know of any other way how we could use the libbpf version of kepler with the operator right now?

@sthaha
Copy link
Collaborator

sthaha commented Oct 24, 2023

My team and organization want to deploy kepler to our cloud/VM-based OpenShift clusters ASAP because we love the integrated dashboards and how easy use to use the operator is. We want to make this available for all teams on our clusters but cannot do it right now because we need to use libbpf instead of bcc.

@BGrasnick , do note there always exists the possibility of creating your own bundle with any kepler image and use operator-sdk to deploy the bundle.

E.g. here is how I create my bundle that uses kepler libbpf bundle that gets pushed to quay as - quay.io/sthaha/kepler-operator-bundle:0.0.0-dev

make operator-build bundle bundle-build operator-push bundle-push IMG_BASE=quay.io/sthaha VERSION=0.0.0-dev KEPLER_IMG=quay.io/sustainable_computing_io/kepler:release-0.6.1-libbpf

There exists a HACK as well (but use it as your own risk)

  1. Use community catalog to deploy operator
  2. Deploy Kepler
  3. Release all kepler - pod images as follows
for x in $(kubectl get pods -n openshift-kepler-operator -l app.kubernetes.io/component=exporter -l app.kubernetes.io/managed-by=kepler-operator -o name); do
 oc set image -n openshift-kepler-operator $x kepler-exporter=quay.io/sustainable_computing_io/kepler:release-0.6.1-libbpf ;
done

We have observed that daemonset controllers do not revert back the image.

@marceloamaral
Copy link

We might have only libbpf in the future, there is no reason, as far as I know, for maintaining duplicated code.
However, I am not sure when we will remove the bcc. In the upcoming release, we plan to set libbpf as the default, and perhaps in the subsequent release, we may consider removing bcc.

Are there any use cases that I might not be aware of that require bcc?
If not, it's probably unnecessary to enable it in the operator.

@sthaha
Copy link
Collaborator

sthaha commented Oct 25, 2023

@marceloamaral , thank you for taking time to answer.

Are there any use cases that I might not be aware of that require bcc?

Not that I am aware of but definitely there are more use-cases for libbpf than bcc. This also makes me wonder if it would be possible for us to make a release of kepler 0.7.0 that only switches to libbpf provided libbpf transition is mature enough.
This would allow us to make another operator release which deploys libbpf based kepler allows users of openshift 4.11+ to try out kepler and provide feedback.

If not, it's probably unnecessary to enable it in the operator.

@BGrasnick, I agree with Marcelo's comment. Since this issue is short lived, the hack @vimalk78 is working on should suffice.

@BGrasnick
Copy link
Contributor Author

@marceloamaral thanks for your answer!

@sthaha yeah of course that makes sense! If kepler doesn't need/support bcc in the future then there is no meaning in implementing this functionality of method choosing. However, as you outlined it would be great to fast track a release of kepler and kepler-operator with libbpf. I don't know if other people already use kepler on OpenShift + AWS with bcc but for us it does not work...

@sthaha
Copy link
Collaborator

sthaha commented Oct 26, 2023

@BGrasnick I have started a discussion about switching to libbpf - sustainable-computing-io/kepler#1027

@BGrasnick
Copy link
Contributor Author

@sthaha That's a great idea! Alternatively, if it does not match up with the release plan for kepler, how about releasing a new kepler-operator 0.10.0 or 0.9.1 where we use release-0.6.1-libbpf of kepler?

@sthaha
Copy link
Collaborator

sthaha commented Oct 26, 2023

how about releasing a new kepler-operator 0.10.0 or 0.9.1 where we use release-0.6.1-libbpf of kepler?

I am happy with that proposal if we can validate in kepler CI that both images produce the same / comparable values for all metrics.

@sthaha
Copy link
Collaborator

sthaha commented Oct 26, 2023

@BGrasnick , I am closing this in favour of #293

@sthaha sthaha closed this Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants