prefetch-task-rhsm-integration #1205

brianwcook · 2024-07-26T17:33:43Z

update: this PR has been reworked to use the new ssl options sub-key introduced into cachi2's RPM package manager.

This PR causes the prefetch task to react to the same ACTIVATION_KEY parameter that is used for non-hermetic builds. The container will use the pipeline-provided activation key to register with Red Hat's subscription manager, ~~container and set the proper environment variables~~ and augment the Cachi2 input to use the generated entitlement certificates before executing Cachi2.

The following points are pertinent:

the RHSM (Red Hat subscription management) files generated by subscription-manager register are generated by this task (by running subscription-manager register, immediately before running cachi2
they are never used again

Therefore this implementation is safe from certificate revocation / rotation behavior of RHSM.

eskultety · 2024-07-29T13:03:33Z

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

+            subscription-manager register \
+              --org $(cat "/activation-key/orgid") \
+              --activationkey $(cat "/activation-key/activationkey")


Is the user running inside the container privileged?

Error: this command requires root access to execute

Where did you get that error? It is running properly already in Konflux using git ref pointing at my fork of build-defintions. The buildah task already has sufficient permissions with no modifications.

Default UBI container.

the buildah task is running as user 0 already so imo this is a non-issue.

Checked the buildah task, there's nothing going on in terms of user setting really (apart from ID mapping), so by default this runs as root. I guess the container itself isn't privileged, so while not ideal, I guess it's acceptable, so consider my initial comment retracted.

However, that makes things easier for cachi2 then and since we're assuming the default root user inside the container, this whole RHSM registration should IMO be baked into cachi2 rpm-dnf backend rather than in the tekton task for proper integration - we could also leverage dbus to communicate with RHSM for better error handling rather than dealing with shell.

brianwcook · 2024-07-30T11:55:22Z

IMHO the way I have implemented it is the most versatile option.

If your task is running on a static Jenkins or gitlab runner which is registered with RHSM,set your environment variables and pull content
If your task is running on Openshift with insights operator, set your environment variables and pull content.
If you are running in an unregistered environment like Konflux is, you can register a container with activation key and then set variables and pull content.
In addition if you want to run your own client-cert protected yum repo, my code also works for that.

It also introduces no new dependencies on yum, dnf or subscription manager and so is insulated from changes there. The client certificate scheme hasn't changed in over a decade and should be quite stable.

eskultety · 2024-07-30T12:12:25Z

IMHO the way I have implemented it is the most versatile option.
* If your task is running on a static Jenkins or gitlab runner which is registered with RHSM,set your environment variables and pull content

* If your task is running on Openshift with insights operator, set your environment variables and pull content.

* If you are running in an unregistered environment like Konflux is, you can register a container with activation key and then set variables and pull content.

* In addition if you want to run your own client-cert protected yum repo, my code also works for that.
It also introduces no new dependencies on yum, dnf or subscription manager and so is insulated from changes there. The client certificate scheme hasn't changed in over a decade and should be quite stable.

Which of ^these use cases does the following not comply with, rendering it less versatile?

this whole RHSM registration should IMO be baked into cachi2 rpm-dnf backend rather than in the tekton task

This changes the container build to use UBI9 so that it is supportable by a major user (Red Hat) with subscription enabled repositories. The change requires using createrepo_c from PyPyi since the createrepo_c rpm is not distributed as part of the UBI9 content set and it is desireable to keep this image freely redistributable. Chaniging to UBI keeps maintenance to a minimum (just one image flavor) but in the future multiple images could be maintained if required. The subscription-manager package is included to support konflux-ci/build-definitions#1205 and containerbuildsystem#580 where it will be used to obtain TLS certificates to send to authenticate to private repositories. Signed-off-by: Brian Cook <bcook@redhat.com>

task/prefetch-dependencies/0.1/prefetch-dependencies.yaml

This changes the container build to use UBI9 so that it is supportable by a major user (Red Hat) with subscription enabled repositories. The change requires using createrepo_c from PyPyi since the createrepo_c rpm is not distributed as part of the UBI9 content set and it is desireable to keep this image freely redistributable. Chaniging to UBI keeps maintenance to a minimum (just one image flavor) but in the future multiple images could be maintained if required. The subscription-manager package is included to support konflux-ci/build-definitions#1205 and #580 where it will be used to obtain TLS certificates to send to authenticate to private repositories. Signed-off-by: Brian Cook <bcook@redhat.com>

brianwcook · 2024-11-02T19:00:01Z

/ok-to-test

brianwcook · 2024-11-04T01:32:54Z

/retest

brianwcook · 2024-11-04T14:56:53Z

@eskultety I wrote some simple tests to ensure that the input manipulation here was working as intended and they are here (https://github.com/brianwcook/cachi2-input-stdz). At some point I think it should become a part of tests for the task but those are not actually possible yet, so just an FYI for now.

brianwcook · 2024-11-04T21:58:47Z

/ok-to-test

eskultety

@brianwcook Terribly sorry to have gone so long without a review, it has now become my top-most priority to help you finish the work.

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

eskultety · 2024-11-18T14:38:55Z

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

+        mkdir -p /shared/rhsm/consumer
+
+        if [ -e /activation-key/org ]; then
+          cp -r --preserve=mode "$ACTIVATION_KEY_PATH" /tmp/activation-key


Why do we need to copy the data? Why not reading it directly from the volume

It could be possible to read it directly from the volume but I tried to keep this code in sync with the buildah task which does a copy.

eskultety · 2024-11-18T14:45:51Z

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

+
+          echo "Registering with Red Hat subscription manager."
+          subscription-manager register --org "$(cat /tmp/activation-key/org)" --activationkey "$(cat /tmp/activation-key/activationkey)"
+
+          # copy generated certificates to /shared/rhsm
+          cp /etc/pki/entitlement/*.pem /shared/rhsm/entitlement/
+          cp /etc/pki/consumer/*.pem /shared/rhsm/consumer/
+
+          file="$(find /shared/rhsm/entitlement -regextype egrep -regex '.*[0-9]+\.pem' -printf %f)"
+          echo "file: $file"
+          basename "$file" .pem >/shared/RHSM_ID
+          echo "./RHSM_ID:"
+          cat /shared/RHSM_ID
+
+          # trust the CA used for Red Hat CDN
+          cp /etc/rhsm-host/ca/redhat-uep.pem /shared/rhsm/redhat-uep.pem
+        fi
+    - name: preprocess-input
+      image: quay.io/redhat-appstudio/cachi2@sha256:eb34cfe3fea20997eebd8164dc93eedb2fd7a60dc1fb4afcc1b1ff43df9d6667
+      args:
+        - $(params.input)
+      env:
+        - name: INPUT
+          value: $(params.input)
+        - name: ACTIVATION_KEY
+          value: $(params.ACTIVATION_KEY)
+      script: |
+        #!/bin/python3
+        import json
+        import os
+        import sys
+


Why do we need several steps to handle this, especially since it all runs within a context of the cachi2 container? Now, I'm a big adversary of spaghetti code, but in this case it's all related to cachi2 deps prefetch (just wrapped by an optional subman register/unregister work), do we need to handle it this way in separate steps? Maybe I'm missing something, but if this were a single bigger script you wouldn't need the shared volume, would you? You'd also make it more readable and less opaque that way IMO.
If I'm not mistaken in my thought process then you wouldn't need this inline Python at all and could be replaced with a more straightforward Bash I think.

Actually the reason I separated it into more steps is so I avoid implementation in Bash. To me using Python seemed like a much better choice. I also tested the Python part with unit tests that I hope will one day accompany this task to prevent regressions.

If this should be implemented in Bash it will need to be done by someone else. I don't think I have the Bash skills to do it.

eskultety · 2024-11-19T09:22:10Z

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

+        fi
+
+        echo "false" >/shared/registered
+        ACTIVATION_KEY_PATH="/activation-key"


Noob question - how is the ACTIVATION_KEY parameter even used, it doesn't seem to since you're hardcoding the volume path here? What if someone changes the value, how do we probe the right file system location? I'm clearly missing something here making me confused.

The path is hardcoded but the secret that is populated to that path has a default value of 'activation-key' in the task parameter declarations and can be overridden in the params passed to task invocation. It is the secret name the user needs to be concerned with, the path is an internal implementation detail.

eskultety · 2024-11-19T09:25:55Z

task/prefetch-dependencies/0.1/prefetch-dependencies.yaml

+        subscription-manager register --org "$(cat /tmp/activation-key/org)" --activationkey "$(cat /tmp/activation-key/activationkey)"
+
+        # copy generated certificates to /shared/rhsm
+        cp /etc/pki/entitlement/*.pem /shared/rhsm/entitlement/


Related to my earlier comment on not splitting this among different steps, I'm confused why we'd want a shared volume full of these default pre-populated subman system locations with data instead of handling this within the same context and referencing only these standard and expected system locations instead of shared volumes.

I suspect that this bit of code just evolved this way but I tried to keep it consistent between the buildah task and this one. If we want to try to remove the copying and just use the volume directly I would prefer to handle it in a followup PR in both this task and the buildah task. Side note: this could be a candidate for use with future stepActions.

I suspect that this bit of code just evolved this way but I tried to keep it consistent between the buildah task and this one.

Is this: https://github.com/konflux-ci/build-definitions/blob/main/task/buildah-oci-ta/0.2/buildah-oci-ta.yaml#L436-L462 the bit of buildah task work you're referring to?

@brianwcook I'd like an answer on ^this question as that stopped me from expanding our discussion in this thread further and I think it's paramount to the rest of this PR.

the general pattern even predates that section - this bit existed before activation keys were supported at all. There was some issue with trying to mount directly but I cannot recall what it was.

I also think that this code can be consolidated into a stepAction in the future and to do that we 1) need to keep the data on a shared emptyDir as it is now and 2) would benefit from keeping the code in sync as best we can between prefetch and buildah tasks.

task/prefetch-dependencies/0.1/prefetch-dependencies.yaml

pipelines/docker-build-multi-platform-oci-ta/README.md

eskultety · 2024-11-19T09:31:21Z

pipelines/docker-build-multi-platform-oci-ta/README.md

@@ -135,6 +135,7 @@ This pipeline is pushed as a Tekton bundle to [quay.io](https://quay.io/reposito
 ### prefetch-dependencies-oci-ta:0.1 task parameters
 |name|description|default value|already set by|
 |---|---|---|---|
+|ACTIVATION_KEY| Name of secret which contains subscription activation key| activation-key| |


Commit message:

s/THe/The

THe input is modified, injecting the entitlement certs

How is the input modified? Yes, we copy stuff to a shared volume, most importantly the certs that get passed to cachi2 to use for TLS auth, but other than that, how is the user input modified?

added example of how the input is modified.

eskultety · 2024-11-20T11:16:23Z

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

+          file="$(find /shared/rhsm/entitlement -regextype egrep -regex '.*[0-9]+\.pem' -printf %f)"
+          echo "file: $file"
+          basename "$file" .pem >/shared/RHSM_ID
+          echo "./RHSM_ID:"
+          cat /shared/RHSM_ID


This is IMO not needed at all, IIUC you only added it to report the RHSM_ID in the logs, but I wonder what value that brings compared to reporting the TLS auth credentials as part of cachi2's own debug logging system.

208, 210 and 211 for visibility and we could remove them but it would make any necessary debugging harder. 207 and 209 are crucial functionality.

the "RHSM_ID" is not known before subsription-manager register is run.
That is why the code has to pase this value from the file names. Those files also don't exist before subscrip-manager-register is run. What is happening (and there is no alternative) is this

user provides org id and activation key values in Kube secret

code runs subscription-manager register --org [org] --key key

Two files are generated during registration, [rhsm_id.pem] and `[rhsm_id].key

prefetch input has to be modified on the fly adding certificate options with those file namesfor all the occurences of the rpm manager in the prefetch input - even if all the user passed as input is rpm.

the "RHSM_ID" is not known before subsription-manager register is run. That is why the code has to pase this value from the file names. Those files also don't exist before subscrip-manager-register is run. What is happening (and there is no alternative) is this

1. user provides org id and activation key values in Kube secret 2. code runs subscription-manager register --org [org] --key key 3. Two files are generated during registration, `[rhsm_id.pem]` and `[rhsm_id].key 4. prefetch input has to be modified on the fly adding certificate options with those file namesfor _all_ the occurences of the rpm manager in the prefetch input - even if all the user passed as input is `rpm`.

We can extract the file names on demand right before executing cachi2. The intermediate RHSM_ID handling is just opaque and makes it visibly ~~hard~~ harder to read, why complicating stuff? :)

you cannot extract the names only once, modify the input in a separate task and omit passing the names between steps.

Passing data between steps by writing to an emptyDir is a common pattern to save Tekton results space. It is not unexpected in this context or abnormally complicated. If you want to avoid it for some reason then we have to extract the names from the files twice. In any case, I don't think the difference between the two approaches is worth the time spent discussing it and I find parsing the name once to be the simpler way (hence the way I went).

eskultety · 2024-11-20T11:18:13Z

task/prefetch-dependencies-oci-ta/0.1/prefetch-dependencies-oci-ta.yaml

+            cert = ("/shared/rhsm/entitlement/%s.pem" % rhsm_id)
+            key = ("/shared/rhsm/entitlement/%s-key.pem" % rhsm_id)


Your code already assumes there's only ever going to be a single set of entitlement key-cert pair of files which is a reasonable assumption ATM, so you don't really need this RHSM_ID filename construction logic, one file is going to be named xyz-key.pem the other xyz.pem no RHSM involvement needed at all.

The construction is necessary. Even though the code limits the user to one set of certificates, the certificate names are not known until subscription-manager register is run.

The construction is necessary. Even though the code limits the user to one set of certificates, the certificate names are not known until subscription-manager register is run.

It's true that you don't know the actual names, but I don't believe the construction really is necessary, regardless of whether this is Bash or Python, you can extract the right names dynamically without knowing RHSM_ID at the time of use, because you have the most important bit of information - the files are named the same sans the -key suffix.
What I'm trying to say is, we should only be using and passing around data what we absolutely have to and need instead of generating/passing around data we don't.

...but the names of the keys need to be input into the python step so it can be injected into input JSON.

Why? IIUC the shared volume entitlementdirectory contains only 2 files from which you can derive the intent of each of those.

if the user input is simply rpm and an activation key is presented, the input string has to be transformed before being passed to cachi2 to:

[{"type": "rpm", "options": {"ssl": {"client_key": null, "client_cert": null, "ca_bundle": null, "verify": 1}}}]

And then the client_key and client_cert values need to be populated with the name of the pem file generated by subscription-manager register

brianwcook · 2024-12-03T23:46:16Z

/retest

brianwcook · 2024-12-04T14:37:44Z

/retest

task/prefetch-dependencies/0.1/prefetch-dependencies.yaml

This adds steps to the prefetch task to detect when a Red Hat subscription activation key is provided. When prefetch is configured for RPM package manager and an acivation key is provided, the pod will be registered with Red Hat's subscription management service so that protected content can be fetched. The activation key is provided via the param ACTIVATION_KEY. This is expected to be the name of a secret with two keys: org and activationkey. For more information see https://access.redhat.com solutions/3341191. The task modifies the prefetch input on the fly in order to inject the necessary entitlement files used for mTLS auth. For example, for simple input like 'rpm', the input will first be transformed to: [ { "type": "rpm", "options": { "ssl": { "client_key": null, "client_cert": null, "ca_bundle": null, "verify": 1 } } } ] After this the entitelement certificate information will be added to ALL instances of rpm package manager present (in case the input is a JSON array.) After prefetch the container is unregistered. Signed-off-by: Brian Cook <bcook@redhat.com>

brianwcook requested review from Tojaj, brunoapimentel and chmeliik July 26, 2024 17:33

brianwcook marked this pull request as draft July 26, 2024 17:33

eskultety reviewed Jul 29, 2024

View reviewed changes

brunoapimentel mentioned this pull request Aug 6, 2024

Use UBI9 base image for container build containerbuildsystem/cachi2#586

Merged

1 task

eskultety reviewed Aug 8, 2024

View reviewed changes

task/prefetch-dependencies/0.1/prefetch-dependencies.yaml Outdated Show resolved Hide resolved

openshift-merge-robot added the needs-rebase label Aug 23, 2024

brianwcook force-pushed the prefetch-task-rhsm-integration branch from 22f5975 to cd8cad9 Compare October 29, 2024 21:18

openshift-merge-robot removed the needs-rebase label Oct 29, 2024

brianwcook force-pushed the prefetch-task-rhsm-integration branch from 1e0fb3d to 64ae5f9 Compare October 30, 2024 02:53

openshift-merge-robot added the needs-rebase label Nov 1, 2024

brianwcook force-pushed the prefetch-task-rhsm-integration branch from e032884 to 78a571d Compare November 1, 2024 20:30

openshift-merge-robot removed the needs-rebase label Nov 1, 2024

brianwcook force-pushed the prefetch-task-rhsm-integration branch 4 times, most recently from 46f4e82 to 95e2128 Compare November 2, 2024 02:40

brianwcook marked this pull request as ready for review November 2, 2024 02:42

openshift-ci bot requested a review from mkosiarc November 2, 2024 02:42

brianwcook force-pushed the prefetch-task-rhsm-integration branch from b6881e4 to e652f28 Compare November 4, 2024 13:12

brianwcook requested a review from eskultety November 4, 2024 14:40

brianwcook force-pushed the prefetch-task-rhsm-integration branch 2 times, most recently from 74ebe1e to a55a347 Compare November 4, 2024 21:58

brianwcook force-pushed the prefetch-task-rhsm-integration branch 2 times, most recently from 716753f to 2b126d8 Compare November 5, 2024 12:58

rhmdnd mentioned this pull request Nov 5, 2024

Enable hermetic pipeline builds openshift/file-integrity-operator#598

Open

brianwcook force-pushed the prefetch-task-rhsm-integration branch from 2b126d8 to e0002e2 Compare November 7, 2024 03:10

brianwcook force-pushed the prefetch-task-rhsm-integration branch from e0002e2 to 04d5e91 Compare November 15, 2024 17:14

eskultety reviewed Nov 19, 2024

View reviewed changes

eskultety reviewed Nov 20, 2024

View reviewed changes

openshift-merge-robot added the needs-rebase label Nov 21, 2024

brianwcook force-pushed the prefetch-task-rhsm-integration branch from 04d5e91 to cd9a6c7 Compare November 21, 2024 16:54

openshift-merge-robot removed the needs-rebase label Nov 21, 2024

brianwcook force-pushed the prefetch-task-rhsm-integration branch 2 times, most recently from b299f95 to 1e21f62 Compare November 21, 2024 17:31

brianwcook force-pushed the prefetch-task-rhsm-integration branch from 1e21f62 to 90749c1 Compare December 3, 2024 13:29

brianwcook force-pushed the prefetch-task-rhsm-integration branch 3 times, most recently from 840e480 to 46dcf0a Compare December 4, 2024 20:40

MartinBasti reviewed Dec 4, 2024

View reviewed changes

task/prefetch-dependencies/0.1/prefetch-dependencies.yaml Show resolved Hide resolved

brianwcook force-pushed the prefetch-task-rhsm-integration branch from 46dcf0a to d9579c2 Compare December 4, 2024 21:35

MartinBasti approved these changes Dec 5, 2024

View reviewed changes

MartinBasti added this pull request to the merge queue Dec 5, 2024

Merged via the queue into konflux-ci:main with commit 5f62fe3 Dec 5, 2024
16 checks passed

rh-tap-build-team bot mentioned this pull request Dec 5, 2024

build-definitions update redhat-appstudio/infra-deployments#5065

Closed

		cert = ("/shared/rhsm/entitlement/%s.pem" % rhsm_id)
		key = ("/shared/rhsm/entitlement/%s-key.pem" % rhsm_id)

prefetch-task-rhsm-integration #1205

prefetch-task-rhsm-integration #1205

Conversation

brianwcook commented Jul 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianwcook Jul 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianwcook commented Jul 30, 2024 • edited Loading

eskultety commented Jul 30, 2024

brianwcook commented Nov 2, 2024

brianwcook commented Nov 4, 2024

brianwcook commented Nov 4, 2024

brianwcook commented Nov 4, 2024

eskultety left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eskultety Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

brianwcook Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eskultety Nov 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

brianwcook Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

brianwcook commented Dec 3, 2024

brianwcook commented Dec 4, 2024

brianwcook commented Jul 26, 2024 •

edited

Loading

brianwcook Jul 29, 2024 •

edited

Loading

brianwcook commented Jul 30, 2024 •

edited

Loading

eskultety Nov 20, 2024 •

edited

Loading

brianwcook Dec 4, 2024 •

edited

Loading

eskultety Nov 20, 2024 •

edited

Loading

brianwcook Nov 21, 2024 •

edited

Loading