Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefetch-task-rhsm-integration #1205

Merged

Conversation

brianwcook
Copy link
Contributor

@brianwcook brianwcook commented Jul 26, 2024

update: this PR has been reworked to use the new ssl options sub-key introduced into cachi2's RPM package manager.

This PR causes the prefetch task to react to the same ACTIVATION_KEY parameter that is used for non-hermetic builds. The container will use the pipeline-provided activation key to register with Red Hat's subscription manager, container and set the proper environment variables and augment the Cachi2 input to use the generated entitlement certificates before executing Cachi2.

The following points are pertinent:

  • the RHSM (Red Hat subscription management) files generated by subscription-manager register are generated by this task (by running subscription-manager register, immediately before running cachi2
  • they are never used again

Therefore this implementation is safe from certificate revocation / rotation behavior of RHSM.

@brianwcook brianwcook marked this pull request as draft July 26, 2024 17:33
Comment on lines 221 to 335
subscription-manager register \
--org $(cat "/activation-key/orgid") \
--activationkey $(cat "/activation-key/activationkey")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the user running inside the container privileged?

Error: this command requires root access to execute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you get that error? It is running properly already in Konflux using git ref pointing at my fork of build-defintions. The buildah task already has sufficient permissions with no modifications.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default UBI container.

Copy link
Contributor Author

@brianwcook brianwcook Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the buildah task is running as user 0 already so imo this is a non-issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked the buildah task, there's nothing going on in terms of user setting really (apart from ID mapping), so by default this runs as root. I guess the container itself isn't privileged, so while not ideal, I guess it's acceptable, so consider my initial comment retracted.

However, that makes things easier for cachi2 then and since we're assuming the default root user inside the container, this whole RHSM registration should IMO be baked into cachi2 rpm-dnf backend rather than in the tekton task for proper integration - we could also leverage dbus to communicate with RHSM for better error handling rather than dealing with shell.

@brianwcook
Copy link
Contributor Author

brianwcook commented Jul 30, 2024

IMHO the way I have implemented it is the most versatile option.

  • If your task is running on a static Jenkins or gitlab runner which is registered with RHSM,set your environment variables and pull content
  • If your task is running on Openshift with insights operator, set your environment variables and pull content.
  • If you are running in an unregistered environment like Konflux is, you can register a container with activation key and then set variables and pull content.
  • In addition if you want to run your own client-cert protected yum repo, my code also works for that.

It also introduces no new dependencies on yum, dnf or subscription manager and so is insulated from changes there. The client certificate scheme hasn't changed in over a decade and should be quite stable.

@eskultety
Copy link
Contributor

IMHO the way I have implemented it is the most versatile option.

* If your task is running on a static Jenkins or gitlab runner which is registered with RHSM,set your environment variables and pull content

* If your task is running on Openshift with insights operator, set your environment variables and pull content.

* If you are running in an unregistered environment like Konflux is, you can register a container with activation key and then set variables and pull content.

* In addition if you want to run your own client-cert protected yum repo, my code also works for that.

It also introduces no new dependencies on yum, dnf or subscription manager and so is insulated from changes there. The client certificate scheme hasn't changed in over a decade and should be quite stable.

Which of ^these use cases does the following not comply with, rendering it less versatile?

this whole RHSM registration should IMO be baked into cachi2 rpm-dnf backend rather than in the tekton task

brianwcook added a commit to brianwcook/cachi2 that referenced this pull request Aug 6, 2024
This changes the container build to use UBI9 so that it is supportable
by a major user (Red Hat) with subscription enabled repositories. The
change requires using createrepo_c from PyPyi since the createrepo_c rpm
is not distributed as part of the UBI9 content set and it is desireable
to keep this image freely redistributable. Chaniging to UBI keeps
maintenance to a minimum (just one image flavor) but in the future
multiple images could be maintained if required.

The subscription-manager package is included to support
konflux-ci/build-definitions#1205 and
containerbuildsystem#580 where it will be
used to obtain TLS certificates to send to authenticate to private
repositories.

Signed-off-by: Brian Cook <bcook@redhat.com>
brianwcook added a commit to brianwcook/cachi2 that referenced this pull request Aug 6, 2024
This changes the container build to use UBI9 so that it is supportable
by a major user (Red Hat) with subscription enabled repositories. The
change requires using createrepo_c from PyPyi since the createrepo_c rpm
is not distributed as part of the UBI9 content set and it is desireable
to keep this image freely redistributable. Chaniging to UBI keeps
maintenance to a minimum (just one image flavor) but in the future
multiple images could be maintained if required.

The subscription-manager package is included to support
konflux-ci/build-definitions#1205 and
containerbuildsystem#580 where it will be
used to obtain TLS certificates to send to authenticate to private
repositories.

Signed-off-by: Brian Cook <bcook@redhat.com>
github-merge-queue bot pushed a commit to containerbuildsystem/cachi2 that referenced this pull request Aug 12, 2024
This changes the container build to use UBI9 so that it is supportable
by a major user (Red Hat) with subscription enabled repositories. The
change requires using createrepo_c from PyPyi since the createrepo_c rpm
is not distributed as part of the UBI9 content set and it is desireable
to keep this image freely redistributable. Chaniging to UBI keeps
maintenance to a minimum (just one image flavor) but in the future
multiple images could be maintained if required.

The subscription-manager package is included to support
konflux-ci/build-definitions#1205 and
#580 where it will be
used to obtain TLS certificates to send to authenticate to private
repositories.

Signed-off-by: Brian Cook <bcook@redhat.com>
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from 22f5975 to cd8cad9 Compare October 29, 2024 21:18
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from 1e0fb3d to 64ae5f9 Compare October 30, 2024 02:53
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from e032884 to 78a571d Compare November 1, 2024 20:30
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch 4 times, most recently from 46f4e82 to 95e2128 Compare November 2, 2024 02:40
@brianwcook brianwcook marked this pull request as ready for review November 2, 2024 02:42
@openshift-ci openshift-ci bot requested a review from mkosiarc November 2, 2024 02:42
@brianwcook
Copy link
Contributor Author

/ok-to-test

@brianwcook
Copy link
Contributor Author

/retest

@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from b6881e4 to e652f28 Compare November 4, 2024 13:12
@brianwcook brianwcook requested a review from eskultety November 4, 2024 14:40
@brianwcook
Copy link
Contributor Author

@eskultety I wrote some simple tests to ensure that the input manipulation here was working as intended and they are here (https://github.com/brianwcook/cachi2-input-stdz). At some point I think it should become a part of tests for the task but those are not actually possible yet, so just an FYI for now.

@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch 2 times, most recently from 74ebe1e to a55a347 Compare November 4, 2024 21:58
@brianwcook
Copy link
Contributor Author

/ok-to-test

@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch 2 times, most recently from 716753f to 2b126d8 Compare November 5, 2024 12:58
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from 2b126d8 to e0002e2 Compare November 7, 2024 03:10
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from e0002e2 to 04d5e91 Compare November 15, 2024 17:14
Copy link
Contributor

@eskultety eskultety left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianwcook Terribly sorry to have gone so long without a review, it has now become my top-most priority to help you finish the work.

mkdir -p /shared/rhsm/consumer

if [ -e /activation-key/org ]; then
cp -r --preserve=mode "$ACTIVATION_KEY_PATH" /tmp/activation-key
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to copy the data? Why not reading it directly from the volume

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be possible to read it directly from the volume but I tried to keep this code in sync with the buildah task which does a copy.

Comment on lines 199 to 230

echo "Registering with Red Hat subscription manager."
subscription-manager register --org "$(cat /tmp/activation-key/org)" --activationkey "$(cat /tmp/activation-key/activationkey)"

# copy generated certificates to /shared/rhsm
cp /etc/pki/entitlement/*.pem /shared/rhsm/entitlement/
cp /etc/pki/consumer/*.pem /shared/rhsm/consumer/

file="$(find /shared/rhsm/entitlement -regextype egrep -regex '.*[0-9]+\.pem' -printf %f)"
echo "file: $file"
basename "$file" .pem >/shared/RHSM_ID
echo "./RHSM_ID:"
cat /shared/RHSM_ID

# trust the CA used for Red Hat CDN
cp /etc/rhsm-host/ca/redhat-uep.pem /shared/rhsm/redhat-uep.pem
fi
- name: preprocess-input
image: quay.io/redhat-appstudio/cachi2@sha256:eb34cfe3fea20997eebd8164dc93eedb2fd7a60dc1fb4afcc1b1ff43df9d6667
args:
- $(params.input)
env:
- name: INPUT
value: $(params.input)
- name: ACTIVATION_KEY
value: $(params.ACTIVATION_KEY)
script: |
#!/bin/python3
import json
import os
import sys

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need several steps to handle this, especially since it all runs within a context of the cachi2 container? Now, I'm a big adversary of spaghetti code, but in this case it's all related to cachi2 deps prefetch (just wrapped by an optional subman register/unregister work), do we need to handle it this way in separate steps? Maybe I'm missing something, but if this were a single bigger script you wouldn't need the shared volume, would you? You'd also make it more readable and less opaque that way IMO.
If I'm not mistaken in my thought process then you wouldn't need this inline Python at all and could be replaced with a more straightforward Bash I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the reason I separated it into more steps is so I avoid implementation in Bash. To me using Python seemed like a much better choice. I also tested the Python part with unit tests that I hope will one day accompany this task to prevent regressions.

If this should be implemented in Bash it will need to be done by someone else. I don't think I have the Bash skills to do it.

fi

echo "false" >/shared/registered
ACTIVATION_KEY_PATH="/activation-key"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noob question - how is the ACTIVATION_KEY parameter even used, it doesn't seem to since you're hardcoding the volume path here? What if someone changes the value, how do we probe the right file system location? I'm clearly missing something here making me confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path is hardcoded but the secret that is populated to that path has a default value of 'activation-key' in the task parameter declarations and can be overridden in the params passed to task invocation. It is the secret name the user needs to be concerned with, the path is an internal implementation detail.

subscription-manager register --org "$(cat /tmp/activation-key/org)" --activationkey "$(cat /tmp/activation-key/activationkey)"

# copy generated certificates to /shared/rhsm
cp /etc/pki/entitlement/*.pem /shared/rhsm/entitlement/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to my earlier comment on not splitting this among different steps, I'm confused why we'd want a shared volume full of these default pre-populated subman system locations with data instead of handling this within the same context and referencing only these standard and expected system locations instead of shared volumes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that this bit of code just evolved this way but I tried to keep it consistent between the buildah task and this one. If we want to try to remove the copying and just use the volume directly I would prefer to handle it in a followup PR in both this task and the buildah task. Side note: this could be a candidate for use with future stepActions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that this bit of code just evolved this way but I tried to keep it consistent between the buildah task and this one.

Is this: https://github.com/konflux-ci/build-definitions/blob/main/task/buildah-oci-ta/0.2/buildah-oci-ta.yaml#L436-L462 the bit of buildah task work you're referring to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianwcook I'd like an answer on ^this question as that stopped me from expanding our discussion in this thread further and I think it's paramount to the rest of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the general pattern even predates that section - this bit existed before activation keys were supported at all. There was some issue with trying to mount directly but I cannot recall what it was.

I also think that this code can be consolidated into a stepAction in the future and to do that we 1) need to keep the data on a shared emptyDir as it is now and 2) would benefit from keeping the code in sync as best we can between prefetch and buildah tasks.

@@ -135,6 +135,7 @@ This pipeline is pushed as a Tekton bundle to [quay.io](https://quay.io/reposito
### prefetch-dependencies-oci-ta:0.1 task parameters
|name|description|default value|already set by|
|---|---|---|---|
|ACTIVATION_KEY| Name of secret which contains subscription activation key| activation-key| |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message:

s/THe/The

THe input is modified, injecting the entitlement certs

How is the input modified? Yes, we copy stuff to a shared volume, most importantly the certs that get passed to cachi2 to use for TLS auth, but other than that, how is the user input modified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added example of how the input is modified.

Comment on lines +207 to +211
file="$(find /shared/rhsm/entitlement -regextype egrep -regex '.*[0-9]+\.pem' -printf %f)"
echo "file: $file"
basename "$file" .pem >/shared/RHSM_ID
echo "./RHSM_ID:"
cat /shared/RHSM_ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is IMO not needed at all, IIUC you only added it to report the RHSM_ID in the logs, but I wonder what value that brings compared to reporting the TLS auth credentials as part of cachi2's own debug logging system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

208, 210 and 211 for visibility and we could remove them but it would make any necessary debugging harder. 207 and 209 are crucial functionality.

the "RHSM_ID" is not known before subsription-manager register is run.
That is why the code has to pase this value from the file names. Those files also don't exist before subscrip-manager-register is run. What is happening (and there is no alternative) is this

  1. user provides org id and activation key values in Kube secret
  2. code runs subscription-manager register --org [org] --key key
  3. Two files are generated during registration, [rhsm_id.pem] and `[rhsm_id].key
  4. prefetch input has to be modified on the fly adding certificate options with those file namesfor all the occurences of the rpm manager in the prefetch input - even if all the user passed as input is rpm.

Copy link
Contributor

@eskultety eskultety Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "RHSM_ID" is not known before subsription-manager register is run. That is why the code has to pase this value from the file names. Those files also don't exist before subscrip-manager-register is run. What is happening (and there is no alternative) is this

1. user provides org id and activation key values in Kube secret

2. code runs subscription-manager register --org [org] --key key

3. Two files are generated during registration,  `[rhsm_id.pem]` and `[rhsm_id].key

4. prefetch input has to be modified on the fly adding certificate options with those file namesfor _all_ the occurences of the rpm manager in the prefetch input - even if all the user passed as input is `rpm`.

We can extract the file names on demand right before executing cachi2. The intermediate RHSM_ID handling is just opaque and makes it visibly hard harder to read, why complicating stuff? :)

Copy link
Contributor Author

@brianwcook brianwcook Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you cannot extract the names only once, modify the input in a separate task and omit passing the names between steps.

Passing data between steps by writing to an emptyDir is a common pattern to save Tekton results space. It is not unexpected in this context or abnormally complicated. If you want to avoid it for some reason then we have to extract the names from the files twice. In any case, I don't think the difference between the two approaches is worth the time spent discussing it and I find parsing the name once to be the simpler way (hence the way I went).

Comment on lines +249 to +250
cert = ("/shared/rhsm/entitlement/%s.pem" % rhsm_id)
key = ("/shared/rhsm/entitlement/%s-key.pem" % rhsm_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code already assumes there's only ever going to be a single set of entitlement key-cert pair of files which is a reasonable assumption ATM, so you don't really need this RHSM_ID filename construction logic, one file is going to be named xyz-key.pem the other xyz.pem no RHSM involvement needed at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The construction is necessary. Even though the code limits the user to one set of certificates, the certificate names are not known until subscription-manager register is run.

Copy link
Contributor

@eskultety eskultety Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The construction is necessary. Even though the code limits the user to one set of certificates, the certificate names are not known until subscription-manager register is run.

It's true that you don't know the actual names, but I don't believe the construction really is necessary, regardless of whether this is Bash or Python, you can extract the right names dynamically without knowing RHSM_ID at the time of use, because you have the most important bit of information - the files are named the same sans the -key suffix.
What I'm trying to say is, we should only be using and passing around data what we absolutely have to and need instead of generating/passing around data we don't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but the names of the keys need to be input into the python step so it can be injected into input JSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? IIUC the shared volume entitlementdirectory contains only 2 files from which you can derive the intent of each of those.

Copy link
Contributor Author

@brianwcook brianwcook Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the user input is simply rpm and an activation key is presented, the input string has to be transformed before being passed to cachi2 to:

[{"type": "rpm", "options": {"ssl": {"client_key": null, "client_cert": null, "ca_bundle": null, "verify": 1}}}]

And then the client_key and client_cert values need to be populated with the name of the pem file generated by subscription-manager register

@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from 04d5e91 to cd9a6c7 Compare November 21, 2024 16:54
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch 2 times, most recently from b299f95 to 1e21f62 Compare November 21, 2024 17:31
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from 1e21f62 to 90749c1 Compare December 3, 2024 13:29
@brianwcook
Copy link
Contributor Author

/retest

1 similar comment
@brianwcook
Copy link
Contributor Author

/retest

@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch 3 times, most recently from 840e480 to 46dcf0a Compare December 4, 2024 20:40
This adds steps to the prefetch task to detect when a Red Hat
subscription activation key is provided. When prefetch is configured
for RPM package manager and an acivation key is provided, the pod will
be registered with Red Hat's subscription management service so that
protected content can be fetched.

The activation key is provided via the param ACTIVATION_KEY. This is
expected to be the name of a secret with two keys: org and
activationkey. For more information see https://access.redhat.com
solutions/3341191.

The task modifies the prefetch input on the fly in order to inject the
necessary entitlement files used for mTLS auth. For example, for simple
input like 'rpm', the input will first be transformed to:

[
  {
    "type": "rpm",
    "options": {
      "ssl": {
        "client_key": null,
        "client_cert": null,
        "ca_bundle": null,
        "verify": 1
      }
    }
  }
]

After this the entitelement certificate information will be added to
ALL instances of rpm package manager present (in case the input is a
JSON array.)

After prefetch the container is unregistered.

Signed-off-by: Brian Cook <bcook@redhat.com>
@brianwcook brianwcook force-pushed the prefetch-task-rhsm-integration branch from 46dcf0a to d9579c2 Compare December 4, 2024 21:35
@MartinBasti MartinBasti added this pull request to the merge queue Dec 5, 2024
Merged via the queue into konflux-ci:main with commit 5f62fe3 Dec 5, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants