Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run a custom script at renewal time #57

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Moosemorals
Copy link

Some applications don't notice when their certificates are changed on disk after auto-renewal.

step ca renew has an --exec option to allow arbitary code to be run at renewal time, but the existing autocert-renewal container doesn't hook into that.

This pach updates autocert-controller to add a volumeMount to the renewal container if the user has created an autocert.step.sm/renewalVolume annotation (with a value of the name of the volume to mount).

It also updates autocert-renewal to check for a file at /renewer/renewal.sh and, if found, adds an --exec=/renewer/renewal.sh option to step ca renew.

It's a bit of a hack, I'm not entirely happy with it, but it does work, allowing users to signal the main application in the appropriate way after a renewal.

Feedback/advice/suggestions welcome/encouraged!

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
This is a first aproximation to a solution.

It's a bit brute force - mount a config map in the pod with a script to
run at renewal time.

I'll need to update controller to check for an attribute (that specifies
a script path) so the voulmeMount is only inserted if there's a good
chance that it's going to work.
Bring in upstream changes
If a pod is annotated with `autocert.step.sm/renewalVolume: script',
then controller will add a volume map to the named volume to the renewal
container.
@github-actions github-actions bot added the needs triage Waiting for discussion / prioritization by team label Jun 12, 2022
@dopey dopey requested a review from maraino June 15, 2022 17:16
@maraino
Copy link
Collaborator

maraino commented Jun 15, 2022

Hi @Moosemorals, it makes sense to be able to execute a script after creating or renewing a certificate; your implementation has good ideas, but I don't think it is the right approach.

As you mentioned, it will require shareProcessNamespace: true. This can be solved if the main pod has some inotify-based script, but this is not my main concern. A user might want also to change the bootstrapper image or perhaps add something more complex that is difficult to achieve with the --exec option.

Right now, you always have the option to change the images or even add volumes in the autocert config.yaml. However, these are global variables, and making it work on every deployment might not be trivial.

We'll have to figure out a way not just to add the --exec option, an easy way to mount a ConfigMap with one or more files, and execute those instead of just step ca renew .... As I said, config.yaml is an option, but it will be hard to generalize. A better option that comes on the top of my head (which might or might not work) is to label the ConfigMap with something special, so it gets automatically loaded for a specific pod/deployment on the bootstrapper or the renewer image. Then, a script automatically runs instead of the predefined one if it has a fixed name or, if needed, it has a label with the command to run.

We'll have to come up with a design on how to make this work, but I think it will provide more flexibility than this.

I'd like to hear from you, what do you think?

@areed @mmalone @jdoss @tashian Ideas?

@areed
Copy link
Contributor

areed commented Jun 16, 2022

I think we should solve the problem of restarting services after renewal in the simplest way possible for users.

I propose adding an annotation autocert.step.sm/term: nginx. When that is present:

  1. the admission controller sets shareProcessNamespace: true on the pod
  2. the renewal container runs kill $(pgrep "nginx") after renewal

@Moosemorals
Copy link
Author

I think that's probably a little too simple, I'd want to be able to at least choose the signal (eg, postgres will reload config on a SIGHUP, haproxy wants a SIGUSR2)

@areed
Copy link
Contributor

areed commented Jun 16, 2022

Would something like autocert.step.sm/kill/9: nginx work?

@Moosemorals
Copy link
Author

I'm new to this whole k8s thing so I don't know the culture/style, but I'd just add another attribute for signal name/number, unless there's some cost I'm missing?
(Although I'm still voting for running a script, as that gives most flexibility to users.)

I had a bit of a play with shared volumes, but the problem comes down to the renew container (and step ca renew specifically) needs to send some kind of message to the prime container at renew time. Once that message is received, then the prime container can do whatever it needs

Shared process space allows signals, which are easy at both ends.

Without that, then something like a shared volume and inotify should work but that would need more cooperation from the prime container (some services can be setup to listen on a unix socket for commands, but I think that's probably less common than signals, and the format for the 'reload config/certs' command will almost certainly be different per service) (also, do unix sockets work cross container?)

@maraino
Copy link
Collaborator

maraino commented Jun 16, 2022

@areed I kind of prefer something more flexible and mount a configmap with a special set of labels for different actions:

  1. For which pod I should mount this - mandatory.
  2. A fixed filename to replace completely the bootstrap script.
  3. A fixed name to replace the completely renew script.
  4. An action for --exec (or a set of environment variables, STEP_EXEC defines the flag --exec too.
  5. Pre-bootstrap-script, Pre-renew-script to run.
  6. Post-*-script to run.
  7. If we need to set shareProcessNamespace: true
    ...

Then both bootstrap and renewer scripts can be aware of all that.

@maraino
Copy link
Collaborator

maraino commented Jun 16, 2022

My proposal is perhaps too complex and for just --exec it might make more sense something simpler, but if we're mounting configmaps as this PR proposes, I think we can make it more flexible. but if it's just sending a signal I might accept it.

This is the set of related flags that step ca renew accepts:

--pid=value
    The process id to signal after the certificate has been renewed. By
    default the the SIGHUP (1) signal will be used, but this can be
    configured with the --signal flag.

--pid-file=file
    The file from which to read the process id that will be signaled after
    the certificate has been renewed. By default the the SIGHUP (1) signal
    will be used, but this can be configured with the --signal flag.

--signal=number
    The signal number to send to the selected PID, so it can reload the
    configuration and load the new certificate. Default value is SIGHUP (1)

--exec=command
    The command to run after the certificate has been renewed.

@Moosemorals
Copy link
Author

I had a look at --pid-file and --signal, and I can't see a way of using them that doesn't use both shared process and a shared volume.

@maraino
Copy link
Collaborator

maraino commented Jun 16, 2022

I had a look at --pid-file and --signal, and I can't see a way of using them that doesn't use both shared process and a shared volume.

AFAIK you will always need shareProcessNamespace: true.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Moosemorals Moosemorals marked this pull request as draft October 25, 2022 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Waiting for discussion / prioritization by team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants