Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

policy-server seems to pull cached OCI artifacts regardless of hot cache #883

Open
viccuad opened this issue Aug 21, 2024 · 6 comments
Open

Comments

@viccuad
Copy link
Member

viccuad commented Aug 21, 2024

Deploying Kubewarden with Audit Scanner enabled, and configured to run every 2 minutes,
Deploying verify-image-signatures policy configured to verify Application Collection images following the instructions in kubewarden/docs#443,

It seems that the PolicyServer still exercises the OCI registry instead of consuming from its cache, when calling:
https://github.com/kubewarden/policy-evaluator/blob/3cd66b932b199037e677e3e204d4d9742e23edc8/src/callback_handler/sigstore_verification.rs#L251-L266

Acceptance criteria

Verify that policy-server cache for context-aware calls is correctly configured.
Configure the cache in policy-evaluator as needed.
Add tests as needed.

@viccuad viccuad changed the title policy-server cache seems to pull cached OCI artifacts every 2 minutes policy-server seems to pull cached OCI artifacts regardless of hot cache Aug 21, 2024
@viccuad viccuad added this to the 1.17 milestone Sep 12, 2024
@flavio flavio moved this to Todo in Kubewarden Sep 20, 2024
@flavio flavio self-assigned this Sep 24, 2024
@flavio
Copy link
Member

flavio commented Sep 24, 2024

I've tested the code. Everything is working as described:

  • the results obtained from the registry are cached for 1 minute
  • only successful results are cached

If a container image is not signed, getting its signature will fail. Hence whenever a workload uses an unsigned image we will keep reaching to the remote registry until a signature blob is found.

In the setup described above, the audit-scanner performs an assessment every 2 minutes. That means the cache is always empty when the scanner is initiated. However, if multiple workloads are using the same image, the remote registry is interrogated only once.
However, don't forget the cache is specific to the policy-server instance. When running multiple policy server instances, each one of them will reach out to the registry for the same image; but each one will do that only once.

We could provide a configuration knob that sets the cache expiration time.

@recena: do you have any opinion? I know the potential bug was reported by you.

@flavio
Copy link
Member

flavio commented Sep 24, 2024

Moving to blocked, waiting for feedback

@flavio flavio moved this from Todo to Blocked in Kubewarden Sep 24, 2024
@recena
Copy link

recena commented Sep 24, 2024

I'm not sure If I understand the scenario, but:

  1. We should cache valid images → signed
  2. 1 minute for TTL is too short

@flavio flavio removed this from the 1.17 milestone Sep 25, 2024
@flavio
Copy link
Member

flavio commented Oct 3, 2024

We're caching the valid images, but we expire the cache after 1 minute. That's because someone in the meantime might overwrite a tag.

For example:

  • Time 10:00:00: we verify registry.local/nginx:1_alpine, we fetch data about it from the registry. This information is cached for X minutes
  • Time 10:02:00: someone overwrites registry.local/nginx:1_alpine
  • Time 10:05:00: we have to verify registry.local/nginx:1_alpine again. If the cache is expired everything will be fine, otherwise we will reuse the details about the original image that was around at 10:00:00; which might be bad

Right now we're conservative, being a security project, and we let the cache expire after 1 minute.

I think we should allow the user to configure the cache expiration time. In this way the user could define a value that is the right tradeoff between the two cases (talking too much with a registry vs having stale data).

@flavio flavio moved this from Blocked to Todo in Kubewarden Oct 4, 2024
@flavio flavio added this to the 1.18 milestone Oct 4, 2024
@flavio
Copy link
Member

flavio commented Oct 4, 2024

We're going to refine this card as part of 1.18, and work on this improvement during 1.19.

I would like to come up with a solution that allows the policy to configure the caching interval, so that the k8s admin can put a value that makes him comfortable

@kkaempf kkaempf modified the milestones: 1.18, 1.19 Oct 24, 2024
@flavio
Copy link
Member

flavio commented Oct 24, 2024

I propose to define a new host capability about caching. The new capability will allow the policy author to cache arbitrary data.

We will then update the verify-image-signatures policy to allow the Kubernetes admin to define the expiration criteria of the cached data.

Steps required to solve this issue:

  • Write RFC about caching host capability
  • Implement caching host capability inside of policy-evaluator, propagate the change to policy server and kwctl
  • Adapt the verify-image-signature to make use of this new capability

@flavio flavio moved this from Todo to Pending review in Kubewarden Oct 24, 2024
@flavio flavio removed the status in Kubewarden Oct 25, 2024
@flavio flavio removed this from the 1.19 milestone Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

4 participants