-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Fix operator-controller not using updated registries configuration #1554
🐛 Fix operator-controller not using updated registries configuration #1554
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Working on tests |
bd195d1
to
e871aa5
Compare
@@ -250,6 +302,11 @@ func (i *ContainersImageRegistry) unpackImage(ctx context.Context, unpackPath st | |||
if err != nil { | |||
return fmt.Errorf("error creating image source: %w", err) | |||
} | |||
defer func() { | |||
if err := layoutSrc.Close(); err != nil { | |||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a question - imho panicking here seems a little excessive, but I see it in multiple places, so I think it's getting caught
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just following convention here :) Not sure it's the right way to handle it either but I just noticed the source close was missing so I wanted to catch that while I was in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe let's proceed with the convention for this PR, but create a discussion in upstream slack? I'm also a bit 🤔 about these panics. Maybe we should level set on a tenet for when to use it and why?
e871aa5
to
a30588c
Compare
if err != nil { | ||
continue | ||
} else if !t.Equal(i.RegistriesConfTimestamp) { | ||
// We've found the timestamped symlink and it's been updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turns out there might be an issue with relying on created symlinks: if you use subPath
to mount a specific file from ConfigMap, the symlinks are not created at all. See https://stackoverflow.com/a/50687707 and kubernetes/kubernetes#50345 for context.
Note: can't seem to have the powers to unresolve a previous discussion, so adding it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that, an fsnotify
solution is starting to seem more attractive to me now 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going back to a question I saw @perdasilva ask.. What is the cost of invalidating every single time? I assume that would mean that Unpack
would be re-reading files and re-loading essentially the entire configuration at every pass.
WDYT about starting out with "invalidate every time" to keep code complexity low, with a comment that explains that we pay this reload cost unnecessarily, and that if it becomes a performance issue, we should follow up later and optimize to invalidate only on changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with that personally. Have we encountered any performance issues otherwise so far? Just so we aren't adding on top of anything else. AFAIK we aren't, so I'd be OK with invalidating on each pass with a follow up issue to track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@perdasilva now kicking myself because I completely misinterpreted that question you asked - my bad!! I thought you were saying my first change was invalidating on each pass and didn't read it as a recommendation 😅 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running an extremely rudimentary time elapsed check on a local e2e run, the cache invalidation can take between 500ns and 3µs. I think that's acceptable though it may be worth mentioning every unpack will have to wait on the config mutex from any prior unpack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all good ^^
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1554 +/- ##
==========================================
- Coverage 66.84% 66.70% -0.15%
==========================================
Files 57 57
Lines 4588 4595 +7
==========================================
- Hits 3067 3065 -2
- Misses 1298 1304 +6
- Partials 223 226 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Now that |
Absolutely - I'll include the change in there as well. |
a30588c
to
2389538
Compare
@dtfranz @joelanford one thing I was thinking about is that (and maybe this has changed since) the unpack flow blocks the reconciliation thread. I wonder if we should just spawn the unpack process, update the status and re-enqueue after 30 seconds (or whatever), check the status of the process, update status, repeat until the process reaches a terminal state. |
2389538
to
74bb174
Compare
/unhold |
…ates without restart Signed-off-by: dtfranz <dfranz@redhat.com>
74bb174
to
1231180
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm =D great work! thank you!!!
f26bf23
NOTE: Same issue applies to catalogd, I'll make a similar change there.DoneWith this change, operator-controller will be able to invalidate the configuration cache of
containers/image
and accept new configuration at runtime when changes occur.For code simplicity's sake, there's a few things to take note of that can be addressed if deemed necessary:The cache always invalidates on the first unpack, as we've not yet read theregistries.conf
file. This is easily addressed but I wanted to keepmain()
clear of the file reading op.We only check theregistries.conf
file. Checking the entire directory is more involved since we have to traverse the folder tree along with the symlinks dropped in there by k8s (WalkDir
does not support symlinks , but it might get easier ingo
1.25?)