Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: jwk aquire lock when generating #3865

Closed

Conversation

awill1988
Copy link
Contributor

Update the locking behavior of jwk.GetOrGenerateKeys to acquire a read lock to fetch keys and only acquire the write lock when generating and persisting new keys.

Related issue(s)

#3863

Checklist

  • I have read the contributing guidelines.
  • I have referenced an issue containing the design document if my change
    introduces a new feature.
  • I am following the
    contributing code guidelines.
  • I have read the security policy.
  • I confirm that this pull request does not address a security
    vulnerability. If this pull request addresses a security vulnerability, I
    confirm that I got the approval (please contact
    security@ory.sh) from the maintainers to push
    the changes.
  • I have added tests that prove my fix is effective or that my feature
    works.
  • I have added or changed the documentation.

Further Comments

I can attempt to write up more unit tests if that is required, but since this is not a significant change I am wondering if the existing tests can cover it?

I am also not sure what would need to be written for documentation but I am happy to add it where it is preferred.

On the related issue there are some reproducible steps that can be used to demonstrate the benefit of this refactor, but I can specify steps or update here if needed.

@awill1988 awill1988 changed the title refactor(jwk): aquire read lock unless generating refactor: jwk aquire read lock unless generating Oct 28, 2024
jwk/helper.go Outdated
getLock(set).Lock()
defer getLock(set).Unlock()

getLock(set).RLock()
Copy link
Member

@aeneasr aeneasr Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why the lock is there in the first place. We're using a SQL backend for fetching the keys, and that's atomic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good point..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is that the intention was to protect the initialization so that it's only done once within a single instance. Otherwise if there's more than one concurrent caller to this function, and each read the uninitialized state, they'd each generate and persist their own key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@terev if it were to keep the map[string]*RWMutex or maybe just map[string]*Mutex and lock on key generation, do you think that would work for the key set initialization condition?

Copy link
Contributor

@terev terev Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think a plain mutex per keyset is fine. But maybe keep the rwlock for accessing the map. But actually use the read lock in the right case there. But to be clear I'm not part of ory so if aeneasr disagrees that's cool too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But to be clear I'm not part of ory so if aeneasr disagrees that's cool too.
thanks anyways @terev 😄!

Okay, @aeneasr, I think I'm following what you're saying about not even needing the read lock.

I'll read up on the test/conformance to grok what's going on there 🤞🏻

Also found a couple snapshots that looked like they needed a quick patch so I grabbed those just in case.

@awill1988 awill1988 changed the title refactor: jwk aquire read lock unless generating refactor: jwk aquire lock when generating Oct 29, 2024
if errors.Is(err, x.ErrNotFound) || keys != nil && len(keys.Keys) == 0 {
r.Logger().Warnf("JSON Web Key Set \"%s\" does not exist yet, generating new key pair...", set)

l := getLock(set)
defer l.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you missed it, I think you meant to lock the returned lock then defer unlock. Same thing below on line 69.

At the end of this if block it may make sense to directly return FindPrivateKey(keys). If we don't it opens up the potential for deadlock if FindPrivateKey fails and we try to generate another keyset. Also I don't really think it would help to generate another keyset if the one we just generated didn't have a private key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops!

@alnr
Copy link
Contributor

alnr commented Oct 29, 2024

This whole code is unsafe in the presence of multiple hydra instances. We would need to change the database code to do an INSERT INTO ... ON UPDATE DO NOTHING RETURNING * to properly synchronize.

@aeneasr
Copy link
Member

aeneasr commented Oct 29, 2024

This whole code is unsafe in the presence of multiple hydra instances. We would need to change the database code to do an INSERT INTO ... ON UPDATE DO NOTHING RETURNING * to properly synchronize.

While this is true, it doesn't really matter, because the whole thing will eventually settle. Even if we created 20 keys, it will be like if we rotated through 20 keys and just use the newest key. So the impact is really minimal in my view

@alnr
Copy link
Contributor

alnr commented Oct 29, 2024

That is:

  1. SELECT * FROM HYDRA_JWK
  2. a) if found, return
  3. b) if not found, generate a new one in memory (could be done under a mutex to throttle CPU usage)
  4. INSERT INTO hydra_jwk ON CONFLICT DO NOTHING
  5. SELECT * FROM hydra jwk
  6. Continue with the key returned from the last select.

Postgres and other DBs which support RETURNING could do 4+5 in one step as an optimization (optional, since this is a rare event).

@aeneasr
Copy link
Member

aeneasr commented Oct 29, 2024

I think that analysis is correct. So all we need really is a lock for when we generate the key and not for reading, which also solves the issue here

aeneasr added a commit that referenced this pull request Oct 29, 2024
@aeneasr aeneasr mentioned this pull request Oct 29, 2024
7 tasks
@aeneasr
Copy link
Member

aeneasr commented Oct 29, 2024

@awill1988
Copy link
Contributor Author

awill1988 commented Oct 29, 2024

closing in favor of #3866, 🙏🏻 for taking over this issue @aeneasr

@awill1988 awill1988 closed this Oct 29, 2024
@awill1988 awill1988 deleted the awill/jwk-helper-use-read-lock branch October 29, 2024 15:02
aeneasr added a commit that referenced this pull request Oct 30, 2024
aeneasr added a commit that referenced this pull request Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants