-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Resolve race condition of Duchy claimTask in spanner implementation. #1726
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @renjiezh)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @renjiezh)
For some additional context, this is mostly reverting #1726. Using a single transaction can result in a lock contention issue, as the unclaimed tasks query will lock all claimable Computations DB rows. The solution is to use a separate transaction to perform the write, but to re-read the single row and ensure that it's still claimable. The old version of this check had a flaw in that it incorrectly assumes that the JVM clock is monotonic. This PR introduces an additional check based on LockExpirationTime, which is less susceptible to clock skew.
For some additional context, this is mostly reverting #1726. Using a single transaction can result in a lock contention issue, as the unclaimed tasks query will lock all claimable Computations DB rows. The solution is to use a separate transaction to perform the write, but to re-read the single row and ensure that it's still claimable. The old version of this check had a flaw in that it incorrectly assumes that the JVM clock is monotonic. This PR introduces an additional check based on LockExpirationTime, which is less susceptible to clock skew.
Issue #1722