Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Resolve race condition of Duchy claimTask in spanner implementation. #1726

Merged
merged 2 commits into from
Aug 1, 2024

Conversation

renjiezh
Copy link
Contributor

Issue #1722

@wfa-reviewable
Copy link

This change is Reviewable

@renjiezh renjiezh changed the title fix: Resolve race condition of Duchy claimTask with spanner impl. fix: Resolve race condition of Duchy claimTask in spanner implementation. Jul 31, 2024
@renjiezh renjiezh requested a review from SanjayVas July 31, 2024 18:51
Copy link
Member

@SanjayVas SanjayVas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @renjiezh)

Copy link
Collaborator

@stevenwarejones stevenwarejones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @renjiezh)

@renjiezh renjiezh enabled auto-merge (squash) August 1, 2024 18:34
@renjiezh renjiezh merged commit 64075c3 into main Aug 1, 2024
4 checks passed
@renjiezh renjiezh deleted the renjiez-claim-race branch August 1, 2024 20:49
renjiezh added a commit that referenced this pull request Aug 8, 2024
For some additional context, this is mostly reverting #1726. Using a single transaction can result in a lock contention issue, as the unclaimed tasks query will lock all claimable Computations DB rows. The solution is to use a separate transaction to perform the write, but to re-read the single row and ensure that it's still claimable. The old version of this check had a flaw in that it incorrectly assumes that the JVM clock is monotonic. This PR introduces an additional check based on LockExpirationTime, which is less susceptible to clock skew.
ple13 pushed a commit that referenced this pull request Aug 16, 2024
For some additional context, this is mostly reverting #1726. Using a single transaction can result in a lock contention issue, as the unclaimed tasks query will lock all claimable Computations DB rows. The solution is to use a separate transaction to perform the write, but to re-read the single row and ensure that it's still claimable. The old version of this check had a flaw in that it incorrectly assumes that the JVM clock is monotonic. This PR introduces an additional check based on LockExpirationTime, which is less susceptible to clock skew.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants