-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JitArm64: Fix creqv/crorc setting eq bit #13266
JitArm64: Fix creqv/crorc setting eq bit #13266
Conversation
913ecce
to
82695ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crXXX assigns to bits_1_to_31_are_set but doesn't use it anywhere. I presume it's supposed to be an argument for the SetCRFieldBit call at the end of the function?
Yes! That's an embarrassing mistake :( |
82695ea
to
6678af3
Compare
AND(CR, CR, LogicalImm(0xFFFF'FFFF'0000'0000, GPRSize::B64)); | ||
ORR(CR, CR, in); | ||
if (!negate) | ||
EOR(CR, CR, LogicalImm(1ULL << 0, GPRSize::B64)); | ||
EOR(CR, CR, LogicalImm(bits_1_to_31_are_set ? 0xFFFF'FFFFULL : 1ULL, GPRSize::B64)); | ||
else if (bits_1_to_31_are_set) | ||
AND(CR, CR, LogicalImm(0xFFFF'FFFF'0000'0001ULL, GPRSize::B64)); | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose we replaced this whole CR_EQ_BIT case with a BFI of the lower 32 bits (which would handle the combination where negate is true and bits_1_to_31_are_set is false), then EOR with 0x1, 0xFFFF'FFFE, or 0xFFFF'FFFF for the other three combinations respectively. That would make each combination 1 or 2 instructions; are there any downsides I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would certainly be possible to replace the initial AND+ORR with a BFI. This would save one instruction, but BFI is two cycles on many CPUs, so on those CPUs it wouldn't be any faster (aside from any gains in icache hit rate). What I'm wondering is whether BFI being two cycles will hurt pipelining in the case where CR
is known early but in
is known late. Currently, the AND can fully execute before in
is known, and I'm not aware if BFI is able to do the first half of its calculation (so to speak) before in
is known.
Moving to BFI might very well be advantageous overall, but since it's not entirely clear, I'd like to leave it for a separate PR, because I want to prioritize getting the correctness fix merged.
When I wrote 71e9766, there was an interaction I didn't take into account: When setting eq, SetCRFieldBit assumes that all bits in the passed-in host register except the least significant bit are 0. But if we use EON or ORN, all bits except the least significant bit get set to 1. This can cause eq to end up unset when it should be set. This commit fixes the issue. crandc is unaffected by the issue because the "1" bits get ANDed with "0" bits from the first operand. Note that in practice, we never have both bits_1_to_31_are_set and negate at once, so while it looks like this commit adds an extra AND instruction in some cases, those cases don't happen in practice, meaning this fix shouldn't affect performance.
6678af3
to
25e412a
Compare
For the eq and gt bits specifically, setting negate_result is one instruction shorter than not setting it.
25e412a
to
85cd0ca
Compare
FifoCI detected that this change impacts graphical rendering. Here are the behavior differences detected by the system:
automated-fifoci-reporter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Untested, LGTM.
When I wrote 71e9766, there was an interaction I didn't take into account: When setting eq, SetCRFieldBit assumes that all bits in the passed-in host register except the least significant bit are 0. But if we use BIC/EON/ORN, all bits except the least significant bit get set to 1. This can cause eq to end up unset when it should be set.
This pull request fixes the issue.