Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] inefficient code for multiple select instructions with same condition #122551

Open
weiguozhi opened this issue Jan 10, 2025 · 2 comments

Comments

@weiguozhi
Copy link
Contributor

The example test case is at https://godbolt.org/z/fEEonKhn9.

Function foo is what I got from source code, there are two selects in two basic blocks, they use the same condition, it generates two pairs of "test + jns" instructions. An optimal result should be same as function expected, there is only one pair of "test + jns".

@sharkautarch
Copy link

sharkautarch commented Jan 14, 2025

seems like CodeGen Prepare pass is actually duplicating an icmp:
https://godbolt.org/z/W6M4dzP44
interestingly enough, simply adding a gvn pass after the CodeGen Prepare removes the duplicated icmp
I'm guessing it does that because it wants to make local variables bb-local variables
but in this situation, this leads to an increase in the number of bb-local-live variables
I mean, a gvn pass would normally never happen after codegen prepare, but the normal instcombine version doesn't either (tho there is a machine instcombine version)
tho I will say that when manually adding the instcombine pass after the codegen prepare pass, the instcombine is unable to remove the duplicated icmp
Assuming that Codegen Prepare duplicating the icmp was intentional, maybe there should be a late machine-code version of the gvn pass, since the normal gvn pass is able to cleanup the duplicated icmp? Though I'm not sure how tricky it would be to do gvn over machine code...
that, or in the codegen prepare pass, maybe it should add metadata to duplicated expressions, as a way to make it easier to (in the case that it isn't profitable to keep the duplicated expressions duplicated) cleanup the duplicated expressions in a late machine pass?

Edit: the more I look into the Codegen Prepare pass, the more it seems like this duplicated icmp is a bug, but I have no clue what is causing the Codegen Prepare pass to do this. It seems to still duplicate the icmp even when branch optimizations in the Codegen Prepare pass are turned off, so IG at least I know it's not that part...

@weiguozhi
Copy link
Contributor Author

Edit: the more I look into the Codegen Prepare pass, the more it seems like this duplicated icmp is a bug, but I have no clue what is causing the Codegen Prepare pass to do this. It seems to still duplicate the icmp even when branch optimizations in the Codegen Prepare pass are turned off, so IG at least I know it's not that part...

The function CodeGenPrepare::sinkCmpExpression does optimization work on the function foo. Let's consider several different versions of foo:

  • Version 1, without sinkCmpExpression. In first BB the condition %6 should be stored into a general register with "setcc al". And in later basic block it needs to be tested with "cmp + Jcc".
  • Version 2, with sinkCmpExpression. In first BB the condition %6 will not be saved with extra instruction. In later basic block the original condition is tested again with "test + jns". It is the current generated code, it is a little better than version 1.
  • Version 3, another option that CodeGenPrepare didn't consider. We can move the user of %6 to the first BB if possible. In our case the instruction %spec.select.i and its dependent instruction %buf are moved to the first BB, and then we get the function expected. With this method the second pair of "test + jns" is completely removed, it also reduces 1 used register for either the condition code or the source operand for the condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants