-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[LLVM] Add IRNormalizer Pass #68176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM] Add IRNormalizer Pass #68176
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
Thank you @justinfargnoli and @AidanGoldfarb for taking over this work and reaching out! I will review your changes in the coming week. CC @ChrisCummins FYI |
Added @plotfi as a reviewer; one drive-by question is whether adding a new LLVM tool like this requires an RFC to be posted to Discourse or not (https://llvm.org/docs/DeveloperPolicy.html#introducing-new-components-into-llvm). (I don't have any opinions on the tool; just trying to help the review move along.) |
Thanks for reaching out guys. I recall approving the original PR on phab. I can re-review if needed. This work was approved to land before, so personally I think it might be overkill to RFC it on discourse again but I am not certain of the right process to take.
|
Please don't use the term canon/canonicalize for this pass or tool. LLVM has an existing notion of "canonicalization" which does not coincide with what is being done here. |
I don't have strong opinions on whether this should have an RFC or not, but the original review didn't come with an RFC that I could find either. Given that there are some concerns about what the name of the tool is, I weakly lean towards running an RFC just to make sure everyone is comfortable with the tool. But again, I don't insist, it's just a drive-by comment because I happened to notice the PR. |
The term comes from an older pass that did similar algorithms on MIR. They are only borrowing the term from it. |
@nikic I don't think the name "canon" or "canonicalizer" will lead to any confusion here. This is in fact what the pass is doing and mimics the name of MIR canonicalizer with similar assumptions as @plotfi noted . Using the term "canonicalization" for the current/existing set of passes is not really well defined. As for the RFC, there was a discussion thread in the LLVM Dev Mailing: https://lists.llvm.org/pipermail/llvm-dev/2019-August/134475.html |
Using the term for MIR is okay, because it does not have a pre-existing notion of canonical form. Doing the same for IR is not okay, because it already has a canonical form and this pass does not produce it. Calling this PS: The PR title and patch description could use an update, as this does not introduce an actual llvm-canon tool, only a normal opt pass (which is the right thing to do, but doesn't match the description). |
Renamed to IRNormalizer. Requesting another round of review. |
I added a couple more comments regarding the name change and formatting of the docs, but apart from these the patch looks good to me. |
Ping for another round of review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this! LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update the patch description with some information on what kind of normalization the pass does?
It would also be great to post some examples, because I don't really get what kind of renaming and reordering this does just looking at the test coverage.
One bit that's particularly non-obvious to me is why this normalization has to involve hashes? That seems like it would make things more unstable and hard to understand.
I would recommend trying to run this pass (with all options enabled) over all existing tests to make sure that it doesn't cause any crashes/verifier failures. You're moving instructions around, and it's easy to get that wrong when invoke, callbr or catchswitch are involved.
I would also try to add this pass at the end of the clang pipeline and run llvm-test-suite to verify that the normalization this does is indeed semantics-preserving.
(Review notes are just nits, I haven't tried to understand the main logic yet.)
Here are the slides that were presented at the LLVM Dev Meeting in 2019. It might be useful to include the link in the commit message for reference. @nikic Thanks for catching more things in the review! Especially the commutative ordering of operands/arguments. When it comes to including hashes in the naming algorithm (renaming is controlled using an option for already named values), the ideas is that by just looking at an instruction further down (a use) you can tell whether any inputs are different. And the other way around, by looking at initial instructions you can see how their "output footprint" has changed after transformations. The example below is a bit outdated, but reflects the general idea we had when considering naming algorithms: |
Thanks for this suggestion! It helped iron out a lot of bugs in the implementation.
I'm running into some linking errors when trying to build |
Hi @plotfi, thanks for pinging me on this! Generally speaking, two things deserve additional focus:
Big thanks to @dtcxzyw for running the test-suite. Since @dtcxzyw did not find any test failures, this MR is ready for another round of review. If things look good to @plotfi and @nikic, then we should be good to get this merged! |
1 similar comment
Did a pretty cursory look over this PR. LGTM, I would prefer if this were landed and added improvements as needed later. Thank you again for working on this. |
Looks like these failed tests on llvm-clang-x86_64-expensive-checks-ubuntu builder are related with this PR:
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/7518 Here is the relevant piece of the build log for the reference
|
@vvereschaka, I agree. Note: The problem is that I likely won't be able to fix this for a few days. @vvereschaka, should I revert 1295d2e in the meantime? |
Reverting the changes is the optimal way in those cases. Yes, would you revert this PR. Thank you. |
This reverts commit 1295d2e.
Reverts #68176 Introduced BuildBot failure: #68176 (comment)
`IRNormalizer` will reorder instructions. Thus, we need to invalidate analyses. Done in cd500d2. This should resolve the [BuildBot failure](#68176 (comment)). --- Original PR: #68176 Original commit: 1295d2e Reverted with: 8a12e01 --- Add the llvm-canon tool. Description from the [original PR](https://reviews.llvm.org/D66029#change-wZv3yOpDdxIu): > Added a new llvm-canon tool which aims to transform LLVM Modules into a canonical form by reordering and renaming instructions while preserving the same semantics. This tool makes it easier to spot semantic differences while diffing two modules which have undergone different transformation passes. The current version of this tool can: - Reorder instructions within a function. - Rename instructions based on the operands. - Sort commutative operands. This code was originally written by @michalpaszkowski and [submitted to mainline LLVM](14d3585). However, it was quickly [reverted](335de55) to do BuildBot errors. Michal presented his version of the tool in [LLVM-Canon: Shooting for Clear Diffs](https://www.youtube.com/watch?v=c9WMijSOEUg). @AidanGoldfarb and I ported the code to the new pass manager, added more tests, and fixed some bugs related to PHI nodes that may have been the root cause of the BuildBot errors that caused the patch to be reverted. Additionally, we rewrote the implementation of instruction reordering to fix cases where the original algorithm would break use-def chains. Note that this is @AidanGoldfarb and I's first time submitting to LLVM. Please liberally critique the PR! CC @plotfi for initial review. --------- Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>
Add the llvm-canon tool. Description from the original PR:
The current version of this tool can:
This code was originally written by @michalpaszkowski and submitted to mainline LLVM. However, it was quickly reverted to do BuildBot errors.
Michal presented his version of the tool in LLVM-Canon: Shooting for Clear Diffs.
@AidanGoldfarb and I ported the code to the new pass manager, added more tests, and fixed some bugs related to PHI nodes that may have been the root cause of the BuildBot errors that caused the patch to be reverted. Additionally, we rewrote the implementation of instruction reordering to fix cases where the original algorithm would break use-def chains.
Note that this is @AidanGoldfarb and I's first time submitting to LLVM. Please liberally critique the PR!
CC @plotfi for initial review.