Skip to content

[LLVM] Add IRNormalizer Pass #68176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 69 commits into from
Oct 22, 2024
Merged

Conversation

justinfargnoli
Copy link
Contributor

@justinfargnoli justinfargnoli commented Oct 4, 2023

Add the llvm-canon tool. Description from the original PR:

Added a new llvm-canon tool which aims to transform LLVM Modules into a canonical form by reordering and renaming instructions while preserving the same semantics. This tool makes it easier to spot semantic differences while diffing two modules which have undergone different transformation passes.

The current version of this tool can:

  • Reorder instructions within a function.
  • Rename instructions based on the operands.
  • Sort commutative operands.

This code was originally written by @michalpaszkowski and submitted to mainline LLVM. However, it was quickly reverted to do BuildBot errors.

Michal presented his version of the tool in LLVM-Canon: Shooting for Clear Diffs.

@AidanGoldfarb and I ported the code to the new pass manager, added more tests, and fixed some bugs related to PHI nodes that may have been the root cause of the BuildBot errors that caused the patch to be reverted. Additionally, we rewrote the implementation of instruction reordering to fix cases where the original algorithm would break use-def chains.

Note that this is @AidanGoldfarb and I's first time submitting to LLVM. Please liberally critique the PR!

CC @plotfi for initial review.

@michalpaszkowski michalpaszkowski self-requested a review October 4, 2023 04:32
@github-actions
Copy link

github-actions bot commented Oct 4, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@michalpaszkowski
Copy link
Member

Thank you @justinfargnoli and @AidanGoldfarb for taking over this work and reaching out! I will review your changes in the coming week.

CC @ChrisCummins FYI

@AaronBallman AaronBallman requested a review from plotfi October 5, 2023 11:31
@AaronBallman
Copy link
Collaborator

Added @plotfi as a reviewer; one drive-by question is whether adding a new LLVM tool like this requires an RFC to be posted to Discourse or not (https://llvm.org/docs/DeveloperPolicy.html#introducing-new-components-into-llvm). (I don't have any opinions on the tool; just trying to help the review move along.)

@plotfi
Copy link
Collaborator

plotfi commented Oct 5, 2023

Thanks for reaching out guys.

I recall approving the original PR on phab. I can re-review if needed. This work was approved to land before, so personally I think it might be overkill to RFC it on discourse again but I am not certain of the right process to take.

Added @plotfi as a reviewer; one drive-by question is whether adding a new LLVM tool like this requires an RFC to be posted to Discourse or not (https://llvm.org/docs/DeveloperPolicy.html#introducing-new-components-into-llvm). (I don't have any opinions on the tool; just trying to help the review move along.)

@nikic
Copy link
Contributor

nikic commented Oct 5, 2023

Please don't use the term canon/canonicalize for this pass or tool. LLVM has an existing notion of "canonicalization" which does not coincide with what is being done here.

@AaronBallman
Copy link
Collaborator

Thanks for reaching out guys.

I recall approving the original PR on phab. I can re-review if needed. This work was approved to land before, so personally I think it might be overkill to RFC it on discourse again but I am not certain of the right process to take.

Added @plotfi as a reviewer; one drive-by question is whether adding a new LLVM tool like this requires an RFC to be posted to Discourse or not (https://llvm.org/docs/DeveloperPolicy.html#introducing-new-components-into-llvm). (I don't have any opinions on the tool; just trying to help the review move along.)

I don't have strong opinions on whether this should have an RFC or not, but the original review didn't come with an RFC that I could find either. Given that there are some concerns about what the name of the tool is, I weakly lean towards running an RFC just to make sure everyone is comfortable with the tool. But again, I don't insist, it's just a drive-by comment because I happened to notice the PR.

@plotfi
Copy link
Collaborator

plotfi commented Oct 9, 2023

Please don't use the term canon/canonicalize for this pass or tool. LLVM has an existing notion of "canonicalization" which does not coincide with what is being done here.

The term comes from an older pass that did similar algorithms on MIR. They are only borrowing the term from it.

@michalpaszkowski
Copy link
Member

michalpaszkowski commented Oct 9, 2023

Please don't use the term canon/canonicalize for this pass or tool. LLVM has an existing notion of "canonicalization" which does not coincide with what is being done here.

@nikic I don't think the name "canon" or "canonicalizer" will lead to any confusion here. This is in fact what the pass is doing and mimics the name of MIR canonicalizer with similar assumptions as @plotfi noted . Using the term "canonicalization" for the current/existing set of passes is not really well defined.

As for the RFC, there was a discussion thread in the LLVM Dev Mailing: https://lists.llvm.org/pipermail/llvm-dev/2019-August/134475.html

@nikic
Copy link
Contributor

nikic commented Oct 9, 2023

Using the term for MIR is okay, because it does not have a pre-existing notion of canonical form. Doing the same for IR is not okay, because it already has a canonical form and this pass does not produce it. Calling this ir-normalizer instead of ir-canonicalizer should convey about the same intent without overloading an existing, widely used term.

PS: The PR title and patch description could use an update, as this does not introduce an actual llvm-canon tool, only a normal opt pass (which is the right thing to do, but doesn't match the description).

@justinfargnoli justinfargnoli changed the title llvm-canon [LLVM] Add Normalizer Pass Oct 14, 2023
@justinfargnoli justinfargnoli changed the title [LLVM] Add Normalizer Pass [LLVM] Add IRNormalizer Pass Oct 14, 2023
@justinfargnoli
Copy link
Contributor Author

Renamed to IRNormalizer. Requesting another round of review.

@michalpaszkowski
Copy link
Member

I added a couple more comments regarding the name change and formatting of the docs, but apart from these the patch looks good to me.

@justinfargnoli
Copy link
Contributor Author

Ping for another round of review.

Copy link
Member

@michalpaszkowski michalpaszkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! LGTM!

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update the patch description with some information on what kind of normalization the pass does?

It would also be great to post some examples, because I don't really get what kind of renaming and reordering this does just looking at the test coverage.

One bit that's particularly non-obvious to me is why this normalization has to involve hashes? That seems like it would make things more unstable and hard to understand.

I would recommend trying to run this pass (with all options enabled) over all existing tests to make sure that it doesn't cause any crashes/verifier failures. You're moving instructions around, and it's easy to get that wrong when invoke, callbr or catchswitch are involved.

I would also try to add this pass at the end of the clang pipeline and run llvm-test-suite to verify that the normalization this does is indeed semantics-preserving.

(Review notes are just nits, I haven't tried to understand the main logic yet.)

@michalpaszkowski
Copy link
Member

Here are the slides that were presented at the LLVM Dev Meeting in 2019. It might be useful to include the link in the commit message for reference.

@nikic Thanks for catching more things in the review! Especially the commutative ordering of operands/arguments.

When it comes to including hashes in the naming algorithm (renaming is controlled using an option for already named values), the ideas is that by just looking at an instruction further down (a use) you can tell whether any inputs are different. And the other way around, by looking at initial instructions you can see how their "output footprint" has changed after transformations.

The example below is a bit outdated, but reflects the general idea we had when considering naming algorithms:
image
image

@justinfargnoli
Copy link
Contributor Author

I would recommend trying to run this pass (with all options enabled) over all existing tests to make sure that it doesn't cause any crashes/verifier failures. You're moving instructions around, and it's easy to get that wrong when invoke, callbr or catchswitch are involved.

Thanks for this suggestion! It helped iron out a lot of bugs in the implementation.

I would also try to add this pass at the end of the clang pipeline and run llvm-test-suite to verify that the normalization this does is indeed semantics-preserving.

I'm running into some linking errors when trying to build llvm-test-suite. Pinging for another round of review in the meantime.

@justinfargnoli justinfargnoli requested a review from nikic November 20, 2023 06:34
@justinfargnoli
Copy link
Contributor Author

Hi @plotfi, thanks for pinging me on this! Generally speaking, two things deserve additional focus:

  • UX testing to find:
    • Improvements to existing features
    • New features to add
  • Additional testing to ensure that the pass preserves the semantics of the IR.
    • e.g. I want to add this pass to the nvcc optimization pipeline and run it on NVIDIA's internal test suite.
    • If Meta has similar test suites, it would be great to see if this pass introduces any functional issues on it.

I cannot reproduce these failures on my x86 machine.

Big thanks to @dtcxzyw for running the test-suite. Since @dtcxzyw did not find any test failures, this MR is ready for another round of review.

If things look good to @plotfi and @nikic, then we should be good to get this merged!

@justinfargnoli justinfargnoli requested a review from nikic September 8, 2024 04:38
@justinfargnoli
Copy link
Contributor Author

Ping @plotfi @nikic for review.

1 similar comment
@justinfargnoli
Copy link
Contributor Author

Ping @plotfi @nikic for review.

@justinfargnoli justinfargnoli added the awaiting-review Has pending Phabricator review label Sep 23, 2024
@plotfi
Copy link
Collaborator

plotfi commented Oct 17, 2024

Did a pretty cursory look over this PR.

LGTM, I would prefer if this were landed and added improvements as needed later.

Thank you again for working on this.

@justinfargnoli justinfargnoli merged commit 1295d2e into llvm:main Oct 22, 2024
9 checks passed
@vvereschaka
Copy link
Contributor

Looks like these failed tests on llvm-clang-x86_64-expensive-checks-ubuntu builder are related with this PR:

LLVM ERROR: Function @test changed by IRNormalizerPass without invalidating analyses
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/opt -S -passes=normalize
1.	Running pass "function(normalize)" on module "<stdin>"
2.	Running pass "normalize" on function "test"
 #0 0x00005643d9c9fdaa llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:22
...

https://lab.llvm.org/buildbot/#/builders/187/builds/2031

@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 22, 2024

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/7518

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/IRNormalizer/regression-infinite-loop.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -passes=normalize < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -passes=normalize
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll
LLVM ERROR: Function @test changed by IRNormalizerPass without invalidating analyses
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -passes=normalize
1.	Running pass "function(normalize)" on module "<stdin>"
2.	Running pass "normalize" on function "test"
 #0 0x00000000045c7697 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x45c7697)
 #1 0x00000000045c514e llvm::sys::RunSignalHandlers() (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x45c514e)
 #2 0x00000000045c7d4f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f95c2d96140 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x13140)
 #4 0x00007f95c28aad51 raise (/lib/x86_64-linux-gnu/libc.so.6+0x38d51)
 #5 0x00007f95c2894537 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22537)
 #6 0x00000000045290aa llvm::report_fatal_error(llvm::Twine const&, bool) (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x45290aa)
 #7 0x00000000028458da void llvm::detail::UniqueFunctionBase<void, llvm::StringRef, llvm::Any, llvm::PreservedAnalyses const&>::CallImpl<llvm::PreservedCFGCheckerInstrumentation::registerCallbacks(llvm::PassInstrumentationCallbacks&, llvm::AnalysisManager<llvm::Module>&)::$_18>(void*, llvm::StringRef, llvm::Any&, llvm::PreservedAnalyses const&) StandardInstrumentations.cpp:0:0
 #8 0x00000000043f760e llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x43f760e)
 #9 0x0000000000d0b26d llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) crtstuff.c:0:0
#10 0x00000000043fb840 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x43fb840)
#11 0x0000000000d0b03d llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) crtstuff.c:0:0
#12 0x00000000043f6407 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x43f6407)
#13 0x000000000085cf50 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x85cf50)
#14 0x0000000000852635 optMain (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x852635)
#15 0x00007f95c2895d7a __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x23d7a)
#16 0x000000000084e99a _start (/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt+0x84e99a)
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/IRNormalizer/regression-infinite-loop.ll

--

********************


@justinfargnoli
Copy link
Contributor Author

justinfargnoli commented Oct 22, 2024

@vvereschaka, I agree.

Note: The problem is that IRNormalizer produces a change that invalidates an analysis. It's not that IRNormalizer should declare that it's not preserving an analysis.

I likely won't be able to fix this for a few days. @vvereschaka, should I revert 1295d2e in the meantime?

@vvereschaka
Copy link
Contributor

Reverting the changes is the optimal way in those cases. Yes, would you revert this PR. Thank you.

justinfargnoli added a commit that referenced this pull request Oct 22, 2024
justinfargnoli added a commit that referenced this pull request Oct 22, 2024
justinfargnoli added a commit that referenced this pull request Nov 14, 2024
`IRNormalizer` will reorder instructions. Thus, we need to invalidate
analyses. Done in cd500d2. This should
resolve the [BuildBot
failure](#68176 (comment)).

---

Original PR: #68176
Original commit: 1295d2e
Reverted with: 8a12e01

---

Add the llvm-canon tool. Description from the [original
PR](https://reviews.llvm.org/D66029#change-wZv3yOpDdxIu):

> Added a new llvm-canon tool which aims to transform LLVM Modules into
a canonical form by reordering and renaming instructions while
preserving the same semantics. This tool makes it easier to spot
semantic differences while diffing two modules which have undergone
different transformation passes.

The current version of this tool can:

- Reorder instructions within a function.
- Rename instructions based on the operands.
- Sort commutative operands.

This code was originally written by @michalpaszkowski and [submitted to
mainline
LLVM](14d3585).
However, it was quickly
[reverted](335de55)
to do BuildBot errors.

Michal presented his version of the tool in [LLVM-Canon: Shooting for
Clear Diffs](https://www.youtube.com/watch?v=c9WMijSOEUg).

@AidanGoldfarb and I ported the code to the new pass manager, added more
tests, and fixed some bugs related to PHI nodes that may have been the
root cause of the BuildBot errors that caused the patch to be reverted.
Additionally, we rewrote the implementation of instruction reordering to
fix cases where the original algorithm would break use-def chains.

Note that this is @AidanGoldfarb and I's first time submitting to LLVM.
Please liberally critique the PR!

CC @plotfi for initial review.

---------

Co-authored-by: Aidan <aidan.goldfarb@mail.mcgill.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review Has pending Phabricator review help wanted Indicates that a maintainer wants help. Not [good first issue].
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants