Skip to content

[OpenMP][LLVM] Update alloca IP after PrivCB in OMPIRBUIlder #93920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 5, 2024

Conversation

ergawy
Copy link
Member

@ergawy ergawy commented May 31, 2024

Fixes a crash uncovered by pr77666.f90 in the test suite when delayed privatization is enabled by default.

In particular, whenever PrivCB (the callback responsible for generating privatizaiton logic for an OMP variable) generates a multi-block privatization region, the insertion point diverges: the BB component of the IP can become a different BB from the parent block of the instruction iterator component of the IP. This PR updates the IP to make sure that the BB is the parent block of the instruction iterator.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp clang:openmp OpenMP related changes to Clang labels May 31, 2024
@llvmbot
Copy link
Member

llvmbot commented May 31, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-flang-openmp

Author: Kareem Ergawy (ergawy)

Changes

Fixes a crash uncovered by pr77666.f90 in the test suite.

In particular, whenever PrivCB (the callback responsible for generating privatizaiton logic for an OMP variable) generates a multi-block privatization region, the insertion point diverges: the BB component of the IP can become a different BB from the parent block of the instruction iterator component of the IP. This PR updates the IP to make sure that the BB is the parent block of the instruction iterator.


Full diff: https://github.com/llvm/llvm-project/pull/93920.diff

2 Files Affected:

  • (added) flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 (+20)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+3)
diff --git a/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90
new file mode 100644
index 0000000000000..0363494c59ecf
--- /dev/null
+++ b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90
@@ -0,0 +1,20 @@
+! RUN: %flang -S -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \
+! RUN:   -o - %s 2>&1 | FileCheck %s
+
+subroutine foo(x)
+  integer, allocatable :: x, y
+!$omp parallel private(x, y)
+  x = y
+!$omp end parallel
+end
+
+! CHECK-LABEL: define void @foo_
+! CHECK:         ret void
+! CHECK-NEXT:  }
+
+! CHECK-LABEL: define internal void @foo_..omp_par
+! CHECK-DAG:     call ptr @malloc
+! CHECK-DAG:     call ptr @malloc
+! CHECK-DAG:     call void @free
+! CHECK-DAG:     call void @free
+! CHECK:       }
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index cb4de9c8876dc..eab41eb8a35b2 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -1583,6 +1583,9 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
     } else {
       Builder.restoreIP(
           PrivCB(InnerAllocaIP, Builder.saveIP(), V, *Inner, ReplacementValue));
+      InnerAllocaIP = {InnerAllocaIP.getPoint()->getParent(),
+                       InnerAllocaIP.getPoint()};
+
       assert(ReplacementValue &&
              "Expected copy/create callback to set replacement value!");
       if (ReplacementValue == &V)

@llvmbot
Copy link
Member

llvmbot commented May 31, 2024

@llvm/pr-subscribers-flang-fir-hlfir

Author: Kareem Ergawy (ergawy)

Changes

Fixes a crash uncovered by pr77666.f90 in the test suite.

In particular, whenever PrivCB (the callback responsible for generating privatizaiton logic for an OMP variable) generates a multi-block privatization region, the insertion point diverges: the BB component of the IP can become a different BB from the parent block of the instruction iterator component of the IP. This PR updates the IP to make sure that the BB is the parent block of the instruction iterator.


Full diff: https://github.com/llvm/llvm-project/pull/93920.diff

2 Files Affected:

  • (added) flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 (+20)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+3)
diff --git a/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90 b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90
new file mode 100644
index 0000000000000..0363494c59ecf
--- /dev/null
+++ b/flang/test/Lower/OpenMP/delayed-privatization-lower-allocatable-to-llvm.f90
@@ -0,0 +1,20 @@
+! RUN: %flang -S -emit-llvm -fopenmp -mmlir --openmp-enable-delayed-privatization \
+! RUN:   -o - %s 2>&1 | FileCheck %s
+
+subroutine foo(x)
+  integer, allocatable :: x, y
+!$omp parallel private(x, y)
+  x = y
+!$omp end parallel
+end
+
+! CHECK-LABEL: define void @foo_
+! CHECK:         ret void
+! CHECK-NEXT:  }
+
+! CHECK-LABEL: define internal void @foo_..omp_par
+! CHECK-DAG:     call ptr @malloc
+! CHECK-DAG:     call ptr @malloc
+! CHECK-DAG:     call void @free
+! CHECK-DAG:     call void @free
+! CHECK:       }
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index cb4de9c8876dc..eab41eb8a35b2 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -1583,6 +1583,9 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
     } else {
       Builder.restoreIP(
           PrivCB(InnerAllocaIP, Builder.saveIP(), V, *Inner, ReplacementValue));
+      InnerAllocaIP = {InnerAllocaIP.getPoint()->getParent(),
+                       InnerAllocaIP.getPoint()};
+
       assert(ReplacementValue &&
              "Expected copy/create callback to set replacement value!");
       if (ReplacementValue == &V)

Fixes a crash uncovered by [pr77666.f90](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/pr77666.f90) in the test suite.

In particular, whenever `PrivCB` (the callback responsible for
generating privatizaiton logic for an OMP variable) generates a
multi-block privatization region, the insertion point diverges: the BB
component of the IP can become a different BB from the parent block of
the instruction iterator component of the IP. This PR updates the IP to
make sure that the BB is the parent block of the instruction iterator.
@ergawy ergawy force-pushed the delayed_privatization_25 branch from e4dfb33 to 926cf8d Compare May 31, 2024 05:19
@ergawy ergawy changed the title [OpenMP][][LLVM] Update alloca IP after PrivCB in OMPIRBUIlder [OpenMP][LLVM] Update alloca IP after PrivCB in OMPIRBUIlder May 31, 2024
@kiranchandramohan
Copy link
Contributor

Probably similar to #92430

@@ -1583,6 +1583,9 @@ IRBuilder<>::InsertPoint OpenMPIRBuilder::createParallel(
} else {
Builder.restoreIP(
PrivCB(InnerAllocaIP, Builder.saveIP(), V, *Inner, ReplacementValue));
InnerAllocaIP = {InnerAllocaIP.getPoint()->getParent(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the problem leading to this is that we've split the Alloca block [because the variable is allocatable], and the actual insert point is now the branch of the second block.

I don't think we want the new AllocaIP to be in that second block, in case someone adds more alloca ops - they should go into the first block.

My PR, that Kiran mentioned, has several of this kind of fixes [I will check if it fixes this case too,]. Whenever possible, I try to make the new location at the terminator of the original AllocaIP block.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR, that Kiran mentioned, has several of this kind of fixes [I will check if it fixes this case too,]. Whenever possible, I try to make the new location at the terminator of the original AllocaIP block.

Thanks for looking at this @Leporacanthicus. I tried your PR with a reproducing example and so far it does not handle this case. If you like, you can take the test I added in this PR and add it to yours and handle that case in your PR as well. Either way is fine with me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kiranchandramohan @Leporacanthicus based on the above comment, and since neither PRs solves the "1 BB for all allocas" problem and since #92430 does not solve the crash handled by this PR, I think there is not conflict between this PR and #92430. I think we can proceed with both. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hitting further problems (in the real code that we're trying to make work), so I'm OK with this going in if it is helping you get the job done. I personally would prefer it going into the oriiginal allocaIP BB - I'm well aware that not ALL things go in the original BB, but we don't need to make it worse... :)

Also, if all places that need this fix use code like builder.SetInsertPoint(allocaIP.getBlock()->getTerminator()); or perhaps:

    allocaIP =
        InsertPointTy(allocaIP.getBlock(),
                      allocaIP.getBlock()->getTerminator()->getIterator());

then it makes it a little easier to find ALL such places, and not have to find half a dozen different variants that do slightly different things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    allocaIP =
        InsertPointTy(allocaIP.getBlock(),
                      allocaIP.getBlock()->getTerminator()->getIterator());

I can do that. I actually tried that locally and did not commit it since it did not make much difference in terms of generated IR. However, it makes sense to commit it given the point you mentioned about consistency.

Copy link
Contributor

@Leporacanthicus Leporacanthicus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jun 4, 2024
@ergawy ergawy merged commit 7db4e6c into llvm:main Jun 5, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants