-
Notifications
You must be signed in to change notification settings - Fork 16
Unstructured OpenMP code #1178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unstructured OpenMP code #1178
Conversation
Address several problems with OpenMP constructs and unstructured code. In fir, code that contains a GOTO, EXIT, or any of a number of other branches is _unstructured_. OpenMP generally prohibits branching into or out of an OpenMP construct, but allows unstructured branches where the source and target are both local to the construct. A structured loop is implemented with a fir.do_loop op, and a structured IF is implemented with a fir.if op. Unstructured loops and IFs are implemented with explicit branches between basic blocks. This PR allows an OpenMP construct to immediately follow unstructured code (see PR 1077), and allows an OpenMP construct to contain unstructured code (see Issue 1120). The same issues are likely present in OpenACC code. The infrastructure changes in this PR should also be valid for OpenACC code, but file OpenACC.cpp is not changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks reasonable to me. Please wait for a more enlightened opinion regarding these CFGs/OpenMP changes like @schweitzpgi and @kiranchandramohan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks @vdonaldson for this patch. It will help us unblock our SNAP application work. Really appreciate that your work in fixing this.
I have a few minor comments for tests and one question otherwise.
I think I can close #1077 in favour of this patch. Is that OK?
@Leporacanthicus can you give this patch a test with the SNAP code without your workarounds?
@@ -0,0 +1,111 @@ | |||
! RUN: bbc %s -fopenmp -o "-" | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: A comment on what each of these test is testing might be helpful for future reference.
Nit: It might also be good to have PFT tests for these cases like the one in 1077.
d01b9ef#diff-280157740e4d34bdf8b6c3fa1aafdd92f337a59badc9e36162ec1484a7e36eca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments for each test subroutine. Subroutine ss1 exercises the 1077 functionality.
@@ -0,0 +1,111 @@ | |||
! RUN: bbc %s -fopenmp -o "-" | FileCheck %s | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Suggestion for comment: Test that OpenMP constructs can be target of branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment
!$omp end master | ||
print* | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Suggestion for comment: Test unstructured code inside OpenMP construct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment
print*, 'ss2-C', i | ||
print* | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Suggestion for comment: Test unstructured code inside a do loop inside OpenMP do construct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment
auto *eval = constructAndDirectiveStack.back(); | ||
if (eval->isExecutableDirective()) { | ||
// A construct at the end of an (unstructured) OpenACC or OpenMP | ||
// construct region must have an exit target inside the region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the SS3 test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is exercised by test subroutines ss2 and ss3.
@@ -93,9 +93,27 @@ static void genObjectList(const Fortran::parser::OmpObjectList &objectList, | |||
} | |||
} | |||
|
|||
/// Create empty blocks for the current region. | |||
/// These blocks replace blocks parented to an enclosing region. | |||
void createEmptyRegionBlocks( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function called for each OpenMP construct/region?
I don't see it changing the current region. Will this work for nested OpenMP regions (where the outer and inner have unstructured code)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intended to operate on the current region, as updated before the call in createBodyOfOp.
I'm trying this patch now on SNAP (with small changes, rather than the larger changes I have to "work around unstructured constructs") I hit this:
I'm attempting to produce a small reproducer. I'm not suggesting this is a blocker for getting this patch in, but that there may be some more work needed for this to be completely solved - up to others to decide whether we should "work more on this patch" or "add separate patch later". |
Here's a small reproducer:
Note that a previously posted program with |
Good news: As far as I can tell, the remainder of the SNAP code compiles and builds correctly with this patch. |
That was the intent of adding a more narrowly focused implementation of the 1077 code here, along with some related comment updates. We've still got problems to solve though. |
Thanks for this small test case. It turns up at least one more problem variation, and hits at the intersection of some of the other known issues. The most obvious problem is that the The test case as is hits a |
Feel free to work on that in a separate patch. I can see how this patch helps a whole lot, and I don't know enough about this subject to say which is the better approach - two patches, or try to fix everything in one patch.
Yes, I have reverted the commit that causes that check - it hits a couple of different places in SNAP. I think there is an updated patch in review at the moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks, Val for this patch. I guess we can fix the other issue in a separate patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I have raised #1192 for the remaining known issue in SNAP.
With this fix, I'm able to compile all of SNAP, and with four $OMP directives commented out, the application runs to completion with what appears to be correct results.
I've pushed a change that takes a step toward resolving the issue with (implicit or explicit) nested constructs. A full solution to that problem may need special code for some cases, such as the implicit As a general note, unstructured code (in the fir sense) in OpenMP code is probably not all that common. One way to artificially "create" more tests that exercise unstructured OpenMP code generation is to use the I'll merge this PR as a base for OpenMP developers to address the remaining issues. |
Since the FIR operations are mostly structured, it is only the functions that could contain multiple blocks inside an operation. This changes with OpenMP since OpenMP regions can contain multiple blocks. For unstructured code, the blocks are created in advance and belong to the top-level function. This caused code in OpenMP region to be placed under the function level. In this fix, if the OpenMP region is unstructured then new blocks are created inside it. Note1: This is part of upstreaming from the fir-dev branch of https://github.com/flang-compiler/f18-llvm-project. The code in this patch is a subset of the changes in flang-compiler#1178. Reviewed By: vdonaldson Differential Revision: https://reviews.llvm.org/D126293 Co-authored-by: Val Donaldson <vdonaldson@nvidia.com> Co-authored-by: Eric Schweitz <eschweitz@nvidia.com> Co-authored-by: Valentin Clement <clementval@gmail.com>
Since the FIR operations are mostly structured, it is only the functions that could contain multiple blocks inside an operation. This changes with OpenMP since OpenMP regions can contain multiple blocks. For unstructured code, the blocks are created in advance and belong to the top-level function. This caused code in OpenMP region to be placed under the function level. In this fix, if the OpenMP region is unstructured then new blocks are created inside it. Note1: This is part of upstreaming from the fir-dev branch of https://github.com/flang-compiler/f18-llvm-project. The code in this patch is a subset of the changes in flang-compiler/f18-llvm-project#1178. Reviewed By: vdonaldson Differential Revision: https://reviews.llvm.org/D126293 Co-authored-by: Val Donaldson <vdonaldson@nvidia.com> Co-authored-by: Eric Schweitz <eschweitz@nvidia.com> Co-authored-by: Valentin Clement <clementval@gmail.com>
Unstructured regions presents some issues for OpenMP code generation. While there are no branches out of the OpenMP region, there can be branches inside. This required the availability of an artificial target at the end of an OpenMP region. This was implemented by insertion an artifical `continue` and marking it as a target for a branch. (flang-compiler#1178) The artificial target is not required for OpenMP loops. Since the DO loop end can itself be a target of a branch. Moreover, insertion of the continue between the end of the loop and the end of the OpenMP loop construct presents problems since the OpenMP MLIR loop construct models both the loop and the construct. This can cause the terminator of the OpenMP loop construct to be missed. This patch solves the issue by skipping the insertion of the continue. Note: This issue is only hit if the `end openmp loop` directive is missed. This patch fixes the issues in: -> #58378 -> flang-compiler#1426 Fixes #58378 Reviewed By: vdonaldson Differential Revision: https://reviews.llvm.org/D151700
Address several problems with OpenMP constructs and unstructured code.
In fir, code that contains a GOTO, EXIT, or any of a number of other
branches is unstructured. OpenMP generally prohibits branching into
or out of an OpenMP construct, but allows unstructured branches where
the source and target are both local to the construct. A structured
loop is implemented with a fir.do_loop op, and a structured IF is
implemented with a fir.if op. Unstructured loops and IFs are implemented
with explicit branches between basic blocks. This PR allows an OpenMP
construct to immediately follow unstructured code (see PR 1077), and
allows an OpenMP construct to contain unstructured code (see Issue 1120).
The same issues are likely present in OpenACC code. The infrastructure
changes in this PR should also be valid for OpenACC code, but file
OpenACC.cpp is not changed.