-
Notifications
You must be signed in to change notification settings - Fork 110
Fix atomic exhaustive tests. #1207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| #if defined(RAJA_TEST_EXHAUSTIVE) | ||
| , RAJA::omp_parallel_for_static_exec< > | ||
| , RAJA::omp_parallel_for_static_exec<4> | ||
| , RAJA::omp_parallel_for_nowait_static_exec< > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these exec policies removed from the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, they no longer exist as valid policies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, these tests are already covered in lines 58-59, and in the forall/region tests. We can safely remove these two lines without further additions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot these were for atomic policies, so added them back in. Tested with nvcc+gcc.
| reduce_minloc.reset(resetVal[0]); | ||
| reduce_maxloc.reset(resetVal[0]); | ||
| reduce_minloctup.reset(resetVal[0]); | ||
| reduce_maxloctup.reset(resetVal[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trws Apparently, the maxloctup.reset() was getting optimized out in the Sequential version of this test when compiled with NVCC + GCC. Changing the ordering fixes this for some reason, but doing so feels clunky. If there is some better way to get around this, please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it get optimized away with GCC only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, with GCC only. Not getting this error with other compilers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you say "optimized out" do you mean the side-effect didn't take effect so it was failing in the assertion below? That's the kind of thing that really shouldn't happen unless we accidentally have UB somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the check on line 157 fails. It returns the initialized values, or more specifically, the default initialized values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a definite cause for concern. It probably shouldn't matter for sequential anyway, but is it valid to use reset after compute but before get like this? I know we have places where get actually completes the reduction, and can't remember if all of that is wired up right to allow this order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the cuda and hip backends this should be safe, as reset also finalizes the reduction by calling get before resetting the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems odd to me. what is a use case for calling reset before getting the reduced value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the author of the test not knowing about calling get before reset, there is probably no reason for this to happen. 😃
I'll modify the PR to put a get call into reset().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's odd, inserting get into the reset call does not help. Nor does manually calling get in the test itself. I'll still push the changes to reset, but keep the test with this modified ordering of calls.
| endif() | ||
| endif (EXTERNAL_CAMP_SOURCE_DIR) | ||
|
|
||
| if (RAJA_ENABLE_TARGET_OPENMP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidbeckingsale are you OK with these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, seems reasonable - matches what we do in Umpire to ensure correct defines are passed to camp.
|
This is one you should file an issue for so we have a record of it.
Something is fishy here. Maybe something we should point `ubsan` toward
and see what happens.
|
a67b262
Good idea, created one here #1217. |
|
@davidbeckingsale @rhornung67 This should be good to go. I added |
Summary