Skip to content

[SYCL] Specialize atomic fetch_add for floating point types #2765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jan 21, 2021

Conversation

AGindinson
Copy link
Contributor

@AGindinson AGindinson commented Nov 12, 2020

The new EXT/SPV_EXT_shader_atomic_float_add SPIR-V extension
allows us to further specialize atomic::fetch_add() for
floating point types. In device mode, we'll now be creating
an external call to a built-in-like __spirv_AtomicFAddEXT().
This is similar to what is done for other atomic binary
instructions, e.g. the integer specialization of fetch_add()
being mapped onto __spirv_AtomicIAdd().

Furthermore, atomic::fetch_sub() is also re-implemented
to use __spirv_AtomicFAddEXT(), the added operand being
a negation of the original one.

The new implementation can be exposed if a dedicated macro is
defined: SYCL_USE_NATIVE_FP_ATOMICS. Otherwise, a fallback
is used, where the atomic operation is done via spinlock emulation.
At the moment of committing this, only Intel GPUs support the
"native" implementation, which relies on a SPIR-V extension.

Tests for the feature have been finalized in
intel/llvm-test-suite#104.

Signed-off-by: Artem Gindinson artem.gindinson@intel.com

@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch 3 times, most recently from 9668b1b to d141d43 Compare November 12, 2020 11:11
@AlexeySotkin AlexeySotkin self-requested a review November 12, 2020 12:08
@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch 3 times, most recently from c7961b1 to 7579487 Compare November 30, 2020 12:57
Pennycook
Pennycook previously approved these changes Nov 30, 2020
Copy link
Contributor

@Pennycook Pennycook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me.

One quick question to make sure I understand: am I right in thinking that we now default to the native FAdd implementation for all our devices? So the old compare-exchange implementation will now only be tested for CUDA devices?

@bader We should add a TODO somewhere to implement AtomicFAddEXT in the CUDA backend.

@bader
Copy link
Contributor

bader commented Nov 30, 2020

We should add a TODO somewhere to implement AtomicFAddEXT in the CUDA backend.

I suggest adding a new issue with "cuda" tag - https://github.com/intel/llvm/issues.

@AGindinson
Copy link
Contributor Author

One quick question to make sure I understand: am I right in thinking that we now default to the native FAdd implementation for all our devices? So the old compare-exchange implementation will now only be tested for CUDA devices?

Yes, this is aimed for all device BEs. I'll create an issue for CUDA once the extension support is pulled down into this repository.

@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch from 7579487 to 8fd82f0 Compare December 2, 2020 12:41
@AGindinson AGindinson changed the title [DRAFT][SYCL] Support atomic float add in headers & SPIR-V [SYCL] Specialize atomic fetch_add for floating point types Dec 2, 2020
@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch from 02e8042 to d8712b5 Compare December 8, 2020 20:49
@AGindinson
Copy link
Contributor Author

@Pennycook, could you please check if __SYCL_EMULATE_NATIVE_ATOMICS__ is a fine name for the macro (as introduced by d8712b5)?

Also, should the IR checks in test/atomic_ref/ be updated to include the emulated checks under the macro defined, or would it be sufficient to update github.com/intel/llvm-test-suite to use the macro for non-supporting targets and make sure all tests pass?

@AGindinson
Copy link
Contributor Author

/summary:run

@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch from 56da357 to fc3ffef Compare December 9, 2020 09:35
@Pennycook
Copy link
Contributor

@Pennycook, could you please check if __SYCL_EMULATE_NATIVE_ATOMICS__ is a fine name for the macro (as introduced by d8712b5)?

That commit says __SYCL_EMULATE_FLOAT_ATOMICS__ rather than __SYCL_EMULATE_NATIVE_ATOMICS__. But I think __SYCL_EMULATE_FLOAT_ATOMICS__ is a good name.

Also, should the IR checks in test/atomic_ref/ be updated to include the emulated checks under the macro defined, or would it be sufficient to update github.com/intel/llvm-test-suite to use the macro for non-supporting targets and make sure all tests pass?

Good question. If we're documenting __SYCL_EMULATE_FLOAT_ATOMICS__ then I think we need to update the tests -- a device that implements AtomicFAdd would need to run correctly with and without the macro enabled, and so we should test that to ensure we don't accidentally break things. If we're not documenting __SYCL_EMULATE_FLOAT_ATOMICS__, and it's only a temporary implementation detail, I think it's sufficient to ensure that the existing tests pass when the macro is set to the expected value.

@AGindinson
Copy link
Contributor Author

That commit says SYCL_EMULATE_FLOAT_ATOMICS rather than SYCL_EMULATE_NATIVE_ATOMICS. But I think SYCL_EMULATE_FLOAT_ATOMICS is a good name.

That was an unfortunate typo on my side, sorry for that.

A device that implements AtomicFAdd would need to run correctly with and without the macro enabled.

Fair enough. And yet I tend to agree that:

it's only a temporary implementation detail

My personal approach would be to rely on the E2E tests passing with correct macro presence/absence. Should there be a lack of consensus among the reviewers, though, I'm prepared to lay this back and make all testing stages as detailed as possible. How would you recommend treating this? @AlexeySotkin, @AlexeySachkov, could you comment as well, please?

@AGindinson AGindinson marked this pull request as ready for review January 13, 2021 13:50
@AGindinson AGindinson requested a review from a team as a code owner January 13, 2021 13:50
@AGindinson AGindinson requested a review from v-klochkov January 13, 2021 13:50
@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch from fc3ffef to f7a585a Compare January 13, 2021 14:32
@AGindinson AGindinson requested a review from Pennycook January 13, 2021 14:40
Artem Gindinson added 11 commits January 21, 2021 15:54
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
New macro name is __SYCL_USE_NATIVE_FP_ATOMICS__

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
…here yet

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
Make the user-exposed macro consistent with naming standards
for C++ libraries, ours in particular.

https://eel.is/c++draft/lex.name

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
@AGindinson AGindinson force-pushed the private/agindins/atomic-fadd-draft branch from e484016 to be71125 Compare January 21, 2021 12:55
@AGindinson
Copy link
Contributor Author

Updates:

  1. Switched to using the emulated implementation as default.
  2. Renamed the macro correspondingly - it's SYCL_USE_NATIVE_FP_ATOMICS from now on. This is reflected in the test-suite PR: Address the atomics' macro rework: emulation runs by default llvm-test-suite#104
  3. Removed the changes from atomic.hpp, as this would've regressed the common atomic class for other targets. atomic_ref remains as the go-to class for native atomics.

romanovvlad
romanovvlad previously approved these changes Jan 21, 2021
Copy link
Contributor

@romanovvlad romanovvlad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
@bader bader merged commit 37a9a2a into intel:sycl Jan 21, 2021
AGindinson added a commit to AGindinson/llvm that referenced this pull request Mar 3, 2021
Minor implementation details aside, this is a follow-up to intel#2765.
The end-to-end tests are already done, the latest update being
intel/llvm-test-suite#118.

Signed-off-by: Artem Gindinson <amgindinson@gmail.com>
bader pushed a commit that referenced this pull request Mar 4, 2021
Minor implementation details aside, this is a follow-up to #2765.
The end-to-end tests are already done, the latest update being
intel/llvm-test-suite#118.

Signed-off-by: Artem Gindinson <amgindinson@gmail.com>
@AGindinson AGindinson deleted the private/agindins/atomic-fadd-draft branch September 22, 2021 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants