Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Move Bfloat16 math functions out of experimental #11506

Draft
wants to merge 4 commits into
base: sycl
Choose a base branch
from

Conversation

JackAKirk
Copy link
Contributor

Bfloat16 math functions are generally supported via software implementation as described in the bfloat16 extension document.
These APIs are straightforward and should be future proof, as such it makes sense to mark them experimental ASAP.
This PR moves bfloat16 math functions back into the bfloat16 extension, and resultantly out of experimental namespace.
This makes it more straightforward to allow dpct to access bfloat16 optimizations available on intel and Nvidia hardware: oneapi-src/SYCLomatic#1341

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk temporarily deployed to WindowsCILock October 11, 2023 13:56 — with GitHub Actions Inactive
@JackAKirk JackAKirk temporarily deployed to WindowsCILock October 11, 2023 14:22 — with GitHub Actions Inactive
@JackAKirk JackAKirk closed this Oct 12, 2023
@JackAKirk JackAKirk reopened this Oct 12, 2023
@JackAKirk JackAKirk temporarily deployed to WindowsCILock October 12, 2023 11:33 — with GitHub Actions Inactive
@JackAKirk JackAKirk temporarily deployed to WindowsCILock October 12, 2023 12:00 — with GitHub Actions Inactive
@JackAKirk JackAKirk marked this pull request as ready for review October 12, 2023 19:56
@JackAKirk JackAKirk requested review from a team as code owners October 12, 2023 19:56
`isnan`, `ceil`, `floor`, `cos`, `sin`, `exp`, `exp2`, `exp10`, `log`, `log2`,
`log10`, `rint`, `sqrt`, `rsqrt` and `trunc` SYCL floating point math functions.
These functions can be used as element wise operations on matrices, supplementing
the `bfloat16` support in the sycl_ext_oneapi_matrix extension.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last sentence seems confusing because this extension does not add support to these math functions for joint_matrix types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can remove that sentence.

@@ -83,9 +83,21 @@ tolerate lower precision. Some implementations may still perform operations
on this type using 32-bit math. For example, they may convert the `bfloat16`
value to `float`, and then perform the operation on the 32-bit `float`.

This extension also adds `bfloat16` support to the `fma`, `fmin`, `fmax`, `fabs`,
`isnan`, `ceil`, `floor`, `cos`, `sin`, `exp`, `exp2`, `exp10`, `log`, `log2`,
`log10`, `rint`, `sqrt`, `rsqrt` and `trunc` SYCL floating point math functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do not need to list each math function here. Instead, let's say something like:

This extension also adds bfloat16 support to several of the SYCL math functions, which are listed below in the section "Math Functions".

Duplicating the names here raises the possibility that the list may get out of date in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure sounds good.

The bfloat16 type is supported on all devices. DPC++ currently supports this
type natively on Intel Xe HP GPUs and Nvidia GPUs with
The bfloat16 type and math functions are supported on all devices. DPC++
currently supports this type natively on Intel Xe HP GPUs and Nvidia GPUs with
Compute Capability >= SM80. On other devices, and in host code, it is emulated
in software.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! How is it that this extension does not have a feature-test macro? There should be a section "Feature test macro" as defined in the template.

Does the implementation define a feature-test macro, and we just forgot to document it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I can add one.


template <size_t N>
sycl::marray<bool, N> isnan(sycl::marray<bfloat16, N> x);
} // namespace sycl::ext::oneapi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This synopsis does not match the form we use for the existing isnan overloads. That existing form looks like:

template<typename NonScalar>
/*return-type*/ isnan(NonScalar x)

With the following specification:

Constraints: Available only if all of the following conditions are met:

  • NonScalar is marray, vec, or the __swizzled_vec__ type; and

  • The element type is float, double, or half.

Returns: If NonScalar is marray, returns true for each element of x only if x[i] has a NaN value. If NonScalar is vec or the __swizzled_vec__ type, returns -1 for each element of x if x[i] has a NaN value and
returns 0 otherwise.

I'd like this extension to use the same form, essentially broadening the constraint to allow NonScalar to be marray with element type bfloat16.

I also think we should extend the existing functions in the sycl namespace, rather than adding new functions in the sycl::ext::oneapi namespace. This is still consistent with the extension naming guidelines because the bfloat16 type is defined in the extension namespace. Thus, there is no danger of conflict with another extension and users still know they are using an extension when calling these functions with bfloat16.

A complicating factor is that DPC++ is in a state of transition from the old form of these math function to the new SYCL 2020 conformant form. Currently, DPC++ only defines the functions in the conformant form if the user opts-in by predefining a macro. I think the most reasonable path for the bfloat16 versions of these functions is to follow suit. By default, they will have the old non-conformant form, and they will only have the conformant form when the user pre-defines the macro. @steffenlarsen might have some input here, though.

I'm not sure how to handle this in the spec. One option is to add a general disclaimer NOTE saying that DPC++ currently implements the math functions as defined in this extension spec when the application predefines the macro. When the macro is not defined, the math functions accept bfloat16 parameters in a manner that is consistent with the standard types. We can then show an example of a typical function to illustrate the difference.

One other note of interest is that the published SYCL 2020 spec (revision 7) does not document the new form of these functions. The WG changed the specification after revision 7 was published. I plan to publish revision 8 within the next week, and that will show the new forms.

I also have a question about vec support. The math functions currently support both marray and vec whenever a non-scalar is supported. This extension currently only supports bfloat16 with marray. Why are we excluding support for vec? I wonder if it would actually be easier to implement if we supported both marray and vec since that's how the other types are handled.

This comment applies, obviously, to all the other functions below too.

Copy link
Contributor Author

@JackAKirk JackAKirk Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All makes sense to me. For the vec question. I don't think it was excluded. I didn't add it initially when creating the first iteration of the bfloat16 math function extension because we did not have a reason to add vec when we had marray support, at least in the cuda backend; but I imagined that at some point other backends would likely motivate the addition of vec bfloat16 math functions.
Even if they are only added in order to have math functions for scalar, vec, and marray types that are consistent with the rest of the types, I see that this could make sense.
I can make such a change to the spec, and add a generic implementation for bfloat16 math functions for vec type in this PR?
Or perhaps I could delay this till later if you think that is better?
I suppose features can still be added to supported extensions so long as they are new additions rather than changes to existing functions/types etc?

Copy link
Contributor Author

@JackAKirk JackAKirk Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reflection, I think it would make sense to wait until the revision 8 is published, and then use the new math function definition patterns for the bfloat16 math functions too.

Since bfloat16 math functions are new, would it be better to just support them using the new math function definition patterns (forms)? As far as I understand it the move to the new form doesn't actually change how a user would define the function in practice, just that the new forms use a mixture of overloading and templating instead of the previous situation that just used templates?

If the above is the case it would seem overly complex to introduce multiple macros for bfloat16 math functions, especially if they form part of the extension for the bfloat16 class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reflection, I think it would make sense to wait until the revision 8 is published, and then use the new math function definition patterns for the bfloat16 math functions too.

I expect revision 8 to be published soon. Yes, I think it would be good for this extension specification to use the same pattern and specification style that revision 8 has.

Since bfloat16 math functions are new, would it be better to just support them using the new math function definition patterns (forms)? As far as I understand it the move to the new form doesn't actually change how a user would define the function in practice, just that the new forms use a mixture of overloading and templating instead of the previous situation that just used templates?

Most application code won't be impacted by the move from old to new form, but there are some edge cases that will be visible to applications. I think I may have been premature about the macro. Our plan is to have a macro in the headers, which selects between the old and new forms of the builtin functions, but I think it is not implemented yet.

I agree that in theory the support for bfloat16 could be added with only the new form (not selected by the macro) because we have no legacy code to support with the old form. I was just thinking that it might be hard to implement this way. I assume the current implementation of the math functions has a set of constraints for the NonScalar template parameter. Therefore, I was thinking that you would simply broaden these constraints to allow NonScalar to be marray<bfloat16>. Once we add the macro, wouldn't this mean that the bfloat16 support was also affected by the macro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the current implementation of the math functions has a set of constraints for the NonScalar template parameter.
Therefore, I was thinking that you would simply broaden these constraints to allow NonScalar to be marray<bfloat16>. > Once we add the macro, wouldn't this mean that the bfloat16 support was also affected by the macro?

At the moment the bfloat16 math functions support marray<bfloat16, N> where {N} means any positive value of size_t type.
So as I understand it this wouldn't affect bfloat16 support.

However as you pointed out before there is no sycl::vec support for bfloat16. I can only guess at whether Intel etc will want to introduce this or not. In relation to this question:

Although the vec question would as far as I can tell potentially just be an addition that I suppose could be added later; it is one of a few questions that seem worthwhile trying to address sooner rather than later.
Another issue that is not clear to me is where the bfloat16 functions in the intel::math namespace fit in relation to this bfloat16 extension. To be more concrete here are a few questions I have:

Although the existing bfloat16 math functions that are detailed in this PR appear to be uncontroversial, maybe it makes sense to get clearer answers to the above before moving any bfloat16 math functions out of experimental.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can only answer this question:

Is the long term plan that bfloat16 type and associated functions be a core part of sycl, or available as a header only library?

Yes, I think we should consider eventually adding bfloat16 as a core part of SYCL. We don't want it to be a header-only library because it needs some underlying h/w support for efficiency.

We need someone to be the "owner" of the bfloat16 feature, who can decide what functions make sense to add to this extension.

Copy link
Contributor

@KseniyaTikhomirova KseniyaTikhomirova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from RT source code side. Will approve after Greg's comments fix.

Removed overly verbose list of math functions supported in intro.

Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@JackAKirk JackAKirk marked this pull request as draft April 10, 2024 18:18
Copy link
Contributor

github-actions bot commented Oct 8, 2024

This pull request is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be automatically closed in 30 days.

@github-actions github-actions bot added the Stale label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants