-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[libclc] Enable -ffp-contract=fast-honor-pragmas except for exp/trig/hyperbolic funcs #153137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…unctions According to OpenCL spec, native_* functions have implementation-defined accuracy and typically have better performance. We can enable floating- point contraction optimizations for them.
I think fp contract should be globally enabled in the build, and selectively disabled in the handful of places that it is problematic (namely specific blocks in expF, sinbF, and trig reductions) |
libclc/CMakeLists.txt
Outdated
@@ -304,7 +304,7 @@ set_source_files_properties( | |||
${CMAKE_CURRENT_SOURCE_DIR}/opencl/lib/generic/math/native_sin.cl | |||
${CMAKE_CURRENT_SOURCE_DIR}/opencl/lib/generic/math/native_sqrt.cl | |||
${CMAKE_CURRENT_SOURCE_DIR}/opencl/lib/generic/math/native_tan.cl | |||
PROPERTIES COMPILE_OPTIONS -fapprox-func | |||
PROPERTIES COMPILE_OPTIONS "-fapprox-func;-ffp-contract=fast" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe should use -ffp-contract=fast-honor-pragmas
, not sure if the stupid interpretation ever got fixed for fast
…o exponential/trigonometric/hyperbolic funcs
@@ -6,6 +6,8 @@ | |||
// | |||
//===----------------------------------------------------------------------===// | |||
|
|||
#pragma clang fp contract(off) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be much more targeted. The problematic areas can be specific block scopes inside of individual functions. I'd suggest running the conformance test with it enabled globally, and then finding the specific places that require this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. in exp f32
- float e = BUILTIN_RINT_F32(ph);
- float a = ph - e + pl;
+ float a, e;
+ {
+ #pragma OPENCL FP_CONTRACT OFF
+ e = BUILTIN_RINT_F32(ph);
+ a = ph - e + pl;
+ }
+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be much more targeted. The problematic areas can be specific block scopes inside of individual functions. I'd suggest running the conformance test with it enabled globally, and then finding the specific places that require this
thanks, I'll run opencl cts on intel gpu to find the places.
Enable -ffp-contract=fast-honor-pragmas globally improves performance.
Disable in functions that may have problem with the flag.