-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[libclc] Add (fast) normalize to CLC; add half overloads #139759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
939d417
to
8f72f0f
Compare
Does it pass with -cl-denorms-are-zero and with denormal support? |
The OpenCL-CTS doesn't actually test the geometrics builtins with that option - that's just for bruteforce. It passes on an OpenCL implementation which advertises support for single-precision denormals, yes. |
Sounds like a bug
Ideally would check it works with denormal flushing forced on, that's the main risk I can see. If I were to stare at this long enough I could probably guess at where the constant was derived from |
For simplicity the half overloads just call into the float versions of the builtin. Note that in the move some floating-point constants were combined. The vector2 versions of normalize used slightly different constants to the vector3 and vector4 versions of the same builtin. For float it was 0x1.0p-65 vs 0x1.0p-66 and for double 0x1.0p-513 vs 0x1.0p-514. I wasn't sure if this was necessary so this commit replaces the vector2 versions of the constants with the vector3/vector4 ones. The OpenCL-CTS seems okay with it. If this is incorrect then it's not very difficult to split them back out again.
8f72f0f
to
b7a74a8
Compare
Yeah, it sounds simpler just to keep the old constants. I've done that now, and there's no codegen changes. I'll update the PR description. |
For simplicity the half overloads just call into the float versions of the builtin. Otherwise there are no codegen changes to any target.