-
Notifications
You must be signed in to change notification settings - Fork 802
Open
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endhipIssues related to execution on HIP backend.Issues related to execution on HIP backend.
Description
Describe the bug
DPC++ does not use use correctly rounded sqrt()
on AMD GPUs, even if -fno-fast-math
is explicitly passed.
This is contrary to the behavior of both hipcc and AdaptiveCpp which by default correctly round sqrt
, and can lead to misleading benchmark results, or at least make them difficult to interpret.
To reproduce
- Print compiler invocation e.g. using
icpx -fsycl -fsycl-targets=amdgcn-amd-amdhsa -Xsycl-target-backend --offload-arch=gfx906 /dev/null -fno-fast-math -###
- Observe in the output that it links with the bitcode library
oclc_correctly_rounded_sqrt_off.bc
Environment
This was observed on Linux with oneAPI 2024.0.2.
Additional context
I suspect that oclc_correctly_rounded_sqrt_off
is simply the default due to all ROCm bitcode library configuration knobs being initialized with false
(daz = off, finite_only = off, unsafe_math = off, correctly_rounded_sqrt = off).
correctly_rounded_sqrt
however may have to be treated differently, because it is the only one where a setting of off
does not correspond to precision.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endhipIssues related to execution on HIP backend.Issues related to execution on HIP backend.