Skip to content

Add an optional diagnostic for the use of double-precision ops in kernels #5783

Closed
@al42and

Description

@al42and

Is your feature request related to a problem? Please describe

In some (many?) cases, the kernels are intended to be run on a GPU, most of which have reduced FP64 performance compared to FP32. However, it's very easy in C++ to accidentally introduce double-precision arithmetic.

Such issues can be detected with a profiler, but it might be helpful to have an opt-in, compile-time diagnostic to warn about double-precision ops in the device kernels. No warning is to be emitted unless the user explicitly requested it.

Toy example (based on real cases). Excerpt:

  sycl::event ev = queue.submit([&](sycl::handler &cgh) {
    auto buffer_dev = buffer.get_access<sycl::access_mode::write>(cgh);
    cgh.parallel_for<class DummyKernel>(
        sycl::range<1>{numThreads}, [=](sycl::id<1> threadId) {
          int x = threadId.get(0);
          float x2 = x * 2.0;      // Forgot F suffix
          float y = fma(x, 2, x2); // Used fma instead of sycl::fma<float>
          buffer_dev[threadId] = y;
        });
  });

Part of the SPIR-V output (clang++ -fsycl -ffast-math -Wall simple.cpp -O3 -o simple && SYCL_DUMP_IMAGES=1 ./simple):

   %conv_i_i = OpUConvert %uint %33
  %conv2_i_i = OpConvertSToF %double %conv_i_i
    %mul_i_i = OpFMul %double %conv2_i_i %double_2
  %conv3_i_i = OpFConvert %float %mul_i_i
  %conv5_i_i = OpFConvert %double %conv3_i_i
         %47 = OpExtInst %double %1 fma %conv2_i_i %double_2 %conv5_i_i
  %conv6_i_i = OpFConvert %float %47

So, we're getting unnecessary conversions between FP64 and FP32, and FP64 ops when FP32 were probably intended.

Describe the solution you would like

Have an optional -Wsycl-fp64-on-device warning switch.

Describe alternatives you have considered

  • A kernel attribute enabling the diagnostic for a single kernel?
  • Things like float x2 = x * 2.0 probably can get be optimized in the ffast-math mode?

Additional context

The example above is compiled correctly, so it's not a bug. But it is likely not what the user intended, so a diagnostic would be a quick way to catch such problems without even having to run a profiler.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions