### Is your feature request related to a problem? Please describe joint_matrix with parameters matrix_type::fp16, use::a, Rows=16, Cols=16 is not supported on this device ### Describe the solution you would like Thank you for explaining that this is not supported. ### Describe alternatives you have considered _No response_ ### Additional context Running the program on an Intel GPU shows the error. https://github.com/zjin-lcf/HeCBench/tree/master/src/wmma-sycl The CUDA and HIP programs run successfully on the NVIDIA and AMD GPUs, respectively.