Missing warning with clang-llvm compiler

Hi all,

I'm working on the migration of cuda-sample from Nvidia cuda toolkit and running the migrated code on Nvidia GPU (`Tesla P100-PCIE-12GB`) using open source DPC++ compiler.
I've successfully migrated the convolutionSeparable application and am able to run the code on Nvidia GPU. Since there is some performance gap I started experimenting with the code and as part of the trails I observed that `nvcc throws a warning` if we set a variable and never use it while the open-source DPC++ compiler (invocation - clang++) does not issue any such warnings.
Please find the attached migrated code below
[dpct_output.zip](https://github.com/intel/llvm/files/11091194/dpct_output.zip)

Steps to reproduce:
Compilation:
`clang++  -fsycl -fsycl-targets=nvptx64-nvidia-cuda -I ../../../Common/ *.cpp`
Execution:
`./a.out`

Basically, there are 2 parts in the convolutionSeparable.dp.cpp file in which one is loading the input data and the other is computation. In order to understand where the migrated code spends a lot of time I just commented out the computation part (commented out the computation part in both rowkernelgpu and columnkernelgpu functions) and tried running the code and here is the result 
```
./a.out] - Starting...
Running on Tesla P100-PCIE-12GB
Image Width x Height = 3072 x 3072

Allocating and initializing host arrays...
Allocating and initializing CUDA arrays...
Running GPU convolution (16 identical iterations)...

convolutionSeparable, Throughput = 53907.5108 MPixels/sec, **Time = 0.00018 s**, Size = 9437184 Pixels, NumDevsUsed = 1, Workgroup = 0

Reading back GPU results...
Checking the results...
Shutting down...
```
While repeating the same experiment with native cuda code it throws some warnings as shown below
` warning #550-D: variable "s_Data" was set but never used`

with output
```
[./convolutionSeparable] - Starting...
GPU Device 0: "Pascal" with compute capability 6.0

Image Width x Height = 3072 x 3072
Allocating and initializing host arrays...
Allocating and initializing CUDA arrays...
Running GPU convolution (16 identical iterations)...

convolutionSeparable, Throughput = 127100.1251 MPixels/sec, **Time = 0.00007 s**, Size = 9437184 Pixels, NumDevsUsed = 1, Workgroup = 0

Reading back GPU results...
Shutting down...
```

As you can see the timings are nearly 2.5X times that of the cuda's timing.

Could someone let me know why there is a time gap? Is there any significant meaning behind the nvcc warning? 
what exactly does that mean and why llvm clang miss that warning?

Please let me know if you need any other information.

Thanks in advance
~Vidya.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missing warning with clang-llvm compiler #8836

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing warning with clang-llvm compiler #8836

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions