Skip to content

[SYCL][CUDA][HIP] warp misaligned address on CUDA and results mismatch on HIP #5007

Closed
@zjin-lcf

Description

@zjin-lcf

Running the example https://github.com/zjin-lcf/HeCBench/blob/master/aop-sycl/main.cpp built with the CUDA support on a P100 GPU
shows warp misaligned address may be caused by the shared local memory "double4 lsums" in the kernel prepare_svd_kernel<256, PayoffPut>. The SYCL program runs successfully on an Intel GPU.

Did you encounter warp misaligned address when porting a CUDA program ?

Running the example built with the HIP support shows the result does not match the HIP/CUDA version:
To reproduce

make HIP=yes
./main

==============
Num Timesteps         : 100
Num Paths             : 32K
Num Runs              : 1
T                     : 1.000000
S0                    : 3.600000
K                     : 4.000000
r                     : 0.060000
sigma                 : 0.200000
Option Type           : American Put
==============
GPU Longstaff-Schwartz: 0.39776070   (the expected is 0.44783124)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcudaCUDA back-endhipIssues related to execution on HIP backend.runtimeRuntime library related issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions