-
Notifications
You must be signed in to change notification settings - Fork 797
Closed
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endhipIssues related to execution on HIP backend.Issues related to execution on HIP backend.runtimeRuntime library related issueRuntime library related issue
Description
Running the example https://github.com/zjin-lcf/HeCBench/blob/master/aop-sycl/main.cpp built with the CUDA support on a P100 GPU
shows warp misaligned address may be caused by the shared local memory "double4 lsums" in the kernel prepare_svd_kernel<256, PayoffPut>. The SYCL program runs successfully on an Intel GPU.
Did you encounter warp misaligned address when porting a CUDA program ?
Running the example built with the HIP support shows the result does not match the HIP/CUDA version:
To reproduce
make HIP=yes
./main
==============
Num Timesteps : 100
Num Paths : 32K
Num Runs : 1
T : 1.000000
S0 : 3.600000
K : 4.000000
r : 0.060000
sigma : 0.200000
Option Type : American Put
==============
GPU Longstaff-Schwartz: 0.39776070 (the expected is 0.44783124)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcudaCUDA back-endCUDA back-endhipIssues related to execution on HIP backend.Issues related to execution on HIP backend.runtimeRuntime library related issueRuntime library related issue