You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
with no regard for if the ioctls required for legacy mode are even supported in the running kernel, causing obscure and difficult to debug issues like ROCm/rccl#1454
ROCR should
use dmabuf by default when running on the mainline kernel or
at least assert with a resonable error message when HSA_ENABLE_IPC_MODE_LEGACY=0 is not set but the kernel dosent support the required ioctls for legacy mode.
Operating System
any
CPU
any
GPU
any
ROCm Version
ROCm 6.3.0
ROCm Component
ROCR-Runtime
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered:
Hi @IMbackK,
Using the dmabuf for IPC is the long term goal, but this feature was recently implemented and we still have some stability issues with that implementation. This is the reason why we have disabled it in the current code. These stability issues are currently being investigated. Once they are fixed, we will enable option to allow it.
IMO rocr should then at least print something like "not supported configuration" and abort when run on the mainline kernel and the ipc mechanism is used.
Yes, we are currently looking for an elegant way to handle this error because the current issue is that ROCr gets the same error code when the IOCTL does not exist and when the IOCTL fails for other reasons.
Problem Description
Currently ROCR uses drmbuf over kfd only when HSA_ENABLE_IPC_MODE_LEGACY is set to zero see:
ROCR-Runtime/runtime/hsa-runtime/core/util/flag.h
Line 237 in b02b842
with no regard for if the ioctls required for legacy mode are even supported in the running kernel, causing obscure and difficult to debug issues like ROCm/rccl#1454
ROCR should
Operating System
any
CPU
any
GPU
any
ROCm Version
ROCm 6.3.0
ROCm Component
ROCR-Runtime
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: