You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix ROCm global load inline assembly in Marlin sparse kernel
Modify the cp_async4 functions to use the correct extern declaration for __builtin_amdgcn_global_load_lds on ROCm platforms. This ensures proper inline assembly and cross-platform compatibility for the Marlin sparse kernel's memory loading operations.
0 commit comments