Open
Description
Hi, I'm running a cupy test with dask, but ran into a CUDA error that's only reproducible with dask-cuda but not with cupy alone, and hoping that I can get some help here. The error:
2024-05-30 07:53:26,923 - distributed.worker - WARNING - Compute Failed
Key: ('uniform-fb64484a444179abec146c7c4ac41c23', 103, 0)
Function: _apply_random_func
args: (<class 'cupy.random._bit_generator.XORWOW'>, 'uniform', SeedSequence(
entropy=1,
spawn_key=(103,),
), (65536, 256), [0.0, 1.0], {})
kwargs: {}
Exception: "CUDARuntimeError('cudaErrorDevicesUnavailable: CUDA-capable device(s) is/are busy or unavailable')"
The script I'm running along with dependency version:
import dask
from dask import array as da
import dask_cuda
from dask_cuda import LocalCUDACluster
from distributed import Client, wait
import cupy
print(dask.__version__)
print(dask_cuda.__version__)
print(cupy.__version__)
# 2024.1.1
# 24.04.00
# 13.1.0
def make_regression(n_samples: int, n_features: int) -> tuple[da.Array, da.Array]:
rng = da.random.default_rng(1)
X = rng.uniform(size=(n_samples, n_features), chunks=(256**2, -1))
y = X.sum(axis=1)
return X, y
def main(client: Client) -> None:
X, y = make_regression(n_samples=2**25, n_features=256)
X, y = client.persist([X, y])
wait([X, y])
if __name__ == "__main__":
with LocalCUDACluster() as cluster:
with Client(cluster) as client:
with dask.config.set({"array.backend": "cupy"}):
main(client)
If I run cupy alone, it can generate random numbers just fine:
>>> import cupy
>>> rng = cupy.random.default_rng(1)
>>> rng.uniform(size=256**2)
array([0.85849577, 0.01410829, 0.28965574, ..., 0.6419934 , 0.35287664,
0.16616132])
Lastly, I'm running the test on a EGX cluster with 4 T4s:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 495.29.05 Driver Version: 495.29.05 CUDA Version: 11.5 |
|-------------------------------+----------------------+----------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Metadata
Assignees
Labels
No labels