Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ray job submission keeps 503 server error #2201

Open
xushijie opened this issue Dec 12, 2023 · 0 comments
Open

ray job submission keeps 503 server error #2201

xushijie opened this issue Dec 12, 2023 · 0 comments

Comments

@xushijie
Copy link

Issue

I following this instruction kuberay-gpu-training-example to install ray cluster in my desktop, which has one GPU. I create a new fresh cluster locally instead of remote GCP. The installation succeeds, and I can access the dashboard http://localhost:8265/ , however, the job submission on the console fails with the message:
image

Meanwhile, http://localhost:8265/api/version can result a json message:
{"version": "4", "ray_version": "2.2.0", "ray_commit": "b6af0887ee5f2e460202133791ad941a41f15beb"}

So any suggestion for this?

The steps I did are:

helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
helm install kuberay-operator kuberay/kuberay-operator --version 1.0.0

# Create a Ray cluster
kubectl apply -f https://raw.githubusercontent.com/ray-project/ray/master/doc/source/cluster/kubernetes/configs/ray-cluster.gpu.yaml

# port forwarding
kubectl port-forward --address 0.0.0.0 services/raycluster-head-svc 8265:8265

# Test cluster (optional)
ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
@anyscalesam anyscalesam transferred this issue from ray-project/ray Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant