Skip to content

[Bug]: Can't serve on ray cluster although passing VLLM_HOST_IP #13521

Open
@hahmad2008

Description

@hahmad2008

Your current environment

vLLM API server version 0.7.2

🐛 Describe the bug

I create a cluster with two instance each with 1 GPU.

  • head
RAY_num_heartbeats_timeout=600 ray start --head --node-ip-address HEAD-IP \
                                                  --port 6379 \
                                                  --ray-client-server-port 10001 \
                                                  --object-manager-port=8076  \
                                                  --node-manager-port=8077

--------------------
Ray runtime started.
--------------------

Next steps
  To add another node to this Ray cluster, run
    ray start --address='HEAD-IP:6379'

  • worker
ray start --object-manager-port=8076 \
            --address='HEAD-IP:6379' \
            --node-manager-port=8077 
  • serve model with two option, first only send --tensor-parallel-size 1 --pipeline-parallel-size 2 and second time with --tensor-parallel-size 2 and with both I have the following error:

  • Error:

2025-02-19 07:41:17,494	INFO worker.py:1654 -- Connecting to existing Ray cluster at address: HEAD-IP:6379...
2025-02-19 07:41:17,507	INFO worker.py:1832 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265 
(autoscaler +18s) Tip: use `ray status` to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0.
(autoscaler +18s) Error: No available node types can fulfill resource request {'node:HEAD-IP:6379': 0.001, 'GPU': 1.0}. Add suitable node types to this cluster to resolve this issue.
  • commands:
VLLM_HOST_IP=HEAD-IP:6379 vllm serve NousResearch/Meta-Llama-3.1-8B-Instruct  --max-model-len 8192  --gpu-memory-utilization 0.8 \
  --tensor-parallel-size 1 --pipeline-parallel-size 2 --distributed-executor-backend ray 
VLLM_HOST_IP=HEAD-IP:6379 vllm serve NousResearch/Meta-Llama-3.1-8B-Instruct  --max-model-len 8192  --gpu-memory-utilization 0.8 \
 --tensor-parallel-size 2  --distributed-executor-backend ray 

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrayanything related with ray

    Type

    No type

    Projects

    Status

    Need User Input

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions