Skip to content

Commit 0b7d050

Browse files
youkaichaoAlvant
authored andcommitted
[doc][faq] add warning to download models for every nodes (vllm-project#5783)
Signed-off-by: Alvant <alvasian@yandex.ru>
1 parent 90da4f2 commit 0b7d050

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

docs/source/serving/distributed_serving.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,7 @@ To scale vLLM beyond a single machine, install and start a `Ray runtime <https:/
3535
$ # On worker nodes
3636
$ ray start --address=<ray-head-address>
3737
38-
After that, you can run inference and serving on multiple machines by launching the vLLM process on the head node by setting :code:`tensor_parallel_size` to the number of GPUs to be the total number of GPUs across all machines.
38+
After that, you can run inference and serving on multiple machines by launching the vLLM process on the head node by setting :code:`tensor_parallel_size` to the number of GPUs to be the total number of GPUs across all machines.
39+
40+
.. warning::
41+
Please make sure you downloaded the model to all the nodes, or the model is downloaded to some distributed file system that is accessible by all nodes.

0 commit comments

Comments
 (0)