Skip to content

Commit 9c82a1b

Browse files
[Doc] Update installation doc (#3746)
[Doc] Update installation doc for build from source and explain the dependency on torch/cuda version (#3746) Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
1 parent b6d1035 commit 9c82a1b

File tree

1 file changed

+21
-12
lines changed

1 file changed

+21
-12
lines changed

docs/source/getting_started/installation.rst

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ You can install vLLM using pip:
1919

2020
.. code-block:: console
2121
22-
$ # (Optional) Create a new conda environment.
22+
$ # (Recommended) Create a new conda environment.
2323
$ conda create -n myenv python=3.9 -y
2424
$ conda activate myenv
2525
@@ -28,24 +28,19 @@ You can install vLLM using pip:
2828
2929
.. note::
3030

31-
As of now, vLLM's binaries are compiled on CUDA 12.1 by default.
32-
However, you can install vLLM with CUDA 11.8 by running:
31+
As of now, vLLM's binaries are compiled with CUDA 12.1 and public PyTorch release versions by default.
32+
We also provide vLLM binaries compiled with CUDA 11.8 and public PyTorch release versions:
3333

3434
.. code-block:: console
3535
3636
$ # Install vLLM with CUDA 11.8.
37-
$ export VLLM_VERSION=0.2.4
37+
$ export VLLM_VERSION=0.4.0
3838
$ export PYTHON_VERSION=39
39-
$ pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux1_x86_64.whl
39+
$ pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux1_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
4040
41-
$ # Re-install PyTorch with CUDA 11.8.
42-
$ pip uninstall torch -y
43-
$ pip install torch --upgrade --index-url https://download.pytorch.org/whl/cu118
44-
45-
$ # Re-install xFormers with CUDA 11.8.
46-
$ pip uninstall xformers -y
47-
$ pip install --upgrade xformers --index-url https://download.pytorch.org/whl/cu118
41+
In order to be performant, vLLM has to compile many cuda kernels. The compilation unfortunately introduces binary incompatibility with other CUDA versions and PyTorch versions, even for the same PyTorch version with different building configurations.
4842

43+
Therefore, it is recommended to install vLLM with a **fresh new** conda environment. If either you have a different CUDA version or you want to use an existing PyTorch installation, you need to build vLLM from source. See below for instructions.
4944

5045
.. _build_from_source:
5146

@@ -77,6 +72,20 @@ You can also build and install vLLM from source:
7772
$ # Use `--ipc=host` to make sure the shared memory is large enough.
7873
$ docker run --gpus all -it --rm --ipc=host nvcr.io/nvidia/pytorch:23.10-py3
7974
75+
If you don't want to use docker, it is recommended to have a full installation of CUDA Toolkit. You can download and install it from `the official website <https://developer.nvidia.com/cuda-toolkit-archive>`_. After installation, set the environment variable `CUDA_HOME` to the installation path of CUDA Toolkit, and make sure that the `nvcc` compiler is in your `PATH`, e.g.:
76+
77+
.. code-block:: console
78+
79+
$ export CUDA_HOME=/usr/local/cuda
80+
$ export PATH="${CUDA_HOME}/bin:$PATH"
81+
82+
Here is a sanity check to verify that the CUDA Toolkit is correctly installed:
83+
84+
.. code-block:: console
85+
86+
$ nvcc --version # verify that nvcc is in your PATH
87+
$ ${CUDA_HOME}/bin/nvcc --version # verify that nvcc is in your CUDA_HOME
88+
8089
.. note::
8190
If you are developing the C++ backend of vLLM, consider building vLLM with
8291

0 commit comments

Comments
 (0)