Skip to content

Commit

Permalink
[Docker] Adding number of nvcc_threads during build as envar (#1893)
Browse files Browse the repository at this point in the history
  • Loading branch information
AguirreNicolas authored Dec 7, 2023
1 parent 42c02f5 commit 24f60a5
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 2 deletions.
3 changes: 3 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ COPY vllm/__init__.py vllm/__init__.py

# max jobs used by Ninja to build extensions
ENV MAX_JOBS=$max_jobs
# number of threads used by nvcc
ARG nvcc_threads=8
ENV NVCC_THREADS=$nvcc_threads
RUN python3 setup.py build_ext --inplace

# image to run unit testing suite
Expand Down
2 changes: 1 addition & 1 deletion docs/source/serving/deploying_with_docker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ You can build and run vLLM from source via the provided dockerfile. To build vLL

.. code-block:: console
$ DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai --build-arg max_jobs=8
$ DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai # optionally specifies: --build-arg max_jobs=8 --build-arg nvcc_threads=2
To run vLLM:

Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,8 @@ def get_torch_arch_list() -> Set[str]:

# Use NVCC threads to parallelize the build.
if nvcc_cuda_version >= Version("11.2"):
num_threads = min(os.cpu_count(), 8)
nvcc_threads = int(os.getenv("NVCC_THREADS"), 8)
num_threads = min(os.cpu_count(), nvcc_threads)
NVCC_FLAGS += ["--threads", str(num_threads)]

ext_modules = []
Expand Down

0 comments on commit 24f60a5

Please sign in to comment.