-
Notifications
You must be signed in to change notification settings - Fork 446
build: fixes to enable vLLM slim runtime image #1058
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 0fe5b94.
""" WalkthroughThe Dockerfile for the vLLM container has been updated to separate the build and installation steps for the NIXL Python module using wheel artifacts. Additional system dependencies and runtime binaries are included, and the Python environment setup is streamlined by adjusting how dependencies and executables are installed and managed. Changes
Poem
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
container/Dockerfile.vllm (1)
491-494
: Reevaluatebuild-essential
andpython3-dev
in the runtime image
Pulling in full build toolchains inflates the runtime container by hundreds of megabytes. Since all C/C++ components (UCX, NIXL) are precompiled and shipped as wheels, these packages may no longer be needed. Consider removing them to further slim the image.
🧹 Nitpick comments (8)
container/Dockerfile.vllm (8)
158-161
: Validateuv build
command quoting and duplication
The multi‐lineuv build
invocation for ARM64 uses nested double quotes and a trailing semicolon, which can be fragile in shell parsing. Consider unifying the two branches to avoid duplication and simplify quoting, for example:RUN cd /opt/nixl && \ uv build . --out-dir /workspace/wheels/nixl \ $( [ "$ARCH" = "arm64" ] && echo "--config-settings='setup-args=-Dgds_path=/usr/local/cuda/targets/sbsa-linux'" )This removes the inner double‐quotes, eliminates duplicate commands, and makes the conditional flag injection clearer.
164-166
: Consider relocating the NIXL wheel installation
TheRUN uv pip install /workspace/wheels/nixl/*.whl
step in the base image duplicates wheel installation logic between stages. Moving this into thewheel_builder
stage (as noted by the TODO) will speed up the runtime build and tighten layer caching.
496-498
: Prune unnecessary files from/opt/dynamo/bindings
The copy fromci_minimum
brings in wheels, headers, and other artifacts under/opt/dynamo/bindings
. At runtime you only need the C API shared libraries (.so
). Excluding include directories and wheels will reduce image size.
504-507
: Optimize UCX and NIXL artifact copy
Currently, you copy the entire source trees at/usr/local/ucx
and/usr/local/nixl
, including headers and docs. For runtime you only need the.so
files underlib/*
and plugin directories. Restricting the copy to libraries will significantly shrink the final image.
508-511
: Ensure LD_LIBRARY_PATH doesn’t override critical paths
By redefiningLD_LIBRARY_PATH
you may inadvertently mask CUDA or system libraries. It’s safer to append your custom paths:ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/nixl/lib/${ARCH_ALT}-linux-gnu:/usr/local/nixl/lib/${ARCH_ALT}-linux-gnu/plugins:/usr/local/ucx/lib
This preserves earlier defaults.
514-515
: Remove redundant venv activation
You’ve already prepended the venv’sbin
toPATH
(line 489), so sourcing the activate script in.bashrc
isn’t needed and could alter shell startup behavior.
518-520
: Add--no-cache-dir
topip install
To prevent pip from caching packages in the venv and reduce layer size, include the--no-cache-dir
flag:uv pip install --no-cache-dir --requirement /tmp/requirements.txt
523-530
: Verify wheelhouse installation and symlinks
- Confirm that the package name
ai-dynamo[vllm]
matches the wheel metadata (underscores vs. hyphens) to avoid install failures.- Instead of symlinking all venv binaries into
/usr/local/bin
, consider targeting only the Dynamo CLI executables to prevent shadowing system tools.ln -sf $VIRTUAL_ENV/bin/dynamo* /usr/local/bin/
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
container/Dockerfile.vllm
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Build and Test - vllm
🔇 Additional comments (1)
container/Dockerfile.vllm (1)
499-503
:❓ Verification inconclusive
Verify nats-server and etcd binary dependencies
Copying these binaries directly may omit shared library dependencies required at runtime. Please run something like:to confirm no missing libraries in the slim image.
🏁 Script executed:
ldd /usr/bin/nats-server ldd /usr/local/bin/etcd/etcdLength of output: 167
Verify nats-server and etcd runtime dependencies inside the built image
The host sandbox can’t locate the binaries, so please run these commands in your Docker image to confirm no missing shared libraries:docker run --rm -it <your-image> ldd /usr/bin/nats-server docker run --rm -it <your-image> ldd /usr/local/bin/etcd/etcdEnsure that no “not found” entries appear.
Overview:
OPS-41: This PR provides a minimum vllm dynamo runtime image which contains all the necessary dependencies to run dynamo CLI with vLLM backend. This includes support for:
The resulting runtime container size is around 12.2 GB. The current vLLM devel image size is approximately 39.5 GB. Once this PR is approved and merged, The next steps would be:
Details:
container/Dockerfile.vllm
wheelbuilder
stage with ucx backendnixl
anducx
artifacts into runtime stage and install nixl viauv pip
build-essential
along withpython3-dev
since this is an indirect dependency for vllm required for disaggregated servingSteps for testing
Where should the reviewer start?
container/Dockerfile.vllm
Summary by CodeRabbit