build: fixes to enable vLLM slim runtime image #1058

nv-tusharma · 2025-05-13T18:13:09Z

Overview:

OPS-41: This PR provides a minimum vllm dynamo runtime image which contains all the necessary dependencies to run dynamo CLI with vLLM backend. This includes support for:

NATS & ETCD
NIXL with UCX plugin support
All dynamo CLI options (dynamo-run, dynamo serve, dynamo deploy, etc).

The resulting runtime container size is around 12.2 GB. The current vLLM devel image size is approximately 39.5 GB. Once this PR is approved and merged, The next steps would be:

Enabling this image in nightly CI build
Provide vllm image to dynamo deploy team to use for their deployments

Details:

container/Dockerfile.vllm

Build nixl wheel in wheelbuilder stage with ucx backend
Copy nixl and ucx artifacts into runtime stage and install nixl via uv pip
Copy bindings from ci_minimum image into runtime images
Copy nats-server & etcd into runtime stage
Install build-essential along with python3-dev since this is an indirect dependency for vllm required for disaggregated serving
Install the common container requirements.txt: https://github.com/ai-dynamo/dynamo/blob/main/container/deps/requirements.txt

Steps for testing

./container/build.sh --target runtime
./container/run.sh -it --image dynamo:latest-vllm-runtime
Ran the aggregated and disaggregated examples from this folder: https://github.com/ai-dynamo/dynamo/tree/main/examples/llm

Where should the reviewer start?

container/Dockerfile.vllm

Summary by CodeRabbit

Chores
- Improved the Docker image build process by separating build and installation steps for Python modules.
- Added necessary system dependencies and runtime components for better performance.
- Streamlined Python environment setup and package installation for enhanced reliability and maintainability.

This reverts commit 0fe5b94.

…thon

container/Dockerfile.vllm

coderabbitai · 2025-05-28T20:48:36Z

"""

Walkthrough

The Dockerfile for the vLLM container has been updated to separate the build and installation steps for the NIXL Python module using wheel artifacts. Additional system dependencies and runtime binaries are included, and the Python environment setup is streamlined by adjusting how dependencies and executables are installed and managed.

Changes

File(s)	Change Summary
container/Dockerfile.vllm	Refactored NIXL build/install to build wheels first and then install, added system dependencies, copied runtime binaries and libs, streamlined Python venv setup and package installation

Poem

In Docker’s warren, wheels now spin,
NIXL builds tidy, let the fun begin!
With binaries and libs, dependencies align,
Python’s venv sparkles, everything’s fine.
A rabbit hops by, gives a wink and a cheer—
“Your containers are faster, more nimble this year!” 🐇
"""

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9cdc46b and 5bd4957.

📒 Files selected for processing (1)

container/Dockerfile.vllm (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

container/Dockerfile.vllm

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Test - vllm

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

container/Dockerfile.vllm (1)

491-494: Reevaluate build-essential and python3-dev in the runtime image
Pulling in full build toolchains inflates the runtime container by hundreds of megabytes. Since all C/C++ components (UCX, NIXL) are precompiled and shipped as wheels, these packages may no longer be needed. Consider removing them to further slim the image.

🧹 Nitpick comments (8)

container/Dockerfile.vllm (8)
158-161: Validate uv build command quoting and duplication
The multi‐line uv build invocation for ARM64 uses nested double quotes and a trailing semicolon, which can be fragile in shell parsing. Consider unifying the two branches to avoid duplication and simplify quoting, for example:
RUN cd /opt/nixl && \
    uv build . --out-dir /workspace/wheels/nixl \
    $( [ "$ARCH" = "arm64" ] && echo "--config-settings='setup-args=-Dgds_path=/usr/local/cuda/targets/sbsa-linux'" )  
This removes the inner double‐quotes, eliminates duplicate commands, and makes the conditional flag injection clearer.

164-166: Consider relocating the NIXL wheel installation
The RUN uv pip install /workspace/wheels/nixl/*.whl step in the base image duplicates wheel installation logic between stages. Moving this into the wheel_builder stage (as noted by the TODO) will speed up the runtime build and tighten layer caching.

496-498: Prune unnecessary files from /opt/dynamo/bindings
The copy from ci_minimum brings in wheels, headers, and other artifacts under /opt/dynamo/bindings. At runtime you only need the C API shared libraries (.so). Excluding include directories and wheels will reduce image size.

504-507: Optimize UCX and NIXL artifact copy
Currently, you copy the entire source trees at /usr/local/ucx and /usr/local/nixl, including headers and docs. For runtime you only need the .so files under lib/* and plugin directories. Restricting the copy to libraries will significantly shrink the final image.

508-511: Ensure LD_LIBRARY_PATH doesn’t override critical paths
By redefining LD_LIBRARY_PATH you may inadvertently mask CUDA or system libraries. It’s safer to append your custom paths:
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/nixl/lib/${ARCH_ALT}-linux-gnu:/usr/local/nixl/lib/${ARCH_ALT}-linux-gnu/plugins:/usr/local/ucx/lib
This preserves earlier defaults.

514-515: Remove redundant venv activation
You’ve already prepended the venv’s bin to PATH (line 489), so sourcing the activate script in .bashrc isn’t needed and could alter shell startup behavior.

518-520: Add --no-cache-dir to pip install
To prevent pip from caching packages in the venv and reduce layer size, include the --no-cache-dir flag:
uv pip install --no-cache-dir --requirement /tmp/requirements.txt
523-530: Verify wheelhouse installation and symlinks

Confirm that the package name ai-dynamo[vllm] matches the wheel metadata (underscores vs. hyphens) to avoid install failures.

Instead of symlinking all venv binaries into /usr/local/bin, consider targeting only the Dynamo CLI executables to prevent shadowing system tools.
ln -sf $VIRTUAL_ENV/bin/dynamo* /usr/local/bin/

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c12f61a and 77cc9ff.

📒 Files selected for processing (1)

container/Dockerfile.vllm (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Test - vllm

🔇 Additional comments (1)

container/Dockerfile.vllm (1)

499-503:
❓ Verification inconclusive

Verify nats-server and etcd binary dependencies
Copying these binaries directly may omit shared library dependencies required at runtime. Please run something like:

to confirm no missing libraries in the slim image.

🏁 Script executed:
ldd /usr/bin/nats-server
ldd /usr/local/bin/etcd/etcd
Length of output: 167
Verify nats-server and etcd runtime dependencies inside the built image
The host sandbox can’t locate the binaries, so please run these commands in your Docker image to confirm no missing shared libraries:
docker run --rm -it <your-image> ldd /usr/bin/nats-server
docker run --rm -it <your-image> ldd /usr/local/bin/etcd/etcd
Ensure that no “not found” entries appear.

nv-tusharma added 2 commits May 8, 2025 10:50

Add mising dependencies into slim runtime container

d925032

Remove extra whitespace

b6faf9c

pull-request-size bot added the size/S label May 13, 2025

copy-pr-bot bot temporarily deployed to GITLAB May 13, 2025 18:13 Inactive

github-actions bot added the build label May 13, 2025

copy-pr-bot bot temporarily deployed to GITLAB May 13, 2025 18:13 Inactive

Merge branch 'main' into tusharma/slim-runtime-vllm-build

68b0ef5

copy-pr-bot bot temporarily deployed to GITLAB May 13, 2025 18:15 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 13, 2025 18:18 Inactive

nv-tusharma added 2 commits May 13, 2025 14:17

Move etcd command after the etcd folder is copied over

c60cb15

Change ucx directory to align with recent changes

3dc9580

copy-pr-bot bot temporarily deployed to GITLAB May 13, 2025 21:42 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 13, 2025 21:43 Inactive

Merge branch 'main' into tusharma/slim-runtime-vllm-build

c5f5813

copy-pr-bot bot temporarily deployed to GITLAB May 15, 2025 22:36 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 15, 2025 22:37 Inactive

Merge branch 'main' into tusharma/slim-runtime-vllm-build

6b30373

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:14 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:15 Inactive

nv-tusharma added 3 commits May 16, 2025 12:20

Remove --no-index as argument isn't required for installing nixl

0a690ea

Install common dependencies as part of requirements.txt

211dd20

Add common requirements to vllm container

0fe5b94

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:24 Inactive

Revert "Add common requirements to vllm container"

4774c07

This reverts commit 0fe5b94.

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:25 Inactive

Install common dependencies seperately from dynamo & vllm installation

89d9533

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:27 Inactive

nit: remove requirements.txt installation from dynamo install step

0c8dfc0

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:28 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 16, 2025 19:30 Inactive

Copy NIXL wheel + point LD_LIBRARY_PATH to C++ bindings instead of py…

0b11d64

…thon

copy-pr-bot bot temporarily deployed to GITLAB May 27, 2025 21:42 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 27, 2025 21:43 Inactive

mc-nv reviewed May 27, 2025

View reviewed changes

container/Dockerfile.vllm Show resolved Hide resolved

mc-nv reviewed May 27, 2025

View reviewed changes

container/Dockerfile.vllm Show resolved Hide resolved

Merge branch 'main' into tusharma/slim-runtime-vllm-build

2c6333c

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 02:02 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 02:05 Inactive

saturley-hall reviewed May 28, 2025

View reviewed changes

container/Dockerfile.vllm Show resolved Hide resolved

saturley-hall reviewed May 28, 2025

View reviewed changes

container/Dockerfile.vllm Outdated Show resolved Hide resolved

saturley-hall reviewed May 28, 2025

View reviewed changes

container/Dockerfile.vllm Outdated Show resolved Hide resolved

nit: minor suggestions to improve vllm dockerfiles

77cc9ff

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 20:48 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 20:49 Inactive

nit: minor dockerfile syntax fix

9cdc46b

coderabbitai bot reviewed May 28, 2025

View reviewed changes

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 20:51 Inactive

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 20:54 Inactive

nv-tusharma requested a review from saturley-hall May 28, 2025 21:06

Merge branch 'main' into tusharma/slim-runtime-vllm-build

5bd4957

copy-pr-bot bot temporarily deployed to GITLAB May 28, 2025 21:40 Inactive

saturley-hall approved these changes May 29, 2025

View reviewed changes

nv-tusharma merged commit 93ca9df into main May 29, 2025
10 checks passed

nv-tusharma deleted the tusharma/slim-runtime-vllm-build branch May 29, 2025 04:58

This was referenced Jun 11, 2025

build: enable vllm runtime container as default container for ci pipelines #1451

Merged

fix: Add ENV variables in docker files for NIXL Plugins #1490

Closed

fix: update nixl build and keep wheels dir in vllm container #1544

Merged

coderabbitai bot mentioned this pull request Jun 21, 2025

feat: add vLLM Dynamo image to Earthly #700

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

build: fixes to enable vLLM slim runtime image #1058

build: fixes to enable vLLM slim runtime image #1058

Uh oh!

nv-tusharma commented May 13, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented May 28, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

build: fixes to enable vLLM slim runtime image #1058

build: fixes to enable vLLM slim runtime image #1058

Uh oh!

Conversation

nv-tusharma commented May 13, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Steps for testing

Where should the reviewer start?

Summary by CodeRabbit

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nv-tusharma commented May 13, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented May 28, 2025 •

edited

Loading