Conversation
There was a problem hiding this comment.
Code Review
This pull request correctly upgrades vLLM to version 0.17.0 and updates its dependencies accordingly. The code changes are consistent with this upgrade. However, it seems a local configuration for a PyPI index has been accidentally included in many of the dependency lock files. This should be removed to avoid breaking builds.
fb6c09d to
56b154c
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
56b154c to
2b1741a
Compare
| opentelemetry-proto==1.39.0 \ | ||
| --hash=sha256:1e086552ac79acb501485ff0ce75533f70f3382d43d0a30728eeee594f7bf818 \ | ||
| --hash=sha256:c1fa48678ad1a1624258698e59be73f990b7fc1f39e73e16a9d08eef65dd838c | ||
| opentelemetry-proto==1.34.1 \ |
There was a problem hiding this comment.
I think this is compiled from https://github.com/ray-project/ray/blob/15a473454084a739264ce66290d7d4fc1b3926b4/python/requirements/serve/tracing-reqs.txt + all opentelemetry libraries should have the same version.
There was a problem hiding this comment.
hmm... that does not really make sense to me.
- how is the
tracing-reqs.txtget pulled in with this change? - should that be upgraded to 1.39.0 for consistency?
I think it is that some additional dependency of vllm 0.17 is pulling the version down
There was a problem hiding this comment.
@elliot-barn could you help investigate? like what will happen if we enforce opentelemetry-proto>=1.39.0 as a constraint?
There was a problem hiding this comment.
opentelemetry-proto 1.40.0 depends on protobuf<7.0 and >=5.0
opentelemetry-proto 1.39.1 depends on protobuf<7.0 and >=5.0
opentelemetry-proto 1.39.0 depends on protobuf<7.0 and >=5.0
we are still on 4.25.8 and py313 dependency upgrade initiative will bring us to 5.29.6
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
aslonnie
left a comment
There was a problem hiding this comment.
I would like to understand more on why opentelemetry-proto needs to be downgraded.
| --hash=sha256:c1fa48678ad1a1624258698e59be73f990b7fc1f39e73e16a9d08eef65dd838c | ||
| opentelemetry-proto==1.34.1 \ | ||
| --hash=sha256:16286214e405c211fc774187f3e4bbb1351290b8dfb88e8948af209ce85b719e \ | ||
| --hash=sha256:eb4bb5ac27f2562df2d6857fc557b3a481b5e298bc04f94cc68041f00cebcbd2 |
There was a problem hiding this comment.
Unintended opentelemetry-proto downgrade across non-LLM lock files
Medium Severity
opentelemetry-proto is downgraded from 1.39.0 to 1.34.1 not only in LLM lock files but also in unrelated base, slim, and serve lock files that have no dependency on vLLM. As the PR reviewers noted, a vLLM upgrade should not pull down opentelemetry-proto in non-LLM environments. This broad, unexplained regression may indicate a dependency resolution issue — possibly a transitive constraint leaking through shared constraint files — and risks incompatibilities with components expecting the newer proto version.
Additional Locations (2)
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
| - --python-version=${PYTHON_VERSION_STR} | ||
| - --unsafe-package ray | ||
| - --python-platform=x86_64-manylinux_2_31 | ||
| - --index https://download.pytorch.org/whl/${CUDA_CODE} |
There was a problem hiding this comment.
PyTorch CUDA index removed from LLM build configs
High Severity
The --index https://download.pytorch.org/whl/${CUDA_CODE} flag was removed from LLM depset configs that build for CUDA targets (cu128, cu130_py312). Other GPU configs in the same repo (ray_base_extra_testdeps_gpu, ray_ml_base_extra_testdeps_cuda, release_multimodal_inference_benchmarks_tests) still use --index https://download.pytorch.org/whl/cu128. vLLM 0.17.0 docs still recommend --extra-index-url for the PyTorch CUDA wheel index. Without this index, the uv resolver may not find CUDA-specific PyTorch wheels for cu128/cu130 builds, potentially resolving CPU-only PyTorch when lock files are regenerated.
Additional Locations (2)
1. Restore `--index https://download.pytorch.org/whl/${CUDA_CODE}` in rayllm.depsets.yaml. This was accidentally dropped, causing cu128 lockfiles to resolve torchaudio from PyPI instead of the CUDA index. 2. Add numexpr>=2.10 to llm-test-requirements.txt. The CI base Docker image has numexpr compiled against NumPy 1.x, but the lockfile installs NumPy 2.x, causing a binary incompatibility crash. Including numexpr in the lockfile ensures a compatible version overwrites the base image's broken one. 3. Regenerate all 16 LLM lockfiles. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Master also lacks numexpr in lockfiles with the same NumPy 2.2.6. The CPU test numexpr failure is a base image issue, not caused by this branch. Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
518bd4c to
d9d7a78
Compare
docker/ray-llm/Dockerfile
Outdated
|
|
||
| uv pip install --system --no-cache-dir --no-deps \ | ||
| --index-strategy unsafe-best-match \ | ||
| --index-strategy first-index \ |
There was a problem hiding this comment.
Index strategy inconsistency may break Docker image builds
Medium Severity
The --index-strategy was changed from unsafe-best-match to first-index, inconsistent with every other Dockerfile in the project (base-deps, base-extra, base-slim), which all use unsafe-best-match with an explicit comment explaining its necessity. The lock file rayllm_py311_cu128.lock is compiled using the PyTorch index (via rayllm.depsets.yaml which still has --index https://download.pytorch.org/whl/${CUDA_CODE}), so hashes may originate from that index. With first-index, uv stops at PyPI for packages like torch==2.10.0 and torchvision==0.25.0 and never checks the PyTorch index — if any hashes don't match PyPI's wheels, the build will fail.
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
| - ray_img_depset_${PYTHON_SHORT} | ||
| output: python/deplocks/base_extra_testdeps/ray-llm-base_extra_testdeps_py${PYTHON_VERSION}.lock | ||
| append_flags: | ||
| - --index https://download.pytorch.org/whl/${CUDA_CODE} |
There was a problem hiding this comment.
PyTorch CUDA index removed inconsistently from CUDA depsets
Medium Severity
The --index https://download.pytorch.org/whl/${CUDA_CODE} flag was removed from the ray_base_extra_testdeps_llm_cuda depset in rayimg.depsets.yaml and from the common settings in llm_release_tests.depsets.yaml, but similar CUDA depsets like ray_base_extra_testdeps_gpu and ray_ml_base_extra_testdeps_cuda in the same file still retain it. The LLM CUDA depset expands dependencies that include sentence-transformers, which depends on PyTorch — without the CUDA wheel index, the resolver may pull CPU-only PyTorch wheels for what's explicitly a CUDA build.
Additional Locations (1)
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
68b5b08 to
9a06636
Compare
Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| # exact versions, so integrity is maintained through version pinning. | ||
| uv pip install --system --no-cache-dir --no-deps \ | ||
| --index-strategy unsafe-best-match \ | ||
| --no-verify-hashes \ |
There was a problem hiding this comment.
Disabling hash verification weakens supply chain security
Medium Severity
Adding --no-verify-hashes disables integrity checking for all packages installed from the lock file. The lock files still contain hashes, but they are completely ignored during installation. This means a compromised or tampered package on the CUDA index (or any alternate index used via unsafe-best-match) could be installed without detection. While version pinning provides some defense, hash verification is the primary protection against supply chain attacks where an index serves a modified binary for a pinned version. A more targeted fix — such as regenerating hashes from the actual CUDA index, or excluding only the mismatched packages — would preserve integrity checking for the majority of dependencies.


Description
Related issues
Additional information