Skip to content

Conversation

@rohan-uiuc
Copy link
Contributor

@rohan-uiuc rohan-uiuc commented Nov 12, 2025

PR Type

Feature

Short Description

Implements support for on-the-fly model weight downloads from HuggingFace when local model weights directory doesn't exist. This allows users to launch models without manually downloading and mounting weight directories.

The code now checks if the model weights directory exists before attempting to bind mount it. If the directory doesn't exist, it skips the bind mount and uses the model identifier from --model in vllm_args (or falls back to model_name). Users must pass the full HuggingFace model identifier (e.g., Qwen/Qwen2.5-7B-Instruct) via --model in vllm_args for automatic downloads to work.

Fixes #166

Tests Added

  • test_generate_server_setup_singularity_no_weights: Verifies server setup doesn't include model weights path when directory doesn't exist
  • test_generate_launch_cmd_singularity_no_local_weights: Verifies launch command uses HF model identifier when local weights are missing
  • test_generate_model_launch_script_singularity_no_weights: Verifies batch mode correctly handles missing model weights
  • All existing tests pass (28 tests in test_slurm_script_generator.py, 116+ total tests)
  • Verified end-to-end: model downloads and serves successfully from HuggingFace when local weights don't exist and --model is specified in vllm_args

@codecov-commenter
Copy link

codecov-commenter commented Nov 12, 2025

Codecov Report

❌ Patch coverage is 90.47619% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.80%. Comparing base (d11de79) to head (c68cb35).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
vec_inf/client/_slurm_script_generator.py 90.47% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #167      +/-   ##
==========================================
- Coverage   90.83%   90.80%   -0.04%     
==========================================
  Files          14       14              
  Lines        1342     1359      +17     
==========================================
+ Hits         1219     1234      +15     
- Misses        123      125       +2     
Files with missing lines Coverage Δ
vec_inf/client/_slurm_templates.py 100.00% <ø> (ø)
vec_inf/client/_slurm_script_generator.py 96.77% <90.47%> (-1.06%) ⬇️

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for on-the-fly model downloads to llm-inference package Support downloading model weights on the fly from HF

2 participants