Skip to content

Pre-download ML models for multi-architecture Docker builds #285

Merged
aponcedeleonch merged 5 commits intomainfrom
feature/pre-download-models
Jan 23, 2026
Merged

Pre-download ML models for multi-architecture Docker builds #285
aponcedeleonch merged 5 commits intomainfrom
feature/pre-download-models

Conversation

@aponcedeleonch
Copy link
Member

  • Add scripts/download_models.py to download architecture-independent ONNX models
    (FastEmbed, tiktoken, LLMLingua) before Docker build
  • Remove model-downloader stage from Dockerfile, copy pre-downloaded models from
    build context instead
  • Add download-models Taskfile target and make build/build-local depend on it
  • Create reusable download-models.yml GitHub workflow that uploads models as artifacts
  • Update image-build.yml, release.yml, integration-tests.yml, offline-tests.yml
    to download models artifact before building images
  • Add optimum[onnxruntime] to dev dependencies for LLMLingua ONNX export
  • Add models/ to .gitignore

aponcedeleonch and others added 3 commits January 23, 2026 17:12
- Add `scripts/download_models.py` to download architecture-independent ONNX models
  (FastEmbed, tiktoken, LLMLingua) before Docker build
- Remove model-downloader stage from Dockerfile, copy pre-downloaded models from
  build context instead
- Add `download-models` Taskfile target and make `build`/`build-local` depend on it
- Create reusable `download-models.yml` GitHub workflow that uploads models as artifacts
- Update `image-build.yml`, `release.yml`, `integration-tests.yml`, `offline-tests.yml`
  to download models artifact before building images
- Add `optimum[onnxruntime]` to dev dependencies for LLMLingua ONNX export
- Add `models/` to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add llmlingua_model_path field to MCPOptimizerConfig
- Add _get_default_model_paths() to calculate paths relative to package root
- Add _setup_model_paths() to set defaults only if models directory exists
- Update load_config() to call _setup_model_paths()
- Add LLMLINGUA_MODEL_PATH to env_mappings

Config now automatically uses models/ directory when models are pre-downloaded.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@claude
Copy link

claude bot commented Jan 23, 2026

PR Review

This PR successfully refactors the Docker build to pre-download ML models before building, enabling multi-architecture support. The implementation is solid with good structure and documentation.

✅ Strengths

  • Well-structured script: download_models.py follows project conventions with proper argument parsing, error handling, and helpful output
  • Good separation of concerns: Download logic cleanly separated from Dockerfile build
  • Reusable workflow: The GitHub Actions workflow for model downloading is well-designed
  • Appropriate caching: Models only downloaded once and reused across builds

🔍 Issues Found

1. Inconsistent dependency group usage (scripts/download_models.py:88)

  • Error message says uv sync --dev but pyproject.toml defines offline-models group
  • Should be: uv sync --group offline-models (matches line 145)
  • Also affects lines 108, 145

2. Redundant logging in config.py (_setup_model_paths)

  • Debug logs at lines 518, 523, 528 provide minimal value
  • The existence check already validates the path
  • Consider removing or only logging when paths are set

3. Error handling could be more specific (download_models.py:115-149)

  • FileNotFoundError catch at line 144 is misleading - subprocess.run() doesn't raise this for missing modules
  • Consider checking for optimum availability before running subprocess
  • The check=False + manual returncode check could be simplified

4. Minor: Variable naming inconsistency

  • export_llmlingua_model() vs download_fastembed_model() - mixing "export" and "download" terminology
  • Consider download_and_export_llmlingua_model() for clarity

💡 Suggestions

Performance: The script runs sequentially. Since downloads are independent, consider parallel execution using concurrent.futures for faster CI runs.

Code clarity (download_models.py:59-60):

files = list(cache_path.iterdir())
return len(files) > 0

Can be simplified to: return any(cache_path.iterdir())

Security & Performance

  • ✅ No security concerns
  • ✅ Build performance improved by removing model downloads from Docker
  • ✅ No breaking changes detected
  • ✅ Proper use of architecture-independent ONNX models

Minor Style Notes

  • Test file formatting changes (test_mcp_client.py) look like automated formatter output - good
  • Dockerfile comments are clear and helpful

Overall excellent work! The main issue is the incorrect error message. Other suggestions are optional improvements.

- Remove misleading FileNotFoundError catch, add _check_optimum_available()
  to verify module availability before running subprocess
- Rename export_llmlingua_model() to download_and_export_llmlingua_model()
  for clearer terminology
- Use subprocess check=True and catch CalledProcessError for cleaner error handling
- Remove redundant ImportError catches (fastembed/tiktoken imported at module level)
- Remove verbose debug logs from _setup_model_paths() in config.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
therealnb
therealnb previously approved these changes Jan 23, 2026
Copy link
Contributor

@therealnb therealnb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it

The download-models workflow was being called multiple times (from
image-build, integration-tests, offline-tests, mcp-tef-integration-tests),
causing artifact name conflicts (409 Conflict error).

Changes:
- Move download-models call to code-checks.yml (runs once at the start)
- Remove download-models job from individual test workflows
- All workflows now download the artifact named 'ml-models' directly
- Add GitHub cache for models to speed up subsequent runs
- release.yml now depends on code_checks for the artifact

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@aponcedeleonch aponcedeleonch merged commit 25d38cb into main Jan 23, 2026
7 checks passed
@aponcedeleonch aponcedeleonch deleted the feature/pre-download-models branch January 23, 2026 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants