Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR updates the GPU Docker configuration to install llmcompressor from the git main branch instead of a PyPI release to incorporate a critical bug fix for _update_mamba_mask that's not yet available in the published package.
- Adds
gitas a system dependency to support installing from GitHub repositories - Changes
llmcompressorinstallation from PyPI version constraint to git repository installation - Adds documentation explaining why the git installation is necessary
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
docker/Dockerfile.gpu
Outdated
| # Commit 4cfc0e6 (Oct 14, 2025) added _update_mamba_mask to tracing_ignore | ||
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2025) |
There was a problem hiding this comment.
The date 'Oct 14, 2025' appears to be a typo. Since the current date is November 2025, this likely refers to October 2024. Please verify and correct the year.
| # Commit 4cfc0e6 (Oct 14, 2025) added _update_mamba_mask to tracing_ignore | |
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2025) | |
| # Commit 4cfc0e6 (Oct 14, 2024) added _update_mamba_mask to tracing_ignore | |
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2024) |
docker/Dockerfile.gpu
Outdated
| # Commit 4cfc0e6 (Oct 14, 2025) added _update_mamba_mask to tracing_ignore | ||
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2025) |
There was a problem hiding this comment.
The date 'Oct 8, 2025' appears to be a typo. This likely refers to October 2024. Please verify and correct the year.
| # Commit 4cfc0e6 (Oct 14, 2025) added _update_mamba_mask to tracing_ignore | |
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2025) | |
| # Commit 4cfc0e6 (Oct 14, 2024) added _update_mamba_mask to tracing_ignore | |
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2024) |
docker/Dockerfile.gpu
Outdated
| # Install llmcompressor from git main branch to get the _update_mamba_mask fix | ||
| # Commit 4cfc0e6 (Oct 14, 2025) added _update_mamba_mask to tracing_ignore | ||
| # This fix is not yet in the latest PyPI release (0.8.1, Oct 8, 2025) | ||
| RUN pip install git+https://github.com/vllm-project/llm-compressor.git |
There was a problem hiding this comment.
Installing from the main branch without pinning to a specific commit or tag can lead to unpredictable builds and potential breaking changes. Consider pinning to the specific commit (4cfc0e6) mentioned in the comment: RUN pip install git+https://github.com/vllm-project/llm-compressor.git@4cfc0e6
| RUN pip install git+https://github.com/vllm-project/llm-compressor.git | |
| RUN pip install git+https://github.com/vllm-project/llm-compressor.git@4cfc0e6 |
|
Automated review 🤖 Summary of Changes Key Changes & Positives
Potential Issues & Recommendations
Language/Framework Checks
Security & Privacy
Build/CI & Ops
Tests
Approval Recommendation
|
…compressor Root Cause (VERIFIED): The error "torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow" occurs because _update_mamba_mask() in GraniteMoeHybrid models contains control flow that cannot be traced by torch.fx. Investigation Process: 1. Searched llmcompressor issue #1603 and PR #1599 for similar fixes 2. Found DatasetArguments.tracing_ignore list in llmcompressor source 3. Verified _update_mamba_mask was added in commit 4cfc0e6 (Oct 14, 2025) 4. Confirmed latest PyPI release (0.8.1, Oct 8, 2025) predates the fix The Fix (VERIFIED): Install llmcompressor from git main branch instead of PyPI to get commit 4cfc0e6 which adds "_update_mamba_mask" to the default tracing_ignore list in DatasetArguments. Changes: - Added git to system packages (required for pip git+https install) - Changed from: pip install "llmcompressor>=0.8.0" - Changed to: pip install git+https://github.com/vllm-project/llm-compressor.git This ensures the quantization engine skips tracing _update_mamba_mask during AWQ sequential tracing, preventing the TraceError. Reference: vllm-project/llm-compressor@4cfc0e6
362cf8c to
a39e51e
Compare
|
Automated review 🤖 Summary of Changes Key Changes & Positives
Potential Issues & Recommendations
Language/Framework Checks
Security & Privacy
Build/CI & Ops
Tests
Approval Recommendation
|
…compressor
Root Cause (VERIFIED):
The error "torch.fx.proxy.TraceError: symbolically traced variables cannot be used as inputs to control flow" occurs because _update_mamba_mask() in GraniteMoeHybrid models contains control flow that cannot be traced by torch.fx.
Investigation Process:
The Fix (VERIFIED):
Install llmcompressor from git main branch instead of PyPI to get commit 4cfc0e6 which adds "_update_mamba_mask" to the default tracing_ignore list in DatasetArguments.
Changes:
This ensures the quantization engine skips tracing _update_mamba_mask during AWQ sequential tracing, preventing the TraceError.
Reference: vllm-project/llm-compressor@4cfc0e6