-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Add config flag for VLLM_DISABLE_COMPILE_CACHE #30108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new configuration flag, disable_compile_cache, to control the vLLM compile cache, with the environment variable VLLM_DISABLE_COMPILE_CACHE taking precedence. The changes are well-implemented across the configuration, backend, and interface files. The documentation has also been updated accordingly. My main feedback is to refactor the newly added tests for better readability and maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
|
Documentation preview: https://vllm--30108.org.readthedocs.build/en/30108/ |
|
Hi @elizabetht, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, |
Signed-off-by: Elizabeth Thomas <email2eliza@gmail.com>
|
This pull request has merge conflicts that must be resolved before it can be |
yewentao256
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work! But I don't think this flag is needed as we already have the env variable which increased the complexity.
|
@yewentao256 this issue #29917 talks about needing the config flag for this env variable? @ProExpertProg @zou3519 could you weigh in on this issue? |
|
Yeah this flag is needed so that we can see whether cache is enabled or disabled when printing config. Env var should stay around so that cache can be disabled without modifying code. |
ProExpertProg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please just address the comments
|
|
||
| # Build a config dict for cache checking that includes the disable flag | ||
| cache_check_config = self.inductor_config.copy() | ||
| cache_check_config["vllm_disable_compile_cache"] = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the key to inductor config, let's just add a parameter
| # Remove vllm-specific keys that are not valid inductor config options | ||
| # before passing to standalone_compile. These keys are used internally | ||
| # by vLLM for cache control but would cause AttributeError in torch._inductor. | ||
| current_config.pop("vllm_disable_compile_cache", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just add a separate param. Could also pass the whole CompilationConfig around
| not always correct. You can disable it via setting `VLLM_DISABLE_COMPILE_CACHE=1`. | ||
| not always correct. You can disable it by either: | ||
|
|
||
| - Setting the config flag: `--compilation-config '{"disable_compile_cache": true}'` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, positive flags are generally better than negative flags. Something like enable_compile_cache. Because double negation gets confusing.
I know the envvar is already negative, but we should (in the future) add a positive envvar and then deprecate the negative envvar. In that sense, I'd prefer that we have the config flag be positive.
Purpose
Add config flag for VLLM_DISABLE_COMPILE_CACHE which should be overridden if env var is set
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.