-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
[V0][V1][Core] Add outlines integration for V1, and update V0 integration. #15975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[V0][V1][Core] Add outlines integration for V1, and update V0 integration. #15975
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
NOTE: Can't be merged until next version of outlines_core is released. |
Thank you for the PR! I will review it this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the v0 code path. One ask is to add tests for this for disabling cache path.
And we should update the requirements/common.txt to the lowest version of outlines-core supported.
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of review. A few things needs to be addressed here. but great progress so far.
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
This pull request has merge conflicts that must be resolved before it can be |
reasoner=None, | ||
vocab_size=32000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why all of the changes were needed to this file? I don't think it has changed a while, so I'm surprised to see changes needed for the V0 code we're trying to change as little as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old integration would silent fail in this test when an arbitrary sequence of token ids was passed in since it would just default to the initial state of the DFA when the sequence's prior state could not be found in the self._fsm_state
defaultdict. In contrast the new version which uses the Guide object will error in this case as opposed to defaulting, so we need to make sure we are passing in sequences which have their sequential dependency (Guide has seen all tokens in the sequence prior to the most recent token) satisfied.
CI also needs to pass on here. If you think any of the failures aren't the fault of your PR, I have some notes here on what to do next: #18782 |
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
@russellb the two failing tests seem to be un-related. Both concern parts of the codebase which have not been modified, and only started failing after I updated the base branch of the PR, leading me to believe it is an upstream issue. Further, on this CI run from last night roughly when I updated the base branch the same tests are failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. Can you add outlines
testing to the V1 tests here?
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
Signed-off-by: Nathan Hoos <thwackyy.y@gmail.com>
@russellb sorry about the hold up, I had to go to Mexico for a family situation last week. V1 tests are added and passing. The failing speculative decoding and quantization tests are currently failing on main according to this ci run. The LoRA test which fails seems to be unrelated to anything modified here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work and perseverance! The CI failure is not relevant and will be fixed separately.
but still will wait for @russellb to take a look once he's back
Adds outlines as a guided decoding backend for V1, and updates the integration for V0.
The aim of this is three fold:
outlines
, and only useoutlines_core
write_mask_into
method onGuide
to write a bitmask in-place for use in logits masking.Because the dependency on
outlines
will be removed, support for grammar based decoding with the outlines backend will also be removed (CFG classes reside in theoutlines
package)cc @aarnphm