-
-
Notifications
You must be signed in to change notification settings - Fork 9k
Closed
Labels
Description
Anything you want to discuss about vllm.
To switch the engine from V0 to V1, we need to comprehensively support the sampling parameters in https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py
While most of the key parameters are already supported, some of them are missing:
TODO (help wanted):
-
n
(parallel sampling) [V1] V1 engine implements parallel sampling (AsyncLLM and LLMEngine) #10980 @afeldman-nm -
guided_decoding
(structured decoding) [V1][Core] Support for Structured Outputs #12388 @aarnphm -
logit_bias
Support logit_bias in v1 Sampler #13079 @houseroad -
min_p
[V1][Core] min_p sampling support #13191 @AoyuQC -
bad_words
(originally implemented via logits processor) [V1] Support bad_words in sampler #13376 @22quinn -
allowed_token_ids
(originally implemented via logits processor) [v1] Support allowed_token_ids in v1 Sampler #13210 @houseroad
Parameters that will not be supported in V1:
- best_of
- logits_processors
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.