Tags: hongxiayang/vllm
Tags
[V1][Spec Decode] Update N-gram Proposer Interface (vllm-project#15750) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
[V1][Spec Decode] Update target_logits in place for rejection sampling ( vllm-project#15427) Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
[V1] Minor V1 async engine test refactor (vllm-project#15075) Signed-off-by: andoorve <murali.andoorveedu@mail.utoronto.ca> Co-authored-by: andoorve <murali.andoorveedu@mail.utoronto.ca>
[Bugfix] Fix LoRA extra vocab size (vllm-project#15047) Signed-off-by: Jee Jee Li <pandaleefree@gmail.com>
[Bugfix] Make Gemma3 MM V0 only for now (vllm-project#14971) Signed-off-by: Roger Wang <ywang@roblox.com>
[V1] [Spec Decode] Support random sampling for spec decode (vllm-proj… …ect#13933) Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
[Bugfix] Fix deepseekv3 grouped topk error (vllm-project#13474) Signed-off-by: Chen-XiaoBing <chenxb002@whu.edu.cn>
[Misc] Improve error message for incorrect pynvml (vllm-project#12809) Signed-off-by: youkaichao <youkaichao@gmail.com>
Disable chunked prefill and/or prefix caching when MLA is enabled (vl… …lm-project#12642) From @mgoin in vllm-project#12638 I cannot push to that branch, therefore a new PR to unblock release. --------- Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: simon-mo <simon.mo@hey.com> Co-authored-by: mgoin <michael@neuralmagic.com>
PreviousNext