-
-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[V1][Prototype] MTP Support
frontend
speculative-decoding
v1
#17683
opened May 5, 2025 by
ruisearch42
Loading…
3 tasks
[Bugfix][V1][Spec Dec] Add generator to request even when no seed is provided.
speculative-decoding
v1
#17509
opened May 1, 2025 by
luyuzhe111
Loading…
[INTEL_HPU][v0] Enable spec decode on HPU
speculative-decoding
#17014
opened Apr 23, 2025 by
xuechendi
Loading…
[V1][Metrics] Add API for accessing in-memory Prometheus metrics
documentation
Improvements or additions to documentation
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
v1
#17010
opened Apr 22, 2025 by
markmc
Loading…
[Misc] Replace
cuda
hard code with current_platform
speculative-decoding
#16983
opened Apr 22, 2025 by
shen-shanshan
•
Draft
[V1] LogitsProcessor interface
ci/build
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
speculative-decoding
structured-output
tool-calling
tpu
Related to Google TPUs
v1
#16728
opened Apr 16, 2025 by
afeldman-nm
•
Draft
[V1][Spec Decode] Non greedy sample with EAGLE / Reduce memory allocation for Rejection Sampler
documentation
Improvements or additions to documentation
needs-rebase
speculative-decoding
v1
#16077
opened Apr 4, 2025 by
ekagra-ranjan
Loading…
2 tasks done
[SpecDecode] Support EAGLE in V1
speculative-decoding
v1
#15901
opened Apr 1, 2025 by
WoosukKwon
7 of 10 tasks
[Misc] Disable pin_memory in AsyncMetricsCollector for spec decode tensor allocation
needs-rebase
speculative-decoding
#15886
opened Apr 1, 2025 by
esmeetu
Loading…
[Misc] Improve cli help show
ci/build
needs-rebase
speculative-decoding
#15455
opened Mar 25, 2025 by
reidliu41
Loading…
[V1][Spec Decode] Remove warning on N-gram
needs-rebase
speculative-decoding
v1
#15361
opened Mar 23, 2025 by
WoosukKwon
Loading…
[SpecDecode] Make spec decoding extensible to different backends
ci/build
speculative-decoding
#15195
opened Mar 20, 2025 by
MengqingCao
Loading…
[Spec Decode] Make speculative decoding compatible with pipeline parallelism
needs-rebase
speculative-decoding
#15173
opened Mar 20, 2025 by
xyang16
Loading…
[Frontend]Reduce vLLM's import time
ci/build
frontend
multi-modality
Related to multi-modality (#4194)
needs-rebase
speculative-decoding
structured-output
v1
#15128
opened Mar 19, 2025 by
Chen-0210
Loading…
[Bugfix] Fix hidden_states reshape failed and no_proposals error when…
speculative-decoding
#15032
opened Mar 18, 2025 by
ptkang
Loading…
[Feature] Eagle Chunked Prefill Support
speculative-decoding
#14922
opened Mar 17, 2025 by
luyuzhe111
Loading…
[Misc] QoL: add speculative_model to SpeculativeConfig
speculative-decoding
v1
#14509
opened Mar 9, 2025 by
andylolu2
Loading…
[Misc] Using ruff-format for smaller sets of directories
ci/build
documentation
Improvements or additions to documentation
misc
multi-modality
Related to multi-modality (#4194)
needs-rebase
speculative-decoding
v1
#14485
opened Mar 8, 2025 by
aarnphm
Loading…
[Core] Add DoRA Support
ci/build
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
needs-rebase
speculative-decoding
v1
#14389
opened Mar 7, 2025 by
ChloeL19
Loading…
[Hardware][CPU] Vllm int8 quantization enablement for ARM CPU
ci/build
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
speculative-decoding
v1
#14129
opened Mar 3, 2025 by
nishith-fujitsu
Loading…
[Bugfix] Make memory profiler account for speculative draft model weights
speculative-decoding
#14067
opened Feb 28, 2025 by
benchislett
Loading…
[Bugfix] Enable speculative decoding for models with nearly-identical vocab sizes
speculative-decoding
#13849
opened Feb 25, 2025 by
benchislett
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.