Skip to content

Pull requests: vllm-project/tpu-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Chore] Update import path for vllm.utils
#901 opened Oct 20, 2025 by wdhongtw Loading…
[WIP] Add Qwen3-Omni model
#896 opened Oct 19, 2025 by eitanporat Loading…
add jax support for Qwen2VL
#893 opened Oct 18, 2025 by shungcp Loading…
Added the docker login instructions
#891 opened Oct 17, 2025 by hosseinsarshar Loading…
[Doc] Docker guide extended
#890 opened Oct 17, 2025 by hosseinsarshar Loading…
[GPT-OSS] JAX implementation of GPT-OSS
#861 opened Oct 14, 2025 by bzgoogle Loading…
Enable spmd on lora
#829 opened Oct 10, 2025 by vanbasten23 Loading…
lora spmd
#802 opened Oct 8, 2025 by vanbasten23 Draft
Prototyping load weight scale for qwen3.
#741 opened Sep 25, 2025 by inho9606 Loading…
[Test only] Remove the model cache
#725 opened Sep 22, 2025 by QiliangCui Loading…
extract docker build step (wip)
#713 opened Sep 19, 2025 by CienetStingLin Loading…
prefill decode microbenchmark for QWen3
#699 opened Sep 16, 2025 by mailvijayasingh Loading…
[Misc] Upgrade to flax 0.11.2
#603 opened Aug 28, 2025 by py4 Loading…
Llama4scout logit checking
#475 opened Aug 13, 2025 by gpolovets1 Loading…
Parallelize llama4 implementation weight loading
#457 opened Aug 12, 2025 by KWang1998 Loading…
ProTip! no:milestone will show everything without a milestone.