-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Misc]: Will the kv-cache be computed and stored if max_tokens=1?
misc
#9902
opened Nov 1, 2024 by
donpromax
1 task done
[help wanted]: add sliding window support for flashinfer
misc
#9854
opened Oct 30, 2024 by
youkaichao
1 task done
[Misc]: Remove max_tokens field for chat completion requests when not supported anymore by the OpenAI client
misc
#9845
opened Oct 30, 2024 by
gcalmettes
1 task done
[Misc]: Eagle reformat checkpoint compatible with Vllm
misc
#9816
opened Oct 29, 2024 by
sssrijan-amazon
1 task done
[Misc]: Unable to load Llama 3B model in A10 GPU
misc
#9753
opened Oct 28, 2024 by
hrsmanian
1 task done
[Usage]: Qwen2VL model mrope implemenation in cuda graph
misc
#9546
opened Oct 21, 2024 by
gujiewen
1 task done
[Misc]: offline inference inconsistency result of qwen2-7b
misc
#9450
opened Oct 17, 2024 by
poppybrown
1 task done
[Misc]: [Question] vLLM's model loading & instance contract, 1 model per vLLM instance, or multiple models per vLLM instance
misc
#9429
opened Oct 16, 2024 by
yx-lamini
1 task done
[Misc]: Im trying to host my finetuned Llama -3-8b instruct in Vllm
misc
#9361
opened Oct 15, 2024 by
preethiisenthil
1 task done
[Misc]: remove dropout related stuff from triton flash attention kernel
misc
#9322
opened Oct 13, 2024 by
HaiShaw
1 task done
[help wanted]: write tests for python-only development
misc
#9315
opened Oct 12, 2024 by
youkaichao
1 task done
[Misc]: Debugging the paged attention issue for customized LLMs
misc
#9231
opened Oct 10, 2024 by
protossw512
1 task done
[Misc]: Segmentation Fault in vLLM API Server during Model Initialization (NCCL Error: Unhandled System Error)
misc
#9156
opened Oct 8, 2024 by
shreyasp-07
1 task done
[Misc]: Need to understand support for torch.compile in Q4 roadmap
misc
#9072
opened Oct 4, 2024 by
amd-abhikulk
1 task done
[Question]: Apply LoRA adapter on quantized model
misc
#8945
opened Sep 29, 2024 by
Tejaswgupta
1 task done
[Misc]: Strange
leaked shared_memory
warnings reported by multiprocessing when using vLLM
misc
#8803
opened Sep 25, 2024 by
shaoyuyoung
1 task done
[Tracking Issue][Help Wanted]: FlashInfer backend improvements
help wanted
Extra attention is needed
misc
#8786
opened Sep 24, 2024 by
comaniac
7 tasks
[Misc]: Enable dependabot to help managing known vulnerabilities in dependencies
misc
#8734
opened Sep 23, 2024 by
fcanogab
1 task done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.