Commits

Commits on Dec 30, 2024

[V1] [6/N] API Server: Better Shutdown (vllm-project#11586 )
robertgshaw2-neuralmagic
authored

Commits on Dec 28, 2024

[V1] [5/N] API Server: unify Detokenizer and EngineCore input (vllm-project#11545 )
robertgshaw2-neuralmagic
authored
[V1] [4/N] API Server: ZMQ/MP Utilities (vllm-project#11541 )
robertgshaw2-neuralmagic
authored

Commits on Dec 26, 2024

[2/N] API Server: Avoid ulimit footgun (vllm-project#11530 )
robertgshaw2-neuralmagic
authored
[1/N] API Server (Remove Proxy) (vllm-project#11529 )
robertgshaw2-neuralmagic
authored

Commits on Nov 11, 2024

[V1] AsyncLLM Implementation (vllm-project#9826 )

authored

Commits on Nov 2, 2024

[V1] Fix EngineArgs refactor on V1 (vllm-project#9954 )
robertgshaw2-neuralmagic
authored

Commits on Oct 28, 2024

Fix beam search eos (vllm-project#9627 )
robertgshaw2-neuralmagic
authored

Commits on Oct 17, 2024

Support BERTModel (first encoder-only embedding model) (vllm-project#9056 )

authored

Commits on Aug 31, 2024

[BugFix][Core] Multistep Fix Crash on Request Cancellation (vllm-project#8059 )
robertgshaw2-neuralmagic
authored

Commits on Aug 21, 2024

Commits on Aug 18, 2024

[ Bugfix ] Fix Prometheus Metrics With zeromq Frontend (vllm-project#7279 )

robertgshaw2-neuralmagic
and
njhill
authored

Commits on Aug 7, 2024

[ BugFix ] Move zmq frontend to IPC instead of TCP (vllm-project#7222 )
robertgshaw2-neuralmagic
authored

Commits on Aug 6, 2024

[ BugFix ] Fix ZMQ when VLLM_PORT is set (vllm-project#7205 )
robertgshaw2-neuralmagic
authored

Commits on Aug 3, 2024

[ Frontend ] Multiprocessing for OpenAI Server with zeromq (vllm-project#6883 )

authored

Commits on Jul 25, 2024

[ Misc ] fp8-marlin channelwise via compressed-tensors (vllm-project#6524 )

robertgshaw2-neuralmagic
and
mgoin
authored

Commits on Jul 18, 2024

Commits on Jul 15, 2024

[CI/Build] Cross python wheel (vllm-project#6394 )
robertgshaw2-neuralmagic
authored

Commits on Jul 14, 2024

Commits on Jul 13, 2024

[ Misc ] More Cleanup of Marlin (vllm-project#6359 )
robertgshaw2-neuralmagic
authored