About Concurrency and Stability #637
gdw439
announced in
Announcements
Replies: 1 comment
-
|
https://www.baseten.co/blog/how-we-built-bei-high-throughput-embedding-inference/#performance-benefits-from-baseten-infrastructure Answer: It depends. It has much more features arond embeddings than vllm. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Feature request
Hello! I recently came across this popular OpenAI-compatible inference framework and found it very interesting. I'd like to know more about its concurrency and stability—specifically, how it compares to vLLM.
Motivation
project feature
Your contribution
no
Beta Was this translation helpful? Give feedback.
All reactions