Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage]: Recommended setting for running vLLM for CPU #5682

Open
jerin-scalers-ai opened this issue Jun 19, 2024 · 2 comments
Open

[Usage]: Recommended setting for running vLLM for CPU #5682

jerin-scalers-ai opened this issue Jun 19, 2024 · 2 comments
Labels
usage How to use vllm x86 CPU

Comments

@jerin-scalers-ai
Copy link

jerin-scalers-ai commented Jun 19, 2024

How would you like to use vllm

What are the recommended settings for running vLLM on a CPU to achieve high performance? For instance, if I have a dual-socket server with 96 cores per socket, how many cores (--cpuset-cpus) should be allocated to run multiple replicas of vLLM?

@jerin-scalers-ai jerin-scalers-ai added the usage How to use vllm label Jun 19, 2024
@mgoin mgoin added the x86 CPU label Jun 19, 2024
@zhouyuan
Copy link
Contributor

related:
#5735

@zhouyuan
Copy link
Contributor

related: #6212

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage How to use vllm x86 CPU
Projects
None yet
Development

No branches or pull requests

3 participants