Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support more than 64 CPU threads #2696

Open
fixerivan opened this issue Jul 19, 2024 · 5 comments
Open

[Feature] Support more than 64 CPU threads #2696

fixerivan opened this issue Jul 19, 2024 · 5 comments
Labels
enhancement New feature or request need-info Further information from issue author is requested

Comments

@fixerivan
Copy link

Feature Request

currently gpt4all settings allow setting max cpu threads to 64 only

when i set it to 192 (which is my current hw setup) it always reverts to 64

ollama by itself supports 192 - i tried that so maybe this is just some UI restriction? didn't look in the code

thanks

@fixerivan fixerivan added the enhancement New feature or request label Jul 19, 2024
@chrisbarrera
Copy link
Contributor

I'm not from Nomic, but I have to ask what is the benefit of even 64 CPU threads? Have you benchmarked 64 threads vs 32, 16, or even 8, and found that higher (after a certain point) are better? My understanding, and shown in my own tests, is that after 6-8 cpu threads, the memory bus is saturated, and more threads tend to do nothing. Maybe you can do a few more than that on Epyc, just don't otherwise expect much more than that to actually accomplish anything. If I am wrong, would appreciate learning from tests you have done.

@cosmic-snow
Copy link
Collaborator

Looks like code is here:

void MySettings::setThreadCount(int value)
{
if (threadCount() == value)
return;
value = std::max(value, 1);
value = std::min(value, QThread::idealThreadCount());
m_settings.setValue("threadCount", value);
emit threadCountChanged();
}

Which means, the thread count is determined by what Qt thinks should be the upper limit. I'm unsure whether you'd get more performance out of it with a higher value, in any case.

@supersonictw
Copy link
Contributor

supersonictw commented Jul 19, 2024

@cosmic-snow
Copy link
Collaborator

Is it caused due to this?

Maybe, although I don't know about Qt internals and that Q&A is really old. But anyway, as chrisbarrera said, I'm not even sure it would help any to go past Qt defined limit.

@cebtenzzre
Copy link
Member

llama.cpp on CPU is memory-bottlenecked in practice, so using more CPU threads doesn't provide much benefit. The default of 4 threads is enough on my machine. Try with ollama or the llama.cpp CLI and see if you actually get any t/s improvement compared to 64 threads—you may actually see a slowdown.

@cebtenzzre cebtenzzre added the need-info Further information from issue author is requested label Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request need-info Further information from issue author is requested
Projects
None yet
Development

No branches or pull requests

5 participants