ggml : add option for controlling work distribution across threads

See https://github.com/ggerganov/llama.cpp/pull/1507

And comment: https://github.com/ggerganov/llama.cpp/pull/1507#issuecomment-1605302021

> I guess we can extend ggml to be able to choose work chunk distribution method - either at compile time, or via a context parameter. We can factor out the range selections from the ggml forward implementations to make implementation more concise and extensible in the future

---

Another thing to be investigated is the usage of `sched_yield()` and potentially making it user configurable:

https://github.com/ggerganov/whisper.cpp/pull/1275/commits/09a6325de56856490ae9046bf0030ceedc04028a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : add option for controlling work distribution across threads #291

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ggml : add option for controlling work distribution across threads #291

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions