worker "pool" for nested paralellization

If I understand correctly, `plan(tweak(multicore, workers=8))` means that the first nesting level gets 8 parallel threads and the second nesting level gets no parallelism. I could hard-allocate threads to each level, but that's hard to do since it means I have to know all thread usages down the tree of packages.

What I'm looking for is a "worker pool" like implementation. A naive greedy allocation using a semaphore that decrements every time a thread is forked off would be a good start. So that if I have a loop of three calling a package that has uses `future.apply` on a huge vector but takes very long to even get there, the NN workers can be busy for as much of the time as possible.

Interaction with in particular OMP is a problem of course. A lot of things seem to use that. IRC, Intel TBB auto-detects the number of "useful" threads to use and adjusts this value as it goes based on system load. Something like this would need extra house keeping, but the concept of "don't start more threads if all my workers/cpus are busy", or even "don't start more threads if we are at XY% memory" would be very useful to robustly run things in parallel.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

worker "pool" for nested paralellization #361

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

worker "pool" for nested paralellization #361

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions