Round up saturation-factor
#7116
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This would ensure that any
worker-saturation
value > 1.0 will always result in a worker being oversaturated with at least one task, even for a single-threaded worker.We don't see a strong signal either way in benchmarking that increasing the saturation factor helps runtime. But consensus is currently that submitting 0 extra tasks to workers feels too radical as an initial change.
For clusters where scheduler<->worker latency is relatively high, giving workers an extra task might make more of a difference than we see in our benchmarks.
Or if data access is very slow, getting a head start and overlapping communication (to a database, S3, etc.) with computation might be more beneficial.
I'm personally still on the fence about this (because it certainly can increase memory usage, but we don't have strong evidence it affects runtime, and it's a little more complicated to explain), but staying a little closer to current behavior seems okay for now.
cc @crusaderky @fjetter @hendrikmakait
pre-commit run --all-files