-
-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't end computations until cluster is truly idle #7790
Conversation
707afb1
to
777521d
Compare
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 26 files ± 0 26 suites ±0 15h 12m 20s ⏱️ - 43m 22s For more details on these failures, see this check. Results for commit 0cf5145. ± Comparison against base commit e887fde. This pull request removes 2 and adds 5 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
777521d
to
57b8868
Compare
57b8868
to
37d6e01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @crusaderky! This logic is much more sensible.
distributed/scheduler.py
Outdated
@@ -1775,6 +1775,19 @@ def _clear_task_state(self) -> None: | |||
): | |||
collection.clear() # type: ignore | |||
|
|||
@property | |||
def fully_idle(self) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is there a way we can stress the semantic difference between idle
and fully_idle
(e.g., by renaming fully_idle -> is_idle
or idle -> idle_workers
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed fully_idle -> is_idle
Testing for Scheduler.total_occupancy to be zero is not a great pick for declaring a computation finished and starting the next one.
This PR will not create a new computation as long as there are
This reduces (but doesn't fully eliminate) use cases where you have overlapping Computation objects (read #7776 for more).