Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't end computations until cluster is truly idle #7790

Merged
merged 3 commits into from
May 23, 2023

Conversation

crusaderky
Copy link
Collaborator

Testing for Scheduler.total_occupancy to be zero is not a great pick for declaring a computation finished and starting the next one.
This PR will not create a new computation as long as there are

  • any long-running tasks
  • any tasks with unmet resource constraints
  • zero workers, but queued tasks

This reduces (but doesn't fully eliminate) use cases where you have overlapping Computation objects (read #7776 for more).

@github-actions
Copy link
Contributor

github-actions bot commented Apr 18, 2023

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       26 files  ±  0         26 suites  ±0   15h 12m 20s ⏱️ - 43m 22s
  3 606 tests +  3    3 496 ✔️ +  2     105 💤 ±0  5 +1 
45 629 runs  +39  43 455 ✔️ +38  2 169 💤 +1  5 ±0 

For more details on these failures, see this check.

Results for commit 0cf5145. ± Comparison against base commit e887fde.

This pull request removes 2 and adds 5 tests. Note that renamed tests count towards both.
distributed.tests.test_scheduler ‑ test_computations
distributed.tests.test_scheduler ‑ test_computations_futures
distributed.tests.test_computations ‑ test_computations
distributed.tests.test_computations ‑ test_computations_futures
distributed.tests.test_computations ‑ test_computations_long_running
distributed.tests.test_computations ‑ test_computations_no_resources
distributed.tests.test_computations ‑ test_computations_no_workers

♻️ This comment has been updated with latest results.

@crusaderky crusaderky self-assigned this May 10, 2023
@crusaderky crusaderky marked this pull request as ready for review May 10, 2023 11:06
@crusaderky crusaderky requested a review from fjetter as a code owner May 10, 2023 11:06
@crusaderky crusaderky removed the request for review from fjetter May 10, 2023 11:06
@hendrikmakait hendrikmakait added the needs review Needs review from a contributor. label May 10, 2023
Copy link
Member

@hendrikmakait hendrikmakait left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @crusaderky! This logic is much more sensible.

@@ -1775,6 +1775,19 @@ def _clear_task_state(self) -> None:
):
collection.clear() # type: ignore

@property
def fully_idle(self) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Is there a way we can stress the semantic difference between idle and fully_idle (e.g., by renaming fully_idle -> is_idle or idle -> idle_workers)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed fully_idle -> is_idle

@hendrikmakait hendrikmakait removed the needs review Needs review from a contributor. label May 22, 2023
@crusaderky crusaderky merged commit fcd921c into dask:main May 23, 2023
@crusaderky crusaderky deleted the computation_idle branch May 23, 2023 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants