Restarting process #458

liamhuber · 2024-09-18T21:16:18Z

To work with long-duration nodes on executors that survive the shutdown of the parent workflow/node python process (e.g. executorlib using slurm), we need to be able to tell the run paradigm to serialize the results, and to try to load such a serialization if we come back and the node is running.

This introduces new attributes Node.serialize_results to trigger the result serialization, and a private Node._do_clean to let power users (i.e. me writing the unit tests) stop the serialized results from getting cleaned up automatically at read-time.

Under the hood, Node now directly implements Runnable.on_run and Runnable.run_args leveraging the new detached path from #457 to make sure that each run has access to a semantically relevant path for writing the temporary output file (using cloudpickle). Child classes of Node implement new abstract methods Node._on_run and Node._run_args in place of the previous Runnable abstract methods they implemented.

TODO:

~~Figure out how to get the node/parent workflow to checkpoint itself after it has set its status to "running" when it is going to rely on serialization~~ rebase this onto main; since breaking the Runnable.run cycle down, getting a right-before-running-checkpoint should be quite easy
Try it out on cmmc to make sure it actually works in production
Document it, probably in the deepdive

By flag, you can cloudpickle results; if already running and the results file exists it's loaded instead of running; the results file gets cleaned up either way. This is really only _useful_ when the node is running on an executor process that _doesn't_ die when the parent process dies.

github-actions · 2024-09-18T21:16:30Z

👈 Launch a binder notebook on branch pyiron/pyiron_workflow/restarting_process

codacy-production · 2024-09-18T21:18:18Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ +0.06% (target: -1.00%)	✅ 96.92%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`c33c9e1`)	3244	2961	91.28%
Head commit (`4b41fdd`)	3288 (+44)	3003 (+42)	91.33% (+0.06%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#458)	65	63	96.92%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more}

coveralls · 2024-09-18T21:19:21Z

Pull Request Test Coverage Report for Build 10948924861

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

0 of 0 changed or added relevant lines in 0 files are covered.
24 unchanged lines in 2 files lost coverage.
Overall coverage increased (+0.06%) to 91.332%

Files with Coverage Reduction	New Missed Lines	%
draw.py	4	92.57%
node.py	20	91.46%

Totals
Change from base Build 10927133806:	0.06%
Covered Lines:	3003
Relevant Lines:	3288

💛 - Coveralls

liamhuber · 2024-09-25T19:13:37Z

Superseded by #476

liamhuber added 5 commits September 18, 2024 11:48

Refactor: Make Node directly responsible for on_run

7b83509

Refactor: Make Node directly responsible for run_args

0edcfb9

Remove unused import

1dcae11

Remove intermediate value

0026331

liamhuber mentioned this pull request Sep 19, 2024

Why are running and failed still coexisting? #461

Open

liamhuber added the format_black trigger the Black formatting bot label Sep 19, 2024

Format black

4b41fdd

liamhuber closed this Sep 25, 2024

liamhuber deleted the restarting_process branch September 25, 2024 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Restarting process #458

Restarting process #458

Uh oh!

liamhuber commented Sep 18, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Sep 18, 2024

Uh oh!

codacy-production bot commented Sep 18, 2024 •

edited

Loading

Uh oh!

coveralls commented Sep 18, 2024 •

edited

Loading

Uh oh!

liamhuber commented Sep 25, 2024

Uh oh!

Uh oh!

Restarting process #458

Restarting process #458

Uh oh!

Conversation

liamhuber commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 18, 2024

Uh oh!

codacy-production bot commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

Uh oh!

coveralls commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 10948924861

Warning: This coverage report may be inaccurate.

Details

💛 - Coveralls

Uh oh!

liamhuber commented Sep 25, 2024

Uh oh!

Uh oh!

liamhuber commented Sep 18, 2024 •

edited

Loading

codacy-production bot commented Sep 18, 2024 •

edited

Loading

coveralls commented Sep 18, 2024 •

edited

Loading