[minor] 0.10.0 #413

liamhuber · 2024-08-08T17:54:58Z

I want to release a patch version before the next minor bump, to facilitate easy conversion from existing storage formats to pickle. This then serves as a landing platform for all the work after that which will require a minor bump.

Only pickle as a storage backend [minor] Only pickle storage #414
Streamline storage architecture (leave it extensible and tests looping over backends to make re-introducing a different backend easy)
Remove node job and trim dependencies [minor] Remove node job #415 -- Closes pyiron_workflow depends on pyiron_atomistics and pyiron_base leading to a large number of dependencies #46
Remove node packages and their registration [minor] Remove registration and NodePackage #417 -- Closes Change: No more node registry #400
Remove working directory -- Closes Change: No more working directory #401 and 🐛 working_directory does not play nicely with macros #316
Revisit documentation [patch] Communication #433
WAIT for 0.9.5 to be released, so there is an intermediate version available to allow users to easily upgrade from h5-serialized workflows to pickle-serialization
Purge HasToDict [minor] Replace HasToDict mixin with HasStateDisplay mixin #435
~~Make fetching and pushing signals private inputs to run, so you can still break the "truth" of a graph state, but you really need to want to~~ This is complicated and maybe even undesirable. I've come around to the thinking that rather it should be very easy to tell whether the current input to a node is synchronized with the current output of that node, or whether it needs to be run. This is easier and closely related to cache_hit. This afternoon I've been playing around with using traitlets to catch desynchronizations of IO. It's not ready now because traitlets uses __getstate__ in some ways that interfere with what I'm doing, but I think it's promising. For now, I'll just make Node.cached_inputs private so that any switch in this direction can stay a patch change.

With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change.

Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow.

You're only supposed to use it as a decorator to start with, so the kwarg was senseless

This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close.

Now that it's pickling as well as anything else

When the pickle backend fails

Regardless of what the specified storage backend was.

review-notebook-app · 2024-08-08T17:55:04Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

github-actions · 2024-08-08T17:55:12Z

👈 Launch a binder notebook on branch pyiron/pyiron_workflow/minor_bump

* Remove all storage backends except for pickle * Remove pyiron_contrib dependence and references * Remove h5io_browser dependence * Remove no-bidict workaround It was only necessary for h5io since that can't handle bidict's custom reconstructor * Remove channel recursion workaround It was only necessary for h5io since that can't handle recursion. We keep the purging of connections, because those might actually extend the data scope by reaching non-owner nodes * [minor] Remove node job (#415) * Remove node job * Remove pyiron_base and h5io dependencies

codacy-production · 2024-08-08T22:11:47Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
❌ -1.23% (target: -1.00%)	✅ 78.31%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`3bd6241`)	3599	3331	92.55%
Head commit (`cef10fb`)	3099 (-500)	2830 (-501)	91.32% (-1.23%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#413)	249	195	78.31%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

_{Codacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more}

coveralls · 2024-08-08T22:12:50Z

Pull Request Test Coverage Report for Build 10517318730

Details

0 of 0 changed or added relevant lines in 0 files are covered.
141 unchanged lines in 12 files lost coverage.
Overall coverage decreased (-1.2%) to 91.32%

Files with Coverage Reduction	New Missed Lines	%
nodes/transform.py	1	99.39%
nodes/for_loop.py	2	98.22%
create.py	4	93.44%
mixin/semantics.py	5	96.91%
nodes/static_io.py	6	85.71%
nodes/macro.py	8	92.65%
channels.py	11	94.78%
nodes/composite.py	13	90.4%
io.py	14	94.19%
node.py	21	90.95%

Totals
Change from base Build 10374248662:	-1.2%
Covered Lines:	2830
Relevant Lines:	3099

💛 - Coveralls

* Remove all storage backends except for pickle * Remove pyiron_contrib dependence and references * Remove h5io_browser dependence * Remove no-bidict workaround It was only necessary for h5io since that can't handle bidict's custom reconstructor * Remove channel recursion workaround It was only necessary for h5io since that can't handle recursion. We keep the purging of connections, because those might actually extend the data scope by reaching non-owner nodes * Remove node job * Remove pyiron_base and h5io dependencies * [patch] Remove version constraints on storage Tbh, I've forgotten why they were there. Now that we exclusively use pickle, I am cautiously optimistic py3.10 will work fine * 🐛 fix test_docs indentation I screwed up the indentation of the return value

* Remove all storage backends except for pickle * Remove pyiron_contrib dependence and references * Remove h5io_browser dependence * Remove no-bidict workaround It was only necessary for h5io since that can't handle bidict's custom reconstructor * Remove channel recursion workaround It was only necessary for h5io since that can't handle recursion. We keep the purging of connections, because those might actually extend the data scope by reaching non-owner nodes * Remove node job * Remove pyiron_base and h5io dependencies * [patch] Remove version constraints on storage Tbh, I've forgotten why they were there. Now that we exclusively use pickle, I am cautiously optimistic py3.10 will work fine * 🐛 fix test_docs indentation I screwed up the indentation of the return value * [minor] Remove registration and NodePackage Just import and use nodes like any other python class. * Extend error message * Remove print

Some stuff slipped through the cracks in earlier PRs

* Refactor storage test * Remove empty line * Rename workflow in tests With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change. * Introduce `pickle` as a backend for saving * Fix root cause of storage conflict Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow. * Again, correctly order try/finally and for-loops

* Refactor storage test * Remove empty line * Rename workflow in tests With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change. * Introduce `pickle` as a backend for saving * Fix root cause of storage conflict Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow. * Again, correctly order try/finally and for-loops * Remove keyword argument from pure-decorator You're only supposed to use it as a decorator to start with, so the kwarg was senseless * Add factory import source This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close. * Bring the dataclass node in line with function and macro * Leverage new pyiron_snippets.factory stuff to find the class * Mangle the stored dataclass qualname so it can be found later * Add tests * Update docs examples to reflect new naming * Update snippets dependency * [dependabot skip] Update env file * Use new pyiron_snippets syntax consistently * Expose `as_dataclass_node` in the API Now that it's pickling as well as anything else * Format black --------- Co-authored-by: pyiron-runner <pyiron@mpie.de>

* Refactor storage test * Remove empty line * Rename workflow in tests With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change. * Introduce `pickle` as a backend for saving * Fix root cause of storage conflict Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow. * Again, correctly order try/finally and for-loops * Remove keyword argument from pure-decorator You're only supposed to use it as a decorator to start with, so the kwarg was senseless * Add factory import source This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close. * Bring the dataclass node in line with function and macro * Leverage new pyiron_snippets.factory stuff to find the class * Mangle the stored dataclass qualname so it can be found later * Add tests * Update docs examples to reflect new naming * Update snippets dependency * [dependabot skip] Update env file * Use new pyiron_snippets syntax consistently * Expose `as_dataclass_node` in the API Now that it's pickling as well as anything else * [patch] Fall back on cloudpickle When the pickle backend fails * Format black --------- Co-authored-by: pyiron-runner <pyiron@mpie.de>

* Refactor storage test * Remove empty line * Rename workflow in tests With the introduction of a pickle backend, this test wound up failing. I haven't been able to work out exactly why, but since pickle is much faster my guess is that something in the test parallelization gets in a race condition with deleting a saved workflow from another test. Simply renaming it here solved the issue, which is an innocuous change. * Introduce `pickle` as a backend for saving * Fix root cause of storage conflict Ok, the reason workflow storage tests were failing after introducing pickle was that the original version of the test had the wrong order of the try/finally scope and the for-loop scope. I got away with it earlier because of interplay between the outer-most member of the loop and the default storage backend. Actually fixing the problem is as simple as ensuring the "finally delete" clause happens after _each_ loop step. This does that and reverts the renaming of the workflow. * Again, correctly order try/finally and for-loops * Remove keyword argument from pure-decorator You're only supposed to use it as a decorator to start with, so the kwarg was senseless * Add factory import source This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close. * Bring the dataclass node in line with function and macro * Leverage new pyiron_snippets.factory stuff to find the class * Mangle the stored dataclass qualname so it can be found later * Add tests * Update docs examples to reflect new naming * Update snippets dependency * [dependabot skip] Update env file * Use new pyiron_snippets syntax consistently * Expose `as_dataclass_node` in the API Now that it's pickling as well as anything else * [patch] Fall back on cloudpickle When the pickle backend fails * [minor] Make pickle the default storage backend * Format black * Fall back on loading any storage contents Regardless of what the specified storage backend was. * Format black --------- Co-authored-by: pyiron-runner <pyiron@mpie.de>

I must have screwed up the import line when doing the rebase merge on GitHub

* [patch] Don't touch storage (#421) * Remove unused import * Never interact directly with the storage object Always go through HasStorage methods. In practice, this was only .delete. * Break the recursive ownership between HasStorage and StorageInterface (#423) * [patch] Merge `HasStorage` back into `Node` (#424) * Refactor: merge HasStorage back into Node * Indicate that StorageInterface is abstract * Refactor: Move storage out of the mixin submodule * 🐛 pass node through methods We were never getting to the cloudpickle deletion branch * Format black --------- Co-authored-by: pyiron-runner <pyiron@mpie.de> * [minor] Checkpointing (#425) * 🐛 compare like-objects * Distinguish saving and checkpointing * Rename flag to reflect checkpoint behaviour * Reflect save/checkpoint difference in StorageInterface It only needs save now -- checkpointing is the responsibility of the node itself * [minor] No `storage` property (#426) * Expose the backend as a node method kwarg * Use the backend kwarg directly in tests * Expose backend in storage deletion too In case someone has their own custom backend * Remove the storage property And route the `storage_backend` kwarg to the post-setup routine * Format black * Allow the backend to be specified for checkpoints * Update storage_backend docs and type hint * Use storage_backend kwarg as intended in tests Including a couple bug-fixes, where I was looping over backends but forgot to pass the loop value to the save method! * 🐛 fix JSON edit typo * Check for stored content to autoload Even if the autoload back end is a custom interface * Give Node.delete_storage consistent use of the backend arg * Pass backend iterator variable to deletion method in tests * Remove unused method * Allow specifically mentioned backends to pass on missing files * Refactor: remove code duplication * Format black --------- Co-authored-by: pyiron-runner <pyiron@mpie.de> * [minor] Autoload (#427) * Refactor: rename storage_backend to autoload * Update autoload default behaviour on the base class And note the difference in Workflow * Add a test to demonstrate the issue And ensure that, for macros at least, the default behaviour stays safe * Format black --------- Co-authored-by: pyiron-runner <pyiron@mpie.de> * [minor] Housekeeping (#428) * Clean up after macro test * Refactor: rename overwrite_save to delete_existing_savefiles * 🐛 Actually deactivate auto-loading for Node * Remove void __init__ implementation It just passes through * Consistent and sensible kwarg ordering, and a doc * Refactor: rename run_after_init to autoload It's shorter and, IMO, better in line with the other kwarg names like "autoload" * Delete storage in the order the nodes are instantiated Otherwise if it fails before hard_input is defined, the finally clause never deletes the self.n1 storage because there's a second failure _inside_ finally * Simplify setup and allow loading and running at init (#429) * [minor] Shift responsibility (#430) * Remove unused property * Shorten variable name * Refactor: rename method * Call base method and remove specialist * Condense has_contents * Remove unused property * Move methods from Node to StorageInterface * Have StorageInterface.load just return the new instance * Break dependence on Node.working_directory * Refactor: rename obj to node in the storage module Since it is the interface for Node instances * 🐛 pass argument * Remove tidy method * Clean up using pathlib * Rely on pathlib This is almost exclusively because `pyiron_snippets.files.DirectoryObject` has no option to avoid the `self.create()` call at instantiation. That means if I want to peek at stuff, I need to create then clean it, which is killer slow. I'm very open to re-introducing `files` dependence again in the future, I just don't have the time to implement the necessary fixes, with test, and go through the release cycle. This is all under-the-hood, so should be safely changeable with a `pyiron_workflow` patch change later anyhow. * Remove the post-setup cleanup No longer necessary now that storage isn't creating directories willy nilly * Simplify and remove unused * Allow semantic objects to express as pathlib.Path * Leverage new semantic path for storage * Remove unneeded exception * Remove unused imports * Use a generator in the storage module for available backends * Test loading failure * Fix docstring * Format black * Remove unused import * Don't overload builtin name --------- Co-authored-by: pyiron-runner <pyiron@mpie.de> * [minor] interface flexibility (#432) * Expose existing kwargs on the node methods * Refactor: slide * Allow storage to specify either a node xor a filename directly * Add tests * Expose backend-specific kwargs And use them for the PickleStorage * Add storage to the API * Pass kwargs through the node interface to storage * Add docstrings * Refactor: rename methods * Shorten kwargs docstrings * Format black * Deduplicate test class name * Remove unnecessary passes Codacy actually catching real things for once --------- Co-authored-by: pyiron-runner <pyiron@mpie.de> --------- Co-authored-by: pyiron-runner <pyiron@mpie.de>

# Conflicts: # .binder/environment.yml # .ci_support/environment.yml # .ci_support/lower_bound.yml # docs/environment.yml # notebooks/deepdive.ipynb # pyiron_workflow/mixin/storage.py # pyiron_workflow/node.py # pyiron_workflow/nodes/composite.py # pyiron_workflow/nodes/for_loop.py # pyiron_workflow/nodes/macro.py # pyiron_workflow/nodes/static_io.py # pyiron_workflow/workflow.py # pyproject.toml # tests/unit/nodes/test_macro.py # tests/unit/test_node.py # tests/unit/test_workflow.py

The one remaining use case was to generate a local subdirectory for `Node.draw`, but this is easy enough to replace with `Semantic.as_path()` I snuck in some docstring edits too

* Update overview * Update readme * 🐛 pass body node executor when provided at init * Update quickstart * 🐛 facilitate dataclass subclassing This is actually a weird issue with python more than us, where a class inheriting from a dataclass passes `dataclasses.is_dataclass`, but doesn't act like one. * 🐛 fix readme example whitespace * 🐛 don't let channel readiness be impacted when type checking is off * Remove unused import * Expose the custom storage error as part of the API * Rewrite the deepdive * Update Node docs * Update node subclass docstrings * Fix readme demo inconsistencies * Clean up the save file produced by the deepdive

* [minor] Replace `HasToDict` mixin with `HasStateDisplay` mixin Functionality is similar -- the point is to get a nice JSON representation for child classes. But the public interface is different (hence minor flag), and in particular we don't use `to_dict` which might interfere with other pyiron activities. There's also some changes to exactly the display looks like, but since it's just a nice/useful thing for humans to look at, this is non-critical and can be improved with patches as needed. * Remove unused import

In optimistic anticipation of tracking the connection of input and output without using an explicit cache (e.g. traitlets to track when channel values change)

liamhuber and others added 24 commits August 6, 2024 20:07

Refactor storage test

506458d

Remove empty line

54fe84f

Introduce pickle as a backend for saving

6a1b49c

Again, correctly order try/finally and for-loops

5e9279a

Remove keyword argument from pure-decorator

6ccd24c

You're only supposed to use it as a decorator to start with, so the kwarg was senseless

Add factory import source

3d1c9ea

This is necessary (but not sufficient) to get `as_dataclass_node` decorated classes to pickle. The field name is a bit off compared to `Function` and `Macro`, as now we decorate a class definition instead of a function definition, but it's close.

Bring the dataclass node in line with function and macro

6a07879

Leverage new pyiron_snippets.factory stuff to find the class

7f16385

Mangle the stored dataclass qualname so it can be found later

de8518e

Add tests

2c0ca0e

Update docs examples to reflect new naming

71f021d

Update snippets dependency

d05fea8

[dependabot skip] Update env file

d91c015

Use new pyiron_snippets syntax consistently

a1b1836

Expose as_dataclass_node in the API

91348e0

Now that it's pickling as well as anything else

[patch] Fall back on cloudpickle

a8d3bd9

When the pickle backend fails

[minor] Make pickle the default storage backend

23503e5

Format black

4a1c879

Merge branch 'as_dataclass_node_pickle' into fallback_on_cloudpickle

46e8985

Merge branch 'fallback_on_cloudpickle' into pickle_as_default_backend

a24abb7

Fall back on loading any storage contents

e0e1297

Regardless of what the specified storage backend was.

Format black

c156385

liamhuber changed the title ~~[minor] Minor bump~~ [minor] 0.10.0 Aug 8, 2024

liamhuber and others added 5 commits August 8, 2024 16:23

Remove unused code (#418)

50ac464

Some stuff slipped through the cracks in earlier PRs

liamhuber changed the base branch from main to patch_bump August 9, 2024 16:52

liamhuber and others added 5 commits August 9, 2024 09:58

Merge branch 'patch_bump' into minor_bump

338c6b2

Format black

458d78d

Merge branch 'patch_bump' into minor_bump

47e846c

🐛 Import storage mixin

bcc8be4

I must have screwed up the import line when doing the rebase merge on GitHub

Merge remote-tracking branch 'origin/minor_bump' into minor_bump

21d8e95

Base automatically changed from patch_bump to main August 13, 2024 17:18

liamhuber and others added 4 commits August 15, 2024 12:21

Remove unused imports

62559e1

[minor] Remove HasWorkingDirectory

dc5a8a0

The one remaining use case was to generate a local subdirectory for `Node.draw`, but this is easy enough to replace with `Semantic.as_path()` I snuck in some docstring edits too

liamhuber mentioned this pull request Aug 16, 2024

Derive DirectoryObject from Path? pyiron/pyiron_snippets#13

Open

liamhuber added 3 commits August 21, 2024 21:16

[minor] Make the input cache private

cef10fb

In optimistic anticipation of tracking the connection of input and output without using an explicit cache (e.g. traitlets to track when channel values change)

liamhuber marked this pull request as ready for review August 22, 2024 23:56

liamhuber merged commit 3041942 into main Aug 22, 2024
15 of 17 checks passed

liamhuber deleted the minor_bump branch August 22, 2024 23:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[minor] 0.10.0 #413

[minor] 0.10.0 #413

liamhuber commented Aug 8, 2024 •

edited

Loading

review-notebook-app bot commented Aug 8, 2024

github-actions bot commented Aug 8, 2024

codacy-production bot commented Aug 8, 2024 •

edited

Loading

coveralls commented Aug 8, 2024 •

edited

Loading

[minor] 0.10.0 #413

[minor] 0.10.0 #413

Conversation

liamhuber commented Aug 8, 2024 • edited Loading

review-notebook-app bot commented Aug 8, 2024

github-actions bot commented Aug 8, 2024

codacy-production bot commented Aug 8, 2024 • edited Loading

Coverage summary from Codacy

See diff coverage on Codacy

See your quality gate settings Change summary preferences

coveralls commented Aug 8, 2024 • edited Loading

Pull Request Test Coverage Report for Build 10517318730

Details

💛 - Coveralls

liamhuber commented Aug 8, 2024 •

edited

Loading

codacy-production bot commented Aug 8, 2024 •

edited

Loading

coveralls commented Aug 8, 2024 •

edited

Loading