Skip to content

[patch] Bump development status to Beta #482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,19 @@
`pyiron_workflow` is a framework for constructing workflows as computational graphs from simple python functions. Its objective is to make it as easy as possible to create reliable, reusable, and sharable workflows, with a special focus on research workflows for HPC environments.

Nodes are formed from python functions with simple decorators, and the resulting nodes can have their data inputs and outputs connected.
Unlike regular python, they operate in a delayed way.

By allowing (but not demanding, in the case of data DAGs) users to specify the execution flow, both cyclic and acyclic graphs are supported.

By scraping type hints from decorated functions, both new data values and new graph connections are (optionally) required to conform to hints, making workflows strongly typed.

Individual node computations can be shipped off to parallel processes for scalability. (This is a beta-feature at time of writing; standard python executors like `concurrent.futures.ThreadPoolExecutor` and `ProcessPoolExecutor` work, and the `Executor` executor from [`executorlib`](https://github.com/pyiron/executorlib) is supported and tested; `executorlib`'s more powerful flux- and slurm- based executors have not been tested and may fail.)
Individual node computations can be shipped off to parallel processes for scalability. Standard python executors like `concurrent.futures.ThreadPoolExecutor` and `ProcessPoolExecutor` work, but so does, e.g., the `Executor` executor from [`executorlib`](https://github.com/pyiron/executorlib), which facilitates running on HPC. It is also straightforward to run an entire graph on a remote process, e.g. a SLURM allocation, by locally saving the graph and remotely loading, running, and re-saving. Cf. [this notebook](../notebooks/hpc_example.ipynb) for some simple examples.

Once you're happy with a workflow, it can be easily turned it into a macro for use in other workflows. This allows the clean construction of increasingly complex computation graphs by composing simpler graphs.

Nodes (including macros) can be stored in plain text as python code, and imported by future workflows for easy access. This encourages and supports an ecosystem of useful nodes, so you don't need to re-invent the wheel. When these python files are in a properly managed git repository and released in a stable channel (e.g. conda-forge), they fulfill most requirements of the [FAIR](https://en.wikipedia.org/wiki/FAIR_data) principles.

Executed or partially-executed graphs can be stored to file, either by explicit call or automatically after running. These can be reloaded (automatically on instantiation, in the case of workflows) and examined/rerun, etc.
Executed or partially-executed graphs can be stored to file, either by explicit call or automatically after running. These can be reloaded (automatically on instantiation, in the case of workflows) and examined/rerun, etc. If your workflow fails, it will (by default) save a recovery file for you to restore it at the time of failure.

## Installation

Expand Down
12 changes: 12 additions & 0 deletions pyiron_workflow/mixin/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,18 @@ class Runnable(UsesState, HasLabel, HasRun, ABC):
Child classes can optionally override :meth:`process_run_result` to do something
with the returned value of :meth:`on_run`, but by default the returned value just
passes cleanly through the function.

The `run` cycle is broken down into sub-steps:
- `_before_run`: prior to the `running` status being set to `True`
- `_run`: after the `running` status has been set to `True`
- `_finish_run`: what is done to the results of running, and when `running` is
set to `False`
- `_run_exception`: What to do if an encountered
- `_run_finally`: What to do after _every_ run, regardless of whether an exception
was encountered

Child classes can extend the behavior of these sub-steps, including introducing
new keyword arguments.
"""

def __init__(self, *args, **kwargs):
Expand Down
10 changes: 9 additions & 1 deletion pyiron_workflow/node.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,8 @@ class Node(
- In addition to operations, some methods exist for common routines, e.g.
casting the value as `int`.
- When running their computation, nodes may or may not:
- If already running, check for serialized results from a process that
survived the death of their original process
- First update their input data values using kwargs
- (Note that since this happens first, if the "fetching" step later occurs,
any values provided here will get overwritten by data that is flowing
Expand All @@ -100,10 +102,12 @@ class Node(
the execution flow
- Running the node (and all aliases of running) return a representation of data
held by the output channels (or a futures object)
- If an error is encountered _after_ reaching the state of actually computing the
- If an error is encountered _after_ reaching the state of actually running the
node's task, the status will get set to failure
- Nodes can be instructed to run at the end of their initialization, but will exit
cleanly if they get to checking their readiness and find they are not ready
- Nodes can suppress raising errors they encounter by setting a runtime keyword
argument.
- Nodes have a label by which they are identified within their scope, and a full
label which is unique among the entire semantic graph they exist within
- Nodes can run their computation using remote resources by setting an executor
Expand Down Expand Up @@ -140,6 +144,10 @@ class Node(
IO data is not pickle-able.
- Saving is triggered manually, or by setting a flag to make a checkpoint save
of the entire graph after the node runs.
- Saving the entire graph can be set to happen at the end of a particular
node's run with a checkpoint flag.
- A specially named recovery file for the entire graph will (by default) be
automatically saved if the node raises an exception.
- The pickle storage interface comes with all the same caveats as pickle and
is not suitable for storage over indefinitely long time periods.
- E.g., if the source code (cells, `.py` files...) for a saved graph is
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ readme = "docs/README.md"
keywords = [ "pyiron",]
requires-python = ">=3.10, <3.13"
classifiers = [
"Development Status :: 3 - Alpha",
"Development Status :: 4 - Beta",
"Topic :: Scientific/Engineering",
"License :: OSI Approved :: BSD License",
"Intended Audience :: Science/Research",
Expand Down
Loading