[Feature] Export to Python Workflow Definition by jan-janssen · Pull Request #882 · pyiron/executorlib

jan-janssen · 2026-01-04T17:22:48Z

As demonstrated in https://github.com/pyiron-dev/executorlib-export-python-workflow-definition/

Example:

from executorlib import SingleNodeExecutor, get_item_from_future

function_str = """
def get_sum(x, y):
    return x + y
    
def get_prod_and_div(x, y):
    return {"prod": x * y, "div": x / y}

def get_square(x):
    return x ** 2
"""

with open("workflow.py", "w") as f:
    f.write(function_str)

from workflow import get_sum, get_prod_and_div, get_square

with SingleNodeExecutor(export_workflow_filename="workflow.json") as exe:
    future_prod_and_div = exe.submit(get_prod_and_div, x=1, y=2)
    future_prod = get_item_from_future(future_prod_and_div, key="prod")
    future_div = get_item_from_future(future_prod_and_div, key="div")
    future_sum = exe.submit(get_sum, x=future_prod, y=future_div)
    future_result = exe.submit(get_square, x=future_sum)

Summary by CodeRabbit

New Features
- Added support for exporting workflow dependency graphs to JSON files across all executor types (Flux, SLURM, and single-node executors).
Tests
- Added new test suite validating workflow graph export functionality with arithmetic and NumPy array workflows, ensuring proper graph structure and output.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

for more information, see https://pre-commit.ci

coderabbitai · 2026-01-04T17:22:59Z

📝 Walkthrough

Walkthrough

Adds an optional export_workflow_filename parameter to multiple executors and the interactive DependencyTaskScheduler, and implements export_dependency_graph_function to serialize workflow nodes/edges to a JSON file; the filename is used on scheduler exit to write the workflow JSON.

Changes

Cohort / File(s)	Summary
Executor constructors `src/executorlib/executor/flux.py`, `src/executorlib/executor/slurm.py`, `src/executorlib/executor/single.py`	Added optional `export_workflow_filename` parameter to public executor constructors (`FluxJobExecutor`, `FluxClusterExecutor`, `SlurmClusterExecutor`, `SlurmJobExecutor`, `SingleNodeExecutor`, `TestClusterExecutor`) and threaded the value through to the underlying scheduler/executor construction calls.
Scheduler integration `src/executorlib/task_scheduler/interactive/dependency.py`	Added `export_workflow_filename: Optional[str]` to `DependencyTaskScheduler.__init__`, stored as `_export_workflow_filename`, updated `_generate_dependency_graph` logic, and changed `__exit__` to call export vs. plot depending on the provided filename. Imported the export function.
Graph export implementation `src/executorlib/task_scheduler/interactive/dependency_plot.py`	Added `export_dependency_graph_function(node_lst, edge_lst, file_name="workflow.json")` which formats nodes/edges (handles functions, inputs, numpy arrays) and writes a JSON workflow file. Added `json` and `numpy as np` imports.
Tests `tests/test_singlenodeexecutor_pwd.py`	Added tests exercising SingleNodeExecutor with `export_workflow_filename="workflow.json"`, asserting node/edge counts and removing the generated file in teardown.

Sequence Diagram

sequenceDiagram
    participant User
    participant Executor as Executor<br/>(Flux/Slurm/Single)
    participant Scheduler as DependencyTask<br/>Scheduler
    participant Export as export_dependency<br/>_graph_function
    participant FS as File System

    User->>Executor: init(export_workflow_filename)
    Executor->>Scheduler: __init__(..., export_workflow_filename=...)
    Scheduler->>Scheduler: store _export_workflow_filename

    Note over Executor,Scheduler: Workflow runs, tasks scheduled/executed

    Executor->>Scheduler: __exit__()
    alt _export_workflow_filename provided
        Scheduler->>Export: export_dependency_graph_function(nodes, edges, filename)
        Export->>Export: build JSON structure (nodes, edges, ports)
        Export->>FS: write JSON file (filename)
    else no export filename
        Scheduler->>Scheduler: call plot_dependency_graph_function(plot_filename)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Worker for overflow queue #763 — Modifies Flux executor constructor/scheduler wiring in the same flux.py constructors where export_workflow_filename was added.
Upgrade terminate function #715 — Alters the same Flux executor constructors and parameter threading patterns that this change extends.
Add plot functionality for new future selector #820 — Changes interactive dependency plotting plumbing; closely related to the new export/plot branching in DependencyTaskScheduler.

Poem

🐰 A ribbon of nodes in a neat JSON line,

I hop and I nibble, this graph looks fine.
From Flux to Slurm and the scheduler's nest,
I save every edge so your workflows rest.
Hooray for exports — a carrot for the test! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main feature being added: exporting workflow definitions to a JSON file format, which is the primary change across all modified files.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/executorlib/task_scheduler/interactive/dependency_plot.py (1)
263-271: Non-JSON-serializable values may cause TypeError at runtime.

The else branch serializes n["value"] directly. If the value is a non-JSON-serializable object (e.g., a custom class instance, datetime, or other complex types), json.dump will raise a TypeError. Consider adding a fallback to convert such values to strings.
🔎 Example defensive approach
         else:
+            try:
+                # Test if value is JSON serializable
+                json.dumps(n["value"])
+                value = n["value"]
+            except (TypeError, ValueError):
+                value = str(n["value"])
             pwd_nodes_lst.append(
                 {
                     "id": n["id"],
                     "type": n["type"],
-                    "value": n["value"],
+                    "value": value,
                     "name": n["name"],
                 }
             )

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b357c5 and 8683249.

📒 Files selected for processing (5)

src/executorlib/executor/flux.py
src/executorlib/executor/single.py
src/executorlib/executor/slurm.py
src/executorlib/task_scheduler/interactive/dependency.py
src/executorlib/task_scheduler/interactive/dependency_plot.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/executorlib/task_scheduler/interactive/dependency.py (1)

src/executorlib/task_scheduler/interactive/dependency_plot.py (2)

export_dependency_graph_function (237-307)

plot_dependency_graph_function (206-234)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)

GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: unittest_win
GitHub Check: unittest_old
GitHub Check: unittest_flux_openmpi
GitHub Check: minimal

🔇 Additional comments (10)

src/executorlib/executor/slurm.py (2)

105-106: LGTM!

The export_workflow_filename parameter is correctly added to the constructor signature and properly propagated to DependencyTaskScheduler. The documentation is appropriately updated.

Also applies to: 231-231

320-321: LGTM!

The export_workflow_filename parameter is correctly added and propagated in SlurmJobExecutor, consistent with the pattern in SlurmClusterExecutor.

Also applies to: 404-404

src/executorlib/executor/flux.py (2)

109-110: LGTM!

The export_workflow_filename parameter is correctly integrated into FluxJobExecutor with proper documentation and propagation to the underlying DependencyTaskScheduler.

Also applies to: 195-195

301-302: LGTM!

The export_workflow_filename parameter is correctly integrated into FluxClusterExecutor, following the same consistent pattern as other executor classes.

Also applies to: 430-430

src/executorlib/executor/single.py (2)

98-99: LGTM!

The export_workflow_filename parameter is correctly integrated into SingleNodeExecutor with proper documentation and propagation.

Also applies to: 177-177

270-271: LGTM!

The export_workflow_filename parameter is correctly integrated into TestClusterExecutor, maintaining consistency with the other executor implementations.

Also applies to: 368-368

src/executorlib/task_scheduler/interactive/dependency.py (3)

68-74: Verify the conditional logic for _generate_dependency_graph.

The logic sets _generate_dependency_graph = True when plot_dependency_graph_filename is not None OR when export_workflow_filename is None. This means if neither filename is provided, the graph is still generated (but not saved anywhere useful since plot_dependency_graph_function with filename=None displays inline in Jupyter).

Was the intention to only generate the graph when at least one filename is provided, or is the inline Jupyter display the expected fallback behavior?

219-230: LGTM!

The __exit__ method correctly dispatches to either export_dependency_graph_function or plot_dependency_graph_function based on the provided filename. The export path correctly uses the new JSON export function when export_workflow_filename is specified.

15-20: LGTM!

The import of export_dependency_graph_function is correctly added alongside the existing imports from the same module.

src/executorlib/task_scheduler/interactive/dependency_plot.py (1)

2-8: Add numpy to the project dependencies.

The code imports numpy at line 8 for handling np.ndarray serialization in the new export_dependency_graph_function function. However, numpy is not declared as a dependency in pyproject.toml. Since this function is called from src/executorlib/task_scheduler/interactive/dependency.py and numpy is not transitively available through the graph optional dependencies, add numpy to either the main dependencies or the graph optional dependency group.

coderabbitai · 2026-01-04T17:30:25Z

src/executorlib/task_scheduler/interactive/dependency_plot.py

+    pwd_edges_lst.append(
+        {
+            "target": final_node["id"],
+            "targetPort": None,
+            "source": max([e["target"] for e in pwd_edges_lst]),
+            "sourcePort": None,
+        }
+    )


⚠️ Potential issue | 🟠 Major

Handle empty edge list to prevent ValueError on max().

If edge_lst is empty, pwd_edges_lst will be empty, and max([e["target"] for e in pwd_edges_lst]) will raise ValueError: max() arg is an empty sequence. This can occur with single-node workflows or graphs without edges.

🔎 Proposed fix

- pwd_edges_lst.append( - { - "target": final_node["id"], - "targetPort": None, - "source": max([e["target"] for e in pwd_edges_lst]), - "sourcePort": None, - } - ) + if pwd_edges_lst: + pwd_edges_lst.append( + { + "target": final_node["id"], + "targetPort": None, + "source": max(e["target"] for e in pwd_edges_lst), + "sourcePort": None, + } + )

🤖 Prompt for AI Agents

In src/executorlib/task_scheduler/interactive/dependency_plot.py around lines 293-300, the code calls max() on pwd_edges_lst which raises ValueError when pwd_edges_lst is empty; update the logic to handle an empty list by computing source only if pwd_edges_lst has elements (e.g., source = max(e["target"] for e in pwd_edges_lst)), otherwise set source to a safe fallback (for example final_node["id"] or None) or skip adding the edge entirely; implement a short conditional that selects the fallback and then append the dict using that source to avoid the max() on an empty sequence.

for more information, see https://pre-commit.ci

codecov · 2026-01-04T18:32:09Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.41%. Comparing base (7b357c5) to head (c0cf665).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #882      +/-   ##
==========================================
+ Coverage   93.34%   93.41%   +0.07%     
==========================================
  Files          38       38              
  Lines        1817     1837      +20     
==========================================
+ Hits         1696     1716      +20     
  Misses        121      121

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/executorlib/task_scheduler/interactive/dependency.py (1)
37-38: Docstring contains duplicate attribute name with incorrect type.

Line 37 describes _generate_dependency_graph (bool) and line 38 reuses the same attribute name with type (str). The second entry should be _plot_dependency_graph_filename (str) based on the actual attribute.
🔎 Proposed fix
     Attributes:
         _future_hash_dict (Dict[str, Future]): A dictionary mapping task hash to future object.
         _task_hash_dict (Dict[str, Dict]): A dictionary mapping task hash to task dictionary.
         _generate_dependency_graph (bool): Whether to generate the dependency graph.
-        _generate_dependency_graph (str): Name of the file to store the plotted graph in.
+        _plot_dependency_graph_filename (str): Name of the file to store the plotted graph in.
+        _export_workflow_filename (str): Name of the file to store the exported workflow graph in.

🧹 Nitpick comments (2)

src/executorlib/task_scheduler/interactive/dependency.py (1)

216-227: Export and plot are mutually exclusive when both filenames are provided.

When both export_workflow_filename and plot_dependency_graph_filename are specified, only the export is performed due to the if/else structure. If this is intentional, consider documenting this precedence in the class docstring. Otherwise, consider supporting both operations when both filenames are provided.

🔎 Proposed fix to support both operations

         if self._generate_dependency_graph:
             node_lst, edge_lst = generate_nodes_and_edges_for_plotting(
                 task_hash_dict=self._task_hash_dict,
                 future_hash_inverse_dict={
                     v: k for k, v in self._future_hash_dict.items()
                 },
             )
             if self._export_workflow_filename is not None:
-                return export_dependency_graph_function(
+                export_dependency_graph_function(
                     node_lst=node_lst,
                     edge_lst=edge_lst,
                     file_name=self._export_workflow_filename,
                 )
-            else:
-                return plot_dependency_graph_function(
+            if self._plot_dependency_graph_filename is not None:
+                plot_dependency_graph_function(
                     node_lst=node_lst,
                     edge_lst=edge_lst,
                     filename=self._plot_dependency_graph_filename,
                 )
-        else:
-            return None
+        return None

tests/test_singlenodeexecutor_pwd.py (1)

8-15: Minor: Trailing whitespace on line 10.

Line 10 has trailing whitespace after return x + y. Pre-commit hooks should catch this, but worth noting.
🔎 Proposed fix
 def get_sum(x, y):
-    return x + y
-    
+    return x + y
+
 def get_prod_and_div(x, y):

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8683249 and c0cf665.

📒 Files selected for processing (2)

src/executorlib/task_scheduler/interactive/dependency.py
tests/test_singlenodeexecutor_pwd.py

🧰 Additional context used

🧬 Code graph analysis (2)

src/executorlib/task_scheduler/interactive/dependency.py (1)

src/executorlib/task_scheduler/interactive/dependency_plot.py (2)

export_dependency_graph_function (237-307)

plot_dependency_graph_function (206-234)

tests/test_singlenodeexecutor_pwd.py (2)

src/executorlib/executor/single.py (1)

SingleNodeExecutor (20-194)

src/executorlib/standalone/select.py (1)

get_item_from_future (42-54)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
GitHub Check: notebooks_integration

🔇 Additional comments (5)

src/executorlib/task_scheduler/interactive/dependency.py (2)

16-16: LGTM!

Import addition aligns with the new export functionality.

49-49: LGTM!

The new export_workflow_filename parameter is correctly added and the initialization logic properly enables dependency graph generation when either filename is provided.

Also applies to: 67-71

tests/test_singlenodeexecutor_pwd.py (3)

18-21: LGTM!

Proper tearDown implementation to clean up the generated workflow.json file after each test.

23-36: LGTM!

The test effectively validates the arithmetic workflow export with chained futures using get_item_from_future. The assertion that future_result.result() is None correctly reflects the graph generation mode behavior where tasks are recorded but not executed.

38-47: LGTM!

Good coverage for NumPy array handling in workflow export. The test verifies that numpy arrays are properly serialized in the workflow graph (converted to lists per export_dependency_graph_function implementation).

jan-janssen and others added 2 commits January 4, 2026 18:22

Export to Python Workflow Definition

c2a7d4c

[pre-commit.ci] auto fixes from pre-commit.com hooks

338d9c9

for more information, see https://pre-commit.ci

Format black

8683249

jan-janssen marked this pull request as draft January 4, 2026 17:28

coderabbitai bot reviewed Jan 4, 2026

View reviewed changes

jan-janssen and others added 2 commits January 4, 2026 19:29

Update dependency.py

c8be904

[pre-commit.ci] auto fixes from pre-commit.com hooks

df6c8ea

for more information, see https://pre-commit.ci

jan-janssen added 3 commits January 5, 2026 08:47

Add test

b36d3a4

extend test

e146432

fix

c0cf665

jan-janssen marked this pull request as ready for review January 5, 2026 08:05

coderabbitai bot reviewed Jan 5, 2026

View reviewed changes

jan-janssen merged commit 3ea87a0 into main Jan 10, 2026
56 of 63 checks passed

jan-janssen deleted the pwd branch January 10, 2026 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Export to Python Workflow Definition#882

[Feature] Export to Python Workflow Definition#882
jan-janssen merged 8 commits intomainfrom
pwd

jan-janssen commented Jan 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 4, 2026

Uh oh!

codecov bot commented Jan 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jan-janssen commented Jan 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 4, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jan-janssen commented Jan 4, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 4, 2026 •

edited

Loading

codecov bot commented Jan 4, 2026 •

edited

Loading