Test restart_limit by jan-janssen · Pull Request #813 · pyiron/executorlib

jan-janssen · 2025-09-08T16:52:45Z

Summary by CodeRabbit

Tests
- Expanded test coverage for single-node execution scenarios with block allocation and restart limits, improving confidence in crash/restart handling under multi-worker conditions.
- Strengthens validation of failure-handling behavior to reduce regressions and enhance overall reliability.
- No user-facing functionality or API changes; runtime behavior remains unchanged.

coderabbitai · 2025-09-08T16:52:53Z

Walkthrough

A test in tests/test_singlenodeexecutor_noblock.py was updated to pass an additional argument resource_dict={"restart_limit": 2} when constructing SingleNodeExecutor in TestFunctionCrashes.test_single_node_executor_block_allocation. No other files or interfaces were changed.

Changes

Cohort / File(s)	Summary
Tests `tests/test_singlenodeexecutor_noblock.py`	Updated SingleNodeExecutor instantiation in TestFunctionCrashes.test_single_node_executor_block_allocation to include resource_dict={"restart_limit": 2} alongside max_workers=2 and block_allocation=True.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Possibly related PRs

Add tests for function crashes #802 — Also modifies TestFunctionCrashes.test_single_node_executor_block_allocation in the same test file; likely part of coordinating changes to SingleNodeExecutor test parameters.

Poem

A rabbit taps the keys with cheer,
Adds restart_limit, crisp and clear.
Two workers hop, allocations block—
Tests now burrow, tick and clock.
In nested nodes, we safely roam,
Green carrots greet a passing comb. 🥕

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch restart_limit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2025-09-08T16:55:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.12%. Comparing base (9bacadf) to head (32a9f6e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #813   +/-   ##
=======================================
  Coverage   98.12%   98.12%           
=======================================
  Files          33       33           
  Lines        1649     1649           
=======================================
  Hits         1618     1618           
  Misses         31       31

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

tests/test_singlenodeexecutor_noblock.py (2)
160-160: Cover the non-block-allocation path too.

To ensure parity, add a sibling test that sets restart_limit without block allocation.

Apply:
@@
     def test_single_node_executor_block_allocation(self):
         with self.assertRaises(ExecutorlibSocketError):
-            with SingleNodeExecutor(max_workers=2, block_allocation=True, resource_dict={"restart_limit": 2}) as exe:
+            with SingleNodeExecutor(max_workers=2, block_allocation=True, resource_dict={"restart_limit": 2}) as exe:
                 f = exe.submit(exit_funct)
                 print(f.result())
 
+    def test_single_node_executor_no_block_allocation_with_restart_limit(self):
+        with self.assertRaises(ExecutorlibSocketError):
+            with SingleNodeExecutor(max_workers=2, resource_dict={"restart_limit": 2}) as exe:
+                f = exe.submit(exit_funct)
+                print(f.result())
160-160: Optional: add a behavior test that proves limited restarts happen.

Current test asserts the error path; consider an additional test where a function fails twice (e.g., via a counter persisted in a temp file) and then succeeds, and assert success only when restart_limit >= 2.

I can draft such a deterministic test if you want.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9bacadf and 32a9f6e.

📒 Files selected for processing (1)

tests/test_singlenodeexecutor_noblock.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

tests/test_singlenodeexecutor_noblock.py (1)

executorlib/executor/single.py (1)

SingleNodeExecutor (20-190)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
GitHub Check: unittest_win
GitHub Check: unittest_old
GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: unittest_flux_openmpi
GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: notebooks
GitHub Check: notebooks_integration
GitHub Check: unittest_slurm_mpich
GitHub Check: unittest_flux_mpich
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)

coderabbitai · 2025-09-08T16:58:54Z

tests/test_singlenodeexecutor_noblock.py

    def test_single_node_executor_block_allocation(self):
        with self.assertRaises(ExecutorlibSocketError):
-            with SingleNodeExecutor(max_workers=2, block_allocation=True) as exe:
+            with SingleNodeExecutor(max_workers=2, block_allocation=True, resource_dict={"restart_limit": 2}) as exe:


💡 Verification agent

🧩 Analysis chain

Good addition; please verify the key propagates end-to-end.

Passing resource_dict={"restart_limit": 2} looks correct and aligns with the SingleNodeExecutor docstring. However, default_resource_dict in executorlib/executor/single.py doesn’t include restart_limit, so this key must be explicitly handled downstream (e.g., in create_single_node_executor). Please verify that restart_limit is actually consumed and not silently ignored in both block_allocation=True/False paths.

Run this to confirm usage coverage and call paths:

🏁 Script executed:

#!/bin/bash # Where is restart_limit referenced? rg -nC3 -P '\brestart_limit\b' # Where is create_single_node_executor defined and how does it use resource_dict? rg -nC5 -P 'def\s+create_single_node_executor\b' rg -nC5 -P 'create_single_node_executor\(' # Check that worker (re)spawn logic reads restart_limit rg -nC3 -P '(restart|respawn|relaunch).*restart_limit|restart_limit.*(restart|respawn|relaunch)'

Length of output: 17219

🏁 Script executed:

#!/bin/bash # Inspect merging of default_resource_dict and resource_dict sed -n '140,200p' executorlib/executor/single.py # Inspect create_single_node_executor implementation sed -n '360,450p' executorlib/executor/single.py # Check for any direct usage of restart_limit in this file rg -n 'restart_limit' -C5 executorlib/executor/single.py

Length of output: 9024

restart_limit ignored in executor pipeline
The restart_limit key from resource_dict isn’t read anywhere in create_single_node_executor (executorlib/executor/single.py:create_single_node_executor) nor by the spawned schedulers, so setting resource_dict={"restart_limit":2} has no effect. Add explicit handling—e.g. extend create_single_node_executor to pop restart_limit and pass it into the OneProcessTaskScheduler/BlockAllocationTaskScheduler or their spawners.

🤖 Prompt for AI Agents

In tests/test_singlenodeexecutor_noblock.py around line 160, the test sets resource_dict={"restart_limit": 2} but this key is ignored by create_single_node_executor and the spawned schedulers; update executorlib/executor/single.py:create_single_node_executor to explicitly extract (pop) "restart_limit" from resource_dict and forward it as a parameter to the scheduler factory or spawner (OneProcessTaskScheduler/BlockAllocationTaskScheduler) when constructing/starting schedulers; ensure scheduler constructors/spawners accept a restart_limit argument (add parameter and store/use it) and propagate it into the process/task restart logic, and update any call sites that create schedulers to pass the extracted restart_limit so the test's resource_dict takes effect.

Test restart_limit

32a9f6e

coderabbitai bot reviewed Sep 8, 2025

View reviewed changes

jan-janssen merged commit 79debaa into main Sep 8, 2025
86 of 91 checks passed

jan-janssen deleted the restart_limit branch September 8, 2025 17:14

coderabbitai bot mentioned this pull request Feb 22, 2026

[Feature] Validate resource_dict using pydantic #933

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test restart_limit#813

Test restart_limit#813
jan-janssen merged 1 commit intomainfrom
restart_limit

jan-janssen commented Sep 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 8, 2025 •

edited

Loading

Uh oh!

codecov bot commented Sep 8, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jan-janssen commented Sep 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

codecov bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jan-janssen commented Sep 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 8, 2025 •

edited

Loading

codecov bot commented Sep 8, 2025 •

edited

Loading