Conversation
WalkthroughA test in tests/test_singlenodeexecutor_noblock.py was updated to pass an additional argument resource_dict={"restart_limit": 2} when constructing SingleNodeExecutor in TestFunctionCrashes.test_single_node_executor_block_allocation. No other files or interfaces were changed. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #813 +/- ##
=======================================
Coverage 98.12% 98.12%
=======================================
Files 33 33
Lines 1649 1649
=======================================
Hits 1618 1618
Misses 31 31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
tests/test_singlenodeexecutor_noblock.py (2)
160-160: Cover the non-block-allocation path too.To ensure parity, add a sibling test that sets restart_limit without block allocation.
Apply:
@@ def test_single_node_executor_block_allocation(self): with self.assertRaises(ExecutorlibSocketError): - with SingleNodeExecutor(max_workers=2, block_allocation=True, resource_dict={"restart_limit": 2}) as exe: + with SingleNodeExecutor(max_workers=2, block_allocation=True, resource_dict={"restart_limit": 2}) as exe: f = exe.submit(exit_funct) print(f.result()) + def test_single_node_executor_no_block_allocation_with_restart_limit(self): + with self.assertRaises(ExecutorlibSocketError): + with SingleNodeExecutor(max_workers=2, resource_dict={"restart_limit": 2}) as exe: + f = exe.submit(exit_funct) + print(f.result())
160-160: Optional: add a behavior test that proves limited restarts happen.Current test asserts the error path; consider an additional test where a function fails twice (e.g., via a counter persisted in a temp file) and then succeeds, and assert success only when restart_limit >= 2.
I can draft such a deterministic test if you want.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_singlenodeexecutor_noblock.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_singlenodeexecutor_noblock.py (1)
executorlib/executor/single.py (1)
SingleNodeExecutor(20-190)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
- GitHub Check: unittest_win
- GitHub Check: unittest_old
- GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
- GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
- GitHub Check: unittest_mpich (macos-latest, 3.13)
- GitHub Check: unittest_flux_openmpi
- GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
- GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
- GitHub Check: notebooks
- GitHub Check: notebooks_integration
- GitHub Check: unittest_slurm_mpich
- GitHub Check: unittest_flux_mpich
- GitHub Check: unittest_openmpi (macos-latest, 3.13)
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
| def test_single_node_executor_block_allocation(self): | ||
| with self.assertRaises(ExecutorlibSocketError): | ||
| with SingleNodeExecutor(max_workers=2, block_allocation=True) as exe: | ||
| with SingleNodeExecutor(max_workers=2, block_allocation=True, resource_dict={"restart_limit": 2}) as exe: |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Good addition; please verify the key propagates end-to-end.
Passing resource_dict={"restart_limit": 2} looks correct and aligns with the SingleNodeExecutor docstring. However, default_resource_dict in executorlib/executor/single.py doesn’t include restart_limit, so this key must be explicitly handled downstream (e.g., in create_single_node_executor). Please verify that restart_limit is actually consumed and not silently ignored in both block_allocation=True/False paths.
Run this to confirm usage coverage and call paths:
🏁 Script executed:
#!/bin/bash
# Where is restart_limit referenced?
rg -nC3 -P '\brestart_limit\b'
# Where is create_single_node_executor defined and how does it use resource_dict?
rg -nC5 -P 'def\s+create_single_node_executor\b'
rg -nC5 -P 'create_single_node_executor\('
# Check that worker (re)spawn logic reads restart_limit
rg -nC3 -P '(restart|respawn|relaunch).*restart_limit|restart_limit.*(restart|respawn|relaunch)'Length of output: 17219
🏁 Script executed:
#!/bin/bash
# Inspect merging of default_resource_dict and resource_dict
sed -n '140,200p' executorlib/executor/single.py
# Inspect create_single_node_executor implementation
sed -n '360,450p' executorlib/executor/single.py
# Check for any direct usage of restart_limit in this file
rg -n 'restart_limit' -C5 executorlib/executor/single.pyLength of output: 9024
restart_limit ignored in executor pipeline
The restart_limit key from resource_dict isn’t read anywhere in create_single_node_executor (executorlib/executor/single.py:create_single_node_executor) nor by the spawned schedulers, so setting resource_dict={"restart_limit":2} has no effect. Add explicit handling—e.g. extend create_single_node_executor to pop restart_limit and pass it into the OneProcessTaskScheduler/BlockAllocationTaskScheduler or their spawners.
🤖 Prompt for AI Agents
In tests/test_singlenodeexecutor_noblock.py around line 160, the test sets
resource_dict={"restart_limit": 2} but this key is ignored by
create_single_node_executor and the spawned schedulers; update
executorlib/executor/single.py:create_single_node_executor to explicitly extract
(pop) "restart_limit" from resource_dict and forward it as a parameter to the
scheduler factory or spawner
(OneProcessTaskScheduler/BlockAllocationTaskScheduler) when
constructing/starting schedulers; ensure scheduler constructors/spawners accept
a restart_limit argument (add parameter and store/use it) and propagate it into
the process/task restart logic, and update any call sites that create schedulers
to pass the extracted restart_limit so the test's resource_dict takes effect.
Summary by CodeRabbit