-
Notifications
You must be signed in to change notification settings - Fork 6
[Feature] Implement run time limits #930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1ff45bc
dbafe20
a7e0b10
4e9fa1b
0e85b85
d4b84cf
8490745
5a42db3
f882c05
5d57c1d
9c97a99
3997de5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,11 @@ | ||
| import os | ||
| import unittest | ||
| from time import sleep | ||
|
|
||
| import numpy as np | ||
|
|
||
| from executorlib import FluxJobExecutor | ||
| from executorlib.api import ExecutorlibSocketError | ||
|
|
||
|
|
||
| try: | ||
|
|
@@ -20,6 +22,11 @@ def calc(i): | |
| return i | ||
|
|
||
|
|
||
| def delayed_calc(i): | ||
| sleep(2) | ||
| return i | ||
|
|
||
|
|
||
| def mpi_funct(i): | ||
| from mpi4py import MPI | ||
|
|
||
|
|
@@ -110,6 +117,24 @@ def test_single_task(self): | |
| [[(1, 2, 0), (1, 2, 1)], [(2, 2, 0), (2, 2, 1)], [(3, 2, 0), (3, 2, 1)]], | ||
| ) | ||
|
|
||
| def test_run_time_limit(self): | ||
| with FluxJobExecutor( | ||
| max_cores=1, | ||
| resource_dict={"cores": 1}, | ||
| flux_executor=self.executor, | ||
| block_allocation=False, | ||
| pmi_mode=pmi, | ||
| ) as p: | ||
| f1 = p.submit(delayed_calc, 1, resource_dict={"run_time_limit": 1}) | ||
| f2 = p.submit(delayed_calc, 2, resource_dict={"run_time_limit": 5}) | ||
| self.assertFalse(f1.done()) | ||
| self.assertFalse(f2.done()) | ||
| self.assertEqual(f2.result(), 2) | ||
| self.assertTrue(f1.done()) | ||
| self.assertTrue(f2.done()) | ||
| with self.assertRaises(ExecutorlibSocketError): | ||
| f1.result() | ||
|
Comment on lines
+120
to
+136
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Timing logic is sound; one redundant assertion and a minor flakiness risk to be aware of. The core timing invariant holds in both sequential and parallel execution:
Two minor points:
♻️ Suggested cleanup (remove redundant assertion) self.assertEqual(f2.result(), 2)
self.assertTrue(f1.done())
- self.assertTrue(f2.done())
with self.assertRaises(ExecutorlibSocketError):
f1.result()🤖 Prompt for AI Agents |
||
|
|
||
| def test_output_files_cwd(self): | ||
| dirname = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..", "..")) | ||
| os.makedirs(dirname, exist_ok=True) | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -55,7 +55,8 @@ def test_generate_slurm_command(self): | |
| exclusive=True, | ||
| openmpi_oversubscribe=True, | ||
| slurm_cmd_args=["--help"], | ||
| run_time_limit=250, | ||
| ) | ||
| self.assertEqual(len(command_lst), 12) | ||
| reply_lst = ['srun', '-n', '1', '-D', '/tmp/test', '-N', '1', '--cpus-per-task=2', '--gpus-per-task=1', '--exact', '--oversubscribe', '--help'] | ||
| self.assertEqual(len(command_lst), 13) | ||
| reply_lst = ['srun', '-n', '1', '-D', '/tmp/test', '-N', '1', '--cpus-per-task=2', '--gpus-per-task=1', '--exact', '--oversubscribe', '--time=5', '--help'] | ||
| self.assertEqual(command_lst, reply_lst) | ||
|
Comment on lines
+58
to
62
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: ast-grep --pattern 'def generate_slurm_command($$$)'Repository: pyiron/executorlib Length of output: 4450 Replace The current implementation at line 165 of 🧰 Tools🪛 Ruff (0.15.1)[error] 61-61: Probable insecure usage of temporary file or directory: "/tmp/test" (S108) 🤖 Prompt for AI Agents |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--timeconversion uses floor+1 instead of ceiling, wasting a minute for exact multiples of 60.run_time_limit // 60 + 1producesfloor(seconds / 60) + 1, notceil(seconds / 60). For exact multiples (e.g.,run_time_limit=3600), this allocates 61 minutes instead of 60. Non-exact multiples (e.g.,3601 s) happen to produce the correct ceiling, so the formula is inconsistent.🐛 Proposed fix using proper ceiling arithmetic
🤖 Prompt for AI Agents